Экспансия онтологий: онтологии в информационных системах Калиниченко Л.А. (ИПИ РАН) Симпозиум «Онтологическое моделирование», КФУ, 11-12 Октября 2010 г.

Презентация:



Advertisements
Похожие презентации
LANGUAGE, SPEECH, SPEECH ACTIVITY Suggests to allocate the following functions: communicative; thinking tools; mastering the socio-historical; experience;
Advertisements

© 2002 IBM Corporation Confidential | Date | Other Information, if necessary © Wind River Systems, released under EPL 1.0. All logos are TM of their respective.
Operator Overloading Customised behaviour of operators Chapter: 08 Lecture: 26 & 27 Date:
Loader Design Options Linkage Editors Dynamic Linking Bootstrap Loaders.
© 2009 Avaya Inc. All rights reserved.1 Chapter Two, Voic Pro Components Module Two – Actions, Variables & Conditions.
Designing Network Management Services © 2004 Cisco Systems, Inc. All rights reserved. Designing the Network Management Architecture ARCH v
© 2005 Cisco Systems, Inc. All rights reserved. BGP v Route Selection Using Policy Controls Using Multihomed BGP Networks.
The waterfall model is a popular version of the systems development life cycle model for software engineering. Often considered the classic approach to.
In mathematics, the notion of permutation is used with several slightly different meanings, all related to the act of permuting (rearranging) objects.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v Customer-to-Provider Connectivity with BGP Connecting a Multihomed Customer to Multiple Service.
SIR model The SIR model Standard convention labels these three compartments S (for susceptible), I (for infectious) and R (for recovered). Therefore, this.
WEB SERVICES Mr. P. VASANTH SENA. W EB SERVICES The world before Situation Problems Solutions Motiv. for Web Services Probs. with Curr. sols. Web Services.
The waterfall model is a popular version of the systems development life cycle model for software engineering. Often considered the classic approach to.
How can we measure distances in open space. Distances in open space.
HPC Pipelining Parallelism is achieved by starting to execute one instruction before the previous one is finished. The simplest kind overlaps the execution.
11 BASIC DRESS-UP FEATURES. LESSON II : DRESS UP FEATURES 12.
WELCOME TO THE WORLD OF FUZZY SYSTEMS. DEFINITION Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v Route Selection Using Policy Controls Applying Route-Maps as BGP Filters.
© 2005 Cisco Systems, Inc. All rights reserved.INTRO v Connecting Networks Understanding How TCP/IP Works.
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 1-1 Chapter 1 Why Study Statistics? Statistics for Business and Economics.
Транксрипт:

Экспансия онтологий: онтологии в информационных системах Калиниченко Л.А. (ИПИ РАН) Симпозиум «Онтологическое моделирование», КФУ, Октября 2010 г.

План рассмотрения Онтологии как концептуальные схемы Развитие языков на дескриптивных логиках в контексте БД и ИС (до появления стека W3C) Стеки языков W3C OWL QL и DL-Lite Family Ontology based data access (OBDA) systems Интеграция баз данных на основе онтологий

ОНТОЛОГИИ КАК КОНЦЕПТУАЛЬНЫЕ СХЕМЫ

Horrocks: что такое онтология

Экспансия онтологий в контекст ИС и БД Наблюдается активная экспансия онтологий в область БД и ИС: Исследуются возможности использования онтологических языков, основанных на дескриптивных логиках, в ИС и БД В частности, возможности использования дескриптивных логик в качестве языков концептуального моделирования, а также как языков определения данных Онтологии применяются как схемы баз данных и концептуальные схемы Обсуждаются архитектурные решения систем доступа к данным на основе онтологий Экспансия онтологического подхода в сторону ИС и БД подкрепляется соответствующими определениями Анализ содержания выполняемых работ, их уровня, новизны и влияния на область БД и ИС – цель доклада

Эволюция понятия «онтология » In the early 1990's, an effort to create interoperability standards identified the ontology as a standard component of knowledge systems. According to Gruber, an ontology is a specification of a conceptualization, i.e., a formal description of the concepts and their relations for a universe of discourse. Gruber 2008: In the context of computer and information sciences, an ontology defines a set of representational primitives with which to model a domain of knowledge or discourse. The representational primitives are typically classes (or sets), attributes (or properties), and relationships (or relations among class members). In the context of database systems, ontology can be viewed as a level of abstraction of data models, analogous to hierarchical and relational models, but intended for modeling knowledge about individuals, their attributes, and their relationships to other individuals. Ontologies are said to be at the "semantic" level, whereas database schema are models of data at the "logical" or "physical" level. Due to their independence from lower level data models, ontologies are used for integrating heterogeneous databases, enabling interoperability among disparate systems, and specifying interfaces to independent, knowledge-based services.

Онтологии: уровни Ontology is a representation scheme that describes a formal conceptualization of a domain of interest (D.Calvanese) The specification of an ontology usually comprises two distinct levels: –Intensional level: specifies a set of conceptual elements and of rules to describe the conceptual structures of the domain (compare with IDB in deductive DB). –Extensional level: specifies a set of instances of the conceptual elements described at the intensional level (compare with EDB in deductive DB).. Note: an ontology may specify also a meta-level, which defines a set of modeling categories of which the conceptual elements are instances.

Интенсиональный уровень An ontology language for expressing the intensional level usually includes: –Concepts (vs. entity types in IS) –Properties of concepts –Relationships between concepts, and their properties –Axioms –Queries Ontologies are typically rendered as diagrams (e.g., Semantic Networks, Entity-Relationship schemas, UML Class Diagrams).

Онтологии, представляемые как диаграммы классов UML Понятия (классы) именуемые также типами сущностей, фреймами Свойства, именуемые атрибутами, слотами, свойствами данных Связи, именуемые ассоциациями, типами связей, атрибутами объектов, ролями

Аксиомы

Экстенсиональный уровень At the extensional level we have individuals and facts: An instance represents an individual (or object) in the extension of a concept. e.g., domenico is an instance of Employee A fact represents a relationship hold e.g., worksFor (domenico, tones)

Сопоставление с другими языками Ontology languages vs. knowledge representation languages: –Ontologies are knowledge representation schemas. Ontology vs. logic: –Logic is the tool for assigning semantics to ontology languages. Ontology languages vs. conceptual data models: –Conceptual schemas are special ontologies, suited for conceptualizing a single logical model (database). Ontology languages vs. programming languages: –Class definitions are special ontologies, suited for conceptualizing a single structure for computation.

Классы онтологических языков Graph-based –Semantic networks –Conceptual graphs –UML class diagrams, Entity-Relationship schemas Frame Based –Frame Systems –OKBC, XOL Logic based –Description Logics (e.g., SHOIQ, DLR, DL-Lite, OWL,... ) –Rules (e.g., RuleML, LP/Prolog, F-Logic) –First Order Logic (e.g., KIF) –Non-classical logics (e.g., non-monotonic, probabilistic)

Horrocks: OBIS Ontology-based Information Systems – View of data that is independent of logical/physical schema – Queries use terms familiar to users – Answers reflect schema & data, e.g.: Patients suffering from Vascular Disease – Query expansion/navigation/refinement – Incomplete and semi-structured data – Integration of heterogeneous sources

Что нового ? Создается впечатление, что онтологии открывают дорогу новым направлениям после десятилетий развития и исследований в области концептуальных моделей данных, систем интеграции неоднородных баз данных, семантической интероперабельности, дедуктивных баз данных Необходимый теоретический фундамент и конкретные высокоуровневые модели уже существуют Эти результаты теперь как бы открываются заново и объявляются достижениями исследований в области онтологий В действительности же речь идет об исследованиях возможностей онтологических языков для концептуального моделирования ИС и БД Базы данных, их схемы, информационные системы, концептуальные схемы не перестают оставаться таковыми при использовании тех или иных языковых средств Нужно отбросить терминологическую шелуху и отфильтровать то новое, что удалось привнести а теорию и практику БД и ИС за счет онтологических средств

Что нового ? Онтологические модели, рассматриваемые в публикациях, посвященных их использованию в системах БД и ИС, основаны на логике предикатов первого порядка, чаще всего на ее подмножествах – дескриптивных логиках. По существу, в контексте БД и ИС онтологические языки (в частности, языки на дескриптивных логиках) играют роль [концептуальных] моделей данных, и не более Таким образом, следует сосредоточиться на изучении особенностей языков на дескриптивных логиках и новизны, привносимой ими в контекст БД и ИС. Что могут дать модели данных на дескриптивных логиках в сравнении с реляционными, объектными и другими моделями данных ?

Концептуальное моделирование Концептуальное моделирование реализует абстрактное, семантическое моделирование предметной области (определение классов объектов предметной области, их взаимосвязей, ограничений), независящее от реализации, и служащее в качестве средства порождения эталонной спецификации, отражающей консенсус в сообществе, включающем разработчиков, пользователей ИС, и, собственно, самих ИС Концептуальные схемы применяются также в качестве глобальных схем при интеграции информационных ресурсов (баз данных) КС используются в процессе проектирования ИС и в процессе исполнения

Концептуальное моделирование (2) Концептуальная схема описывает структуру предметной области, тогда как онтология ориентирована главным образом на определения используемых в предметной области понятий Концептуальные схемы базы данных, помимо описания классов объектов предметной области и ограничений, содержат описания поведения объектов (методов, функций, процессов), чего онтологии не содержат (пока). Онтологическая модель предметной области задает определения понятий, которыми аннотируются соответствующие имена определений концептуальной схемы – вот пример того, где онтологии могли бы проявить себя оригинальным образом

Развитие языков на дескриптивных логиках в контексте БД и ИС до появления стека W3C

Гуарино, 1998 Рассмотрена идея создания информационных систем, движимых онтологиями. В общих терминах обсуждаются возможные подходы использования онтологий в процессе проектирования ИС и в процессе работы (run time). По-существу, излагаются хорошо известные концепции семантически интероперабельных систем и систем интеграции информационных ресурсов В рассматриваемых сценариях онтологии играют роль спецификаций, не зависящих от ресурсов Эта статья явилась переизложением известных подходов в терминах онтологий.

Дескриптивные логики этого периода CLASSIC [Borgida, 1989] used by different systems including OBSERVER [1996] GRAIL [1997], LOOM [1991], SIMS [1996]), OIL [2000] used for terminology integration These languages are capable of almost all common concept forming operators. An exception is CLASSIC that does not allow the use of disjunction and negation in concept definitions OIL can also be used to define instances LOOM provides reasoning support for A- and T-Box but it cannot guarantee soundness and completeness Terminological axioms that seem to be important are equality and disjointness.

Дескриптивные логики с правилами CARIN,[1999] - description logic extended with function-free Horn rules (used in DWQ project by Calvanese) ALlog [1998] a combination of a simple description logics with Datalog DLR a description logic with n-ary relations used by Calvanese for information integration [2001] The integration of description logics with rule-based reasoning makes it necessary to restrict the expressive power of the terminological part of the language in order to remain decidable

Classical frame-based/logic based languages Ontolingua [1993] OKBC [1998] F-Logic [1995] used in Ontobroker [1998] and COIN [1997]). These languages provide common elements for the definition of concepts and relations, such as typing, default values and cardinalities. Further, compared to the description logic languages the used FOL- and frame-based languages have a larger variety of options for capturing terminological knowledge. This is mainly a result of the possibility to define first-order axioms in ontology specifications.

Стеки языков W3C

Single Semantic Web Stack Architecture (SSA)

SSA ontologically related languages Languages in a stack directly related to ontology specification RDF is a simple language for expressing data, which refer to objects ("resources") and their relationships. An RDF-based model can be represented in XML syntax. RDF Schema extends RDF and is a vocabulary for describing properties and classes of RDF-based resources, with semantics for generalized- hierarchies of such properties and classes. OWL adds more vocabulary for describing properties and classes: among others, relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes.

Почему не SSA It is a vain hope that a single upward-compatible language, developed in the Semantic Web's infancy, will suffice for all the future semantic developments on the Web Еvery technology, including language design, eventually becomes obsolete and no technology can address all problems More realistic architecture must allow multiple technological stacks to exist side-by-side E.g., SWRL is a technology, which extends OWL-DL. However, rule- based technology (not description logic based) is mature with decades of theoretical development and practical and commercial use (logic programming and nonmonotonic reasoning (LPNMR))

Multiple Semantic Web Stack Architecture

Overlaps among Different Logics

DLP layer Description Logic Programs (DLP) layer is a set of all statements in Description Logic that are translatable into Horn rules (FOL) This layer at least should be assumed to assure upward compatibility: the semantics of DLP in the OWL stack and in the rules stack are the same Logic languages that are based on pure first-order logic (OWL) do not support constraints and have no notion of their violation. Instead, they provide restrictions (statements about the desired state of the world) They produce inferences: if a person is said to have at most one spouse and the knowledge base records that John has two, Mary and Ann, then OWL would conclude that Mary and Ann is the same person. In contrast, a rule base with nonmonotonic semantics will view such a data base as inconsistent.

Interoperability in MSA through language extension: problems It is claimed that it is incorrect to say that Datalog is an extension of the DLP layer because, given a single fact, such as knows(pat,jo), DLP and Datalog give different answers to the question of whether pat knows exactly one person. Under the OWL semantics the answer will be unknown" since it is not possible to either prove or disprove that pat knows exactly one person; under the rules semantics the answer will be yes. Both answers are right! A user who chooses to write an application using the rules stack does so because of a desire to use the language and semantics of that stack. Otherwise, a competent user should choose OWL and SWRL.

Databases and MSA MSA is extensible and additional stacks can be added to it as long as they can be made interoperoperable (e.g., OWL QL, OWL RL) Each layer is a syntactic and semantic extension of the previous layer Application of description logic of W3C in DB & IS –Straightforward: OWL+RDFS, RDF –As an extension above large databases ( probably, relational) e.g., OWL QL However, it seems that OWL cannot be considered as a pure extension of the relational layer at least due to the difference between constraints and restrictions mentioned above. Would be good to analyze OWL QL in this respect

OWL QL и DL-Lite Family

DL-Lite Family DL-Lite objectives: to capture basic ontology languages, while keeping low complexity of reasoning (reasoning includes also answering unions of conjunctive queries over the instance level (Abox)) DL-Lite reasoning tasks are polynomial in the size of the TBox, and query answering is LogSpace in the size of the Abox DL-Lite allows for a separation between TBox and ABox reasoning during query evaluation: the part of the process requiring TBox reasoning is independent of the ABox, and the part of the process requiring access to the ABox can be carried out by an SQL engine The logics of the DL-Lite family are the maximal DLs supporting efficient query answering over large amounts of instances.

The DL-Lite family of description logics

DL-Lite family formation DL-Litecore allows for expressing ISA assertions on concepts, disjointness between concepts, role-typing, participation constraints DL-Lite F adds to the core the possibility of expressing functionality restrictions on roles DL-Lite R adds ISA and disjointness assertions between roles. OWL 2 QL is based on DL-Lite R D-Lite A adds possibility of using together role inclusion assertions and functionality assertions, and so on Very simple DLs like DL-Lite R are suitable for support of basic ontology languages, conceptual data models (e.g., Entity-Relationship, and object- oriented formalisms such as UML class diagrams)

OWL axioms supported by DL-Lite R subclass axioms (SubClassOf) class expression equivalence (EquivalentClasses) class expression disjointness (DisjointClasses) inverse object properties (InverseObjectProperties) property inclusion (SubObjectPropertyOf not involving property chains and SubDataPropertyOf) property equivalence (EquivalentObjectProperties and EquivalentDataProperties) property domain (ObjectPropertyDomain and DataPropertyDomain) property range (ObjectPropertyRange and DataPropertyRange) disjoint properties (DisjointObjectProperties and DisjointDataProperties) symmetric properties (SymmetricObjectProperty) assertions other than the equality assertions (DifferentIndividuals, ClassAssertion, ObjectPropertyAssertion, and DataPropertyAssertion)

OWL axioms not supported by DL-Lite R (1) existential quantification to a class expression or a data range (ObjectSomeValuesFrom and DataSomeValuesFrom in the subclass position) self-restriction (ObjectExistsSelf) existential quantification to an individual or a literal (ObjectHasValue, DataHasValue) nominals (ObjectOneOf, DataOneOf) universal quantification to a class expression or a data range (ObjectAllValuesFrom, DataAllValuesFrom) cardinality restrictions (ObjectMaxCardinality, ObjectMinCardinality, ObjectExactCardinality, DataMaxCardinality, DataMinCardinality, DataExactCardinality) disjunction (ObjectUnionOf, DisjointUnion)

OWL axioms not supported by DL-Lite R (2) property inclusions (SubObjectPropertyOf involving property chains) functional and inverse-functional properties (FunctionalObjectProperty, InverseFunctionalObjectProperty, and FunctionalDataProperty) transitive properties (TransitiveObjectProperty) reflexive properties (ReflexiveObjectProperty) irreflexive properties (IrreflexiveObjectProperty) asymmetric properties (AsymmetricObjectProperty) keys (HasKey)

ONTOLOGY BASED DATA ACCESS (OBDA) SYSTEMS

Онтологии в ядре информационных систем Использование информационных ресурсов на основе концептуализации предметных областей. Доступ к данным, опосредованный онтологией (концептуальным взглядом на данные)

Linking data to DL-Lite R Relational databases store data, whereas instances of concepts are objects, each object should be denoted by an ad hoc identifier (impedance mismatch) This idea traces back to the work done in deductive object-oriented databases providing Skolem functions taking values as arguments and returning OID Mapping of relational schemas into the concept definitions is straightforward Through a mapping we associate a conjunctive query over atomic concepts, domains, roles, attributes, and role attributes (generically referred to as predicates in the following) with a first-order (more precisely, SQL) query of the appropriate arity over the database. Formally, a mapping assertion is an assertion of the form: φ ψ, where φ is an arbitrary SQL query of arity n > 0 over DB, and ψ is a UCQ over T of arity n > 0 without non-distinguished variables

QA applying DL-Lite R An interpretation I is a model of T that should satisfy all mapping assertions in M wrt DB. Mapping assertions, denoted with (T.M,DB), where DB is a database as defined above, T is a DL-Lite R TBox, and M a set of mapping assertions between DB and T. The ontology conveys only incomplete information about the domain of interest, and we want to guarantee that the answers to a query that we obtain are certain, independently of how we complete this incomplete information. For QA we split each mapping assertion φ ψ into several assertions of the form φ p, one for each atom p in ψ We unify the atoms in the query q to be evaluated with the right-hand side atoms of the mappings, thus obtaining a UCQ Then, we unfold each atom with the corresponding left-hand side mapping query. Observe that, after unfolding, we obtain an SQL query.

QA посредством «переписывания»

Рассуждения применительно к Tbox и ABox

Система QuOnto QuOnto is a tool for representing and reasoning over ontologies of the DL- Lite family. The basic functionalities it offers are: –Ontology satisfiability check –Intensional reasoning services: concept/property subsumption and disjunction, concept/property satisfiability Query Answering of UCQs Reasoning services are optimized Can be used with internal and external DBMS (include drivers for Oracle, DB2, IBM Information Integrator, SQL Server, MySQL, etc.) Implemented in Java ; APIs are available for selected projects upon request.

ИНТЕГРАЦИЯ БАЗ ДАННЫХ НА ОСНОВЕ ОНТОЛОГИЙ

Концептуальный уровень в OBDI

Решение на основе DL-Lite The (federated) source database is external and independent from the conceptual view (the ontology). Mappings relate information in the sources to the ontology; sort of virtual Abox GAV is used, mappings are such that the result of an (arbitrary) SQL query on the source database is considered a (partial) extension of a concept/role. The distinction between objects and values in DL-LiteA are resolved to deal with the impedance mismatch problem

MASTRO-I: the OBDI system MASTRO-I is based on the system QuONTO, a reasoner for DL-Lite A, and is coupled with a data federation tool: –the global schema is expressed in terms of a TBox of DL-Lite A, –mapping language allows for expressing GAV sound mappings between the sources (the source schema in MASTRO-I is assumed to be ONE flat relational database schema obtained from the federated DB) and the global schema –the mapping language has specific mechanisms for addressing the impedance mismatch problem (values in sources vs the instances of concepts in the ontology as objects) –answering unions of conjunctive queries can be done through a very efficient technique (LOGSPACE with respect to data complexity) which reduces this task to standard SQL query evaluation.

Дискуссия Анализ показывает, что работы по онтологиям в области БД и ИС не привносят нового в эту область. Скорее удается воспроизвести некоторые результаты из области дедуктивных баз, систем интеграции Фактически исследования сводятся к созданию онтологических языков для концептуального моделирования, т.е созданию концептуальных моделей данных на дескриптивных логиках Образованные базы данных со схемой в OWL легко могут быть интегрированы с другими базами данных в системах интеграции баз данных. Например, в системе СИНТЕЗ для интеграции таких OWL-based баз данных достаточно создания соответствующего адаптера (отображение OWL в СИНТЕЗ, сохраняющее семантику, построено)

Дискуссия Создание семейства дескриптивных логик DL-Lite как максимальных подмножеств средств, обладающих приемлемой сложностью, следует считать важным результатом Отражение этих результатов видно в OWL 2 QL Следует исследовать вопросы отображения баз данных в DL-Lite при наличии ограничений целостности в схемах баз данных В архитектурном плане заметных результатов нет Зато развивается активная терминологическая экспансия, кампания, к которой нужно относиться снисходительно. Перспектива – в чем ? Экспансия онтологий в сторону БД и ИС – отражение кризиса идей ?

References Nicola Guarino Formal Ontology in Information Systems. Proceedings of FOIS98, Trento, Italy, 6-8 June Amsterdam, IOS Press, pp H.Wache, T. V¨ogele, U. Visser, H. Stuckenschmidt, G. Schuster, H. Neumann and S. Hubner, Ontology-Based Integration of Information, A Survey of Existing Approaches, 2001 WS Thomas R. Gruber. Toward Principles for the Design of Ontologies Used for Knowledge Sharing International Journal Human-Computer Studies 43, p T. Gruber (2008). "Ontology". In the Encyclopedia of Database Systems, Ling Liu and M. Tamer Özsu (Eds.), Springer-Verlag, Michael Kifer, Jos de Bruijn, Harold Boley, and Dieter Fensel A Realistic Architecture for the Semantic Web, 2005 OWL 2 W EB O NTOLOGY L ANGUAGE : P ROFILES, 2009

References Diego Calvanese1, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, Antonella Poggi, Riccardo Rosati Ontology-based database access Proc. of the 15th Italian Conf. on Database Systems (SEBD 2007) Diego Calvanese Ontology-based Data Management Masters Ontology Spring School 2009 September, projects/events/moss09-1/MOSS-09-OBDM-calvanese-draft.pdf Diego Calvanese, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, Antonella Poggi, Mariano Rodriguez-Muro1, and Riccardo Rosati2Reasoning Ontologies and Databases: The DL-Lite Approach Web 2009, LNCS 5689, pp. 255–356, Springer-Verlag Berlin Heidelberg 2009

Expressiveness of the evaluated description logics

Expressiveness of the extended description logics

Expressiveness of FOL and frame-based languages