Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE.

Презентация:



Advertisements
Похожие презентации
Comparative Analysis of Phylogenic Algorithms V. Bayrasheva, R. Faskhutdinov, V. Solovyev Kazan University, Russia.
Advertisements

The reconstruction of coding scheme through errors distributions Lyakhovetskii V.A., Karpinskaya V.Ju*, Bobrova E.V. Pavlov Institute of Physiology of.
General characteristics As any other part of speech, the noun can be characterized by three criteria: Semantic (the meaning) Morphological (the form and.
Prepared by: pupil of 11-B form Ishchenko Inna. WE LEARN A LOT AT SCHOOL English History Geography Mathematics Physics С hemistry Biology Ukrainian and.
Time-Series Analysis and Forecasting Lecture on the 5 th of October.
Учимся писать Эссе. Opinion essays § 1- introduce the subject and state your opinion § 2-4 – or more paragraphs - first viewpoint supported by reasons/
Statistics Probability. Statistics is the study of the collection, organization, analysis, and interpretation of data.[1][2] It deals with all aspects.
Time-Series Analysis and Forecasting – Part IV To read at home.
The category of mood. The category of mood is an explicit verbal category expressing the relation of the action denoted by the predicate to reality as.
THE HISTORY OF ENGLAND IN THE HISTORY OF ENGLISH Made by: Ilia Zykov, 7-grader Supervisor: Marina V. Bednjagina Municipal Educational Institution Secondary.
First Certificate in English By Olha Ostroverkh, Form 11-B.
How can we measure distances in open space. Distances in open space.
COLLOQUIALISMS AND THEIR SPHERE OF COMMUNICATION.
© 2006 Cisco Systems, Inc. All rights reserved. BSCI v Implementing BGP Explaining BGP Concepts and Terminology.
Cultural Features In Management. The meaning of management Management activity is one of the deciding factors of effective work of the firm in a free.
The program requirements to TG at school. Prepared by: Kanat Karina Zhaksylykova Aktoty Ermahan Uldana.
A new interface model for the Jazyki Mira typological database Oleg Belyaev The research is supported by RFBR grant ( а.
The Law of Demand The work was done by Daria Beloglazova.
© 2005 Cisco Systems, Inc. All rights reserved.INTRO v Building a Simple Ethernet Network Understanding How an Ethernet LAN Works.
My future profession prepared 11th grade pupil Korotkova Victoria.
Транксрипт:

Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE (ON THE DATA OF DB «LANGUAGES OF THE WORLD») (*) * The research was supported by Russian Scientific Foundation of Humanities (grant в)

Screenshots. Win Version

Source of Data for DB JM Encyclopedic issue Jaziki Mira(Languages of the World) – 18 volumes, printed by Institute of Linguistics of Russian Academy of Science from 1993 to Large Encyclopedic Dictionary. Linguistics (Edited by Yarceva V.N.) – includes interpretation of all terms of model of DB.

List of some Encyclopedic Publications Jaziki Mira(Languages of the World) Languages of the world: Uralic (1993). Languages of the world. Paleoasiatic languages. Мoscow: Publ. Indricк. (1996) p. Languages of the world: Turkic. Мoscow: Publ. Indricк. (1997) p. Languages of the world: Mongolic languages. Manchu-Tungus languages. Japan. Korean. (Ed.: Kibrik A.A., Rogova N.B., Romanova O.I.). Мoscow: Publ. Indricк. (1997) p. Languages of the world: Iranian languages. I. South-Western Iranian languages. Мoscow: Publ. Indricк. (1997) p. Languages of the world: Iranian languages. II. North-Western Iranian languages. Мoscow: Publ. Indricк. (1999). – 302 p. Languages of the world: Dardic and Nuristani languages. Мoscow: Publ. Indricк. (1998) p. Languages of the world: Iranian languages. III. East Iranian languages. Мoscow: Publ. Indricк. (1999) p. Languages of the world: Germanic languages. Celtic languages. Moscow: Publ. Academia. (1999) p. Languages of the world: Caucasian languages. RAS. Institute of Linguistics. Moscow: Publ. Academia. (2001) p. Languages of the world: Romance languages. Moscow: Publ. Academia. (2001) p. Languages of the world: Indo-Aryan languages of Ancient and Middle Period. Moscow: Publ. Academia. (2004) p. Languages of the world: Slavonic languages. RAS. Institute of Linguistics. /Ed. A.M. Moldovan, S.S. Skorvid, A.A. Kibrik/ Moscow: Publ. Academia. (2005) p. Languages of the world: Baltic languages. RAS. Institute of Linguistics. /Ed. V.N.Toporov, M.V.Zavyalov, A.A. Kibrik /. Moscow: Publ. Academia. (2006), 224 p.

Dictionary and source books Dictionary Two of 18 source books

Characteristics of Data Base Languages of the World Content The Data Base Languages of the World has the following quantitative characteristics. - contains more than 3800 features - the number of languages is 315 Eurasian languages - contains the description of the following spheres of language: phonetics, morphology, syntax. - representation of data: binary In Data Base Languages of the World the following language families and unities are represented: Austroasian, Austronesian, Altaic, Afroasian, Indoeuropean, Caucasian, Paleoasian, Sinotibetic, Uralic, Hurrito-Urartean. DB contains the description of languages-isolates: Ainu, Nivch, Burushaski, Sumeran, Elamite. The unique peculiarity of Data Base Languages of the World is a large collection of extinct languages description, that includes 54 essays. There is no analogues of such detailed and systematic description of exinct languages. The main principles forming of the model of language description are binarity, hierarchicity and paradigmaticity.

Task Formulation 1.Grammatical constructions are supposed to require different resources of the brain in processing. 2.There is another supposition that the total number of the resources of the brain aimed at processing of the volume, which is approximately equal in the meaning, must be constant. 3.Semantic cases can be an example of a complex construction for the verification of these statements (Fillmores cases). 4.The DB Jazyky Mira contains semantic cases that form a rather wide paradigm.

Example Lets study an example of the accusative case Суд обвинил Вас-ю в краже. The court accused Basil of robbery. In the Russian language case is marked by a form of the noun (Вас-ю) and by a preposition (в), and in the English language – only by preposition (of).

Method of Data Processing Velina Slavova used the data of DB Jazyki Mira in order to receive a more convenient representation of the case paradigm. After a rather sophisticated reduction we received the first results that show examples of correlation of different case systems.

Case description in DB. Scope of the research. In DB JM we have 405 grammar features devoted to case system (in the Part number of Model). In this research only actant case meaning were investigated (140 grammar features ). They were divided in six fragments: --subject/object --contrastive case formation of subject --contrastive case formation of object --method of expressing subject--object-meanings --other actant cases -case of nominal predicate. At the first step only four fragments were investigated.

Examples of case description --subject/object ---absolutive ---absolutive/relative ---dative ---narrative ---nominative/accusative ---nominative/accusative/genitive ---nominative/accusative-genitive ---nominative/accusative/indefinite accusative ---nominative/acusative/genitive/partitive ---nominative/accusative/privative/sociative ---nominative/accusative/locative ---nominative/accusative/partitive ---nominative/dative-accusative ---nominative/narrative ---nominative/partitive ---nominative/genitive ---nominative/genitive/partitive ---nominative/general indirect ---nominative/ergative ---nominative/ergative/genitive At left the part of subject/object paradigms in DB is shown. At right fragment of description of the English language is shown *LANGUAGE DENOMINATION.English ………………………………………………… CASE MEANINGS.actant case meanings..subjective/objective...general case/accusative..contrastive case formation...of object....nouns and pronouns..method of expressing subject.-object.meanings...case affixes...word order...auxiliary words....in preposition.case of attributive relation..prepositional construction.case of possesive relation..prepositinal construction..possesive affix at possesor's name.case of locative relations..method of expression...prepositions ………………………………………………..

Metrics of complexity For each six part the own metrics of complexity was developed. Part of case description (Complex characteristics) Type of feature codingMetrics --subject/objectParadigma – only one choiceMaximal number of cases marked in language --contrastive case formation of subject Multi-choiceNumber of features presented in language --contrastive case formation of object Multi-choiceNumber of features presented in language --method of expressing subject--object- meanings Multi-choiceNumber of features presented in language

Correlation Analysis We can see good correlations between three complex characteristics (marked by yellow).

Factor Analysis We have two groups of factors (# 1 – yellow, # 2 - blue)

Tree Analysis The distances between the languages following this SO syntactic rules complexity measure seem to keep languages from some genealogic groups closed together. Nevertheless, it is seen that Indo-European languages are VERY dispersed. OLD languages seem to stay a part!

ANALYSIS OF RESULTS 1.The hypothesis about the preservation of the complexity of the grammar structure of the language on a certain level found its confirmation. The study showed that languages with a complex case paradigm have simpler grammatical means of expressing cases and fewer differences in the description of cases for subject/object. Languages with a simple case paradigm have more complex means of expressing case relations and have more differences in the description of cases for subject/object. Such dichotomy explains 76% variations of the content of the DB Jazyki Mira 2.In general such description of the case system (as two groups of factors) correlates well with the genealogical tree. The exception is Indo-European language family, which can be conditioned by a big geographical spread of EU languages and, consequently, intensive borrowing during areal contacts. This hypothesis requires additional check.

The present report is called upon to show that DB Jazyki Mira is an interesting resource for studying the complexity of different grammar parts of the language. We have only received the first experience. The methods and approaches are still at the stage of establishment and development. Works in this direction will be continued. AS A CONCLUSION

Thank you for your attention Contacts: