1 NOOJ Conference Inalco, Saarbruecken June 5th, 2013 Vincent BÉNET INALCO CREE Recherche assistée par ordinateur Conception and realisation of semantic.

Презентация:



Advertisements
Похожие презентации
Prefixes and Suffixes What are they? Examples of common prefixes and suffixes The presentation made by Serbina and Zholtikova.
Advertisements

© 2006 Cisco Systems, Inc. All rights reserved. HIPS v Configuring Groups and Policies Configuring Policies.
S10-1 PAT328, Section 10, September 2004 Copyright 2004 MSC.Software Corporation TOPOLOGY OPTIMIZATION.
Institute for Information Problems of the Russian academy of Sciences and its linguistic research Olga Kozhunova CML-2008, Becici, 6-13 September.
The 6 th of May, Thursday. How do you understand this saying? No Men, No Problems.
How much/ How many. Write these nouns in the right column.
FORMAL AND SEMANTIC TYPES OF TERMS IN LINGUISTIC DISCOURSE Myakisheva Irina Russia.
Nouns ENG 110 Prof. K. Horowitz. Index Objectives Introduction What is a noun? Common & Proper nouns Concrete & Abstract nouns Collective & Compound Try.
Callan School London Callan Method Аппак Е. Нурматова М.
Тема 11 Медицинская помощь и лечение (схема 1). Тема 11 Медицинская помощь и лечение (схема 2)
S16-1 NAS122, Section 16, August 2005 Copyright 2005 MSC.Software Corporation SECTION 16 COMPLEX MODAL ANALYSIS.
WS6-1 PAT328, Workshop 6, May 2005 Copyright 2005 MSC.Software Corporation WORKSHOP 6 NESTED COORDINATE SYSTEMS.
1 Vocabulary Instruction. 2 How We Learn New Words Firsthand experience with the concept is directly related to reading comprehension Experience is a.
© 2006 Cisco Systems, Inc. All rights reserved. HIPS v Configuring Groups and Policies Managing Hosts and Deploying Software Updates.
Why do students use the Internet? by Nickolay Dolgun.
© 2006 Cisco Systems, Inc. All rights reserved. CVOICE v VoIP Signaling and Call Control Introducing H.323.
© 2009 Avaya Inc. All rights reserved.1 Chapter Two, Voic Pro Components Module Two – Actions, Variables & Conditions.
LANGUAGE, SPEECH, SPEECH ACTIVITY Suggests to allocate the following functions: communicative; thinking tools; mastering the socio-historical; experience;
© 2006 Cisco Systems, Inc. All rights reserved. CVOICE v Configuring Voice Networks Configuring Dial Peers.
© 2006 Cisco Systems, Inc. All rights reserved. HIPS v Using CSA Analysis Generating Application Deployment Reports.
Транксрипт:

1 NOOJ Conference Inalco, Saarbruecken June 5th, 2013 Vincent BÉNET INALCO CREE Recherche assistée par ordinateur Conception and realisation of semantic tags for the Russian language for Max Silberzteins Nooj software Russian Module for NooJ: Semantic annotation

2 one main dictionary (95000 entries) one main dictionary (95000 entries) two annex dictionaries two annex dictionaries one for proper nouns one for proper nouns one for noun-adjectives one for noun-adjectives Russian Module for NooJ: design and implementation of lexical and grammatical ressources

3 How ? How ? -by adding tags to the general dictionary -by writing grammars Semantic Tagging or Annotation ? Russian Module for NooJ: design and implementation of basic semantic ressources

4 Writing semantic resources for the Russian language The semantic tags of the Russian national Corpus: Taxonomy (a lexeme's thematic class) – for nouns, verbs, adjectives and adverbs. Mereology (part – whole and element – aggregate relationships) – for concrete and abstract nouns Topology – for concrete names Causation – for verbs Evaluation – for abstract and concrete nouns, adjectives and adverbs

5 Writing semantic resources for the Russian language 27 semantic taxonomic tags for verbs t:move movement (бежать, дергаться, бросить, нести) t:be sphere of existence (жить, возникнуть, убить) t:loc location (лежать, стоять, положить) t:poss sphere of possession (иметь дать, подарить, приобрести) t:ment mental sphere (знать, верить, догадаться, помнить) t:perc perception (смотреть, слышать, нюхать, чуять) t:speech speech (говорить, советовать, спорить, каламбурить) t:sound sounds (гудеть, шелестеть) t:light light (гаснуть, лучиться)

6 Semantic information in the Russian national corpus (Verbs)

7 Writing semantic resources for the Russian language khodit,V+Mvt+Indet+ipf+intr+FLX=ходить Idti,V+Mvt+Det+ipf+intr+FLX=идти Vkhodit,V+Mvt+Pvb+ipf+intr+FLX=ходить Vojti,V+Mvt+Pvb+pf+intr+FLX=идти Vykhodit,V+Mvt+Pvb+ipf+intr+FLX=ходить Priezzhat,V+Mvt+Pvb+ipf+intr+FLX=акать

8 Grammar to locate the verbs of motion

9 Searching for « verbs of motion » with Nooj

10 Searching for « verbs of motion » with Nooj

11 Writing semantic resources for the Russian language concrete nouns (девочка, стол, молоко) abstract nouns (вождение, яркость, время) proper names (Иван, Эйнштейн, Петроград) person (человек, учитель) ethnonyms (эфиоп, итальянка) kinship terms (брат, бабушка) supernatural creatures (русалка, инопланетянин) animals (корова, жираф, сорока, ящерица, муравей) plants (береза, роза, трава) a.s.o.

12 Semantic information in the Russian national corpus (Nouns)

13 Semantic information in the Russian national corpus (Adjectives)

14 Semantic information in the Russian national corpus (Adverbs)

15 Writing basic semantic resources for the Russian language Nooj properties.def file N_Genre = m | f | n ; N_SGenr = an | inan ; N_Nombre = s | p; N_Cas = Im | Vi | Ro | R2 | Da | Tv | Pr | P2 | Zv ; … V_Type = Mvt; V_Morph = Pref | Suff;

16 Writing basic semantic resources for the Russian language Nooj properties.def file A_Sem = Animal; Color ( Hum = App) N_Sem = Hum | Prof | Parents | Body Conc | Abstr | Org | Text | Animal | Food | Health | Arts | Lit | Music | Sports Topo | Country | River | City | Mount| Lake | Posit | Time | Color ; ADV_Sem = Time |Topo | Modal; V_Sem = Color | Topo | Posit |Modal;

17 Writing semantic resources for the Russian language malchik, N+an+Hum+FLX=buldog pered tem kak,CONJ+UNAMB+Time Moskva,N+f+inan+City+FLX=Москва Don,N+m+inan+River+FLX=Дон Katar,N+Country+m+s+FLX=Ленинград Nora,N+Forename+Hum+f+an+FLX=Лена

18 Writing semantic resources for the Russian language zelënyj,A+Color+FLX=novyj zelenovatyj,A+ Color+FLX= zelënenkij, A+Color+FLX=novyj temno-zelënyj, A+Color+FLX=novyj zelen,N+f+inan+Color+FLX=smes zelenet,V+intr+ipf+Color+FLX=belet zazelenet,V+intr+pf+Color+FLX=belet zazelenetsja,V+sja+pf+Color+FLX=….

19 Writing basic semantic resources for the Russian language Prof = 900 Parent = 160 items Forenames = 2280 Animal = 370 Food = 280 (Liquid = 25 ) Body = 285 Health = 175 Arts = 65 Lit = 40 Music = 155 Sport = 65 Topo = 40 Country = 180 River = 15 City = 175 Mount = 5 Lake = 5 Posit = 25 Time = 135 Modal = 15 Color = 275

20 Searching for « colors » with Nooj

21 Searching for « body parts » with Nooj

22 Searching for « parents (relatives) » words with Nooj

23 Writing basic semantic resources for the Russian language NEXT WORK TO BE DONE…. -Completion of the dictionary for concrete nouns using thematic dictonaries -a new parameter to the dictionary +Translation= to use Nooj as a resource to build basic dictionaries for parallel corpuses.

24 NOOJ Conference Inalco, Saarbruecken June 5th, 2013 Thank you for your attention Russian Module for NooJ: Semantic annotation