Скачать презентацию
Идет загрузка презентации. Пожалуйста, подождите
Презентация была опубликована 9 лет назад пользователемРуслан Трындин
1 1 NOOJ Conference Inalco, Saarbruecken June 5th, 2013 Vincent BÉNET INALCO CREE Recherche assistée par ordinateur Conception and realisation of semantic tags for the Russian language for Max Silberzteins Nooj software Russian Module for NooJ: Semantic annotation
2 2 one main dictionary (95000 entries) one main dictionary (95000 entries) two annex dictionaries two annex dictionaries one for proper nouns one for proper nouns one for noun-adjectives one for noun-adjectives Russian Module for NooJ: design and implementation of lexical and grammatical ressources
3 3 How ? How ? -by adding tags to the general dictionary -by writing grammars Semantic Tagging or Annotation ? Russian Module for NooJ: design and implementation of basic semantic ressources
4 4 Writing semantic resources for the Russian language The semantic tags of the Russian national Corpus: Taxonomy (a lexeme's thematic class) – for nouns, verbs, adjectives and adverbs. Mereology (part – whole and element – aggregate relationships) – for concrete and abstract nouns Topology – for concrete names Causation – for verbs Evaluation – for abstract and concrete nouns, adjectives and adverbs
5 5 Writing semantic resources for the Russian language 27 semantic taxonomic tags for verbs t:move movement (бежать, дергаться, бросить, нести) t:be sphere of existence (жить, возникнуть, убить) t:loc location (лежать, стоять, положить) t:poss sphere of possession (иметь дать, подарить, приобрести) t:ment mental sphere (знать, верить, догадаться, помнить) t:perc perception (смотреть, слышать, нюхать, чуять) t:speech speech (говорить, советовать, спорить, каламбурить) t:sound sounds (гудеть, шелестеть) t:light light (гаснуть, лучиться)
6 6 Semantic information in the Russian national corpus (Verbs)
7 7 Writing semantic resources for the Russian language khodit,V+Mvt+Indet+ipf+intr+FLX=ходить Idti,V+Mvt+Det+ipf+intr+FLX=идти Vkhodit,V+Mvt+Pvb+ipf+intr+FLX=ходить Vojti,V+Mvt+Pvb+pf+intr+FLX=идти Vykhodit,V+Mvt+Pvb+ipf+intr+FLX=ходить Priezzhat,V+Mvt+Pvb+ipf+intr+FLX=акать
8 8 Grammar to locate the verbs of motion
9 9 Searching for « verbs of motion » with Nooj
10 10 Searching for « verbs of motion » with Nooj
11 11 Writing semantic resources for the Russian language concrete nouns (девочка, стол, молоко) abstract nouns (вождение, яркость, время) proper names (Иван, Эйнштейн, Петроград) person (человек, учитель) ethnonyms (эфиоп, итальянка) kinship terms (брат, бабушка) supernatural creatures (русалка, инопланетянин) animals (корова, жираф, сорока, ящерица, муравей) plants (береза, роза, трава) a.s.o.
12 12 Semantic information in the Russian national corpus (Nouns)
13 13 Semantic information in the Russian national corpus (Adjectives)
14 14 Semantic information in the Russian national corpus (Adverbs)
15 15 Writing basic semantic resources for the Russian language Nooj properties.def file N_Genre = m | f | n ; N_SGenr = an | inan ; N_Nombre = s | p; N_Cas = Im | Vi | Ro | R2 | Da | Tv | Pr | P2 | Zv ; … V_Type = Mvt; V_Morph = Pref | Suff;
16 16 Writing basic semantic resources for the Russian language Nooj properties.def file A_Sem = Animal; Color ( Hum = App) N_Sem = Hum | Prof | Parents | Body Conc | Abstr | Org | Text | Animal | Food | Health | Arts | Lit | Music | Sports Topo | Country | River | City | Mount| Lake | Posit | Time | Color ; ADV_Sem = Time |Topo | Modal; V_Sem = Color | Topo | Posit |Modal;
17 17 Writing semantic resources for the Russian language malchik, N+an+Hum+FLX=buldog pered tem kak,CONJ+UNAMB+Time Moskva,N+f+inan+City+FLX=Москва Don,N+m+inan+River+FLX=Дон Katar,N+Country+m+s+FLX=Ленинград Nora,N+Forename+Hum+f+an+FLX=Лена
18 18 Writing semantic resources for the Russian language zelënyj,A+Color+FLX=novyj zelenovatyj,A+ Color+FLX= zelënenkij, A+Color+FLX=novyj temno-zelënyj, A+Color+FLX=novyj zelen,N+f+inan+Color+FLX=smes zelenet,V+intr+ipf+Color+FLX=belet zazelenet,V+intr+pf+Color+FLX=belet zazelenetsja,V+sja+pf+Color+FLX=….
19 19 Writing basic semantic resources for the Russian language Prof = 900 Parent = 160 items Forenames = 2280 Animal = 370 Food = 280 (Liquid = 25 ) Body = 285 Health = 175 Arts = 65 Lit = 40 Music = 155 Sport = 65 Topo = 40 Country = 180 River = 15 City = 175 Mount = 5 Lake = 5 Posit = 25 Time = 135 Modal = 15 Color = 275
20 20 Searching for « colors » with Nooj
21 21 Searching for « body parts » with Nooj
22 22 Searching for « parents (relatives) » words with Nooj
23 23 Writing basic semantic resources for the Russian language NEXT WORK TO BE DONE…. -Completion of the dictionary for concrete nouns using thematic dictonaries -a new parameter to the dictionary +Translation= to use Nooj as a resource to build basic dictionaries for parallel corpuses.
24 24 NOOJ Conference Inalco, Saarbruecken June 5th, 2013 Thank you for your attention Russian Module for NooJ: Semantic annotation
Еще похожие презентации в нашем архиве:
© 2024 MyShared Inc.
All rights reserved.