Функциональная аннотация М.Гельфанд «Сравнительная геномика» БиБи, 4 курс осень 2009.

Презентация:



Advertisements
Похожие презентации
Genetics Genetics (from Ancient Greek γενετικός genetikos, "genitive" and that from γένεσις genesis, "origin"),[1][2][3] a discipline of biology, is the.
Advertisements

11 BASIC DRESS-UP FEATURES. LESSON II : DRESS UP FEATURES 12.
Michael Marchenko. In mathematics, a sequence is an ordered list of objects (or events). Like a set, it contains members (also called elements, or terms),
Ways to Check for Divisibility Vüsal Abbasov Dividing By 1 All numbers are divisible by 1.
LANGUAGE, SPEECH, SPEECH ACTIVITY Suggests to allocate the following functions: communicative; thinking tools; mastering the socio-historical; experience;
DNA
Multiples Michael Marchenko. Definition In mathematics, a multiple is the product of any quantity and an integer. in other words, for the quantities a.
Hello! Im professor Galileo. Lets do this crossword! Крупная авария, повлекшая за собой человеческие жертвы, потерю здоровья, а также приведшая к серьезному.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v Route Selection Using Policy Controls Applying Route-Maps as BGP Filters.
Goals and values. What are goals? Goals can be anything you want to achieve in a short period of time or in a long time period. Eg, get better grade,
Linux Daemons. Agenda What is a daemon What is a daemon What Is It Going To Do? What Is It Going To Do? How much interaction How much interaction Basic.
© 2009 Avaya Inc. All rights reserved.1 Chapter Two, Voic Pro Components Module Two – Actions, Variables & Conditions.
Lesson 2. How to say hello & goodbye ?. When we first meet someone whether it is a person we know or someone we are meeting for the first time, we will.
© 2002 IBM Corporation Confidential | Date | Other Information, if necessary © Wind River Systems, released under EPL 1.0. All logos are TM of their respective.
RLC circuit. An RLC circuit (or LCR circuit) is an electrical circuit consisting of a resistor, an inductor, and a capacitor, connected in series or in.
Convection Convection is the concerted, collective movement of ensembles of molecules within fluids (e.g., liquids, gases) and rheids. Convection of mass.
What is truth? The life and the eternity… But people often mix it up with prosperity and the desire to gain material well-being…
© 2005 Cisco Systems, Inc. All rights reserved.INTRO v Building a Simple Serial Network Understanding the OSI Model.
Indirect Questions How do you make indirect questions? When do you use this grammar?
Albina Manapova ED There is great variety in the types of speakers that are available in the market today. However, the basic principles of sound.
Транксрипт:

Функциональная аннотация М.Гельфанд «Сравнительная геномика» БиБи, 4 курс осень 2009

Цель аннотации Что –функция Когда –Регуляция Экспрессии Время жизни Где –Локализация Внутри/снаружи Органеллы и компартменты Как –Механизм Специфичность, регуляция

Функции (условно) Ферменты –Метаболизм (катаболизм, анаболизм) –Биосинтез макромолекул Транспортеры Регуляторы –Рецепторы –Белки сигнальных каскадов –Факторы транскрипции и т.п. Структурные и «вспомогательные» белки –Цитоскелет, движение, деление –Межклеточные взаимодействия (рецепторы) –Шапероны. Большие комплексы

Gene Ontology

Три иерархии Молекулярная функция Биологический процесс Компонент клетки Пример: цитохром с –Транспорт электронов –Окислительное фосфорилирование –Внутренняя мембрана митохондрии Геномные базы: FlyBase (дрозофила) SGD (Saccharomyces Genome Database) MGD (Mouse Genome Database)

Молекулярная функция - примеры Широкие категории: –Каталитическая активность –Транспортная активность –Связывание Узкие категории: –Адениат-циклазная активность –Связывание Ca 2+ Можно и по-другому (EC, TC) – это потом

Биологический процесс - примеры Широкие категории: –Cellular physiological processes –Перенос сигнала (signal transduction) Узкие категории: –Метаболизм пиримидинов –Транспорт альфа-глюкозидов –Асимметричное деление клеток

GO: процессы

Структура иерархии: сеть Biological process Cellular process –Cellular physiolgical process Cell division –Asymmetric cell division »Regulation of asymmetric cell division –Regulation of cell division »Regulation of asymmetric cell division Regulation of cellular physiological process –Regulation of cell division »Regulation of assymmetric cell division Physiological process –Cellular physiolocical process … –Regulation of physiological process …

Упражнение Нарисовать пути, ведущие к: (А-Д) GO: : positive regulation of cell budding GO: : phosphoenolpyruvate carboxykinase (ATP) activity (Е-К) GO: : arabinose catabolism GO: : double-stranded RNA adenosine deaminase activity (Л-Н) GO: : Golgi vesicle membrane GO: : pectate lyase activity (О-П) GO: : hexose biosynthesis GO: : aspartate racemase activity (Р-С) GO: : ethanol catabolism GO: : cytochrome-c oxidase activity (Т-Я) GO: : regulation of cell migration GO: : RNA polymerase II transcription factor activity, enhancer binding используя AmiGO AmiGo search_constraint=terms&action=replace_tree&session_id=7922b

BLAST home page

Параметры BLAST: wordsize Цистеиновые протеазы из люцернового долгоносика и коровьего клеща: 61% тождества, а BLASTN не находит. Для ДНК Wordsize=11(min 7), для белков =3.

Similarity homology BLAST e-value is a measure of non- randomness of sequence similarity Possible causes of similarity: –homology –domain homology –low complexity, coiled-coil, transmembrane and other types of regions with non-standard amino acid composition Homology same function. Normally: –similar (general) function (e.g. enzymatic activity) –maybe different specificity

Предсказание специфичности: дерево распадается на две ветви – все нормально (A novel type of Ni /Co ABC transporters. Transmembrane component CbiM/NikM) NikM CbiM Ni 2+ Co 2+ + CbiN + NikL, NikK + NikN + NikL

Предсказание специфичности: все смешалось – нет предсказания ( The NiCoT transporters family)

Предсказание специфичности: смена специфичности – ошибки (The NikABCDE family of ABC transporters. Substrate-binding component NikA)

Noradrenaline transporter in an archaeon? SOURCE Methanococcus jannaschii. ORGANISM Methanococcus jannaschii Archaea; Euryarchaeota; Methanococcales; Methanococcaceae; Methanococcus. FEATURES Location/Qualifiers source /organism="Methanococcus jannaschii" /db_xref="taxon:2190" Protein /product="sodium-dependent noradrenaline transporter" CDS /gene="MJ1319" /note="similar to EGAD:HI0736 percent identity: 38.5; identified by sequence similarity; putative" /coded_by="U67572: " /transl_table=11 Now corrected: Hypothetical sodium-dependent transporter MJ1319.

Lesson(s) 1.Avoid overprediction (homology does not necessarily mean same cellular role or specificity)

Similarity to hypothetical proteins: somebody elses errors… The only correct annotation!

Genes with curious functional assignments C75604: Probable head morphogenesis protein, Deinococcus radiodurans O05360: Automembrane protein H, Yersinia enterocolitica Q8TID9: Benzodiazepine (valium) receptor TspO, Methanosarcina acetivorans NP_069403: DR-beta chain MHC class II, Archaeoglobus fulgidus

Errors in experimental papers SwissProt: DEFINITION Hypothetical 43.6 kDa protein. ACCESSION P KEYWORDS Hypothetical protein. SOURCE Debaryomyces occidentalis ORGANISM Debaryomyces occidentalis Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes; Saccharomycetales; Saccharomycetaceae; Debaryomyces. [CAUTION] Was originally (Ref.1) thought to be 3-isopropylmalate dehydrogenase (LEU2). PIR: DEFINITION 3-isopropylmalate dehydrogenase (EC ) - yeast(Schwanniomyces occidentalis). ACCESSION S55845 KEYWORDS oxidoreductase.

SwissProt entry DSDX_ECOLI -!- CAUTION: An ORF called dsdC was originally (Ref.3) assigned to the wrong DNA strand and thought to be a D-serine deaminase activator, it was then resequenced by Ref.2 and still thought to be "dsdC", but this time to function as a D-serine permease. It is Ref.1 that showed that dsdC is another gene and that this sequence should be called dsdX. It should also be noted that the C-terminal part of dsdX (from 338 onward) was also sequenced (Ref.6 and Ref.7) and was thought to be a separate ORF (don't worry, we also had difficulties understanding what happened!).

Lesson(s) 1.Avoid overprediction (homology does not necessarily mean same cellular role or specificity) 2.Check carefully the source(s) of annotations in the list of homologs

mastermind protein of Drosophila

Filtering of low-complexity segments often insufficient may lose non-trivial information

Lesson(s) 1.Avoid overprediction (homology does not necessarily mean same cellular role or specificity) 2.Check the source(s) of annotations in the list of homologs 3.Beware of similarity in low-complexity regions, non-globular domains, transmembrane segments

Homology of domains I64228: DNA polymerase homolog (in fact, 5-3- exonuclease) Klenow fragment Bacterial DNA polymerases

BLAST domains page

InterPro domains

Lesson(s) 1.Avoid overprediction (homology does not necessarily mean same cellular role or specificity) 2.Check the source(s) of annotations in the list of homologs 3.Beware of similarity in low-complexity regions, non-globular domains, transmembrane segments 4.Do not extend domain homology to annotation of the whole protein

PROSITE Множественное выравнивание консервативные позиции паттерны Вырожденные паттерны P-loop ATPases: [GA]x(4)GK[ST] Очень малая избирательность

caspases/paracaspases/metacaspases

Профили. PSI-BLAST Значимость (E=0.005), 1 лишний на 200 поисков Ручная прочистка при итерациях Автоматически – до схождения Асимметрия

Lesson(s) 1.Avoid overprediction (homology does not necessarily mean same cellular role or specificity) 2.Check the source(s) of annotations in the list of homologs 3.Beware of similarity in low-complexity regions, non-globular domains, transmembrane segments 4.Do not extend domain homology to annotation of the whole protein 5.Правильный паттерн должен сохраняться у (близких) ортологов; должны сохраняться основные каталитические остатки

Анализ белка в отсутствие гомологов Сигнальные пептиды. SignalP (нейронная сеть) Трансмембранные сегменты. Две дюжины серверов (TMHMM, PHDhtm, HMMTOP) –Гидрофобные/гидрофильные –Сигнал на границе –Топология (положительные внутри) –Использование выравниваний –Бета-белки. Порины Локализация. PSORT, TargetP Coiled coil. COILS, Parcoil/Multicoil Вторичная и пространственная структура. Threading Сравнительная геномика и негеномные данные