当前位置:首页 >> 生物学 >>

蛋白质组的生物信息学分析


Bioinformatics for Proteome

武汉金开瑞生物工程有限公司

OUTLINE
Summary of Proteome Bottleneck in Proteome based on MS/MS Qualitative Proteome Analysis Quantitative Proteome Analysis Post Translational Modifications(PTMs) Functional Annotation of large scale protein data set

Summary of proteome & proteomics

Summary of proteome & proteomics

Bottleneck in Proteome based on MS/MS

Reproducibility Quantification

Accuracy

Reliability

Identification

Identification of workflow

How to identify a protein/pep seq.

Raw data from mass spectrum
MS spectra: Peptide ions(precursors, mother ions)
MS/MS spectra: fragment ions(product ions)

example.mgf
Charge of precursor ion

Mass of precursor ion

Peptide fragment ions: m/z intensity charge

End symbol of one MS/MS spectrum

Peptide ion fragmentation

Available search engines

Why ProteinPilot
? Paragon? algorithm: ‘Sequence approach’ search algorithm for peptide ID; “The Paragon Algorithm, a Next Generation
Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra” Shilov IV, Seymour SL et al (2007) MCP 6.9, 1638

? Unique feature ? hypothesis selection stage “… there is greater potential for improvement from advances in determining what to score, not how to score it.” Shilov IV, Seymour SL et al (2007) MCP 6.9, 1638 ? ProGroup? algorithm for protein inference ? Quantitative results for stable isotope label quant experiments ? Extensive AA Modification Catalog ? Global & local FDR filtering

ID results evaluation
? Filtering standard
? By identification confidence: Protein Unused score >1.3 (means a 95% confidence). More likely to be used. ? By Local FDR cutoff: less than 1% or 5% FDR. ? By unique peptides num. At least 1 unique peptide per protein group

? ID statistical
Sample #Total name spectra ALL 345040 #ID spectra 158011 ID percentage 45.8% #ID peptides 28125 #ID proteins 4741

? Feature distribution

Unique peptide number

Peptide length distribution

Venn Diagram among multiple experiments

Protein coverage distribution

How to quant a PSM?
? Quant. Based MS: XIC(eXtracted Ion Current)

? Quant. Based MS/MS: XIC(eXtracted Ion Current) & intensity

XICs of fragment ions: MRM,SWATH Intensities of fragment ions (labeled marker):Itraq,TMT

Quantitative Proteome

A general workflow of data analysis
? Calculate peptides’ abundance : XICs, Intensity, Spectra count ? Normalize peptides’ abundance: mean, median, quantile, linear regression ? Decide proteins’ expressed abundance and Fold Change(FC): mean, median, total, weighted ratio. ? Statistical analysis: T-test, ANOVA, PCA, Fisher’s Exact Test ? Decide significant expressed different proteins: FC:1.2, 1.3, 1.5,

2 ,P-value less than 0.05.

Quant. Results evaluation
? Evaluation of biological/technical replicates

Quantified overlap between two experiments

Correlation between two experimental replicates ratio pairs

? Decide Significant difference expressed proteins
If bioreplicates or technique replicates contained, one may get a mean or median of comparable ratios, or just decide by the number of occurrence of sig. difference among the compared replicates samples.

Post Translational Modifications

Post translational modification (PTM) is a step in protein biosynthesis. Proteins are created by ribosomes translating mRNA into polypeptide chains. These polypeptide chains undergo PTM (such as folding, cutting and other processes) before becoming the mature protein product.

The aim is to create a community supported, comprehensive database of protein modifications for mass spectrometry applications. That is, accurate and verifiable values, derived from elemental compositions, for the mass differences introduced by all types of natural and artificial modifications. Other important information includes any mass change, (neutral loss), that occurs during MS/MS analysis, and site specificity, (which residues are susceptible to modification and any constraints on the position of the modification within the protein or peptide)

Website: http://www.unimod.org

Workflow for PTMs

Shortcoming by conventional ID workflow

A new concept of site location for PTMs

Ascore

PTM score

MD score

Some visual results

GO annotation

workflow

Results statistic

? GO annotation results

KEGG Pathway annotation

Workflow

Results

Functional enrichment

Many functional nodes would be gathered and overlap if just annotate genes/proteins directly, which may puzzle researchers. So we hope to filter and screen it to achieve more significative functional nodes.

? Fisher’s exact test ? Cumulative supper hypergeometric test

Functional enrichment
? GO enrichment results

When the p-value is less than 0.05, the corresponding GO term is considered as significant enriched.

Functional enrichment
? Pathway enrichment results


相关文章:
生物信息学概论复习题
生物信息学概论复习题 - 生物信息学概论复习题 一、名词解释: 1. 2. 3. 4. 5. 6. 7. 8. 9. 合成生物学 蛋白质组学 相似性,同一性,同源性 直系...
蛋白质组学及其主要研究方法
和新型质谱技术以及 生物信息学的发展而发展的,本文将对蛋白质组学的主要研究...2.蛋白质组学研究技术 典型的蛋白质组学分析是在蛋白提取后,先凝胶的或非凝胶...
生物信息学在蛋白质组学中的应用研究进展
分析和解释等在内的所有方面的 一门交叉学科,它综合运用数学、计算机科学和生物...数量之大在生物学上是史无前例的, 因此生物信息学蛋白质组学的 研究中起...
生物信息学及其在蛋白质组学中的应用
生物信息学及其在蛋白质组学中的应用_生物学_自然科学_专业资料。生物信息学课程...从双向电泳实验中分离鉴定出水稻组织或细胞器中的蛋 白质, 经分析后获得关于...
蛋白质组学的研究进展及应用
结构蛋白质组学主要是蛋白质表达模式的研究, 包括蛋白质氨基酸序列分析及空间结构...能进行大规模数据处理的计算机 系统和软件、 软电离技术及生物信息学技术等构成...
蛋白质组学主要研究技术
研究技术 目前蛋白质组学的研究手段主要依靠分离技术、 质谱技术和生物信息学的...反射式MALDI-TOF-MS中利用源后衰变分析 (Post Source decay, PSD)检测源后...
干细胞和蛋白质组学
(8、9]。传统的方法,如从和氨基酸构象分析已逐步空荡荡的。 生物信息学 蛋白质数据库的基础上的指示和蛋白质组学的发展提出了 10-12[]。它包括几个部分:(1...
蛋白质组学技术
蛋白质组学技术_生物学_自然科学_专业资料。蛋白质组学研究技术 摘要:蛋白质组...结构信息,耗资低,但速度较慢,所需蛋白质或肽 的量较大, 在超微量分析中受到...
生物信息学资源
生物信息学资源 - 1、代谢组学是什么? 代谢组学是继基因组学蛋白质组学之后新近发展起来的一门学科,是系统生物学的 重要组成部分。 基因组学蛋白质组学...
生物信息学考点整理
生物信息学考点整理 - 生物信息学考点整理 1、 人类科学史上的三大工程:人类基因组计划、曼哈顿原子计划、阿波罗登月 计划 2、 蛋白质的生物学功能:催化功能、...
更多相关标签: