Diagnostics and correction of batch effects in large-scale proteomic studies: a tutorial
Abstract
Advancements in mass spectrometry-based proteomics have enabled experiments encompassing hundreds of samples. While these large sample sets deliver much-needed statistical power, handling them introduces technical variability known as batch effects. Here, we present a step-by-step protocol for the assessment, normalization, and batch correction of proteomic data. We review established methodologies from related fields and describe solutions specific to proteomic challenges, such as ion intensity drift and missing values in quantitative feature matrices. Finally, we compile a set of techniques that enable control of batch effect adjustment quality. We provide an R package, "proBatch", containing functions required for each step of the protocol. We demonstrate the utility of this methodology on five proteomic datasets each encompassing hundreds of samples and consisting of multiple experimental designs. In conclusion, we provide guidelines and tools to make the extraction of true biological signal from large proteomic studies more robust and transparent, ultimately facilitating reliable and reproducible research in clinical proteomics and systems biology. Mehr anzeigen
Persistenter Link
https://doi.org/10.3929/ethz-b-000504182Publikationsstatus
publishedExterne Links
Zeitschrift / Serie
Molecular Systems BiologyBand
Seiten / Artikelnummer
Verlag
EMBO PressThema
batch effects; data analysis; large-scale proteomics; normalization; quantitative proteomicsOrganisationseinheit
03663 - Aebersold, Rudolf (emeritus) / Aebersold, Rudolf (emeritus)
02072 - Proteomics Plattform D-HEST
Förderung
668858 - PERSONALIZED ENGINE FOR CANCER INTEGRATIVE STUDY AND EVALUATION, a tool for cancer patient risk-stratification and pers. drug selection through multi-omic data integration. (SBFI)
670821 - Proteomics 4D: The proteome in context (EC)
163911 - Development of systems biology and bioinformatics approaches for the study of protein post-translational modifications in health and disease with the focus on application to the pregnancy related disorders (SNF)