Big Data and Proteomics: Mining Public Proteomics Data

HORARIO: miércoles 6 de abril de 2016, a las 12:30

LUGAR: Sala de Grados, Facultad de Informática

TÍTULO: "Big Data and Proteomics: Mining Public Proteomics Data."

RESUMEN: On average 75% of spectra measured in an MS experiment remain unidentified. The main reason is due incomplete databases, and bioinformatics pipelines. We propose a "big data" approach to shed a light on this "dark matter" of proteomics. PRIDE Archive is one of the largest MS proteomics public data repositories worldwide. By clustering all tandem MS spectra publicly available in PRIDE Archive, coming from hundreds of datasets, we were able to consistently characterize three distinct groups of spectra that constitute this "dark matter": 1) incorrectly identified spectra, 2) spectra correctly identified but below the set scoring threshold, and 3) truly unidentified spectra. Using a multitude of complementary analysis approaches, we were able to identify 20% of them.

PONENTE: Dr. Yasset Pérez Riverol (Bionformatician, PRIDE Group, Proteomics Services, European Bioinformatics Institute (EBI)).

IDIOMA: castellano

DURACIÓN TOTAL ESTIMADA: 90 minutos