Thursday, November 29, 2018

BISTRO - Hani Habra

4:00 PM

2036 Palmer Commons

BISTRO is restricted to U-M Bioinformatics Graduate Program students and faculty.

"The Binner Algorithm: Deep Annotation, Visualization, and Reduction of Untargeted Metabolomics Data"

Abstract

The goal of untargeted metabolomics is to achieve comprehensive characterization of all measurable analytes in a set of biological samples. Currently, metabolite profiling experiments can confidently identify only a small fraction of detectable compounds. LC-ESI-MS detects metabolites as multiple chemical species: isotopes, adducts, fragments, and multimers. The presence of these features complicates compound identification and leads to inflated false discovery rates in downstream statistical analysis. Many existing metabolomics annotation programs exploit the high pairwise correlation, close retention time, and recognized mass relationships between features derived from a common parent metabolite, but do not facilitate a thorough investigation of related mass spectral features or exploration of potentially novel adducts and fragments. As such, metabolomics datasets remain insufficiently explored and highly redundant.
I will present the algorithm for Binner, a stand-alone application that performs unsupervised clustering and deep annotation of untargeted metabolomics data while providing visual cues to find additional relationships that may be missed by automated processing. The algorithm consists of the following steps: quality control, retention-time based binning, silhouette-based hierarchical correlation clustering, and an iterative adduct and fragment searching algorithm. With Binner, we can achieve an average 30-40% data reduction and provide putative annotations for thousands of unknown features, an essential step towards their identification. We hope to use this tool to catalog frequently observed known and unknown ions that appear in our profiling experiments and facilitate statistical and bioinformatics data interpretation.