|Yin Liu, Ph.D.
Ph.D., 2007, Bioinformatics, Yale University
Within the domain of bioinformatics, my research interests focus on the development of computational and statistical approaches to performing large-scale analysis of cellular networks, biological pathways, genomic sequences, and gene expression.
The research in our group primarily focuses on the development of sound statistical models to integrate information from diverse sources in order to reconstruct biological networks, such as protein interaction networks. In the area of genome-wide protein interaction prediction, the high-throughput techniques such as yeast two-hybrid screening methods used for systematically identifying protein interactions suffer from high false positive rates and high false negative rates due to the limitation of these techniques. We have been working on developing statistical methods to integrate large-scale protein interaction data from diverse organisms in order to improve the reliability of protein interactions inference. Another topic in the area of systems biology we are interested in is signal transduction pathway reconstruction. We have developed an approach that integrates protein-protein interaction data and gene expression data from microarray chips for predicting the order of signaling pathway components, assuming all the components on the pathways are known. Our current research on this topic concentrates on the incorporation of other types of information such as protein phosphorylation data, and the development of more elaborate statistical approaches to make further prediction and modeling of the signal transduction networks.
From a statistical point of view, we are interested in the field of Bayesian inference and its applications in Bioinformatics. Bayesian inference has been widely used in the analysis of high throughput bioinformatics data because biological evidence can be flexibly incorporated into Bayesian models and it naturally lends itself to efficient computational methods. Currently, we are working on the development of a Bayesian approach coupled with Markov Chain Monte Carlo (MCMC) to inferring protein complexes and functional modules using high-throughput mass spectrometry data, with considering the topological structures of the protein interaction networks when making the inference.