Methods of Genomic Data Fusion: An Overview


The abundance of high-throughput biological data, such as microarray or protein-protein interaction assays has lead to a need for new methods of data analysis, that could infer useful information from large amounts of very noisy and indirect measurements. One solution could be provided by data fusion. Data fusion is a relatively recent term describing machine learning methods that can integrate disparate datasets and thus reduce the overall noise, increase statistical significance as well as leverage the interactions and correlations between the datasets to obtain more refined and higher-level information. This paper gives a very brief overview of two very general and well-developed approaches to data fusion---Bayesian networks and kernel methods. It may therefore be of interest to a reader not previosly familiar with these terms, willing to grasp the most basic understanding of the underlying ideas.


Page last updated: 18.06.2006