[HTML][HTML] KC-SMARTR: An R package for detection of statistically significant aberrations in multi-experiment aCGH data

JJ de Ronde, C Klijn, A Velds, H Holstege… - BMC research …, 2010 - Springer
BMC research notes, 2010Springer
Background Most approaches used to find recurrent or differential DNA Copy Number
Alterations (CNA) in array Comparative Genomic Hybridization (aCGH) data from groups of
tumour samples depend on the discretization of the aCGH data to gain, loss or no-change
states. This causes loss of valuable biological information in tumour samples, which are
frequently heterogeneous. We have previously developed an algorithm, KC-SMART, that
bases its estimate of the magnitude of the CNA at a given genomic location on kernel …
Background
Most approaches used to find recurrent or differential DNA Copy Number Alterations (CNA) in array Comparative Genomic Hybridization (aCGH) data from groups of tumour samples depend on the discretization of the aCGH data to gain, loss or no-change states. This causes loss of valuable biological information in tumour samples, which are frequently heterogeneous. We have previously developed an algorithm, KC-SMART, that bases its estimate of the magnitude of the CNA at a given genomic location on kernel convolution (Klijn et al., 2008). This accounts for the intensity of the probe signal, its local genomic environment and the signal distribution across multiple samples.
Results
Here we extend the approach to allow comparative analyses of two groups of samples and introduce the R implementation of these two approaches. The comparative module allows for a supervised analysis to be performed, to enable the identification of regions that are differentially aberrated between two user-defined classes.
We analyzed data from a series of B- and T-cell lymphomas and were able to retrieve all positive control regions (VDJ regions) in addition to a number of new regions. A t-test employing segmented data, that we implemented, was also able to locate all the positive control regions and a number of new regions but these regions were highly fragmented.
Conclusions
KC-SMARTR offers recurrent CNA and class specific CNA detection, at different genomic scales, in a single package without the need for additional segmentation. It is memory efficient and runs on a wide range of machines. Most importantly, it does not rely on data discretization and therefore maximally exploits the biological information in the aCGH data.
The program is freely available from the Bioconductor website http://www.bioconductor.org/ under the terms of the GNU General Public License.
Springer