Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites

GE Crawford, IE Holt, JC Mullikin… - Proceedings of the …, 2004 - National Acad Sciences
GE Crawford, IE Holt, JC Mullikin, D Tai…
Proceedings of the National Academy of Sciences, 2004National Acad Sciences
Analysis of the human genome sequence has identified≈ 25,000–30,000 protein-coding
genes, but little is known about how most of these are regulated. Mapping DNase I
hypersensitive (HS) sites has traditionally represented the gold-standard experimental
method for identifying regulatory elements, but the labor-intensive nature of this technique
has limited its application to only a small number of human genes. We have developed a
protocol to generate a genome-wide library of gene regulatory sequences by cloning DNase …
Analysis of the human genome sequence has identified ≈25,000–30,000 protein-coding genes, but little is known about how most of these are regulated. Mapping DNase I hypersensitive (HS) sites has traditionally represented the gold-standard experimental method for identifying regulatory elements, but the labor-intensive nature of this technique has limited its application to only a small number of human genes. We have developed a protocol to generate a genome-wide library of gene regulatory sequences by cloning DNase HS sites. We generated a library of DNase HS sites from quiescent primary human CD4+ T cells and analyzed ≈5,600 of the resulting clones. Compared to sequences from randomly generated in silico libraries, sequences from these clones were found to map more frequently to regions of the genome known to contain regulatory elements, such as regions upstream of genes, within CpG islands, and in sequences that align between mouse and human. These cloned sites also tend to map near genes that have detectable transcripts in CD4+ T cells, demonstrating that transcriptionally active regions of the genome are being selected. Validation of putative regulatory elements was achieved by repeated recovery of the same sequence and real-time PCR. This cloning strategy, which can be scaled up and applied to any cell line or tissue, will be useful in identifying regulatory elements controlling global expression differences that delineate tissue types, stages of development, and disease susceptibility.
National Acad Sciences