We systematically analyzed 771 high-quality human CTCF ChIP-seq datasets (with at least 2000 peaks) from the public domain. These datasets cover over 200 human cell types, including normal tissues and multiple cancer types. We collectively identified 688,429 distinct CTCF binding sites by merging shared peaks called from each dataset.
We assigned an occupancy score to each CTCF site by tallying the ChIP-seq datasets exhibiting peaks within the site. We obtained 285,467 high-confidence CTCF binding sites with occupancy score ≥ 3, and we identified 22,097 constitutive CTCF binding sites, which were defined as binding present in at least 80% of all 771 datasets determined by an empirical model.
We identified cancer-specific CTCF binding patterns in six cancer types, including T-cell acute lymphoblastic leukemia (T-ALL), acute myeloid leukemia (AML), breast cancer (BRCA), colorectal cancer (CRC), lung cancer (LUAD) and prostate cancer (PRAD). We find that gain of CTCF binding in cancer associates with increased chromatin interactions and cancer-specific gene activation, while loss of CTCF binding occurred at promoters of genes present with lower expression in cancer compared to normal cells. Our results substantiate CTCF binding alteration as a functional epigenomic signature of cancer.
If you use any data from this website, please cite:
Cancer-specific CTCF binding facilitates oncogenic transcriptional dysregulation
Celestia Fang*, Zhenjia Wang*, Cuijuan Han, Stephanie L. Safgren, Kathryn A. Helmin, Emmalee R. Adelman, Valentina Serafin, Giuseppe Basso, Kyle P. Eagen, Alexandre Gaspar-Maia, Maria E. Figueroa, Benjamin D. Singer, Aakrosh Ratan, Panagiotis Ntziachristos#, Chongzhi Zang#
Genome Biology 21, 247 (2020)
BED format
with chrom, start, end, id (hg38)
CSV format
with extra annotation (CTCF sequence motif, occupancy score, genomic annotation, etc.)
BED format
with chrom, start, end, id (hg38)
CSV format
with extra annotation (CTCF sequence motif, occupancy score, genomic annotation, etc.)
BED format
with chrom, start, end, id (hg38)
CSV format
with extra annotation (CTCF sequence motif, occupancy score, genomic annotation, etc.)
T-ALL gained
T-ALL lost
AML gained
AML lost
BRCA gained
BRCA lost
CRC gained
CRC lost
LUAD gained
LUAD lost
PRAD gained
PRAD lost
List of collected CTCF ChIP-seq datasets
Lists of CTCF ChIP-seq datasets for identification of cancer-specific CTCF binding sites
Last modified: December 15, 2020