Analysis and prediction of super-enhancers using sequence and chromatin signatures
نویسندگان
چکیده
Background: Super-enhancers are clusters of active enhancers densely occupied by the Mediators, transcription factors and chromatin regulators, control expression of cell identity and disease associated genes. Current studies demonstrated the possibility of multiple factors with important roles in super-enhancer formation; however, a systematic analysis to asses the relative contribution of chromatin and sequence features of super-enhancers and their constituents remain unclear. In addition, a predictive model that integrates various types of data to predict super-enhancers has not been established. Results: Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of super-enhancers and their constituents and to investigate their relative contribution. Through computational modelling, we found that Cdk8, Cdk9 and Smad3 as new key features of super-enhancers along with many known. Comprehensive analysis of these features in embryonic stem cells and pro-B cells revealed their role in the super-enhancer formation and cellular identity. Further, we observed significant correlation and combinatorial predictive ability among many cofactors at the constituents of super-enhancers. By utilizing these features, we developed computational models which can accurately predict super-enhancers and their constituents. We validated these models using cross-validation and also independent datasets in four human cell-types. Conclusions: Our analysis of these features and prediction models can serve as a resource to further characterize and understand the formation of super-enhancers. Taken together, our results also suggest a possible cooperative and synergistic interactions of numerous factors at super-enhancers and their constituents. We have made available our analysis pipeline as an open-source tool with a command line interface at https://github.com/asntech/improse.
منابع مشابه
GRHL3 binding and enhancers rearrange as epidermal keratinocytes transition between functional states
Transcription factor binding, chromatin modifications and large scale chromatin re-organization underlie progressive, irreversible cell lineage commitments and differentiation. We know little, however, about chromatin changes as cells enter transient, reversible states such as migration. Here we demonstrate that when human progenitor keratinocytes either differentiate or migrate they form compl...
متن کاملGenome-Wide Prediction and Validation of Intergenic Enhancers in Arabidopsis Using Open Chromatin Signatures.
Enhancers are important regulators of gene expression in eukaryotes. Enhancers function independently of their distance and orientation to the promoters of target genes. Thus, enhancers have been difficult to identify. Only a few enhancers, especially distant intergenic enhancers, have been identified in plants. We developed an enhancer prediction system based exclusively on the DNase I hyperse...
متن کاملDELTA: A Distal Enhancer Locating Tool Based on AdaBoost Algorithm and Shape Features of Chromatin Modifications
Accurate identification of DNA regulatory elements becomes an urgent need in the post-genomic era. Recent genome-wide chromatin states mapping efforts revealed that DNA elements are associated with characteristic chromatin modification signatures, based on which several approaches have been developed to predict transcriptional enhancers. However, their practical application is limited by incomp...
متن کاملChromaSig: A Probabilistic Approach to Finding Common Chromatin Signatures in the Human Genome
Computational methods to identify functional genomic elements using genetic information have been very successful in determining gene structure and in identifying a handful of cis-regulatory elements. But the vast majority of regulatory elements have yet to be discovered, and it has become increasingly apparent that their discovery will not come from using genetic information alone. Recently, h...
متن کاملChromatin proteomics reveals novel combinatorial histone modification signatures that mark distinct subpopulations of macrophage enhancers
The integrated activity of cis-regulatory elements fine-tunes transcriptional programs of mammalian cells by recruiting cell type-specific as well as ubiquitous transcription factors (TFs). Despite their key role in modulating transcription, enhancers are still poorly characterized at the molecular level, and their limited DNA sequence conservation in evolution and variable distance from target...
متن کامل