Transcription factor landscapeFor a living cell to function in its environment, a large number of regulatory processes are needed, including regulation of cell proliferation, cell differentiation and cell death. The underlying mechanisms include regulation of gene expression, and an important class of regulatory proteins are transcription factors that determine when genes are switched on and off. Here we explore the 1485 human transcription factors and their expression landscape across different cell types of the human body, as well as cancer cell lines. When studying the expression landscape of transcription factors in different cell types, 1188 transcription factor genes show some level of elevated expression in one or a group of cell types compared others. This analysis is based on the specificity categories of gene expression, and four different cell centric datasets: cell types representing the whole body, cell types compared within the brain, circulating immune cells, and a comparison between cancer cell lines.
Here, we provide an overview of the expression landscape of transcription factors, separated into the four datasets. Click the respective cell types for more details and overview of the transcription factors with enriched expression profiles. Single cell types representing the bodySingle Cell Type transcriptomics data based on single cell RNA sequencing (scRNAseq) data from 31 human tissues, including peripheral blood mononuclear cells (PBMCs), representing 81 different cell types grouped into 15 main cell type groups. Out of the 81 cell types representing 31 tissues, germ cells is the cell type group with most transcription factors classified as cell type enriched (31), followed by neuronal cells with 15 transcription factors classified as cell type enriched.
Single nuclei with human brain detailsBrain single nuclei transcriptomics data is based on single nuclei RNA sequencing (snRNAseq) data representing 11 brain regions, based on 461 cell clusters, and 31 superclusters, here shown as 34 superclusters/cell types based on the published data (Siletti K et al. (2023)) and their cell type classification. This brain single nuclei dataset further expands the representation of the human brain. Brain cells are represented in the whole body comparison of cell types (above), but limited to cerebral cortex and 76533 cells. The overlap between the datasets are further discussed on the respecive cell type pages. When comparing the gene expression of cells within the brain the heterogenous group of glial cells holds the most transcription factors with enriched expression (248), indicating the wide range of cells that are referred to as glial cells (Astrocytes, Bergmann glia, Microglia, Oligodendrocytes, OPCs and ependymal cells). The different clusters of neurons, throught the different brain regions, includes several transcription factors with an elevated expression profile.
Immune cell detailsImmune cell type transcriptomics data, based on flow-sorted cells from blood, covering 18 cell types grouped into 6 immune cell lineages. This immune cell dataset further expands the representation and description of the different immune subtypes, with expression comparison between the different immune cell types present in circulating blood. Immune cells are included in the whole body comparison of cell types (above), but limited to main types of immune cells and focuses on immune cells resident in different tissue types. The number of transcription factors classified as enriched when comparing expression profiles of circulting immune cells, granulocytes has the highest number (59) compared to other immune cell lineages.
Cancer cell lines representing cancer typesCancer cell line transcriptomics data provides RNA expression profiles of human protein-coding genes in 1132 cancer cell lines, representing 28 different cancer types. The cell line specificity is classified by comparing the expression profiles across the 28 grouped cancer cell lines. Three cancer types stand out when comparing the expression profiles of transcription factors; Testis cancer (14), Neuroblastoma (15) and Bone cancer (15) are the three cancers with highest number of transcription factors classified as cancer cell line enriched. Important to note is that tesits cancer is represented by one cell line, while both neuroblastoma and bone cancer is represented by a mean expression value of several representative cell lines.
Some cancers show very low numbers of elevated ranscritpion factors, such as Bladder cancer and Pancreatic cancer.
Regulation of gene expressionTranscription factors are regulatory proteins, and they are considered to be the most diverse and important mechanism of gene regulation. According to the TF class database and with data in both the UniProt and Ensembl databases, 1485 human genes are classified as transcription factors. They have DNA-binding domains that bind, specifically and with extreme affinity, to consensus DNA sequences and thereby activate (or in rare cases inhibit) transcription. Transcription factors are classified into families either based on the highly conserved sequences of the DNA binding domains, or on their three-dimensional protein structure. These structural motifs result in their specificity for the consensus sequence and the major classes include 772 proteins with zinc-coordinating DNA-binding domains (zinc-finger proteins), 171 proteins with basic domains (helix-loop-helix and leucine-zipper factors), and 389 proteins with helix-turn-helix domains (homeodomain factors). In Table 1, the transcription factors are classified according to structural motif as in the TFclass database. Table 1. Structural classification of transcription factors.
Examples of transcription factors from different structural classesThe zinc-finger is a structural motif in which one or more zinc ions stabilize the protein fold as exemplified by the three-dimensional schematic representation of the estrogen receptor ESR1 (purple with zinc-ions in red) binding to DNA. ESR1 is a nuclear hormone receptor, here shown to be expressed in glandular cells and cells in endometrial stroma of the uterus by staining with the antibody CAB000037.
The structural motif known as the leucine-zipper consists of a leucine repeat region, which forms an alpha helix with a hydrophobic region responsible for dimerization. Here exemplified by a three-dimensional schematic representation of the proto-oncogene JUN (purple) binding as a homodimer to DNA. JUN is a basic leucine-zipper factor, here shown to be expressed in glandular cells of the colon by immunohistochemical staining by using the antibody CAB007780.
The helix-turn-helix motif is a DNA binding motif composed of two α-helices, which make contacts with DNA and are joined by a short turn. The three-dimensional schematic representation shows the transcription factor GBX1 (purple) binding to DNA. GBX1 is a homeo-domain factor, here shown to be expressed in follicle cells of the ovary by using the antibody HPA055783.
Relevant links and publications Uhlén M et al., Tissue-based map of the human proteome. Science (2015) Karlsson M et al., A single-cell type transcriptomics map of human tissues. Sci Adv. (2021) Siletti K et al., Transcriptomic diversity of cell types across the adult human brain. Science. (2023) Uhlen M et al., A genome-wide transcriptomic analysis of protein-coding genes in human blood cells. Science. (2019) Jin H et al., Systematic transcriptional analysis of human cell lines for gene expression landscape and tumor representation. Nat Commun. (2023) |