Brain - Methods summary

The Brain resource of the Atlas gives an overview of protein expression and distribution in the mammalian brain. Externally and “In-house” generated data are integrated to explore regional protein expression in the human, pig and mouse brain. Protein expression data are based on quantification of messenger RNA using RNA sequencing techniques and in situ hybridization. Protein distribution data are generated using antibody-based immunohistochemistry and immunofluorescence techniques. The brain resource can be utilized to create an overview of regional and cross species expression of proteins of interest or can be used to identify regional or functional clustered genes based on expression levels across regions of the brain.

Key publications:
Sjöstedt E et al. (2020) “An atlas of the protein-coding genes in the human, pig, and mouse brain.” Science 367(6482):eaay5947
Zhong W et al. (2022) "The neuropeptide landscape of human prefrontal cortex." Proc Natl Acad Sci U S A. 2022 119(33):e2123146119.


How has the data been generated?


Figure 1. Brain regions, areas and nuclei were micro-dissected from human, pig and mouse brains. RNA was extracted using QIAGEN RNEASY lipid tissue mini kit, RIN values were determined using QIAxcel and Qubit was used to measure mRNA concentrations. Samples were enriched for mRNA using poly(A) purification or ribosomal RNA depletion. After library preparation, the quality of the libraries was assessed using Qubit and Tapestation. Samples were sequenced using the illumina and MGI sequencing platfroms (PE100 or PE150).

RNA expression data

Transcriptomics data have been based on micro-dissected areas and regions of the human brain (n=193), human prefrontal cortex (n=17), mouse brain (n=19) and pig brain (n=32) brain. Human brain samples were provided by the Human Brain Tissue Bank of the Semmelweis University in Hungary. Following RNA extraction, messenger RNA was enriched using either a poly(A) purification step (Human prefrontal cortex & Mouse) or ribosomal RNA clean-up (Human & Pig) strategy. Samples were sequenced on the Illumina (mouse brain and human prefrontal cortex) or MGI (human brain and pig brain) RNAseq platforms and reads were mapped to corresponding genes in the Ensembl version used in the Human Protein Atlas.

Immunofluorescence on mouse brain sections

Protein targets are selected based on their brain, brain regional or brain cell-type elevated expression. Antibodies against these targets were applied to a series of coronal mouse brain sections that cover all major brain regions and cell types of the mammalian brain. Sections were scanned on a fluorescence slide scanning microscope to generate complete overview images with microscopic resolution.

Immunohistochemistry on tissue microarrays

The protein expression tissue micro-array data, include cerebral cortex, hippocampal formation, caudate nucleus and cerebellum, and were derived from antibody-based protein profiling using immunohistochemistry (the Tissue resource contains data on 44 human tissue types). Tissue microarrays of 1mm diameter samples were stained with primary antibodies, visualized with DAB (3,3'-diaminobenzidine) and counterstained with hematoxylin. Each brain region is represented by samples from three individuals. For selected proteins, additional brain tissues were stained, such as eye, cerebral cortex, hypothalamus, cerebellum and substantia nigra. Immunohistochemically stained sections from tissue microarrays were scanned to allow for subsequent analysis and presentation at the HPA web portal.


How has the data been analyzed?

Figure 2. All reads of all samples were mapped to selected Ensembl version using Kallisto pseudoalignment algorithms. Transcripts Per Million for all protein coding transcripts was calculated. Technical variation (cohorts, platfrom) and individual (donor) variation was removed using a data normalization approach. Each protein coding gene is classified based on expression in brain vs. peripheral tissues and between regions of the brain. Expression (normalized TPM) is calculated for all regions and subregions. Brain and regional elevated genes for human, pig and mouse are listed on the brain region pages.


RNA expression data

Transcript expression levels are determined by counting the reads that match each protein coding sequence. Normalization between experiments, platforms, species and individuals is performed based on the assumption the normal distribution of gene expression for all protein coding genes between samples is similar. Normalized data are presented on the gene summary pages as normalized Transcripts Per Million (nTPM). For the non-human species, only data on proteins with one-to-one human orthologues are presented. Pig data mapped to the pig genome are available in the Pig RNAatlas. Based on the anatomical and developmental organization of the brain, samples are grouped in 14 main regions of the central nervous system (olfactory bulb, cerebral cortex, hippocampal formation, amygdala, basal ganglia, thalamus, hypothalamus, midbrain, pons, cerebellum, medulla oblongata, spinal cord, choroid plexus and white matter structures). For each region, protein expression is calculated as the maximum expression of any of the areas and subregions included in the group.

Mouse brain virtual microscope

Protein distribution maps for 303 proteins have been generated. These maps provide an overview of protein distribution in the many regions of the mouse brain, and also allow inspection of cells and cellular compartments.

Knowledge-based annotation of protein expression

Human brain tissue microarray images have been annotated for positivity in glial cells, neuronal cells, endothelial cells and neuropil. For cerebellum, Purkinje cells and cells in the granular and molecular layer have been annotated. Mouse IF data, based on a series of coronal sections have been annotated for cell type (astrocyte, microglia, oligodendrocyte, neurons, ependymal cells or endothelia) and subcellular distribution (soma, nucleus, endfeet, myelin sheath, dendrites, axons, or synapses). Fluorescence intensity is measured in 129 brain regions and subfields and summarized in the 13 main brain regions based on the maximum mean fluorescence intensity of any of its subregions and subfields.


What can you learn from the Brain resource?


Figure 3. The gene summary pages provide an overview of protein expression in the brain. The brain is divided in 13 main regions each represented with a bar. For each main brain region, expression data for individual (sub)regions, areas and nuclei can be explored. For all one-to-one orthologues gene expression in the mouse and pig brain can be explored.

Learn about:

  • Expression levels for all human proteins in regions and subregions of the human brain
  • Expression levels for all proteins with human orthologs in regions and subregions of the pig and mouse brain
  • Brain enriched genes with higher expression in any of the regions of the brain compared to peripheral organs
  • Regional enriched genes with higher expression in a single or few regions of the brain
  • Cell-type and cell-compartment distribution of selected proteins in the human and mouse brain
  • Differences in gene expression between mammalian species


Data overview

Data type Count Data Coverage (nr genes)
RNA expression (HPA) 13 Gene level RNA seq data for 13 brain regions in the HPA dataset used for expression profiling and classification 20162
RNA expression (HPA) 193 Gene level RNA seq data for 193 brain subregions in the HPA dataset used for expression profiling 20162
RNA expression (GTEx) 10 Gene level RNA seq data for 10 brain regions from the GTEx RNA dataset used for expression profiling 19266
RNA expression (FANTOM) 14 Gene level RNA seq data for 14 brain regions from the FANTOM dataset used for expression profiling 18292
RNA expression (pfc) 20 Gene level RNA seq data for 20 prefrontal cortex (pfc) subregions used for expression profiling 20162
RNA expression (HPA pig) 15 Gene level RNA seq data for 15 pig brain regions used for expression profiling 16614
RNA expression (HPA mouse) 13 Gene level RNA seq data for 13 mouse brain regions from HPA used for expression profiling 16679
RNA expression (Allen mouse) 10 Gene level RNA seq data for 10 mouse brain regions from Allen brain atlas used for expression profiling 15156
RNA gene clustering 193 Gene expression cluster analysis based on 193 brain subregions 17832

Mouse brain virtual microscope

A genome-wide classification of the protein-coding genes with regard to tissue distribution and specificity has been performed using between-sample normalized data (Tissue Section). In this comparison, brain was represented by the maximum expression levels in any of the 13 main regions of the brain. In the Brain section, regional distribution is classified by comparing gene expression across the 13 main regions of the central nervous system. The genes were classified according to specificity into (i) enriched genes with at least four-fold higher expression levels in one tissue type as compared with any other analysed tissue or brain regions; (ii) group-enriched genes with enriched expression in a small number of tissues or brain regions (2 to 5); and (iii) tissue-enhanced genes with only moderately elevated expression in brain or a brain region. In the figure, the number of tissue enriched and group-enriched genes are shown.


Assays and Annotations

Detailed information on the transcriptomics data , gene classification, immunofluorescence , and immunohistochemistry can be found on the Assays and Annotations page.