Downloadable data

Programmatic access

If you want to programmatically access a subset of the data more information can be found on the help page

Search results

The data files represented here includes data available in the Human Protein Atlas version 24.0. A subset of this data can also be downloaded from the Search page with the genes corresponding to the current search result in the result in different formats; XML, RDF, TSV & JSON.

Single entry

Data for a single entry can be accessed in XML, RDF (trig), TSV or JSON format by adding the corresponding format extension to the Ensembl id as in the below URLs:
https://www.proteinatlas.org/ENSG00000134057.xml
https://www.proteinatlas.org/ENSG00000134057.trig
https://www.proteinatlas.org/ENSG00000134057.tsv
https://www.proteinatlas.org/ENSG00000134057.json

Archived data

As of version 16 of the Human Protein Atlas, the site can be reached using the url structure "vXX.proteinatlas.org" where XX is the version number. For example, version 16 of the Human Protein Atlas has the url https://v16.proteinatlas.org.

Tissue

Data type Count Data Coverage (nr genes)
Protein expression (IHC) 45 Protein expression levels based on IHC staining for 76 cell types in 45 normal tissues 15302
RNA expression (consensus) 50 Gene level RNA seq data based on 50 consensus tissues used for expression profiling and classification 20162
RNA expression (HPA) 40 Gene level RNA seq data based on the 40 tissues in the HPA dataset 20162
RNA expression (GTEx) 35 Gene level RNA seq data for 35 tissues based on 46 tissue subtypes in the GTEx dataset 19266
RNA expression (FANTOM) 46 Gene level RNA seq data for 46 tissues based on 66 tissue subtypes in the FANTOM dataset 18292
RNA gene clustering 50 Gene expression cluster analysis based on 50 tissues 18492
RNA expression (HPA) 186 Transcript level RNA seq data based on 186 tissue samples from 40 HPA tissues 20162
Protein expression (mIHC/IF) Multiplex profiling using fluorescent multiplex IHC (mIHC/IF) in a selection of tissues and ciliated cells 1021

Brain

Data type Count Data Coverage (nr genes)
RNA expression (HPA) 13 Gene level RNA seq data for 13 brain regions in the HPA dataset used for expression profiling and classification 20162
RNA expression (HPA) 193 Gene level RNA seq data for 193 brain subregions in the HPA dataset used for expression profiling 20162
RNA expression (HPA) 966 Transcript level RNA seq data for 966 brain samples in the HPA dataset used for expression profiling 20162
RNA expression (GTEx) 10 Gene level RNA seq data for 10 brain regions from the GTEx RNA dataset used for expression profiling 19266
RNA expression (GTEx) 105 Transcript level RNA seq data for 105 GTEx retina samples used for expression profiling 20162
RNA expression (FANTOM) 14 Gene level RNA seq data for 14 brain regions from the FANTOM dataset used for expression profiling 18292
RNA expression (pfc) 20 Gene level RNA seq data for 20 prefrontal cortex (pfc) subregions used for expression profiling 20162
RNA expression (HPA pig) 15 Gene level RNA seq data for 15 pig brain regions used for expression profiling 16614
RNA expression (HPA pig) 144 Transcript level RNA seq data for 144 pig brain samples 16614
RNA expression (HPA mouse) 13 Gene level RNA seq data for 13 mouse brain regions from HPA used for expression profiling 16679
RNA expression (HPA mouse) 75 Transcript level RNA seq data for 75 mouse brain samples from HPA 16679
RNA expression (Allen mouse) 10 Gene level RNA seq data for 10 mouse brain regions from Allen brain atlas used for expression profiling 15156
RNA gene clustering 193 Gene expression cluster analysis based on 193 brain subregions 17832

Single cell type

Data type Count Description Cover (nr genes)
RNA expression 31 RNA read count for genes per cell across 31 tissues 20082
RNA expression 557 RNA expression for genes across 557 clusters 20082
RNA expression 81 RNA expression levels per gene and cell type 20082

Single nuclei brain

Data type Count Description Cover (nr genes)
RNA expression 11 RNA read count for genes per cell across 11 brain regions 19580
RNA expression 260 RNA expression for genes across 260 clusters 19580
RNA expression 34 RNA expression for genes across 34 cluster types 19580

Immune cell

Data type Count Description Coverage (nr genes)
RNA expression (HPA) 19 RNA seq data for 19 immune cells in the HPA dataset used for expression profiling 20162
RNA expression (HPA) 18 RNA seq data for 18 immune cells in the HPA dataset used for RNA classification 20162
RNA expression (HPA) 109 RNA seq data for 109 immune cell samples in the HPA dataset used for expression profiling 20162
RNA expression (HPA) 109 Transcript level RNA seq data for 109 immune cell samples in the HPA dataset 20162
RNA expression (Monaco) 30 Monaco immune cell info 20162
RNA expression (Schmiedel) 15 RNAseq data for the 15 immune cells in the Schmiedel dataset used for expression profiling and classification 20162
RNA gene clustering 103 Gene expression cluster analysis based on 103 immune cell samples in the HPA data set 12863

Subcellular

Data type Count Data Coverage (nr genes)
Protein location 49 Protein location data across 13534 genes 13534

Cancer

Data type Count Data Coverage (nr genes)
Prognostic data 21 Analysis of association between TCGA RNA expression and cancer survival in 21 cancers 13698
Prognostic data 10 Validation of association between RNA expression and cancer survival in 10 cancers 13698
Protein expression (IHC) 20 IHC estimated protein expression in 20 cancer types 15302
RNA expression (TCGA) 31 Classification of RNA expression based on 8384 samples from 31 prognostic and validation cancers corresponding to 21 cancer types 19973
RNA expression (TCGA) 21 RNA expression for 8384 samples from 21 cancer types 19973
Protein expression (CPTAC MS) 11 Differential expression between cancerous and matched normal tissues across 11 cancer types 13814

Blood disease

Data type Count Data Coverage (nr genes)
Protein expression 59 Differential expression analysis across 59 diseases 1162

Blood protein

Data type Count Data Coverage (nr genes)
Protein concentration Protein concentrations across 453 genes measured in Immunoassays 453
Protein concentration Protein concentrations across 4294 genes measued with mass spectrometry 4294

Cell line

Data type Count Data Coverage (nr genes)
RNA expression 1206 RNA expression in 1206 cell line 20162
RNA expression 28 RNA expression and classification based on 28 cancer cell line groups 20162
RNA expression 1132 Comparison between 1132 cancer cell lines and TCGA cancers using ranking and correlation 19973
RNA cell line analysis 1206 PROGENy and CytoSig analysis on 1206 cell lines
RNA gene clustering 1206 Gene expression cluster analysis based on 1206 cell lines 19508

Interaction

Data type Count Data Coverage (nr genes)
Protein-protein interaction Interacting protein pairs from the consensus dataset 7712

Protein atlas data

Data from the Human Protein Atlas in tab-separated format
This file contains a subset of the data in the Human Protein Atlas version 24.0 corresponding to the data seen in the search result. This data can also be downloaded for a resulting gene set when using the search function (via the TSV link on the result page).

Data from the Human Protein Atlas in json format
This file contains the same subset of the data as the above proteinatlas.tsv but in a different format and potentially more useful for 3rd party web APIs. This data can also be downloaded for a resulting gene set using the search function (via the Download: Custom TSV/JSON link on the result page).

Data from the Human Protein Atlas in XML format
The XML file contains most of the data in the Human Protein Atlas version 24.0, including protein expression data (in normal and tumor tissues and in cell lines), antigen sequences, Western blot data for antibodies, protein array data for antibodies, RNA-seq data, external references such as UniProt identifiers, and more. The data is based on Ensembl version 109. The file structure is presented in the XSD-schema. This data can also be downloaded for a resulting gene set when using the search function (via the xml link on the result page). The XML file presented here is compressed with gzip due to its size. It can be uncompressed with an archive program like 7‑zip.

Data from the Human Protein Atlas in RDF format
This file contains a subset of the data in the Human Protein Atlas version 24.0 corresponding to the tissue annotations on gene level. This data can also be downloaded for a resulting gene set when using the search function (via the RDF link on the result page). This RDF release is BETA and will be extended and developed in coming releases. We thank Mark Thompson, Rajaram Kaliyaperumal and Eelke van der Horst (LUMC, The Netherlands), and Christine Chichester (SIB, Switzerland) for providing templates for generating the first beta-release of HPA nanopublications. Their contribution was made possible by IMI project Open PHACTS and EU FP7 project RD-Connect. This beta was developed within an ELIXIR collaboration.

Cell graphic
Schematic cell containing all structures annotated within the Human Protein Atlas.