Explore the Subcellular location UMAP

Uniform Manifold Approximation and Projection (UMAP) is an analytical technique for reducing the dimensionality of a data set (Becht E et al. (2018)). The subcellular location UMAP is generated using the large collection of confocal microscopy images showing the subcellular localization patterns of human proteins. A machine learning model trained to classify the subcellular locations in these images is used to extract 1024 features from each image in the subcellular section of the Human Protein Atlas (Ouyang W et al. (2019)). The dimensionality of this dataset is then reduced by uniform manifold approximation and projection (UMAP). The result is displayed in a two- or three dimensional scatter plot, where each data point represents one image. This tool provides a new way to visualize and explore the highly dimensional protein localization data that makes up the subcellular section and find images that group together based on similiarity of these features. By coloring the data points, each representing one image, according to subcellular localizations it is evident that images of proteins localizing to the same compartment tend to cluster together. Overlaying the UMAP projection with different data can allow you to find new staining patterns and identify interesting groups of genes, in a large and complex data set.

Clicking a data point in the plot displays the corresponding image together with information about gene name, cell line, annotated subcellular location(s), and antibody. The legend below the UMAP can be used to toggle the different subcellular locations on and off in the UMAP. Click on one location in the legend to only display data points for images with an annotation of that structure. You can select multiple subcellular locations at the same time. Clicking again on one of the selected subcellular locations will deselect it, while clicking on Clear filter will reset and display all data points in the UMAP again. Images with annotations of multiple locations, representing multilocalizing proteins, are shown in grey.

A strength of the HPA database is the gene-centric integrations of a large collection of different datasets. The search function allows you to search for an individual gene, but also to perform complex filtering of the data points in the UMAP. Using pre-defined search terms, images can be filtered based on general gene information (eg. gene name or chromosome location) as well as data from all different sections of the HPA (eg tissue expression or prognostic cancer association). Read more about about how to use the search function here.

Show moreShow less

2D
3D

Images: 76094

Genes: 13083

UMAP1UMAP2

Field
Term


      • SPECIFICITY
      • DISTRIBUTION
      • EXPRESSION
      • SPECIFICITY
      • DISTRIBUTION
      • EXPRESSION
      • SPECIFICITY
      • DISTRIBUTION
      • EXPRESSION
      • SPECIFICITY
      • DISTRIBUTION
      • EXPRESSION
        • SPECIFICITY
        • DISTRIBUTION
        • EXPRESSION
        • SPECIFICITY
        • DISTRIBUTION
        • EXPRESSION
        • SPECIFICITY
        • DISTRIBUTION
        • EXPRESSION
        • SPECIFICITY
        • DISTRIBUTION
        • EXPRESSION
        • RELIABILITY
        • SUBCELLULAR VARIATION
        • MULTILOCALIZATION
        • OTHER
        • SPECIFICITY
        • DISTRIBUTION
        • EXPRESSION
        • SPECIFICITY
        • DISTRIBUTION
        • EXPRESSION

        Legend - click to toggle in UMAPi

        Multilocalizing

        Cytoplasm

        Actin filaments
        Aggresome
        Centriolar satellite
        Centrosome
        Cleavage furrow
        Cytokinetic bridge
        Cytoplasmic bodies
        Cytosol
        Focal adhesion sites
        Intermediate filaments
        Microtubule ends
        Microtubules
        Midbody
        Midbody ring
        Mitochondria
        Mitotic spindle
        Rods & Rings

        Endomembrane system

        Cell Junctions
        Endoplasmic reticulum
        Endosomes
        Golgi apparatus
        Lipid droplets
        Lysosomes
        Peroxisomes
        Plasma membrane
        Vesicles

        Nucleus

        Kinetochore
        Mitotic chromosome
        Nuclear bodies
        Nuclear membrane
        Nuclear speckles
        Nucleoli
        Nucleoli fibrillar center
        Nucleoli rim
        Nucleoplasm