The organelle proteome

Spatial compartmentalization of biological functions is a fundamental strategy that enables multiple biological processes to occur in parallel without undesired interference. An organelle is a subunit of the eukaryotic cell with a specialized function. The name "organelle" stems from the analogy between the different roles of organelles in the cells to the different roles of organs in the human body as a whole. A distinction is often made between membrane-bound and non-membrane bound organelles. The membrane-bound organelles, such as the nucleus and the Golgi apparatus, have a clearly defined physical boundary that separates the internal space from the outside. In contrast, non-membrane bound organelles and subcellular compartments, like the cytoskeleton and nucleoli, constitute spatially distinct assemblies of proteins, and sometimes RNA, within the cell without a physical boundary. This partitioning of cellular components creates specific environments where the concentration of different molecules can be tailored to fit the purpose of the organelle or subcellular structure, and provides important opportunities for regulation and coordination of cellular processes.

A major function of proteins is to catalyze, conduct and control cellular processes in time and space. As different organelles and subcellular structures offer distinct environments, with distinct physiological conditions and interaction partners, the subcellular localization of a protein is an important factor for protein function. Consequently, mis-localization of proteins is associated with cellular dysfunction and various human diseases (Kau TR et al. (2004); Laurila K et al. (2009); Park S et al. (2011)). Knowledge of the spatial distribution of proteins at the subcellular level is essential for understanding protein function and molecular interactions, as well as for identifying the components of different cellular processes. Thus, studying how cells generate and maintain their spatial organization is central for understanding the mechanisms of living cells.

Within the subcellularresource, 13534 human proteins have been mapped at single-cell level to 49 different organelles and subcellular structures (Figure 1), which has enabled the definition of 14 major organelle proteomes. The final group consists of proteins localizing to sub of the highly specialized sperm cells have been grouped into a s

The analysis also reveals that approximately half of the proteins localize to multiple compartments and identifies many proteins with single-cell variation in terms of protein abundance and/or spatial distribution.

Subcellular localization of proteins

Several approaches for systematic analysis of protein localization have been described. The first major approach is organelle fractionation and quantitative mass-spectrometry, which can allow identification of the subcellular location of proteins by comparing their distribution profiles across the fractions with known organelle markers (Park S et al. (2011); Christoforou A et al. (2016); Itzhak DN et al. (2016)). The second major approach is to use protein-protein interaction studies to deduce the local spatial proteome of proteins. In this case, affinity purification or enzyme-mediated proximity-labelling is used to deduce unknown protein subcellular locations from interaction networks overlayed with known subcellular locazations (Itzhak DN et al. (2016); Roux KJ et al. (2012); Lee SY et al. (2016)). The third major approach is imaging-based methods, which enable the exploration of subcellular distribution of proteins in situ in single cells. These approaches are complemental, but imaging-based subcular profiling has the advantage of effectively identifying single-cell variability and multi-organelle localization. Imaging based approaches can be performed using tagged recombinant proteins (Huh WK et al. (2003); Simpson JC et al. (2000); Stadler C et al. (2013)) or affinity reagents, such as antibodies.

The subcellular reso employs an immunofluorescence (IF) based approach combined with confocal microscopy to enable high-resolution investigation of the spatial distribution of proteins in fixed cells (Thul PJ et al. (2017); Stadler C et al. (2013); Barbe L et al. (2008); Stadler C et al. (2010); Fagerberg L et al. (2011)). With the diffraction-limited resolution of about 200 nm, a confocal image gives detailed insights into organization at the subcullar level. The spatial distribution of the protein is investigated using indirect IF in up to three cell lines, usually comprised of U-2 OS and two additional cell lines selected based on mRNA expression of the corresponding gene, using a panel of 41 human cell lines. Some proteins have also been mapped in ciliated cell lines and/or in human sperm cells. The protein of interest is visualized in green, while reference markers for microtubules (red), endoplasmic reticulum (yellow) and nucleus (blue) are used to outline the cell and the nucleus. From small dots like nuclear bodies, to larger structures such as the nucleoplasm, the distinct patterns in the images together with the reference markers make it possible to precisely determine the spatial distribution of a protein within the cell. The localization of each protein is assigned to one or more of 49 organelles and subcellular structures (Figure 1).


Nucleoplasm

Nuclear speckles

Nuclear bodies

Nucleoli

Nucleoli fibrillar center

Nucleoli rim

Mitotic chromosome

Kinetochore

Nuclear membrane

Cytosol

Cytoplasmic bodies

Rods & Rings

Aggresome

Mitochondria

Centrosome

Centriolar satellites

Microtubules

Microtubule ends

Mitotic spindle

Cytokinetic bridge

Midbody

Midbody ring

Cleavage furrow

Intermediate filaments

Actin filaments

Focal adhesion sites

Endoplasmic reticulum

Golgi apparatus

Vesicles

Endosomes

Lysosomes

Lipid droplets

Peroxisomes

Plasma membrane

Cell junctions

Primary cilium

Primary cilium tip

Primary cilium transition zone

Basal body

Acrosome

Equatorial segment

Perinuclear theca

Calyx

Connecting piece

Flagellar centriole

Midpiece

Principal piece

End piece

Annulus

Figure 1. An example of confocal immunofluorescence images of different proteins (green) localized to each of the subcellular organelles and substructures currently annotated in the subcellular resource in a representative set of cell lines. Microtubules are marked with an anti-tubulin antibody (red) and the nucleus is counterstained with DAPI (blue). For more example images and details describing all the 49 patterns annotated in the subcellular resource, see the cell structure dictionary.

Protein distribution in human cells

Figure 2 shows the distribution of all classificactions across the 49 organelles and subcellular structures for 13534 genes with protein localization data in the subcellular resource. The plot is sorted by meta-compartments: cytoplasm, nucleus, and endomembrane system, respectively. Most proteins are found in the nucleus, followed by the cytosol and vesicles, which consist of transport vesicles as well as small membrane-bound organelles like endosomes or peroxisomes. 60% (n=8187) of the proteins were detected in more than one location (multilocalizing proteins), and 27% (n=3670) displayed single-cell variation in expression level or spatial distribution.

CytosolMitochondriaCentrosomeMicrotubulesActin filamentsCentriolar satelliteCytokinetic bridgeIntermediate filamentsFocal adhesion sitesMitotic spindleCytoplasmic bodiesMidbodyLipid dropletsPeroxisomesAggresomeMidbody ringLysosomesRods & RingsEndosomesMicrotubule endsCleavage furrowVesiclesPlasma membraneGolgi apparatusEndoplasmic reticulumCell JunctionsNucleoplasmNucleoliNuclear bodiesNuclear specklesNucleoli fibrillar centerNuclear membraneNucleoli rimMitotic chromosomeKinetochoreBasal bodyPrimary ciliumPrimary cilium transition zonePrimary cilium tipPrincipal pieceMid pieceEnd pieceAcrosomeFlagellar centrioleConnecting pieceEquatorial segmentCalyxPerinuclear thecaAnnulus01234log GenesCytoplasmEndomembrane systemNucleusPrimary ciliumSperm

Figure 2. Bar plot showing the distribution of classifications of proteins in organelles and subcellular structures in the subcellular resource. Note that one protein can localize to more than one compartment. The bars are colored according to meta compartment.

Validation of antibodies and location data

The quality and use of antibodies in research have been frequently debated (Baker M. (2015)). As antibody off-target binding can cause false positive results, the subcellular section makes an effort in manually scoring all results in terms of reliability. In the subcellular resource a reliability score for every annotated location at a four-graded scale is provided: Enhanced, Supported, Approved, and Uncertain, as described in detail in the assay & annotation section. Enhanced reliability scores are obtained through antibody validation according to one of the validation "pillars" as proposed by an international working group (Uhlen M et al. (2016): (i) genetic methods using siRNA silencing (Stadler C et al. (2012)) or CRISPR/Cas9 knock-out, (ii) expression of a fluorescent protein-tagged protein at endogenous levels (Skogs M et al. (2017)) or (iii) independent antibodies targeting different epitopes (Stadler C et al. (2010)). Supportive reliability scores are given for locations that are in agreement with external experimental data (UniProtKB/Swiss-Prot database). the reliability score Approved indicates that there is no external experimental information available to confirm the observed location or that external data is only partially supportive. An Uncertain location is contradictory compared to external protein localization data or protein function, but is shown if it cannot be ruled out that the data is correct, and further experiments are needed to establish the reliability of the antibody staining. The individual location reliability scores are summarized into an overall gene reliability score. The distribution of reliability scores in the subcellular resource is shown in Figure 3. Approximately 43% (n=5826) of the protein localizations provided are Enhanced or Supported. Table 1 details the organelle distribution of all localized proteins and the distribution of reliability scores on the basis of individual organelles ans subcellular structures.

Figure 3. Pie chart showing level of reliability of the localized proteins, where each piece is the number of proteins with one type of score, out of the four reliability scores Enhanced, Supported, Approved, and Uncertain.

Table 1. Table showing the number of proteins localized to every organelle, structure, and substructure in the subcellular resource, along with the distribution of reliability scores.

Location
Proteins
Location reliability
%
Enhanced
Supported
Approved
Uncertain
Intermediate filaments 1511.111199427
Actin filaments 2521.9124215840
Focal adhesion sites 1461.110288622
Microtubules 3422.5145120968
Microtubule ends 70.10241
Cytokinetic bridge 2161.641813856
Midbody 530.4212327
Midbody ring 250.211203
Cleavage furrow 200011
Mitotic spindle 13412227535
Centriolar satellite 2291.783812558
Centrosome 4783.5710628778
Mitochondria 11178.312039851386
Aggresome 210.200183
Cytosol 515738.129715302635695
Cytoplasmic bodies 740.5224426
Rods & Rings 200.101172
Endoplasmic reticulum 5664.25119828631
Golgi apparatus 12569.363214820159
Vesicles 240417.8953681563378
Peroxisomes 240.261251
Endosomes 160.16910
Lysosomes 190.121331
Lipid droplets 400.3611203
Plasma membrane 219616.2987121099287
Cell Junctions 3302.4209017941
Nucleoplasm 624446.167819502936680
Nuclear membrane 2872.1147017033
Nucleoli 11068.292285565164
Nucleoli fibrillar center 3192.4148219330
Nucleoli rim 1511.128476610
Nuclear speckles 4973.74717022852
Nuclear bodies 6114.53722729156
Kinetochore 70.10520
Mitotic chromosome 750.61416405
Primary cilium 3842.8260198124
Primary cilium tip 550.4062821
Primary cilium transition zone 880.7094633
Basal body 4013172216112
Acrosome 13311910221
Equatorial segment 810.602790
Perinuclear theca 680.503614
Calyx 700.504642
Connecting piece 930.704863
Flagellar centriole 1150.8019942
Mid piece 3502.622430717
Principal piece 3872.923031837
End piece 1941.401915916
Annulus 500.405441
Number of proteins 135341001364518079792212

Relevant links and publications

Kau TR et al., Nuclear transport and cancer: from mechanism to intervention. Nat Rev Cancer. (2004)
PubMed: 14732865 DOI: 10.1038/nrc1274

Laurila K et al., Prediction of disease-related mutations affecting protein localization. BMC Genomics. (2009)
PubMed: 19309509 DOI: 10.1186/1471-2164-10-122

Park S et al., Protein localization as a principal feature of the etiology and comorbidity of genetic diseases. Mol Syst Biol. (2011)
PubMed: 21613983 DOI: 10.1038/msb.2011.29

Christoforou A et al., A draft map of the mouse pluripotent stem cell spatial proteome. Nat Commun. (2016)
PubMed: 26754106 DOI: 10.1038/ncomms9992

Itzhak DN et al., Global, quantitative and dynamic mapping of protein subcellular localization. Elife. (2016)
PubMed: 27278775 DOI: 10.7554/eLife.16950

Roux KJ et al., A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J Cell Biol. (2012)
PubMed: 22412018 DOI: 10.1083/jcb.201112098

Lee SY et al., APEX Fingerprinting Reveals the Subcellular Localization of Proteins of Interest. Cell Rep. (2016)
PubMed: 27184847 DOI: 10.1016/j.celrep.2016.04.064

Huh WK et al., Global analysis of protein localization in budding yeast. Nature. (2003)
PubMed: 14562095 DOI: 10.1038/nature02026

Simpson JC et al., Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing. EMBO Rep. (2000)
PubMed: 11256614 DOI: 10.1093/embo-reports/kvd058

Stadler C et al., Immunofluorescence and fluorescent-protein tagging show high correlation for protein localization in mammalian cells. Nat Methods. 2013 Apr;10(4):315-23 (2013)
PubMed: 23435261 DOI: 10.1038/nmeth.2377

Thul PJ et al., A subcellular map of the human proteome. Science. (2017)
PubMed: 28495876 DOI: 10.1126/science.aal3321

Barbe L et al., Toward a confocal subcellular atlas of the human proteome. Mol Cell Proteomics. (2008)
PubMed: 18029348 DOI: 10.1074/mcp.M700325-MCP200

Stadler C et al., A single fixation protocol for proteome-wide immunofluorescence localization studies. J Proteomics. (2010)
PubMed: 19896565 DOI: 10.1016/j.jprot.2009.10.012

Fagerberg L et al., Mapping the subcellular protein distribution in three human cell lines. J Proteome Res. (2011)
PubMed: 21675716 DOI: 10.1021/pr200379a

Baker M., Reproducibility crisis: Blame it on the antibodies. Nature. (2015)
PubMed: 25993940 DOI: 10.1038/521274a

Uhlen M et al., A proposal for validation of antibodies. Nat Methods. (2016)
PubMed: 27595404 DOI: 10.1038/nmeth.3995

Stadler C et al., Systematic validation of antibody binding and protein subcellular localization using siRNA and confocal microscopy. J Proteomics. (2012)
PubMed: 22361696 DOI: 10.1016/j.jprot.2012.01.030

Skogs M et al., Antibody Validation in Bioimaging Applications Based on Endogenous Expression of Tagged Proteins. J Proteome Res. (2017)
PubMed: 27723985 DOI: 10.1021/acs.jproteome.6b00821