The organelle proteomeSpatial compartmentalization of biological functions is a fundamental strategy that enables multiple biological processes to occur in parallel without undesired interference. An organelle is a subunit of the eukaryotic cell with a specialized function. The name "organelle" stems from the analogy between the different roles of organelles in the cells to the different roles of organs in the human body as a whole. A distinction is often made between membrane-bound and non-membrane bound organelles. The membrane-bound organelles, such as the nucleus and the Golgi apparatus, have a clearly defined physical boundary that separates the internal space from the outside. In contrast, non-membrane bound organelles and subcellular compartments, like the cytoskeleton and nucleoli, constitute spatially distinct assemblies of proteins, and sometimes RNA, within the cell without a physical boundary. This partitioning of cellular components creates specific environments where the concentration of different molecules can be tailored to fit the purpose of the organelle or subcellular structure, and provides important opportunities for regulation and coordination of cellular processes. A major function of proteins is to catalyze, conduct and control cellular processes in time and space. As different organelles and subcellular structures offer distinct environments, with distinct physiological conditions and interaction partners, the subcellular localization of a protein is an important factor for protein function. Consequently, mis-localization of proteins is associated with cellular dysfunction and various human diseases (Kau TR et al. (2004); Laurila K et al. (2009); Park S et al. (2011)). Knowledge of the spatial distribution of proteins at the subcellular level is essential for understanding protein function and molecular interactions, as well as for identifying the components of different cellular processes. Thus, studying how cells generate and maintain their spatial organization is central for understanding the mechanisms of living cells. Within the subcellularresource, 13534 human proteins have been mapped at single-cell level to 49 different organelles and subcellular structures (Figure 1), which has enabled the definition of 14 major organelle proteomes. The final group consists of proteins localizing to sub of the highly specialized sperm cells have been grouped into a s
The analysis also reveals that approximately half of the proteins localize to multiple compartments and identifies many proteins with single-cell variation in terms of protein abundance and/or spatial distribution. Subcellular localization of proteinsSeveral approaches for systematic analysis of protein localization have been described. The first major approach is organelle fractionation and quantitative mass-spectrometry, which can allow identification of the subcellular location of proteins by comparing their distribution profiles across the fractions with known organelle markers (Park S et al. (2011); Christoforou A et al. (2016); Itzhak DN et al. (2016)). The second major approach is to use protein-protein interaction studies to deduce the local spatial proteome of proteins. In this case, affinity purification or enzyme-mediated proximity-labelling is used to deduce unknown protein subcellular locations from interaction networks overlayed with known subcellular locazations (Itzhak DN et al. (2016); Roux KJ et al. (2012); Lee SY et al. (2016)). The third major approach is imaging-based methods, which enable the exploration of subcellular distribution of proteins in situ in single cells. These approaches are complemental, but imaging-based subcular profiling has the advantage of effectively identifying single-cell variability and multi-organelle localization. Imaging based approaches can be performed using tagged recombinant proteins (Huh WK et al. (2003); Simpson JC et al. (2000); Stadler C et al. (2013)) or affinity reagents, such as antibodies. The subcellular reso employs an immunofluorescence (IF) based approach combined with confocal microscopy to enable high-resolution investigation of the spatial distribution of proteins in fixed cells (Thul PJ et al. (2017); Stadler C et al. (2013); Barbe L et al. (2008); Stadler C et al. (2010); Fagerberg L et al. (2011)). With the diffraction-limited resolution of about 200 nm, a confocal image gives detailed insights into organization at the subcullar level. The spatial distribution of the protein is investigated using indirect IF in up to three cell lines, usually comprised of U-2 OS and two additional cell lines selected based on mRNA expression of the corresponding gene, using a panel of 41 human cell lines. Some proteins have also been mapped in ciliated cell lines and/or in human sperm cells. The protein of interest is visualized in green, while reference markers for microtubules (red), endoplasmic reticulum (yellow) and nucleus (blue) are used to outline the cell and the nucleus. From small dots like nuclear bodies, to larger structures such as the nucleoplasm, the distinct patterns in the images together with the reference markers make it possible to precisely determine the spatial distribution of a protein within the cell. The localization of each protein is assigned to one or more of 49 organelles and subcellular structures (Figure 1).
Figure 1. An example of confocal immunofluorescence images of different proteins (green) localized to each of the subcellular organelles and substructures currently annotated in the subcellular resource in a representative set of cell lines. Microtubules are marked with an anti-tubulin antibody (red) and the nucleus is counterstained with DAPI (blue). For more example images and details describing all the 49 patterns annotated in the subcellular resource, see the cell structure dictionary. Protein distribution in human cellsFigure 2 shows the distribution of all classificactions across the 49 organelles and subcellular structures for 13534 genes with protein localization data in the subcellular resource. The plot is sorted by meta-compartments: cytoplasm, nucleus, and endomembrane system, respectively. Most proteins are found in the nucleus, followed by the cytosol and vesicles, which consist of transport vesicles as well as small membrane-bound organelles like endosomes or peroxisomes. 60% (n=8187) of the proteins were detected in more than one location (multilocalizing proteins), and 27% (n=3670) displayed single-cell variation in expression level or spatial distribution. Figure 2. Bar plot showing the distribution of classifications of proteins in organelles and subcellular structures in the subcellular resource. Note that one protein can localize to more than one compartment. The bars are colored according to meta compartment. Validation of antibodies and location dataThe quality and use of antibodies in research have been frequently debated (Baker M. (2015)). As antibody off-target binding can cause false positive results, the subcellular section makes an effort in manually scoring all results in terms of reliability. In the subcellular resource a reliability score for every annotated location at a four-graded scale is provided: Enhanced, Supported, Approved, and Uncertain, as described in detail in the assay & annotation section. Enhanced reliability scores are obtained through antibody validation according to one of the validation "pillars" as proposed by an international working group (Uhlen M et al. (2016): (i) genetic methods using siRNA silencing (Stadler C et al. (2012)) or CRISPR/Cas9 knock-out, (ii) expression of a fluorescent protein-tagged protein at endogenous levels (Skogs M et al. (2017)) or (iii) independent antibodies targeting different epitopes (Stadler C et al. (2010)). Supportive reliability scores are given for locations that are in agreement with external experimental data (UniProtKB/Swiss-Prot database). the reliability score Approved indicates that there is no external experimental information available to confirm the observed location or that external data is only partially supportive. An Uncertain location is contradictory compared to external protein localization data or protein function, but is shown if it cannot be ruled out that the data is correct, and further experiments are needed to establish the reliability of the antibody staining. The individual location reliability scores are summarized into an overall gene reliability score. The distribution of reliability scores in the subcellular resource is shown in Figure 3. Approximately 43% (n=5826) of the protein localizations provided are Enhanced or Supported. Table 1 details the organelle distribution of all localized proteins and the distribution of reliability scores on the basis of individual organelles ans subcellular structures.
Figure 3. Pie chart showing level of reliability of the localized proteins, where each piece is the number of proteins with one type of score, out of the four reliability scores Enhanced, Supported, Approved, and Uncertain. Table 1. Table showing the number of proteins localized to every organelle, structure, and substructure in the subcellular resource, along with the distribution of reliability scores. Relevant links and publications Kau TR et al., Nuclear transport and cancer: from mechanism to intervention. Nat Rev Cancer. (2004) |