The metabolic proteome

Metabolism is a complex network of biochemical reactions, in which various compounds are created, decomposed and interconverted to meet the cell’s demand for energy, building blocks and waste disposal. These reactions need to happen in an orchestrated way. Their direction, rates, and localization are tightly controlled. This provides timely and sufficient, but not excessive supply of the necessary compounds, such as nutrients, structural components of the cell or signaling molecules. A total of 2882 human proteins are involved in cell metabolism.

Many biochemical reactions naturally happen at extremely slow rates, yield a mix of products instead of a specific desired compound, or are reversible and reach an equilibrium, where the substrate and the product coexist. To solve these problems, cells employ a special class of proteins, called enzymes, as catalysts and regulatory points. A classic example is glycolysis — the process by which glucose is converted into pyruvate with energy released and stored in a form that can be used by the cell. The first glycolytic reaction is catalyzed by an enzyme called hexokinase (HK1, HK2, HK3), which converts glucose to glucose-6-phosphate (G6P). This reaction in the cells is coupled to the hydrolysis of ATP. The combined reaction is thermodynamically favorable and could happen spontaneously in solution, but it requires high activation energy. HK1 binds both ATP and glucose in a way that both brings the two substrates close to each other in space and can lower the activation energy, facilitating the reaction and creating G6P. Additionally, HK1 is inhibited by its own product, which prevents the enzyme from using up all available glucose and ATP. Enzymes usually have more than one regulatory mechanism, which allows the cell to fine-tune their activity to match the cell’s needs.

Metabolic reactions are linked in pathways

Direct conversion of the substrate into a necessary product might not always be possible. In this case, the conversion usually happens in several steps. Each of the individual reactions can be catalyzed by a different enzyme, which uses the product of the previous step as its substrate. Such linked reactions are called metabolic pathways (Figure 1). In case of glycolysis, the product of HK1, G6P, is used by another enzyme, glucose-6-phosphate isomerase (GPI), which converts it to fructose-6-phosphate (F6P). G6P is, therefore, removed and cannot inhibit the previous step in the pathway. At the same time, this reaction is reversible, and its direction is controlled by the concentrations of G6P and F6P. At high levels of F6P, the same enzyme will work in reverse, producing G6P and impeding its synthesis by HK1. The product of a certain reaction in a metabolic pathway can also serve as a substrate for a different pathway, connecting the individual pathways into metabolic networks. The flow of the reactions in these networks will be determined by the concentrations of all the substrates and products, as well as the amount and activity of all enzymes involved. Regulating metabolic flows allows the cell to adapt to the changing internal and external stimuli. Metabolic pathways and subsystems can be explored in more detail within the Metabolic maps resource of the Human Protein Atlas (HPA) or in the Metabolic Atlas (Robinson JL et al. (2020); Li F et al. (2023)).

Interactive pathway explorer

Figure 1. Interactive metabolic pathway explorer. Different metabolic pathways can be accessed with a drop-down menu. Pathways are allocated to the subcellular compartments based on the Metabolic Atlas database. Metabolic enzymes are shown in rectangles, metabolites — in light-blue ellipses, reactions — in white diamonds. Gray arrows indicate the direction of the reactions. Individual proteins are color-coded based on their main subcellular locations in HPA. Proteins that show single-cell variability (SCV) in HPA data (for example, due to cell cycle) are marked with a cyan tag in the upper right corner. Clicking on a protein brings up an information box with an immunofluorescent image (if available), subcellular locations, biological processes and ligands (from UniProt), as well as the number of known interacting proteins. Clicking the name of the protein in the box will bring you to the protein summary page.

Spatial organization of metabolism

An important aspect of metabolism is the spatial context (Bar-Peled L et al. (2022)). For a reaction to happen, all its components, including the substrates and the enzyme, have to be in physical proximity. Transporter proteins, pores and channels are often needed to bring the substrates of the reactions to the place where the reaction should happen, particularly if the substrate needs to cross a membrane. Changes in the expression levels and localization of those proteins affects the concentrations of the substrates and, by extension, the possible biochemical reactions. A great example is glucose transporters, which facilitate the uptake of glucose by the cells. Their presence on the cell membrane is regulated by the levels of available glucose and insulin. Translocation of glucose transporters is often dysregulated in insulin-resistant cells, such as seen in type 2 diabetes.

Enzymes are often organized into larger enzymatic complexes, with individual proteins catalyzing one or a few steps in the overall reaction. Pyruvate dehydrogenase complex consists of three different catalytic subunits (pyruvate dehydrogenase, dihydrolipoyl transacetylase and dihydrolipoyl dehydrogenase) for the conversion of pyruvate into acetyl-CoA. Each of these subunits has multiple proteins or multiple copies of the same protein. Another example is ATP synthase, which produces ATP during the oxidative phosphorylation in mitochondria. This large complex consists of the proteins encoded by 20 different genes, but only a few of these proteins have catalytic activity (Song J et al. (2018)).

Enzymes of a specific metabolic pathway tend to be found in the same intracellular compartment (Figure 2). One of the most notable examples is oxidative phosphorylation, which exclusively occurs in the mitochondria, and this is where most of the associated proteins are found (Figure 3). At the same time, certain pathways, such as cholesterol synthesis, span multiple cellular compartments (Maxfield FR et al. (2010)). Pathways with fewer enzymes tend to be more spatially constricted than pathways with more enzymes, with some notable exceptions, such as oxidative phosphorylation or aminoacyl-tRNA biosynthesis (Figure 2C). Metabolic systems that share intermediates, such as lipid synthesis and lipid degradation, also share the subcellular localizations. However, lipid synthesis proteins are more present in endoplasmic reticulum and lipid degradation proteins — in mitochondria, reflecting the metabolic compartmentalization (Figure 3).

Figure 2. Cell images reveal subcellular localization of the metabolic proteome. (A) UMAP visualization of the image features of the entire HPA image dataset, showing metabolic proteins (Ouyang W et al. (2019)). Protein images with single location are colored according to this location, gray data points correspond to multilocalizing proteins. (B) Pairwise cosine distances calculated for proteins within the same pathway, pathway group, and across random controls in the latent space from image-classification model. Centre line, median; box, first (Q1) and third (Q3) quartiles; whiskers, 1.5× interquartile range (IQR) below Q1 and above Q3. Enzymes within the same pathways are significantly more compartmentalized compared to random protein sets. (C) Logarithmic correlation between the cosine distance from (A) and the number of proteins in a pathway. Cholesterol biosynthesis (red), oxidative phosphorylation (blue) and aminoacyl-tRNA biosynthesis (yellow) are highlighted (Gnann C et al. (2024)).

Figure 3. Metabolic pathways show high level of protein multilocalization. Oxidative phosphorylation occurs in mitochondria and has relatively few multilocalizing proteins. Cholesterol synthesis spans multiple compartments, including endoplasmic reticulum, cytosol and plasma membrane. Tricarboxylic acid cycle takes place in mitochondria, but shows connections to cytosol and nuclear structures. Glycolysis occurs in the cytosol, but most proteins are multilocalizing to the nucleus, plasma membrane and vesicles. Larger metabolic systems that share intermediates, such as lipid synthesis and lipid degradation, also share the subcellular localizations. However, lipid synthesis proteins are more present in endoplasmic reticulum (orange), whereas lipid degradation proteins are more often found in mitochondria (yellow) and vesicles, including peroxisomes (magenta). Specific intracellular compartments are color-coded across the panels. Ticks and numbers indicate the protein count in a specific compartment. Each connecting line represents a single multilocalizing protein. Pathway annotations are based on protein functions described in UniProt.

As discussed above, localization of the enzymes can have an effect on their activity, based on the availability of the substrates, as well as the possible interaction partners. In fact, enzymes are overrepresented among the multilocalizing proteins — such proteins that are found in more than one subcellular compartment. Multilocalization has been shown for 1199 metabolic proteins. Examples of multilocalizing proteins are given in Figure 4.

ALDH3A1 - A-549
HMGCS1 - U2OS
ALAS1 - U-251MG

NDUFB4 - PC-3
GCH1 - U2OS
SLC7A5 - U2OS

CERT1 - A-431
GAPDH - MCF-7
PDHB - A-431

Figure 4. Examples of metabolic proteins localizing to multiple compartments. ALDH3A1 is involved in detoxification of aldehydes (detected in nucleoplasm, vesicles and cytosol of A-549 cells). HMGCS1 participates in cholesterol biosynthesis (detected in nucleoplasm and cytosol of U2OS cells). ALAS1 catalyzes the rate-limiting step in heme biosynthesis (detected in nucleoplasm and mitochondria of MCF7 cells). NDUFB4 is a part of mitochondrial respiratory chain (detected in mitochondria and nuclear membrane of PC-3 cells). GCH1 is the rate-limiting enzyme in tetrahydrobiopterin biosynthesis (detected in nucleoplasm and cytosol of U2OS cells). SLC7A5 is an amino acid transporter (detected in cytoplasm and plasma membrane of U2OS cells). CERT1 is involved in ceramide transport (detected in Golgi apparatus and nucleoplasm of A-431 cells). GAPDH is a key enzyme in glycolysis, but is known to additionally regulate stress response and cell death (detected in cytoplasm, nuclear membrane and plasma membrane of MCF-7 cells). PDHB links glycolysis to the tricarboxylic acid cycle (detected in mitochondria and nucleoplasm of A-431 cells). The target protein is shown in green, microtubules in red, and the nucleus in blue. In some images, certain channels are omitted for better presentation.

Certain multilocalizing enzymes preserve their function in different compartments. NOX2 (CYBB) is an enzyme that immune cells use to produce reactive oxygen species in a regulated manner. It can be found either on the plasma membrane, where it contributes to the extracellular antimicrobial activity, or in the phagosomes, where it is essential for killing the internalized pathogens. At the same time, some of the enzymes perform different functions when found in different compartments — these are moonlighting metabolic proteins. GAPDH, a vital glycolytic enzyme, is normally found in the cytoplasm. A smaller portion of it, however, in the presence of stressors can undergo additional posttranslational modifications and translocate to the nucleus. Nuclear GAPDH has a completely different set of interaction partners than the cytoplasmic form and serves as an intracellular sensor, participating in the stress response. ENO1, another glycolytic enzyme, performs its main function in the cytoplasm, but can also be found on the plasma membrane, where it serves as a plasminogen receptor and promotes cancer progression. Overlay of spatial proteomic data with protein-protein interaction measurements can reveal the distinct interaction partners in each of the possible locations of a multilocalizing protein, giving hints at its multiple functions.

Single-cell variation of metabolic proteins

Perhaps, unsurprisingly, metabolic proteins show large-scale single-cell variation (824 out of 2050 metabolic proteins with known subcellular localization). Individual cells within the population tend to have different levels of enzyme expression, as well as different preferred localizations in case of multilocalizing proteins (Figure 5).

PHGDH - U2OS
CHST10 - U-251MG
PFKM - U-251MG

PHGDH - pancreas
CHST10 - pancreas
PFKM - pancreas

NDUFA1 - HEK293
RPN1 - A-431
HMGCS1 - U2OS

NDUFA1 - pancreas
RPN1 - pancreas
HMGCS1 - pancreas

Figure 5. Single-cell variation of metabolic proteins can be found in cell lines and tissues (pancreatic acinar cells) across different metabolic pathway groups (Gnann C et al. (2024)).The target protein is shown in green in the immunofluorescent images.

SCV is observed across different metabolic pathways and is often present in more than one cell line, as well as in tissues (Figure 5). This variation is not genetic and is re-established in the clonal cell populations, as the number of cells increases (Gnann C et al. (2024)). We hypothesize that enzymatic SCV reflects the cell’s metabolic state — both in terms of the current metabolic activity and the buffering potential, which can be used to quickly adapt to a changing environment. Such metabolic states can be difficult to detect using a single omics approach, so future simultaneous spatial profiling of multiple omics layers may provide novel insights into the regulation of cellular metabolism.

Our resource provides a robust baseline for studying metabolic compartmentalization and variation on subcellular level, constructing spatially resolved proteome-wide metabolic flux models, as well as manipulating metabolic states and developing new treatment designs.

Relevant links and publications

Gnann C et al., Dissecting autonomous enzyme variability in single cells. bioRxiv. (2024) DOI: 10.1101/2024.10.03.616530

Robinson JL et al., An atlas of human metabolism. Sci Signal. (2020)
PubMed: 32209698 DOI: 10.1126/scisignal.aaz1482

Li F et al., GotEnzymes: an extensive database of enzyme parameter predictions. Nucleic Acids Res. (2023)
PubMed: 36169223 DOI: 10.1093/nar/gkac831

Bar-Peled L et al., Principles and functions of metabolic compartmentalization. Nat Metab. (2022)
PubMed: 36266543 DOI: 10.1038/s42255-022-00645-2

Song J et al., Assembling the mitochondrial ATP synthase. Proc Natl Acad Sci U S A. (2018)
PubMed: 29514954 DOI: 10.1073/pnas.1801697115

Maxfield FR et al., Cholesterol, the central lipid of mammalian cells. Curr Opin Cell Biol. (2010)
PubMed: 20627678 DOI: 10.1016/j.ceb.2010.05.004

Ouyang W et al., Analysis of the Human Protein Atlas Image Classification competition. Nat Methods. (2019)
PubMed: 31780840 DOI: 10.1038/s41592-019-0658-6