The Intestine-specific proteome

The intestine is the major site of food breakdown and nutrient absorption. It encompasses the duodenum, jejunum, ileum, colon and rectum. It receives chyme (semifluid mass) from the stomach and bile and pancreatic fluids from the pancreaticobiliary duct in the duodenum to start degrading lipids and carbohydrates to enable uptake in the jejunum and ileum. In the colon the remaining water, electrolytes and vitamins are absorbed after which the feces go to the rectum where they are stored until release via the anal canal. Transcriptome analysis shows that 75% (n=15219) of all human proteins (n=20162) are expressed in the intestine and 949 of these genes show an elevated expression in the intestine compared to other tissue types.

  • 949 elevated genes
  • 128 enriched genes
  • 250 group enriched genes
  • Intestine has most group enriched gene expression in common with lymphoid tissue


The Intestine transcriptome

Transcriptome analysis of the intestine can be visualized with regard to the specificity and distribution of transcribed mRNA molecules (Figure 1). Specificity illustrates the number of genes with elevated or non-elevated expression in the intestine compared to other tissues. Elevated expression includes three subcategory types of elevated expression:

  • Tissue enriched: At least four-fold higher mRNA level in intestine compared to any other tissues.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in intestine compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes have, or do not have, detectable levels (nTPM≥1) of transcribed mRNA molecules in the intestine compared to other tissues. As evident in Table 1, all genes elevated in intestine are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one-third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in intestine as well as in all other tissues. (B) The distribution of all genes across the six categories, based on transcript detection (nTPM≥1) in intestine as well as in all other tissues.

As shown in Figure 1, 949 genes show some level of elevated expression in the intestine compared to other tissues. The three categories of genes with elevated expression in intestine compared to other organs are shown in Table 1. In Table 2, the 12 genes with the highest enrichment in intestine are defined.

Table 1. The number of genes in the subdivided categories of elevated expression in intestine.

Distribution in the 36 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Specificity
Tissue enriched 1584245 128
Group enriched 01391038 250
Tissue enhanced 413637259 571
Total 1935949972 949


Table 2. The 12 genes with the highest level of enriched expression in intestine. "Tissue distribution" describes the transcript detection (nTPM≥1) in intestine as well as in all other tissues. "mRNA (tissue)" shows the transcript level in intestine as nTPM values. "Tissue specificity score (TS)" corresponds to the fold-change between the expression level in intestine and the tissue with the second-highest expression level.

Gene
Description
Tissue distribution
mRNA (tissue)
Tissue specificity score
TMPRSS15 transmembrane serine protease 15 Detected in some 952.0 927
DEFA6 defensin alpha 6 Detected in some 4676.4 865
DEFA5 defensin alpha 5 Detected in some 9494.8 772
MLN motilin Detected in single 247.0 411
RBP2 retinol binding protein 2 Detected in many 2632.2 358
GIP gastric inhibitory polypeptide Detected in some 251.8 208
S100G S100 calcium binding protein G Detected in some 231.2 201
LCT lactase Detected in some 219.2 186
FABP2 fatty acid binding protein 2 Detected in some 534.5 173
SI sucrase-isomaltase Detected in some 530.0 150
ALPI alkaline phosphatase, intestinal Detected in single 137.3 139
INSL5 insulin like 5 Detected in some 118.8 106


Protein expression of genes elevated in intestine

In-depth analysis of the elevated genes in intestine using antibody-based protein profiling allowed us to visualize the expression patterns of these proteins in different functional compartments including proteins involved in microvilli organisation and proteins involved in breakdown and uptake of nutrients.

Proteins involved in microvilli organization

Several genes with an elevated expression in the intestine encode proteins related to microvilli organization. (VIL1) is a calcium-regulated actin-binding protein that caps, severs as well as bundles actin filaments in the intestinal brush border. Microvilli are attached to each other via heterophilic complexes between (CDHR2) and (CDHR5). Thereby controlling the packing of the microvilli at the apical membrane.


VIL1

CDHR2

CDHR5

Proteins involved in digestion

The intestine is responsible for the uptake of many different dietary nutrients like fatty acids, amino acids and carbohydrates. Fatty acid uptake, metabolism and transport is facilitated by fatty acid binding proteins (FABP). For example, FABP2 is thought to be involved in triglyceride-rich lipoprotein synthesis. Complex carbohydrates like starch on the other hand are broken down by SI which is localized on the intestinal brush border. Simple carbohydrates get transported across the luminal membrane via SLC5A1, a sodium-dependent glucose transporter. For glucose and galactose SLC5A1 is the main transporter, mutations in this gene have been associated with glucose-galactose malabsorption.


FABP2

SI

SLC5A1


Genes shared between intestine and other tissues

There are 250 group enriched genes expressed in intestine. Group enriched genes are defined as genes showing a 4-fold higher average level of mRNA expression in a group of 2-5 tissues, including intestine, compared to all other tissues.

To illustrate the relation of intestine tissue to other tissue types, a network plot was generated, displaying the number of genes with a shared expression between different tissue types.

Figure 2. An interactive network plot of the intestine enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of intestine enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 5 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.


Intestine shares most group enriched gene expression with lymphoid tissue. One example of a gene expressed in both intestine and lymphoid tissue is CD79A, a protein that is involved in the initiation of signal transduction activated by antigen binding to the B-cell antigen receptor complex. As can be expected most genes shared between intestine and lymphoid tissue mainly show staining of lymphoid cells in the lymphoid patches in the appendix and some cells in the lamina propria. The open reading frame C6orf58 is thought to be involved in early liver development, however the function is unknown. C6orf58 is expressed in the submucosa Brunner's glands in the duodenum, stomach and salivary gland. The intestinal tract also shares expression of several genes with the liver. The enzyme OTC does catalyze the second step of the urea cycle and is expressed in the liver and intestine.


CD79A - duodenum

CD79A - appendix

CD79A - lymph node


C6orf58 - duodenum

C6orf58 - stomach

C6orf58 - salivary gland


OTC - duodenum

OTC - small intestine

OTC - liver


Duodenum histology

The most proximal and widest part of the small intestine is the duodenum. It starts at the pylorus of the stomach, ends at the duodenojejunal junction and measures about 25 cm long. It receives partly digested food (chyme) from the stomach and bile and pancreatic fluids from the pancreaticobiliary duct. After entering the duodenum the acidic contents from the stomach is neutralized by secretion from the intestine and pancreas. Enzymes secreted from the pancreas initiate the degradation of lipids, carbohydrates and proteins to enable absorption.

As in all of the intestine, the mucosa forms finger-like projections called villi that extend into the intestinal lumen. These are epithelial folds lined by two types of cells, enterocytes and goblet cells. Enterocytes are simple columnar cells with basal elongated nuclei and an apical brush border. The brush border is the microscopic representation of small protrusions of the cell membrane, microvilli, which greatly increase the surface area of the cell enhancing absorptive capacity. The other cell type is mucus-secreting goblet cells that can be recognized by the presence of an apical mucous cup. The core of the villus is part of the lamina propria. The most numerous cells in the lamina propria are immune cells, most of which are lymphocytes. Because villi are the site of absorption of nutrition they have a rich blood supply, each villus is supplied by central arterioles and drained by central venules and a central lymph vessel.

Underlying the villi are the intestinal glands, also called the crypts of Lieberkuhn. These glands are lined with numerous relatively undifferentiated columnar cells that usually undergo two rounds of mitosis before differentiating into either absorptive cells or goblet cells. Enterocytes, goblet cells, paneth cells that secrete antibacterial enzymes (recognized by eosinophilic granules in their apical cytoplasm) and enteroendocrine cells also line the crypt. A thin layer of smooth muscle marks the end of the mucosa, the muscularis mucosae. In the submucosa there are numerous pale stained glands present, namely Brunner's glands. These are branched tubular or alveotubular glands lined with columnar secretory epithelium. They secrete large amounts of alkaline mucous that neutralize the acidic contents from the stomach.

The histology of human duodenum including detailed images and information can be viewed in the Protein Atlas Histology Dictionary.


Small intestine histology

The small intestine (jejunum and ileum) measures about 6 meters and absorbs nutrition, water and electrolytes. It is similar to the duodenum in histology and composition. The permanent transverse submucosal fold extending into the lumen of the intestine are termed plica circularis. The plica circularis consist of mucosa as well as submucosa. The core is the submucosa composed of loose connective tissue, blood vessels, nerves and dispersed lymphoid tissue. A distinctive feature of the jejunum and ileum is the lack of glands in the submucosa.

The mucosa is characterized by numerous finger-like villi that protrude into the lumen of the intestine. Enterocytes, which are columnar epithelial cells with basally located oval nuclei and an apical brush borderline the villi. The enterocytes located to the villi have mainly absorptive function. Interspersed between the enterocytes are goblet cells, which are recognized by their content of a large mucous globule, resembling a small "empty bubble" within the epithelial lining. The goblet cells are connected to the basement membrane by a thin, cytoplasmic strand that is difficult to distinguish in hematoxylin and eosin staining. Underlying the intestinal villi are the intestinal glands. They are straight tubular glands that are slightly dilated at their bottom. Intestinal stem cells line the proximal part of the glands. The stem cells give rise to all the cells in the epithelium, which are the paneth cells, the enterocytes, the goblet cells and the enteroendocrine cells. The paneth cells secrete antibacterial enzymes and are located in the lower portion of the intestinal glands. Paneth cells are recognized by eosinophilic granules in the cytoplasm.

The histology of human small intestine including detailed images and information can be viewed in the Protein Atlas Histology Dictionary.


Colon histology

The colon is divided into four parts, the ascending, transverse, descending and sigmoid colon is on average 1.5 meters long. Its main function is the reassertion of fluid, electrolytes, and vitamins. Since the colon has no villi or plica circularis the mucosa is smooth. Simple tubular intestinal glands (crypts of Lieberkuhn) extend through the entire thickness of the mucosa. The surface columnar epithelium and the cells lining the crypts are enterocytes, with an oval basal nucleus and apical brush border, the microscopic representation of microvilli. There are also numerous mucous secreting goblet cells recognized by their content of a large mucous globule. The lamina propria with connective tissue and inflammatory cells surround the crypts. A thin smooth muscular layer, the lamina muscularis mucosae marks the border between the mucosa and submucosa. The submucosa consists of loose connective tissue with vessels and nerves. Some solitary lymph follicles are also seen. The muscular layer (muscularis externa) consists of an inner circular smooth muscle layer, the outer longitudinal muscle layer is not continuous as in the rest of the gastrointestinal tract. It is divided into three thickened muscular bands, called teniae coli.

The histology of human colon including detailed images and information can be viewed in the Protein Atlas Histology Dictionary.


Background

Here, the protein-coding genes expressed in intestine are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize corresponding protein expression patterns of genes with elevated expression in intestine.

Transcript profiling was based on a combination of two transcriptomics datasets (HPA and GTEx), corresponding to a total of 14590 samples from 50 different human normal tissue types. The final consensus normalized expression (nTPM) value for each tissue type was used for the classification of all genes according to the tissue-specific expression into two different categories, based on specificity or distribution.


Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Gremel G et al., The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling. J Gastroenterol. (2015)
PubMed: 24789573 DOI: 10.1007/s00535-014-0958-7

Histology dictionary