Antigen and antibody production

PrEST regions (Agaton C et al. (2003); Lindskog M et al. (2005)) are first amplified with RT-PCR from total RNA template pools with specific oligonucleotide primers for each PrEST. Amplicons are automatically processed with solid phase restriction, and ligated into the plasmid vector pAff8c (Larsson M et al. (2000)) where the human gene fragment is fused to a dual tag consisting of a hexahistidyl (His6) tag in frame with an immunopotentiating Albumin Binding Protein (ABP)-tag. After transformation into E. coli Rosetta(DE3), inserts are verified by DNA sequencing to omit clones with mutations and approved clones are single cell streaked. Plasmids are collected from all purified clones for deposition in the clone library and glycerol stocks are prepared and used as starting material for protein production.

All proteins are expressed as His6ABP fusions in E. coli shake flask cultures upon induction with Isopropyl-B-D-Thiogalactopyranoside (IPTG). A fully automated protein purification system has been developed to allow for purifications of up to 60 cell lysates at a time. One-step purification is enabled by the hexahistidine affinity tag and Immobilized Metal Affinity Chromatography (IMAC) and performed under denaturing conditions. After evaluation of protein concentration and purity, the molecular weight of the PrEST proteins is determined by mass spectrometry as a final quality control. The purified proteins are then used to prepare antigens and affinity columns with PrEST-ligands. In addition, affinity resin with His6ABP-ligand is also produced.

After immunization of the antigens the polyclonal antisera, generated together with collaborative partners, are carefully purified in a three-step fashion consisting of: depletion of unwanted specificity, capture of wanted specificity and a final buffer exchange step. A manual process using gravity-flow columns carries out depletion of antibodies with unwanted specificity. The following steps are performed on the ÄKTAxpress chromatography system enabling a high-throughput semi-automated process where captured antibodies are eluted by a low pH glycine buffer and automatically loaded onto a desalting column for buffer exchange. Antibodies are supplemented with 50% glycerol and 0.02% sodium azide for long-term storage at -20°C. The binding specificity of all antibodies is determined on protein microarrays to certify that only antibodies with high specificity and low background binding are approved for immunohistochemistry analysis. All approved antibodies are further analyzed in a high-throughput WB platform using protein lysates from human cell lines (RT-4 and U-251 MG), human plasma depleted of IgG and HSA and whole tissue lysates from human liver and tonsil. A selection of the published antibodies, initially scored as uncertain in the standard WB panel, have been revalidated in a WB set-up comprising an over-expression lysate (VERIFY Tagged Antigen™, OriGene Technologies, Rockville, MD) as a positive control.

Antibody validation

The usefulness of antibodies in different assays is dependent on both sensitivity and specificity of epitope binding, and in order to provide the best estimate of protein expression across tissues and cells, antibody validation is a crucial part of the Human Protein Atlas. All antibodies are validated by a set of defined criteria, as described below. Only antibodies that pass the minimum criteria of standard antibody validation are published on the Human Protein Atlas. In addition to the standard quality assurance, enhanced antibody validation strategies are performed in an application-specific manner. The different criteria for enhanced antibody validation are described below.

Standard antibody validation

All antibodies produced internally within the Human Protein Atlas project (HPA antibodies) must pass steps 1-4 in the list below in order to be used for immunohistochemistry and immunocytochemistry/IF. Steps 5-7 provide the basis for evaluating and scoring the antibody reliability. All antibodies that provide a reasonable pattern of immunoreactivity are added to the Human Protein Atlas portal. Feedback from the research community is appreciated and needed for continuous curation of data.
Quality assurance steps for antibodies generated within the Human Protein Atlas project:

  1. The antigen (protein epitope signature tag (PrEST)) for a protein is selected as a stretch of 20-150 amino acids with as low identity as possible to proteins from all other putative protein-coding genes, and not including signal peptides or transmembrane regions. Multitarget PrESTs are PrESTs that have more than 80% identity to proteins from more than one gene, and are expected to generate antibodies with multiple targets.
  2. Plasmid inserts are sequenced to assure that the correct PrEST sequence is cloned.
  3. Size of the resulting recombinant protein (including the specific PrEST) is analyzed using mass spectrometry to assure that the correct antigen has been produced and purified.
  4. To control for cross-reactivity, affinity purified antibodies are tested for sensitivity and specificity on protein arrays consisting of glass slides with spotted PrEST fragments.
  5. Antibody specificity is analyzed using Western blot in a standardized setup. Total protein lysates from a limited number of tissues (liver and tonsil), cell lines (RT4 and U-251 MG), and human plasma are used to evaluate the antibody target binding in a Western blot setting. Antibodies with an uncertain standard Western blot are reanalyzed using an over-expression lysate as a positive control.
  6. Immunohistochemical staining of normal and cancer tissue is examined and annotated by specially educated personnel, and the staining patterns are compared with available gene/RNA/protein characterization data.
  7. High resolution confocal microscopy images of human cell lines stained by indirect immunofluorescence are annotated for subcellular localizations by trained cell biologists, and the subcellular localization patterns are compared with the immunohistochemical staining and available experimental protein characterization data.

For antibodies supplied through commercial or other academic sources (CAB antibodies), immunocytochemistry and immunohistochemistry have been performed and validated in a similar manner as for HPA antibodies. These antibodies have also been tested on Western blot in a standardized setup. For each commercially available antibody, a link to the antibody provider is given on the "Antibody validation" page. For further validation we refer to quality controls provided by the respective company. Detailed descriptions of the strategies used for standard antibody validation in the different assays are available further down on this page.

Enhanced antibody validation

Antibodies used for Western blot, immunocytochemistry and immunohistochemistry in the Human Protein Atlas undergo enhanced antibody validation based on the five "pillars" described by the International Working Group for Antibody Validation (IWGAV), presented in "A proposal for validation of antibodies" (Uhlen M et al. (2016)). The enhanced validation principles are adapted for validation in Western blot, immunocytochemistry and immunohistochemistry applications. Antibodies that fulfil the criteria are labelled "Enhanced". The following Enhanced antibody validation strategies are used for each assay:

  • Immunocytochemistry: Genetic validation, Recombinant expression validation, and Independent antibody validation
  • Immunohistochemistry: Orthogonal validation and Independent antibody validation
  • Western blot: Genetic validation, Recombinant expression validation, Independent antibody validation, Orthogonal validation, Capture MS validation

Detailed descriptions are available under each section describing the different assays.


Genetic validation: Knock-down (knock-out) of the target protein using genetic methods, such as CRISPR or siRNA, in a suitable cell line. The staining of the antibody is evaluated before and after knock-down of the corresponding target gene.

Orthogonal validation: Comparing the staining pattern with an antibody-independent method analysing the expression level of the target protein. At least two samples must be used and the target protein must express the target at different levels. The levels of the target protein in the different samples determined by the two independent methods must show the same pattern.

Independent antibody validation: Comparing the staining pattern using two independent antibodies with non-overlapping epitopes. The staining pattern generated by the two antibodies is compared in at least two tissues or cell lines, preferably expressing the target protein at different levels. The two antibodies must show a similar result.

Recombinant expression validation: Over-expression of the target protein in a cell line preferably not expressing the target protein, or recombinant expression of a fluorescently tagged version of the target protein in a cell line preferably on endogeneous level. The staining is evaluated by comparing the signal by the over-expressed or tagged version of the target protein with the unmodified or endogenous target protein.

Capture MS validation: Comparing the staining pattern and protein size of the antibody with results obtained by a capture MS method. The size detected by the antibody should be equivalent to the size of the corresponding target protein detected in capture MS.

Immunocytochemistry - cells

Standard antibody validation - ICC

For each antibody, the observed staining in the different cell lines is assigned a validation score based on concordance with available experimental gene/protein characterization data in the UniProtKB/Swiss-Prot database. The validation scores for up to three cell lines are merged into one of the main categories; Supported, Approved, or Uncertain, to represent the overall antibody staining in all analyzed cell lines.

Validation scores for Immunocytochemistry/IF:

Supported

  • The antibody yields a staining pattern supported by available experimental gene/protein characterization data (UniProtKB/Swiss-Prot).

Approved

  • The antibody yields a staining pattern for a gene with no available experimental gene/protein characterization data.
  • The antibody yields a staining pattern where available experimental gene/protein characterization data is partly supporting and partly conflicting.

Uncertain

  • The antibody yields a staining pattern that is not consistent with available experimental gene/protein characterization data.

Validation scores for Immunocytochemistry/IF - multitargeting antibodies: The validation of antibodies targeting PrESTs encoded by two or more genes (here called multitargeting) is based on the conformance of the expression pattern to available gene/protein characterization data. Similarity between paired antibodies is not taken in account due to the complexity of multiple gene targets.

Supported

  • The multitargeting antibody yields a staining pattern consistent with available gene/protein characterization data for all of the genes.
  • The multitargeting antibody yields a staining pattern that is partly consistent with available gene/protein characterization data for all of the genes.

Approved

  • The multitargeting antibody yields a staining pattern for a gene with no available gene/protein characterization data.
  • The multitargeting antibody yields a staining pattern that is consistent with available gene/protein characterization data for at least one of the genes, but not all.

Uncertain

  • The multitargeting antibody yields a staining pattern that is not consistent with available experimental gene/protein characterization data.

Enhanced antibody validation - ICC

Genetic validation - siRNA

For enhanced genetic validation of the antibody, thereby confirming the determined subcellular localization of the target protein, the staining procedure has been repeated on siRNA transfected U-2 OS cells in order to knock-down the expression of the protein (Stadler C et al. (2012)). After siRNA transfection, cells are fixed and stained according to the standard ICC-IF protocol. For each antibody, the assay is performed in duplicates using two different siRNAs sources, and the results are compared to negative control cells transfected with scrambled siRNA. Images are acquired using objectives with 10x- and 40x-magnification. An automated image analysis protocol segments the cells and extracts features from all acquired images before statistical software automatically compares the cell population median staining intensity between siRNA coated and negative control samples. Relative Fluorescence Intensity (RFI) denotes the percentage of remaining staining intensity after siRNA-mediated down regulation. The distribution of RFI for the cells within a sample are presented in a box-plot and the significance of the down-regulation are evaluated using the Wilcoxon rank sum test (Mann-Whitney). A p-value below 0.01 is considered significant. For each siRNA assay the decrease in antibody-based staining intensity upon target protein downregulation is evaluated.

Antibodies that meet one of the following criteria will receive the validation score "Enhanced" by genetic method:

  • Significant downregulation >25 % by both siRNAs.
  • Significant downregulation >25 % by one siRNA and >10 % by the other.
  • Significant downregulation >25 % by one siRNA.

Recombinant expression validation - Tagged protein/GFP

Antibodies targeting a subset of genes have been further analyzed in HeLa cells stably expressing a GFP-tagged version of the target protein.These cell lines have been kindly provided by the group of Professor Anthony Hyman, Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany (Poser I et al. (2008); Skogs M et al. (2017)). They are produced using a Bacterial Artificial Chromosomes (BAC) TransgeneOmics technology. BACs contain all regulatory elements and a transfection with BACs results in near-endogenous expression of the recombinant protein. Analysis is performed directly on the clone pool where individual cells show variations in tagged protein expression level. An anti-GFP antibody is used to enhance the signal in order to detect even low abundant tagged target protein. All images are manually annotated to one or several subcellular locations. The location of the tagged protein is taken into account when performing the knowledge-based annotation of subcellular location for each gene described here. The antibody staining intensity is classified as negative, weak, moderate or strong based on the laser power and detector gain settings used for image acquisition in combination with the visual appearance of the image. GFP intensity is classified as positive or negative. Finally, the signals of the antibodies targeting the endogenous protein and the GFP-tagged protein, respectively, are compared.

Antibodies that meet one of the following criteria will receive the validation score "Enhanced" by recombinant expression:

  • Antibody staining overlaps with tagged protein.
  • Antibody staining overlaps with tagged protein but shows additional locations.

Independent antibody validation

Antibodies have also been validated using independent antibody strategies. In this case, two (or more) independent antibodies directed towards independent epitopes (non-overlapping) on the protein have been used to assess the reliability of the staining. Individual antibodies are scored based on the similarity of the staining with its "sibling" antibodies.

Antibodies that meet the criteria of independent antibody validation will receive the validation score "Enhanced".

Immunohistochemistry - tissues

Standard antibody validation - IHC

For each antibody, the observed staining, obtained by immunohistochemistry (IHC), is assigned a validation score. The validation score is based on the result of three different validations that are separately evaluated: literature conformity, RNA consistency and similarity between paired antibodies (several antibodies targeting non-overlapping sequences) with regard to spatial expression pattern.

Literature conformity refers to the conformance of antibody staining in 44 normal tissues with available gene/protein characterization data in scientific literature and data from bioinformatic predictions. UniProt is used as the main source of gene/protein characterization data and when relevant, available publications and other sources of information are researched in depth. Extensive or sufficient gene/protein data requires that there is evidence of existence on the protein level with information on both tissue specificity and subcellular localization, and the tissue specificity was determined using human samples. Limited protein/gene characterization data does not require evidence of existence on the protein level and refers to genes for which only bioinformatic predictions and scarce published experimental data is available. RNA consistency is based on a comparison of antibody staining in 44 normal tissues with RNA-seq data combined from HPA, GTEX and FANTOM. Consistency is categorized as High, Medium, Low, Very low or Cannot be evaluated. Similarity between paired antibodies are evaluated based on staining patterns in 44 normal tissues. The different levels of validation score are Supported, Approved or Uncertain.

Supported
If one of the following criteria is fulfilled:

  • At least one antibody shows high or medium consistency between RNA levels and staining pattern, but the antibody does not qualify for Orthogonal validation and staining pattern is consistent with valid literature, or there is no valid literature available
  • At least one antibody has RNA consistency defined as “Cannot be evaluated” and staining pattern is consistent with valid literature
  • Paired antibodies (several antibodies targeting non-overlapping sequences) show similar staining pattern, but the antibodies do not qualify for Independent antibody validation and staining pattern is consistent with valid literature, or there is no valid literature available

Approved
If one of the following criteria is fulfilled:

  • At least one antibody shows high or medium consistency between RNA levels and staining pattern and staining pattern is inconsistent with valid literature
  • At least one antibody shows low consistency between RNA levels and staining pattern and staining pattern is consistent with valid literature
  • At least one antibody has RNA consistency defined as “Cannot be evaluated” and staining pattern is partly consistent with valid literature, or consistent with limited literature
  • Paired antibodies show partly similar expression patterns

Uncertain
If one of the following criteria is fulfilled:

  • Only multi-targeting antibodies are available. Multi-targeting antibodies are used for genes where it was not possible to generate single-targeting antibodies due to high sequence identity among proteins belonging to different genes. These genes are in many cases closely related and belong to known gene families, and in these cases a multi-targeting antibody was produced that has >80% sequence identity to transcripts of the genes belonging to the family and low sequence identity to the transcripts of all other human genes.
  • At least one antibody shows low or very low consistency between RNA and staining pattern, or RNA consistency is defined as “Cannot be evaluated” and staining pattern is inconsistent with valid literature, or there is no valid literature available
  • Paired antibodies show dissimilar expression patterns

Enhanced antibody validation - IHC

Orthogonal validation

Orthogonal validation is based on manual evaluation of the correlation between the staining intensity of a single-target antibody and corresponding mRNA levels across up to 46 normal tissues. For single-target antibodies where consistency between staining intensity and mRNA levels is scored as high or medium, two representative images corresponding to tissues with high and low expression of protein and mRNA are selected. If the difference in mRNA levels between the two representative tissues is at least 4-fold, the antibody will receive the validation score "Enhanced".

Independent antibody validation

This method is based on comparing the staining pattern using two single-target independent antibodies with non-overlapping epitopes. The spatial localization of the staining pattern generated by immunohistochemistry using the two antibodies is compared in 44 different normal tissues. For antibodies that show a similar spatial localization, four representative images are chosen for each antibody. Antibodies that meet the criteria of independent antibody validation will receive the validation score "Enhanced".

Immunohistochemistry/IF - mouse brain

Standard antibody validation - IHC/IF

In order to generate and present reliable and valuable data several validation steps are incorporated in our work flow.

Antibody selection: Based on sequence homology, only antibodies raised against PrESTs with >60% homology with corresponding mouse genes are selected.

Translational validation: Antibodies exposed to mouse brain lysates using western blot to identify possible off-target interactions with mouse proteins.

Internal comparative validation: If available multiple antibodies raised against different fragments of targeted proteins are applied to mouse brain tissue. Reliability score increases when 2 or more antibodies reveal similar staining patterns.

External multidisciplinary validation: Staining patterns will be evaluated using peer-reviewed published data on cellular and regional distribution of proteins. In addition protein distribution data is assessed using expression data available in the Allen Brain Atlas.

Protein array (PA)

All purified antibodies are analyzed on antigen microarrays. The specificity profile for each antibody is determined based on the interaction with 384 different antigens including its own target. The antigens present on the arrays are consecutively exchanged in order to correspond to the next set of 384 purified antibodies. Each microarray is divided into 21 replicated subarrays, enabling the analysis of 21 antibodies simultaneously. The antibodies are detected through a fluorescently labeled secondary antibody and a dual color system is used in order to verify the presence of the spotted proteins. A specificity profile plot is generated for each antibody, where the signal from the binding to its own antigen is compared to the eventual off target interactions to all the other antigens. The vast majority (86%) of antibodies are given a pass and the remaining are failed either due to low signal or low specificity.

Standard antibody validation - PA

Supported

  • Pass with single peak corresponding to interaction only with its own antigen.

Approved

  • Pass with quality comment low specificity (binding to 1-2 PrESTs >15% and <40% of the signal from the target PrEST).

Uncertain

  • No or weak signal.
  • Low specificity (one antigen with >40% signal or more than two antigens with signal >15% of the signal from the target PrEST).

Western blot (WB)

Western blot analysis of antibody specificity has been done using a routine sample setup composed of IgG/HSA-depleted human plasma and protein lysates from a limited number of human tissues and cell lines. A selection of antibodies with an uncertain routine WB have been revalidated using an over-expression lysate (VERIFY Tagged Antigen(TM), OriGene Technologies, Rockville, MD) as a positive control. Antibody binding was visualized by chemiluminescence detection in a CCD-camera system using a peroxidase (HRP) labeled secondary antibody.

Antibodies included in the Human Protein Atlas have been analyzed without further efforts to optimize the procedure and therefore it cannot be excluded that certain observed binding properties are due to technical rather than biological reasons and that further optimization could result in a different outcome.

Standard antibody validation - WB

Supported

  • Bands corresponding to the predicted size in kDa (+/-20%).
  • Band of predicted size in kDa (+/-20%) with additional bands present.

Uncertain

  • Single band larger than predicted size in kDa (+20%) but partly supported by predicted transmembrane region, signal peptide or by other available data.
  • No bands detected.
  • Single band differing more than +/-20% from predicted size in kDa and not supported by predicted transmembrane region, signal peptide or by other available data.
  • Weak band of predicted size in kDa (+/-20%) but with additional bands of higher intensity also present.
  • Only bands not corresponding to the predicted size.
  • Target too small/large to be analyzed with the present setup.
  • Current setup is not applicable due to low RNA count

For antibodies showing uncertain Western blot data the corresponding image is not shown.

Enhanced antibody validation - WB

Genetic validation - siRNA

This method is based on the knock-down or knock-out in a suitable cell line of the target protein using genetic methods, such as CRISPR or siRNA. The staining of the antibody is evaluated by Western blot through analyses of samples from cell lysates before and after knock-down of the corresponding target gene. The results show no or weaker band in the lysate from the knock-down cell line.

Antibodies that meet one of the following criteria will receive the validation score “Enhanced” by genetic method:

  • Signal downregulation > 25 % by both siRNAs.
  • Signal downregulation > 25 % by one siRNA.

Recombinant expression validation

This method is based on over-expression of the target protein in a cell line preferably not expressing the target protein. The staining of the antibody is evaluated by Western blot through analyses of samples from cell lysates with and without recombinant expression of the target protein. The results show no or weak band from the unmodified cell line lysate and a strong band in the cell line with recombinant expression.

Independent antibody validation

This method is based on comparing the staining pattern using two independent antibodies with no overlapping epitopes. The staining of the two antibodies is compared by Western blot through analyses of samples from at least two cell lysates preferably expressing the target protein at different levels. The results show similar Western Blot patterns achieved with independent antibodies.

Orthogonal validation

This method is based on manual evaluation by comparing the antibody band intensity against the corresponding protein levels quantified by mass spectrometry (MS). Antibodies are considered enhanced where the staining intensity and protein expression levels show the same pattern. At least two cell or tissue samples must be used and the target protein must express the target at different levels. This method can also be used to compare the protein expression levels determined by the antibody with the corresponding RNA in each corresponding cell line or tissue.

Capture MS validation

This method is based on comparison between the molecular weight of the stained band visualized by the antibody against the protein size obtained by a capture MS method in which multiple gel slices are cut out from the electrophoretic separation and analysed separately by proteomics. The proteins in each gel slice are digested into peptides and the protein presence and its migration in the gel is verified after the subsequent proteomics analysis. The band detected by the antibody should be equivalent to the same of the intended target protein and its peptide(s).