IEDB Analysis Resource

Antibody Epitope Prediction - Tutorial

I. Methods for predicting continuous antibody epitope from protein sequences
General basis: Parameters such as hydrophilicity, flexibility, accessibility, turns, exposed surface, polarity and antigenic propensity of polypeptides chains have been correlated with the location of continuous epitopes. This has led to a search for empirical rules that would allow the position of continuous epitopes to be predicted from certain features of the protein sequence. All prediction calculations are based on propensity scales for each of the 20 amino acids. Each scale consists of 20 values assigned to each of the amino acid residues on the basis of their relative propensity to possess the property described by the scale.
General method: When computing the score for a given residue i, the amino acids in an interval of the chosen length, centered around residue i, are considered. In other words, for a window size n, the i - (n-1)/2 neighboring residues on each side of residue i were used to compute the score for residue i. Unless specified, the score for residue i is the average of the scale values for these amino acids (see table 1 for specific method implementation details). In general, a window size of 5 to 7 is appropriate for finding regions that may potentially be antigenic.
Interpretation of output graphs and tables: On the graphs, the Y-axes depicts for each residue the correspondent score (averaged in the specified window), be it a BepiPred score or a residue score on the Karplus and Schulz flexibility scale; while the X-axes depicts the residue positions in the sequence. The tables provide values of calculated scores for each residue. The larger score for the residues might be interpreted as that the residue might have a higher probability to be part of epitope (those residues are colored in yellow on the graphs). However, the presented methods do not predict the epitopes per se, either linear or discontinuous, -- they might only guide the researchers to further explore the protein regions on being genuine epitopes.

Table 1. Implemented methods
Method
Chou and Fasman beta turn prediction Scale:
ACDEFGHIKLMNPQRSTVWY
0.661.191.460.740.61.560.950.471.010.590.61.561.520.980.951.430.960.50.961.14
Emini surface accessibility scale Scale:
ACDEFGHIKLMNPQRSTVWY
0.490.260.810.840.420.480.660.340.970.40.480.780.750.840.950.650.70.360.510.76
Karplus and Schulz flexibility scale
  • Reference: Karplus PA, Schulz GE. Prediction of Chain Flexibility in Proteins - A tool for the Selection of Peptide Antigens. Naturwissenschafren 1985; 72:212-3.
  • Description: In this method, flexibility scale based on mobility of protein segments on the basis of the known temperature B factors of the a-carbons of 31 proteins of known structure was constructed. The calculation based on a flexibility scale is similar to classical calculation, except that the center is the first amino acid of the six amino acids window length, and there are three scales for describing flexibility instead of a single one.
Kolaskar and Tongaonkar antigenicity scale Scale:
ACDEFGHIKLMNPQRSTVWY
1.0641.4120.8660.8511.0910.8741.1051.1520.931.250.8260.7761.0641.0150.8731.0120.9091.3830.8931.161
Parker Hydrophilicity Prediction Scale:
ACDEFGHIKLMNPQRSTVWY
2.11.410.07.8-9.25.72.1-8.05.7-9.2-4.27.02.16.04.26.55.2-3.7-10.0-1.9
Bepipred-1.0 Linear Epitope Prediction
  • Reference: Jens Erik Pontoppidan Larsen, Ole Lund and Morten Nielsen. Improved method for predicting linear B-cell epitopes. Immunome Res. 2006; 2: 2..
  • Description: BepiPred predicts the location of linear B-cell epitopes using a combination of a hidden Markov model and a propensity scale method. The residues with scores above the threshold (default value is 0.35) are predicted to be part of an epitope and colored in yellow on the graph (where Y-axes depicts residue scores and X-axes residue positions in the sequence) and marked with "E" in the output table. TheÊvaluesÊof the scores are not affected by the selected threshold. The table below shows the relationship between selected thresholds and the sensitivity/specificity of the prediction method, calculated on basis of the epitope/non-epitope predictions. The table is based on a large benchmark calculation containing close to 85 B cell epitopes.
    Threshold Sensitivity Specificity
    -0.20 0.75 0.50
    0.20 0.56 0.68
    0.35 0.49 0.75
    0.90 0.25 0.91
    1.30 0.13 0.96
BepiPred-2.0: Sequential B-Cell Epitope Predictor
  • Reference: Jespersen MC, Peters B, Nielsen M, Marcatili P. BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res 2017.
  • The BepiPred-2.0 server predicts B-cell epitopes from a protein sequence, using a Random Forest algorithm trained on epitopes and non-epitope amino acids determined from crystal structures. A sequential prediction smoothing is performed afterwards. The residues with scores above the threshold (default value is 0.5) are predicted to be part of an epitope and colored in yellow on the graph (where Y-axes depicts residue scores and X-axes residue positions in the sequence) and marked with "E" in the output table. TheÊvaluesÊof the scores are not affected by the selected threshold. The table below shows the relationship between selected thresholds and the sensitivity/specificity of the prediction method.
    Threshold Sensitivity Specificity
    0 1 0
    0.05 1 0
    0.10 1 0
    0.15 1 0
    0.20 1 0.00019
    0.25 0.99743 0.00419
    0.30 0.98995 0.0276
    0.35 0.97212 0.07036
    0.40 0.93605 0.15606
    0.45 0.82607 0.3307
    0.50 0.58564 0.57158
    0.55 0.29159 0.81655
    0.60 0.09559 0.95116
    0.65 0.01969 0.99272
    0.70 0.00182 0.99954
    0.75 0.99743 0.00419
    0.80 0 1
    0.85 0 1
    0.90 0 1
    0.95 0 1
    1 0 1
II. Input
  1. Enter a protein sequence in plain format
  2. Select a prediction method
  3. Click submit
III. Output