Automated Server Benchmarks

Joining the MHC I benchmark

To join the automated server benchmark, participants will need to set up a RESTful web service that is capable of parsing data sent from the automated benchmarking framework and returning it’s prediction in a specified format. Once the web service is running, an email should be sent to help@iedb.org with the following information:

Name of server/prediction tool
Contact email
Author list
Pubmed ID (if the tool has been published)
Server description
URL of the server
HTTP request method
Benchmark data format
Max number of peptides per request
Correlation between prediction value and binding affinity
URL to a list of supported alleles and lengths

Result format
Participating servers should return their predictions to the automated benchmarking framework in a three column format with the following fields.

Allele Peptide Prediction

The three fields may be separated by whitespaces or tab. If predictions for multiple peptides are returned, each peptide should be on a separate line, and all three fields indicated above should be included on each line.

HTTP request method
We have designed the framework to allow for some flexibility in how the participating servers are sent the benchmark data. Participants may choose to have the data sent to their web service using either the GET or the POST HTTP request methods. See the “Setting up a RESTful web service” section for examples on how to handle either type of request. Depending on the type of request selected, a different cURL command is used.

GET:
> curl “<SERVER_URL>?<DATA>”

POST:
> curl --data “<DATA>” <SERVER_URL>

where the <SERVER_URL> field will be replaced by the server’s URL and the <DATA> field will be replaced by the benchmark data to be run (explained in the next subsection).

Benchmark data format
Benchmark data is sent out to participating servers with the following default format.

Default:
peptide=<PEPTIDE>&allele=<ALLELE>

where <PEPTIDE> will be replaced by the peptide sequence and <ALLELE> will be replaced by the name of the MHC molecule. Participants may customize this format during the signup process. The custom format must include the <PEPTIDE> and <ALLELE> fields but the rest of the string maybe be customized as desired. An optional <LENGTH> field is also supported, which will simply be replaced by the peptide length [8-11]. The default GET and POST request formats are shown below.

GET:
> curl “<SERVER_URL>?peptide=<PEPTIDE>&allele=<ALLELE>”

POST:
> curl --data “peptide=<PEPTIDE>&allele=<ALLELE>” <SERVER_URL>

Max request
The framework also supports sending multiple peptides in a single request. The peptides will be sent as a single string at the location indicated by the <PEPTIDE> field in the benchmark data format and will be comma-separated. All peptides sent in the same request will be of the same length and predictions should all be predicted against the specified allele. A sample POST request sending 3 peptides to be predicted against HLA-A*02:01 is shown below.

Example:
> curl --data “peptide=SLAKNERFV,WVATYNDSL,YPQWVADSV&allele=HLA-A*02:01” <SERVER_URL>

The default maximum number of peptides per request is 1. This may be changed during the signup process and is highly recommended, as it substantially reduces the benchmark run time.

Correlation between prediction value and binding affinity
It is common for MHC-I binding prediction tools to return predictions in binding affinity (nM), where lower values indicate a stronger binding and larger values indicate weaker binder. However, some prediction tools may instead provide predictions indicating how likely peptides will bind an MHC-I molecule in which case larger values indicate stronger binding. It is critical to know this relationship in order to properly evaluate the prediction tool. Upon signup it should be indicated whether predictions correlate positively with binding affinity (lower = stronger) or negatively (higher = stronger).

Supported MHC molecules and peptide lengths
Each participating server is required to provide a list of supported alleles and peptide lengths, accessible online. Each line should contain the allele name in column 1 and a comma-separated list of peptide lengths supported for the given allele in column 2. The columns may be separated by whitespaces or tab. An example is given below.

HLA-A*01:01 9,10,11
HLA-A*02:01 8,9,10,11
HLA-B*07:02 9,10
HLA-B*58:01 8,9

The list will be checked every time the weekly benchmark is run, and may be updated as support for more alleles and peptide lengths are added.

Setting up a RESTful web service
For tool developers with no experience with REST or web services in general, setting up a RESTful web service may seem like a big hurdle. To help make this step as smooth as possible, we here provide two examples of how to create a web service compatible with the automated server benchmark. The first example is a RESTful web service created using Bottle, a simple and lightweight web framework written in the Python programming language. The second example uses the Common Gateway Interface (CGI).

Bottle is distributed as a Python module and may be downloaded here. We have written a template for a web service, which only requires you to fill out the path to your prediction tool, and to insert some parsing of the results the tool returns if it differs from the format expected by the automated server benchmark. Note that the template assumes the default POST request format. If the request format is changed, the input parsing in the template will need to be updated.

Another solution is to use CGI. Using CGI, a web server is able to link a URL to an executable program or script, which may be written in any programming language. We provide two CGI templates, written in Python and Perl. Similar to the Bottle template, the path to the prediction tool and the parsing of the results should be filled out. Both templates assume the default GET request format is used. The Bottle template as well as the two CGI templates may be downloaded here.