Rapid determination of HLA B*07 ligands from the West Nile virus NY99 genome.

Defined T cell epitopes for West Nile (WN) virus may be useful for developing subunit vaccines against WN virus infection and diagnostic reagents to detect WN virus-specific immune response. We applied a bioinformatics (EpiMatrix) approach to search the WN virus NY99 genome for HLA B*07 restricted cytotoxic T cell (CTL) epitopes. Ninety-five of 3,433 WN virus peptides scored above a predetermined cutoff, suggesting that these would be likely to bind to HLA B*07 and would also be likely candidate CTL epitopes. Compared with other methods for genome mapping, derivation of these ligands was rapid and inexpensive. Major histocompatibility complex ligands identified by this method may be used to screen T cells from WN virus-exposed persons for cell-mediated response to WN virus or to develop diagnostic reagents for immunopathogenesis studies and epidemiologic surveillance.


West Nile Virus
these hypotheses will require the development of reagents such as the T-cell epitopes defined in this study.

Applying Bioinformatics to Defining T-Cell Epitopes
New bioinformatics tools developed by the TB/HIV Research Lab and EpiVax (Providence, RI) enable researchers to move rapidly from genome sequence to epitope selection (16). EpiMatrix is a computer-driven pattern-matching algorithm that identifies T-cell epitopes. BlastiMer permits the analysis of protein sequences for homology with other known proteins.
The goal of this project was to demonstrate the utility of a bioinformatics and computational immunology approach for the rapid selection of T-cell epitope reagents. Defining these reagents will permit the evaluation of cell-mediated responses in the immunopathogenesis of WN virus, promote the development of diagnostic reagents such as tetramers (17), and provide components for epitope-based preventive or therapeutic vaccines (18)(19)(20). A secondary goal was to determine the time required to select and screen epitope candidates in vitro, since time may be a critical factor in the development of vaccines and diagnostic reagents in response to emerging infectious pathogens.
On the basis of experience with the EpiMatrix HLA B*07 prediction tool, we selected peptides for this pilot study that were expected to be restricted by HLA B*07. In studies of HIV-1 peptides, 60% of peptides selected by EpiMatrix HLA B*07 stimulated T-cell responses in vitro. We therefore expected that approximately 60% of WN virus peptides selected by the same criteria would bind to HLA B*07 and stimulate T-cell responses.
We screened 16 WN virus peptides and identified 12 epitope candidates, 5 of which exhibited strong binding to HLA B*07 at a range of peptide concentrations in vitro. The largest source of delay in the screening process was peptide synthesis (4 weeks from placement of order to receipt of the first set of peptides and 8 weeks until delivery of the final set of peptides). This process could be accelerated if more rapid access to MHC ligands were necessary.
The binding studies we describe are a first step to confirming immunogenicity. In cases such as WN virus, in which access to T cells from infected persons is limited, both the bioinformatics step and the binding assays can be carried out without clinical specimens. Once the epitope candidates selected by this method are confirmed in cytotoxic T-cell (CTL) assays, they may be useful for 1) screening exposed persons for T-cell responses, 2) investigating the immunopathogenesis of WN virus disease in humans, 3) as components of diagnostic kits developed for WN virus surveillance, 4) as reagents for measuring WN virus vaccinerelated immune responses, and possibly 5) as components of a subunit vaccine for WN virus. Confirmation of T-cell response to the peptides will depend on availability of peripheral blood cells from WN virus-infected patients during the 2001 transmission season. Additional peptides also need to be identified and screened for binding to other HLA alleles, to broaden the MHC specificity of the diagnostic reagent or immunopathogenesis tools developed by this approach.

Bioinformatics Analysis
We obtained the NY 1999 WN virus sequence from GenBank (GenBank accession number AF196835) (21). The 3,433 amino acids in the GenBank translation were parsed into 3,424 10-amino acid long frames, each 10 amino acid-long peptide sequence overlapping the previous peptide sequence by nine amino acids. The sequences of these 3,424 decamers were stored in a database.
Each of the peptides in the database was then evaluated by EpiMatrix, a matrix-based algorithm that ranks 9 and 10 amino acid peptides by estimated probability of binding to a selected MHC molecule (22). The estimated binding potential (EBP) is derived by comparing the EpiMatrix score with those of known binders and presumed nonbinders. The EBP describes the proportion of peptides with EpiMatrix scores as high or higher than known binders for a given MHC molecule. Both retrospective and prospective studies of EpiMatrix predictions have confirmed the accuracy of this T-cell epitope selection method (22)(23)(24). EpiMatrix is available for use by HIV researchers on the TB/HIV Research Laboratory website (http://tbhiv.biomed.brown.edu/) and under collaborative and commercial arrangements with the TB/HIV Research Laboratory and EpiVax, Inc. (Providence, RI), respectively. Table 1 illustrates the process of selecting candidate B*07 ligands from the WN virus genome. Of six overlapping peptides in the region of the WN virus sequence shown (Table  1), WN virus B7 0019 scored in the same range as known B*07 ligands and HLA B*07-restricted epitopes (EBP 22.49). Therefore, this peptide would be considered the most likely candidate to show binding to HLA B*07 of the six peptides in this illustration.
EBPs for the WN virus peptides ranged from >20% (highly likely to bind) to <1% (very unlikely to bind) ( Figure  1). We also scored 10,000 random peptides of natural amino acid composition (25) derived from the ExPASy (Expert Protein Analysis System) proteomics server at the Swiss Institute of Bioinformatics (Randseq, http://www.expasy.ch/ tools/ randseq.html). We compared the HLA B*07 EpiMatrix scores of this set of random peptides with those of a set of >300 known binders (compiled and maintained at EpiVax) and with the scores of the set of WN virus peptides selected for this study ( Figure 2).

Selection of Peptides
Peptides with EpiMatrix EBP scores in the range of 7 to 50 are more likely to bind to MHC and stimulate T cells in vitro (23). Peptides with an EBP score >50 are less likely to be immunogenic, although they may bind to B7 in vitro (16,23).  Table 2a) were also selected to test the hypothesis that low scoring peptides derived from WN virus would not bind to HLA B*07 in vitro (predicted nonbinders). One well-defined B*07-restricted epitope, GPGHKARVLA (derived from HIV), was also chosen as a positive control for the assays (26).

Cross-Reactive Analysis
After the EpiMatrix analysis, the Conservatrix tool (EpiVax, Providence, RI) was used to align and compare the WN virus sequences with those of other related flaviviruses (21). In an intermediate step designed to avoid selecting epitopes that may have cross-reactivity with "self," each of the highly selected epitopes was passed through the Blast engine at the National Center for Biotechnology Information, using the BlastiMer tool (EpiVax, Providence, RI). Any sequence that was similar to (i.e., >80% identical to the 10 amino acid WN virus NY99 sequence) a peptide component of equivalent length in the human genome (accessible and published to date) was excluded from the study set.

Peptide Synthesis
Peptides corresponding to the epitope selections were prepared by 9-fluoronylmethoxycarbonyl synthesis on an automated Rainen Symphony/Protein Technologies synthesizer (Synpep, Dublin, CA). The peptides were delivered 90% pure as ascertained by high-performance liquid chromatography, mass spectrophotometry, and UV scan. The peptides were shipped as lyophilized powder, which was diluted in a minimal volume of dimethyl sulfoxide and then diluted to stock concentrations in RPMI 1640 medium (Sigma, St Louis, Figure 1. Distribution of scores for the complete set of 3,424 peptides obtained by parsing the West Nile (WN) virus genome into 10 aminoacid long peptides, each overlapping by 9 amino acids, as scored on the EpiMatrix motif for HLA B*07. Peptides with estimated binding potential (EBP) scores >7 and <50 with the HLA B*07 motif are highly likely to bind to HLA B*07 in T2 B7 assays and to stimulate T cells. WN virus peptides with EBP scores between 20 and 50 were considered for study. Figure 2. EpiMatrix HLA B*07 score distributions for a random set of 10,000 peptides (dark blue), a set of 20 West Nile (WN) virus peptides selected for screening (magenta), and a set of known HLA B*07 ligands (light blue) are compared. The natural log of estimated binding potential (EBP) for all three sets (random, known binders, and WN virus selections) fell within the range -5 to 5. Scores for the set of WN virus peptides selected for this study are higher than those of most random peptides and are within the same range as scores of published HLA B*07 binders.  MO). Peptides that could not be purified to specifications within the study period were not evaluated.

MHC Binding Studies
The T2B7 binding assay method (23,24) relies on the ability of exogenously added peptides to stabilize the class I MHC/beta 2 microglobulin structure on the surface of transporters associated with antigen processing (TAP)-deficient cell lines (27,28). Briefly, the HLA B*07 T2 cell line was prepared for the assay by incubating overnight (16 hours) at 26°C. Before the binding assay, these cells were washed twice in serum-free media. Solutions of the test peptides at three concentrations (final concentrations of 10, 20, and 200 µg/mL in RPMI 1640 (Sigma, St Louis, MO) were plated in triplicate wells of a 96-well, round-bottom assay plate (Becton Dickinson, Lincoln Park, NJ). Sixteen wells containing cells without peptide were included in each plate as background controls.
After 100,000 cells were added to each well, the plates were incubated for 4 hours at 37°C, 5% CO 2 , followed by centrifugation at 110 x g for 10 minutes at 4°C. The supernatant was discarded, and the remaining cells were resuspended. One hundred µL of anti-HLA-B*07 primary antibody-containing hybridoma supernatant was diluted in staining buffer (1: The binding studies are predicated on the assumption that the primary antibody recognizes an epitope on the HLA with a configuration that is unchanged by the stabilizing peptide. The plates were incubated for 30 minutes at 4°C, then washed three times with staining buffer. The contents of each well were then resuspended in 200 µL of fixing buffer (PBS, 1% paraformaldehyde).
The 16 negative control wells in each plate contained no peptide but did contain cells, primary antibody, and secondary antibody. An additional set of wells was plated with peptide at the highest concentration (200 µg/mL), but no primary antibody was added to the wells as a control for nonspecific secondary antibody binding. One positive control peptide (the known B*07 binder) was tested in triplicate at three concentrations (final concentrations of 10, 20, and 200 µg/mL in RPMI 1640) in each assay plate.
Following fixing, the presence of fluorescent secondary antibody on the surface of T2 cells (gated to the appropriate cell size) was measured at 488 nm on a FACScan flow cytometer (Becton Dickinson). The mean linear fluorescence of 10,000 events was measured and compared with the background fluorescence of cells plated in control wells. The entire assay was repeated 4 times, so that each peptide was tested in a total of 36 wells (triplicate wells, three concentrations, four assays).
The B*07 molecule was considered to be stabilized on the surface of the T2B7 cells if the average of the mean linear fluorescence for the triplicate wells at each concentration of peptide was >10% higher than the average of the 16 negative control wells (and p<0.05 in two-way comparisons by ANOVA). Binding was rated as strong, moderate, weak, or none, based on the number of significantly positive wells by pair-wise ANOVA (Table 3).

Results
The 3,424 decamers derived from the WN virus genome were evaluated by EpiMatrix B*07 and evaluated for match to the stored matrix pattern. Most decamers scored for the entire WN virus genome (by the HLA B*07 scoring matrix) had EBP scores <1% (Figure 1). Figure 2 shows the distribution of HLA B*07 scores of a set of 10,000 random peptides (plotted as their natural logs, to allow better distribution of EBP scores <1), compared with scores for the set of >300 known HLA B*07 binders and with the scores of the selected WN virus peptides. The set of peptides selected for study scored well within the EBP range of the comparison set of >300 known HLA B*07 ligands (Figure 1).
Each peptide in the entire WN virus-NY99 dataset of peptides was scored by EpiMatrix. Ninety-five of the 3,424 decamers had EBP scores >7%. Of these 95 peptides, 20 of the 25 with EBP scores between 20% and 50% (Table 2a) were selected for screening. Three peptides with EBP scores >50 (0001, 0002, 0003) were eliminated from the set of peptides tested because scores in this range are less likely to be B*07 ligands and epitopes (TB/HIV Research Lab and EpiVax, unpub. data). The amino acid sequence of peptide 0012 overlapped substantially with the human genome, and for that reason this peptide was also excluded. Three of the original 25 peptides (0014, 0021, 0025) could not be synthesized to sufficient purity within the study timeframe. Two peptides with EBP scores between 50 and 20 (0016 and 0022) were also not tested because they did not fall within a region of the WN virus genome belonging to a mature WN virus protein, based on information in the GenBank database. Sixteen WN virus peptides remained in the final selection.
The final set of 16 WN virus peptides included two from NS-1, four from NS-2A, five from NS-3, one from NS-4A, five from NS-5, one from env gp E, and one from prM (Table 2a). In addition to these peptides, four predicted nonbinder peptides and a known binder (1291) were also synthesized. Twenty-one peptides were tested in vitro in T2B7 binding assays.

Binding Results
Triplicate wells of peptide at 10, 20, and 200 µg/mL were evaluated in each of the T2 B7 binding assays. Table 3 provides information on the mean fluorescence index for the peptide at 200 µg/mL; the average fold increase over background for the peptide at 10, 20 and 200 µg/mL; and the ANOVA analysis for each pairwise comparison (between fluorescence for cells incubated with one of the concentrations of the study peptide and the fluorescence of the cells in control wells).
Twelve of the 16 study peptides demonstrated consistent binding in the four replicate assays. Of these peptides, four (0017, 0019, 0020, and 0023) stabilized HLA B*07 on the surface of T2B7 cells substantially more often than controls in the four replicate assays (strong binders, Table 3). Two WN virus peptides (0008, 0009) stabilized HLA B*07 to a moderate degree. Six WN virus peptides (0005, 0006, 0013, 0015, 0018, and 0024) were weak binders, and four did not bind.
The positive control peptide, 1291, was tested with each set of peptides. The peptide bound significantly over background (based on ANOVA) in all three assays. Four negative control peptides selected for low EBP scores (3399, 3404, 3411, and 3415, all with scores of 0.0%) did not stabilize T2B7 to a significant degree.

Cross Strain Analysis Results
Peptide 0019, a strong binder, was conserved in all strains of WN virus (100% or 10 of 10 amino acids) and Kunjin virus; it was 80% conserved in JE virus strains (

Estimated Cost
The 3,329 peptides with EBP scores <7% (3,424 to 95) were considered unlikely to bind to HLA B*07. The EpiMatrix approach reduced the number of candidate peptides by 97% (3,329/3,424). Some researchers have adopted a standard overlapping (OL) approach (constructing a set of 10 amino acid-long peptides overlapping by 5 amino acids covering the entire genome [29]). This strategy (10/5 OL set) would have required the synthesis of 685 decamer peptides for the WN virus genome, more than 7 times the number (95) selected by the EpiMatrix approach.
The cost of synthesizing the 16 putative ligands and four controls (at a cost of $250 per peptide) for this project was $5,000. Synthesizing the entire selected set and four controls would have cost $24,750. Had the standard overlapping peptide approach been used, the cost of synthesizing OL peptides would have been approximately $170,000 ($250 for each of 685 peptides). The cost of synthesizing and mapping the complete overlapping set of peptides representing decamer peptides overlapping by 9 amino acids (3,423 peptides) would be $856,000 (Table 4).
If the WN virus B7 peptides behave as observed in previous studies of HLA B*07 peptide datasets (23; De Groot et al., unpub. data), additional HLA-B7 ligands would be identified (approximately 76%; 72 of the set of 95 WN virus peptides with EBP scores >7). If, by performing more overlapping peptide assays, this larger set of 72 (putative) ligands had been found, the cost per ligand with the OL approach would have been approximately $3,600 per ligand, compared with $617 per ligand for 72 ligands with the EpiMatrix approach. If no epitopes were to be missed, the exhaustive approach could be used at an estimated cost of $18,000 per ligand. This approach would have cost approximately five times more than the OL approach and 30 times more than the EpiMatrix approach.

Time Required for Analysis
Analysis of the WN virus genome and selection of the WN virus peptides was performed during one working week. Selected peptides were obtained in batches over a 4-week period. T2B7 binding assays were performed as the peptides arrived. Overall, the T2 B7 binding assays and data analysis took place over 20 working days, and the entire process from peptide selection to completion of data analysis took 8 weeks. Eliminating delays associated with peptide synthesis would have reduced the time required to 4 weeks.

Discussion
Using the EpiMatrix approach, we rapidly identified four excellent B*07-restricted T-cell epitope candidates for WN virus. Overall, 12 (75%) of 16 selected peptides bound in T2B7 binding assays. These binding results compare favorably with those of other T2B7 binding results for HIV-1 (16,23). Xia Jin et al. tested 29 HIV-1 peptides with EBP scores of 7%, of which 10 (35%) bound to T2B7 cells in vitro and 4 (14%) were subsequently demonstrated to be HLA-B7 restricted CTL epitopes in assays performed with CD8+ T-cell lines derived from an HIV-infected patient (23). In a separate study (16) of HLA B*07-restricted peptides, 25 peptides were tested, including a known HLA B*07-restricted epitope (peptide 1291, also used in this study). Nineteen (76%) of 25 peptides were shown to bind to T2B7 cells in vitro, and 60% of the peptides stimulated gamma-interferon release in T-cell assays performed with HIV-1-infected patients' cells.
Based on these experiences with EpiMatrix HLA-B7 selection, additional peptides from the original list of 95 WN virus peptides (EBP scores >7) might be expected to bind to HLA B*07 and stimulate T-cell responses. If the rest of the WN virus B7 peptides behave as observed in the HIV-1 datasets, 21 to 60 additional HLA-B7 ligands might be identified (76%, or 72 of the set of 95 WN virus peptides with EBP scores >7). This observation is also consistent with estimates of the number of epitopes in a given protein (30). Even at this higher number of total ligands, the cost per ligand of the OL approach would still have been more expensive than the EpiMatrix approach. Furthermore, the exhaustive approach would have cost approximately five times more than the OL (10/5) approach and 30 times more than the EpiMatrix approach. The EpiMatrix approach would also be substantially more rapid than OL or exhaustive testing of overlapping peptides.
EpiMatrix is one of several epitope mapping tools available to researchers, including the tool available at the SYFPEITHI (31) website and the HLA binding prediction tool available on the National Institutes of Health (BIMAS) site (32). Neither of these sites returned exactly the same predictions as EpiMatrix for the WN virus genome; however, no direct comparison was made. Either of these web-based epitope-mapping tools could also accelerate the process of epitope mapping the WN virus genome by the approach described here.
The matrix-based approach used by EpiMatrix developers occasionally results in the selection of peptides that do not fit standard anchor-based and extended anchor motifs such as those available on the SYFPEITHI website. As a result, WN virus peptides selected by the EpiMatrix method and included in this study did not always fit the conventional, anchor-based format of proline in position 2 and leucine or phenylalanine in position 9 (17). For example, the sequence of one weak WN virus binder, AAKKKGASLL, has little in common with published HLA B*07 motifs, illustrating how EpiMatrix is able to prospectively identify ligands that do not necessarily match anchor-based motifs.
Although EpiMatrix appears to provide excellent discrimination between most published HLA B*07 ligands and a set of random peptides (Figure 2), there is still overlap between the lower-scoring published HLA B*07 ligands and the scores of some of the random peptides. Since the universe of HLA B*07 ligands is unknown, some of the set of random peptides could be previously unidentified HLA B*07 ligands. Furthermore, EpiMatrix scored several known HLA B*07 ligands very low, reflecting either inaccuracy of the HLA B*07 matrix or inaccurate reporting of these ligands. Further study of these low-scoring HLA B*07 ligands may improve knowledge of the rules determining HLA B*07 binding.
Epitopes that are specific for WN virus could be used to develop diagnostic tests such as tetramer assays for WN virus (17). The tetramer staining assay relies only on the interaction between the tetramer reagent and T-cell receptors on the surface of T cells; it can be performed in <30 minutes on as little as 2 mL of blood. Peptide 0008 was unique to WN virus, with only 8 of 10 amino acids in this sequence conserved in Kunjin virus; the sequence was even less well conserved in other members of the flavivirus family. Peptide 0009 would also be a strong candidate reagent for a diagnostic test, as it was conserved in Kunjin and in many strains of WN virus but not in any other member of the flavivirus family.
The incubation period in humans (i.e., time from infection to onset of disease symptoms) for WN virus encephalitis is The standard overlapping approach, constructing a set of 10 amino-acid long peptides overlapping by 5 amino acids (10/5 OL set) would require the synthesis of 685 decamer peptides, approximately 30 times the number synthesized and tested by the EpiMatrix approach. The "discovery" cost per ligand was calculated by dividing the total cost of synthesis and screening for each of the approaches by the number of ligands expected to be discovered (12 ligands, a low estimate, and 72 ligands, a high estimate). b Based on the assumption that only 12 ligands will be found c Based on the assumption that as many as 72 ligands may be found. In that case, 95 peptides would be synthesized for EpiMatrix, 685 for OL (10 by 5), and 3,434 for the exhaustive approach.

West Nile Virus
usually 5 to 15 days. Antibodies are detectable within 3 to 7 days; however, to confirm infection, antibody assays must be repeated in the acute and convalescent phases. In contrast, recent tetramer-staining studies (33) indicate that cell responses may be detectable 2 to 3 days after acute infection. The initial CTL response to acute infection with a virus, as measured by tetramer technology, can be dramatic. For example, during the acute immune response to lymphocytic choriomeningitis virus (LCMV) in BALB/c mice, 55% of all CD8+ splenocytes are stained with an LCMV-specific tetramer (34). The method is extremely robust and can detect antigen-specific populations at frequencies as low as 1:5,000 CD8+ T cells, or approximately 1:50,000 peripheral blood mononuclear cells (35). Results of the studies performed here suggest that peptides 0008 and 0009, which are relatively specific for WN virus and which score in the range of EpiMatrix scores shown to be compatible with immunogenicity (24), would be reasonable first candidates for the development of a tetramer-based diagnostic reagent for WN virus.
No specific vaccine or antiviral treatment exists for WN virus infection. CTL response will likely be one critical component of the immune response against WN virus. Development of a preventive or therapeutic vaccine against this public health threat would be greatly expedited if the correlates of immune response were determined and appropriate components rapidly incorporated into a vaccine. Epitopes defined by methods such as the one described here are likely to contribute substantially to the development of new research and diagnostic reagents and vaccines for WN virus and other emerging infectious diseases.
Dr. De Groot is director of the TB/HIV Research Laboratory and assistant professor of community health and medicine at Brown University. She is also CEO and President of EpiVax, Inc., a privately owned bioinformatics and vaccine design company in Providence, RI. She trained in informatics and immunology at the National Institutes of Health and is an HIV/AIDS specialist in correctional settings.