EpiMatrix

Variation in human MHC ensures that the surveillance capabilities of the human immune system are both broad and deeply redundant, making immune escape through mutation more difficult for pathogenic organisms. Unfortunately, this variation also vastly complicates the process of selecting T cell epitopes for vaccine designers. Selecting T cell epitopes from too many alleles creates a larger pool of epitopes than may be practically incorporated into a vaccine. On the other hand, selecting epitopes from too few alleles may result in a vaccine that is effective in only a small portion of the population. Fortunately, some alleles are much more common than others and the binding repertoires of many alleles significantly overlap. By focusing on “archetypal” or “super-type” alleles that are both common and different from each other, one can reduce the search space to a manageable size. For Class I, we focus on six of these super-type alleles (A*0101, A*0201, A*0301, A*2402, B*0702, and B*4403) and for Class II, on eight of these super-type alleles (DRB1*0101, *0301, *0401, *0701, *0801, *1101, *1301, and *1501) that collectively “cover” the genetic background of most humans worldwide. In a typical analysis, protein antigens are parsed into overlapping 9-mer frames where each 9-mer overlaps the last by eight amino acids. Each 9-mer is then scored for predicted binding affinity against a panel of Class I or Class II alleles. The EpiMatrix algorithm compares the amino acid sequence of each given 9-mer to the coefficients contained in the matrix and produces a raw score. In order to compare potential epitopes across multiple HLA alleles, EpiMatrix raw scores are converted to a normalized “Z” scale. Peptides scoring above 1.64 on the EpiMatrix “Z” scale (typically the top 5% of any given sample), are likely to be MHC ligands and are worthy of further consideration.