Reverse Vaccinology Approach in Constructing a Multi-Epitope Vaccine Against Cancer-Testis Antigens Expressed in Non-Small Cell Lung Cancer

Background: The 5-year survival rate of non-small cell lung cancer (NSCLC) patients has not significantly improved despite advancements in the currently applied treatments. Thus, efforts are put forth in developing novel immunotherapeutic agents targeting cancer-testis antigens (CTA) in NSCLC. This work utilized reverse vaccinology approach in designing a novel multi-epitope vaccine targeting melanoma-associated antigen 3 (MAGEA3), MAGEA4, New York esophageal squamous cell carcinoma-1 (NY-ESO-1), and Kita-Kyushu lung cancer antigen 1 (KK-LC1), being the most frequently expressed CTAs in NSCLC. Methods: Epitopes were mapped from the sequences of CTAs. The population coverage (PC) of identified CD4+ and CD8+ epitopes were estimated. Candidate linear B cell (BL), CD4+, and CD8+ epitopes were adjoined in a multi-epitope construct (Mvax) with flagellin domain as an adjuvant. Antigenicity, and cross-reactivity of Mvax were examined. The tertiary structure of Mvax was modelled, and validated. All epitopes included in the vaccine were docked with their human leukocyte antigen (HLA) binders. The immunogenicity of epitopes in Mvax was validated through molecular dynamics analysis. Results: Mvax contains 22 epitopes from MAGEA3, MAGEA4, NY-ESO-1, and KK-LC1. It is classified as antigenic, non-allergen, non-toxic, and possesses physicochemical stability. Epitopes have no significant hits with other human proteins, except for 2 other CTAs frequently expressed in NSCLC. The stretch of BL epitopes in Mvax confers flexibility, and accessibility emphasizing its antigenicity. The tertiary structure analysis showed that Mvax model has good structural quality. All epitopes included in the vaccine are highly immunogenic as indicated by favorable binding affinity, low binding energy, and acceptable root-mean-square deviation (RMSD). CD4+ and CD8+ epitopes have global PC of 81.81%, and 84.15%, respectively. Conclusion: Overall, in silico evaluations show that Mvax is a potential immunotherapeutic agent against NSCLC.


Introduction
Lung cancer is the primary cause of cancer-related deaths in the world with non-small cell lung cancer (NSCLC) making up 84% to 87% of the lung cancer cases (Bray et al., 2018). Current treatments include radiation therapy, surgery, chemotherapy, monoclonal antibodies, the most frequently expressed CTAs in NSCLC (Hanagiri et al., 2013;Keshavarz-Fathi and Rezaei, 2020). Similar to MAGEA3, MAGEA4 is also frequently expressed in NSCLC patients (Chen et al., 2013;Hou et al., 2020). Another commonly expressed CTA is the New York esophageal squamous cell carcinoma-1 (NY-ESO-1), also known as cancer/testis antigen 1. It is encoded by gene cancer/testis antigen A1 (CTAGA1) which is expressed in testis, ovary, and various cancer tissues such as NSCLC. NY-ESO-1 is currently known as the most immunogenic CTA (Gjerstorff et al., 2013;Xia et al., 2018;Smith and Iwonofu, 2018). Kita-Kyushu lung cancer antigen 1 (KK-LC1), encoded by cancer/testis antigen 83 (CT83), is a single-pass membrane protein normally expressed in testis, and other cancer types including lung cancer (Fukuyama et al., 2006;Jin et al., 2018;Marcinkowski et al., 2019;Ichiki et al., 2020). Strong immunogenicity, tumor-restricted, and biased expression of MAGEA3, MAGEA4, NY-ESO-1, and KK-LC1 can offer vast drugdevelopment opportunities, including immunotherapeutic agents and vaccines. Therefore, the primary goal of this work is to apply reverse vaccinology to efficiently design a novel multi-epitope vaccine that can potentially induce immune response against multiple CTAs expressed in NSCLC. Multi-epitope vaccine can be more advantageous than whole recombinant proteins, or even single peptides. It may cover wider population coverage, and exclude cross-reactive sequences to prevent adverse reactions. With the aid of immunoinformatics, the cost, effort, and time required to develop vaccines, can be minimized (Oyarzún et al., 2016).

Identification of linear B cell and T cell epitopes
The amino acid sequences of Homo sapiens MAGEA3 (P43357), MAGEA4 (P43358), NY-ESO-1 (P78358), and KK-LC1 (Q5H943) were retrieved from UniProt Database. Extracellular sequence of  was mapped to identify linear B cell (BL) epitopes. ABCPred, and Emini Surface Accessibility (ESA), BepiPred Linear Epitope (BLE), and Kolaskar and Tongaonkar Antigenicity (KTA) tools in the Immune Epitope Database (IEDB) were utilized. Epitopes from ABCPred with overlapping sequences from at least 2 tools in IEDB, were chosen as final BL epitopes. Among the four tumor antigens, only KK-LC1 has an extracellular sequence; thus, it was the only antigen used to identify BL epitopes. The full-length amino acid sequences of MAGEA3, MAGEA4, NY-ESO-1, and KK-LC1 were mapped for T cell epitopes using default thresholds. CD4+ epitopes, and their corresponding major histocompatibility complex (MHC) II binders were identified using NNalign-2.3 (netMHCII-2.3), and NetMHCPanII4.0E in IEDB. A reference list for the most common MHC II binders was utilized (Greenbaum et al., 2011). Consensus CD4+ epitopes (15 residues) with good binding affinity to at least 5 MHC II (IC 50 ≤ 150nM) in NN-align-2.3, and with rank ≤ 10 in NetMHCIIPan4.0E, were selected as final CD4+ epitopes. CD8+ epitopes and their MHC I binders were identified using NetMHCcons method in the Proteasomal cleavage/TAP transport/MHC class I combined predictor which combines three MHC-peptide binding prediction methods to give more reliable result (Karosiene et al., 2012). A list containing the most frequent MHC I binders was uploaded (Weiskopf et al., 2012). Epitopes with 8-11 residues, binding to at least 5 MHC I, with TAP and proteasome scores > 1.0, and with IC 50 ≤ 150nM, were selected as final CD8+ epitopes. Peptides with IC 50 <500nM are classified as good binders (Jensen et al., 2018). A multi-epitope vaccine can offer larger population coverage (PC). To estimate global PC, the set of final CD4+ and CD8+ epitopes were queried separately, using the Population Coverage tool in IEDB.

Multi-epitope construct
KK-LC-1 CD4+ epitopes with > 2 residues overlap were shortlisted using BL epitopes as templates. And CD8+ epitopes which overlap with CD4+ epitopes in the same antigen, were shortlisted using CD4+ epitopes as templates. Overlapping epitopes of the same antigen were merged as continuous peptides. BL and CD4+ epitopes were adjoined using GPGPG linkers, and AAY was used to connect CD8+ epitopes. Peptides from the same antigen were arranged next to each other according to their sequence position. Salmonella typhimurium flagellin (fliC) sequence was retrieved from Uniprot database (P06179), and used as an adjuvant. Its N-terminal domain 1 (ND1) region (46-175) was linked to the cluster of CD8+ and CD4+ epitopes, via flexible EAAAK linker. Then, the C-terminus of the construct was fused with the C-terminal D1 (CD1) region of fliC (398-455) using EAAAK. The series of BL epitopes were connected to CD1 of fliC. Lastly, a valine residue was added at the N-terminus of the construct (Mvax) with the aim of increasing its half-life (Herrera, 2020).

Evaluation of antigenicity, allergenicity, toxicity, crossreactivity, and physicochemical properties of Mvax
The whole sequence of Mvax construct was assessed for antigencity in VaxiJen2.0 using ≥0.5 threshold in tumor model. It utilizes an alignment-independent method predicting an antigen with 70%-89% accuracy (Doytchinova and Flower, 2007). The sequence was further evaluated in ANTIGENpro, with estimated accuracy of 76% using cross-validated experiments (Magnan et al., 2010). AllergenFPv1.0 was used to identify potential allergen sequence in Mvax. This tool generates the highest Tanimoto score to the nearest allergenic sequence in the database (Dimitrov et al., 2014). Sequences with exact match to human proteome, other than the target CTAs, may result to cross-reactivity; thus, the sequence of Mvax was queried against human proteome databases using default settings in the protein basic local alignment search tool proteins (BLASTp). The physicochemical properties of Mvax were estimated in sillico in ExPASy ProtParam tool (https://web.expasy.org/protparam/).

Secondary structure analysis, and tertiary structure modelling with validation
The series of B cell epitopes in Mvax must be flexible, and exposed enough for B cell receptors (BCR) and dissociation constants of docked complexes at 37 o C were calculated in PRODIGY web server which makes use of both intermolecular interactions, and non-interface surface properties (Xue et al., 2016). Molecular dynamics simulation was performed to evaluate the stability of interaction within the complex using plot of root-mean-square deviation (RMSD) per residue. For this process, C-alpha Brownian dynamics was set in 100 ps time, time change 0.01 ps, 3.8Ǻ distance between alpha carbon atoms, output frequency of 10 steps, and force constant of 40 kcal/mol Ǻ2. It was performed in MDWeb server (http://mmb.irbbarcelona.org/MDWeb/) which employs force-field Amber-99sb in GROMACS MD setup with solvation (Hospital et al., 2012).

Antigenicity, allergenicity, toxicity, cross-reactivity, and physicochemical properties of Mvax
Mvax is validated antigenic in Vaxijen server (0.5956), and in ANTIGENpro (0.903634). It is classified as non-allergen, having the highest Tanimoto similarity index of 0.86 with Q96PE2. Potentially toxic sequences were not mapped in the vaccine. Results showed that four epitopes have exact sequence match with other known CTAs-MAGEA6 and CTAG2. These include MAGEA3 CD8+ 137-147, 142-151, 176-186 with MAGEA6; and NY-ESO-1 CD4+ 143-157 with CTAG2. More importantly, none of the epitopes has significant match with other human proteins in the databases. Mvax to effectively bind to it. In this work, Mvax sequence was evaluated for its secondary structure composition, disordered, accessible, and hydrophilic regions. The position of B cell epitopes were particularly investigated. GOR4 web tool (https://npsa-prabi.ibcp.fr/cgi-bin/ npsa_automat.pl?page=npsa_gor4.html) was utilized to evaluate the secondary structure composition of Mvax. Disordered, and solvent-accessible regions were identified in RaptorX (http://raptorx.uchicago.edu/ StructurePropertyPred/predict/). GalaxyTBM tool (http:// galaxy.seoklab.org/cgi-bin/submit.cgi?type=TBM) was used in generating tertiary structure model for Mvax. GalaxyRefine server (http://galaxy.seoklab.org/cgi-bin/ submit.cgi?type=REFINE) was employed to improve resulting tertiary structure model. Different validation tools were employed to check the quality of refined structures. PROCHECK generated Ramachandran plot to show percentage of residues lying within the favoured, and disallowed regions. ProSa-web calculates the z-score of a structure by estimating its deviation from validated x-ray crystallography and NMR structures of native proteins (Wiederstein and Sippl, 2007). Qualitative model energy analysis (QMEAN) score provides an estimate of the degree of nativeness in the structural model by comparing it to experimental structures of similar size. Scores ≤-4.0 indicate a low-quality model (Benkert et al., 2011). Finally, the best structure model for Mvax was chosen and viewed in Pymol.

Structural B cell epitopes
The BL epitopes incorporated in Mvax must be protruded enough so BCRs can bind to it. This work utilized Ellipro to identify structural epitopes on the tertiary structure model of Mvax. As the best structure-based algorithm amongst the others, Ellipro predicts the conformational and the linear epitopes based from protrusion index (PI) of a residue, and provides PI score for each protruded sequence (Ponomorenko et al., 2008).

Secondary structure composition and tertiary structure model of Mvax
Mvax is consist of 16.67% extended strands, 39.63% alpha helix, and 43.70% random coil. Figure 2 shows that the residues of BL epitopes (474-540) lie within the disordered region of Mvax (2A), and have medium to full exposure (2B). The tertiary structure model of Mvax has improved after refinement ( Figure 3A). The residues within the favoured regions increased (90.1% to     Fig.3B) to -4.52 ( Figure  3C), moved closer to z-scores of native proteins. Table 2 shows that the series of BL epitopes (474-540) in the tertiary structure of Mvax, can both function as linear and discontinuous structural epitopes. These sequences are extremely protruded, as indicated by high protrusion scores. Figure 4 shows all CD4+ and CD8+ epitopes docked within the binding groove of HLA. The formation of epitope-MHC docked complexes are energetically favoured (Table 3)     P15 (GSVVGNWQYFF-HLA B*3501) which ΔGbind is even smaller with reference to that of the influenza NP418-HLA B*3501 complex (−8.3 kcal/mol) (Adhikari et al., 2018). Formation of peptide-HLA complex is more favourable as indicated by very small dissociation constants (Kd). All epitopes have good to high binding affinities (Kd < 5.0E-07M) to their HLA binders (Koyanagi et al., 2010;Paul et al., 2013). Figure 5 shows the RMSD plot per residue of each peptide-HLA complex. The mobility of a residue is often represented by its RMSD value during molecular dynamics simulation. The lower the RMSD value, the weaker the mobility, making interactions more stable. Furthermore, all complexes formed have RMSD values between 0 to 1.0 Ǻ, indicating positive and stable interactions (Fu et al., 2018).

Discussion
The pathophysiology of NSCLC is made more complicated by the expression of various cancer-testis antigens (CTA) involved in its tumor-progression mechanisms. These require more extensive, yet more specific antigen-targeting approach to increase its efficacy, and to cover the majority of its cases. A clinical trial which utilized engineered T cell receptor targeting MAGEA3, lead to severe adverse events which might have been due to the cross-reaction of the peptide with MAGEA12. MAGEA12, A1, A8, and A9 were also assayed to be positively expressed in the brain in low levels (Morgan et al., 2013). A recently concluded phase III clinical trial utilized MAGEA3 as an adjuvant immunotherapeutic agent for NSCLC patients, but the disease-free survival rate did not improve (Vansteenkiste et al., 2016). Thus, this work targeted more than 1 type of antigen. The most frequently associated CTAs to NSCLC were utilized with the objective of designing a multi-epitope vaccine that can potentially induce immune responses against NSCLC expressing MAGEA3, MAGEA4, NY-ESO-1, and KK-LC1.
Epitopes were cautiously evaluated to prevent potential cross-reaction with MAGEA12, A1, A8, and A9. Sequences with significant match to these antigens were excluded. Three BL epitopes identified from the extracellular sequence of KK-LC1 are highly antigenic, and can potentially induce humoral immune response against KK-LC1. Because this response is induced with the aid of T helper cells, CD4+ epitopes were also mapped in KK-LC1. CD4+ epitopes were identified from MAGEA3, MAGEA4, and NY-ESO-1. To be able to induce cytotoxic immune responses, CD8+ epitopes were also identified. Consensus approach was employed and at least 2 tools were utilized in the identification of all BL, CD4+, and CD8+ epitopes, increasing the accuracy of prediction. Besides good binding affinity, proteosomal cleavage, and TAP transport scores were also considered. In addition, Mvax possess large global population coverage with CD4+ (81.81%) and CD8+ (84.15%) epitopes. Some MHC binders identified in this study are not currently available in the IEDB tool, and were not included by the tool in its calculations; thus, it must be noted that the estimated %PC in this work can be larger in reality.
The choice of adjuvant is another crucial step in vaccine design. Herein, the type of pathogen recognition receptor (PRR) activated by the adjuvant was carefully considered, as the activation of some toll-like receptors (TLRs) is associated to lung cancer progression (Chatterjee et al., 2014). On the contrary, TLR5 activation was found to have antitumor effects in NSCLC cells (Zhou et al., 2014). Flagellin is a well-studied ligand of TLR5. Due to its reported efficacy and safety, fliC was incorporated as adjuvant in Mvax. Among all 4 domains of fliC, D1 is known to be highly conserved, and has the essential interaction binding-sites to activate TLR5 (Song et al., 2017). Only D1 sequence was included to avoid possible adverse reactions. Overall, the inclusion of flagellin in Mvax may offer antitumor effects while enhancing the immunogenicity of epitopes. EAAAK linkers were used to preserve the bioactivity of fliC D1 (Arai et al., 2001). Total of 17 peptides, merged from 22 epitopes, were adjoined using AAY and GPGPG linkers known to effectively present epitopes in vivo (Jin et al., 2009). Besides efficacy, vaccines must possess safety and stability. Mvax is classified as non-allergen, and non-toxic. In addition, no sequence in the human proteome databases was found to significantly match the epitopes in Mvax, except for the 4 epitopes that match exactly with MAGEA6 and CTAG2. MAGEA6 and CTAG2 are expressed in testis, and in very low levels in placenta but are observed to be highly expressed in NSCLC and various cancer types (Wang et al., 2004;McCormack et al., 2013;Pineda et al., 2015;Tsang et al., 2020). Thus, the potential crossreactivity of Mvax with MAGEA6 and CTAG2 may offer additional antitumor benefits. The vaccine is classified as stable (instability index <40), and can also be utilized in areas with warmer climates as it possess thermostability indicated by higher aliphatic index. In addition, the valine residue linked to the N-terminus of Mvax has notably lengthen its half-life.
Mvax is classified as antigenic in tumor models. It has large percentage of random coils, disordered or flexible regions, and exposed sequences, more importantly within the series of B cell epitopes (Figure 2), which provide evidence for the existence of antigenic regions (Barlow et al., 1986). Antigenicity is further emphasized by the high protrusion scores of the B cell epitopes in the tertiary structure of Mvax. Moreover, the tertiary structure model used for Mvax has good and acceptable quality. The need for the refinement was validated by the significant improvements in the quality of refined tertiary model. The immunogenicity of CD4+ and CD8+ epitopes in Mvax is supported by the stability of epitope-HLA complex formed.
In conclusion, this is the first work to use reverse vaccinology approach in designing a multi-epitope vaccine targeting MAGEA3, MAGEA4, NY-ESO-1, and KK-LC1 in NSCLC. In silico assessments showed that Mvax confers antigenicity, immunogenicity, stability, and safety. In vitro and in vivo studies are anticipated.

Author Contribution Statement
The author solely conducted all the requirements for the accomplishment of this work.