Molecular characterization of three enteroviral strains isolated in Kuwait from young children with serious conditions

Introduction: Human enteroviruses are single stranded RNA viruses associated with many serious diseases such as encephalitis and myocarditis. They consist of up to 100 immunologically and genetically distinct types. Three enteroviral isolates, 2104, 3936 and 3988, were previously isolated from patients with neurological disorders or sepsis-like illness. In this study, the molecular characterization of the isolates was investigated. Methodology: A full genome sequencing was performed by Sanger method, followed by phylogenetic and bootscanning analyses. A detailed analysis of genetic differences between the clinical and prototype isolates were investigated by mapping polymorphisms at nucleotide and amino acid levels, and by comparing RNA secondary structure in the noncoding regions. Results: Based on the phylogenetic analysis of the VP1 gene and complete genome, 2104 was typed as coxsackievirus B1, 3936 as coxsackievirus B5, and 3988 as echovirus 7. Similarity and bootscan plots provided support for intraand intertypic recombination with crossover points occurring mainly along the nonstructural coding regions of the isolates. A sequence divergence of 12 to 14% was detected in the 5’-noncoding region between the clinical isolates and their corresponding prototype strains. Synonymous and nonsynonymous substitutions could be also mapped to different coding regions of the isolates, including those coding for the Puff, Knob and the hydrophobic pocket of the capsid. Examination of relative frequencies of synonymous and nonsynonymous substitutions in different coding regions of enteroviral isolates showed no evidence for selective pressure. Conclusion: The results provided a better understanding of the genetic variations, evolution and adaptation of enteroviruses in Kuwait.


Introduction
Enteroviruses (EVs) can cause many different diseases affecting a variety of target organs causing neurological, respiratory, and cardiovascular diseases [1]. They belong to the Picornaviridae family. The nonsegmented positive-strand RNA genome is about 7,500 nucleotides long, encoding the four structural proteins, VP4, VP2, VP3 and VP1 (forming the P1 region), and the seven non-structural 2A to 2C (P2 region) and 3A to 3D (P3 region) viral proteins. The open reading frame (ORF) is flanked by the 5' and 3' untranslated regions (5'UTR and 3'UTR). The 5'UTR contains the internal ribosome entry site (IRES) responsible for capindependent initiation of translation, as well as other secondary structural elements responsible for genome replication. The 3'UTR plays a role in initiating the synthesis of negative-strand RNA [2].
In Kuwait, EVs are highly prevalent in patients with aseptic meningitis or encephalitis, and in patients suffering from febrile illness. Genotyping of detected viral sequences in clinical samples showed the presence of echoviruses 4, 7, 9, 11 and 30, coxsackieviruses B2-6, and other serotypes [4][5][6]. Since only a limited number of complete sequences of clinical EV isolates have been reported so far, compared to complete nucleotide sequences available for all of the prototype EV strains, a full-genome analysis of clinical EVs might contribute to foster better understanding of the genetic variations, evolution and adaptation of the circulating EV strains, and their role in EV outbreaks and virulence.
In this study, the complete nucleotide sequences of three clinical enteroviral isolates were compared to the corresponding prototype strains. A detailed analysis of genetic differences between the clinical and prototype isolates were investigated by mapping polymorphisms at nucleotide and amino acid levels, and by comparing RNA secondary structure in the noncoding regions. To shed light on the evolution of the three clinical EV isolates, the phylogenetic relationship of the clinical isolates to other human enteroviruses were determined, a recombination analysis was performed, and the type of evolutionary selection pressure acting at the molecular level was identified.

Viruses
Three EV strains isolated on Vero cell line from three different patients were used in this study. The 2104 strain was isolated in 2008 from the cerebrospinal fluid of a 1-year-old female with respiratory distress and aseptic meningitis. The 3936 strain was isolated in 2009 from the stool of a 4-year-old male with aseptic meningitis. Finally, the 3988 strain was isolated in 2009 from the blood of a neonate with sepsis-like illness.

Nucleotide sequencing of enteroviruses
Full genome sequencing of the enteroviral strains was carried out as described previously [7]. Briefly, amplification of enteroviral genome was done by RT-PCR using "primer-walking" strategy. The Qiagen One-Step RT-PCR Kit (Qiagen GmbH, Hilden, Germany) was used in all amplification reactions. The amplified PCR products were analyzed on 2% agarose gel electrophoresis. The PCR product was then purified using the Wizard SV GEL and PCR Clean-Up System kit (Promega Corporation, Madison, USA), and the nucleotide sequences of both DNA strands were determined by direct double-strand DNA cycle sequencing in two directions using the ABI PRISM BigDye Terminator Cycle Sequencing v3.1 kit (Applied Biosystems, Foster City, USA). Post sequencing PCR purification was performed to remove unbound fluorescent dye deoxy terminators using BigDye XTerminator Purification kit (Applied Biosystems, Foster City, USA). The samples were then denatured for 2 minutes at 95˚C, and immediately chilled on ice, and loaded on an ABI 3130xl Genetic Analyzer (Applied Biosystems, Foster City, USA). DNA sequences were subjected to electrophoresis on a 50-cm capillary array using POP6 polymer (Applied Biosystems, Foster City, USA) as a separation medium, and analyzed using the Sequencing Analysis Software v 5.3.1 (Applied Biosystems, Foster City, USA).
The derived sequences of all segments were assembled by using The Molecular Evolutionary Genetics Analysis (MEGA) software version 4.02 [8]. A homology search was conducted using the BLAST server at the National Center for Biotechnology Information (National Library of Medicine, Bethesda, USA). The resulting DNA sequences were aligned with complete prototype HEV genome sequences available in the GenBank database. Alignment analysis was done using the ClustalW method in the MEGA software version 4.02 [8]. Highly variable regions were identified in the coding and non-coding sequences. The complete genome sequence of EV sequences from this study has been deposited in GenBank under accession number KP260537, KP233830 and KP202389.

Prediction of RNA secondary structures
Putative RNA secondary structures of the 5'UTR and 3'UTR of the clinical enteroviral isolates were constructed using RNAstructure software version 5.7 [9], and the lowest folding free energy levels for the optimal structures were compared to those of corresponding prototypes.

Evolutionary divergence analysis
To determine whether the mutations supposed to be detected in the enteroviral isolates are the consequence of selection pressure, the average number of synonymous substitutions per synonymous site (dS) and the average number of nonsynonymous substitutions per nonsynonymous site (dN), for each coding region were calculated for all available EV genomes using the Nei-Gojobori (p-distance) method in MEGA software [10]. Identical genes and genes with evidence of recombination were excluded from the analysis. The dN/dS ratio was used as an index to assess positive selection; dN/dS >1 indicates positive (diversifying) selection, dN/dS = 1 indicates neutral selection, whereas dN/dS <1 means negative (purifying) selection [11].

Phylogenetic and recombination analysis
Phylogenetic analysis was performed for the full enteroviral genome, and for different genetic regions separately. The Phylogenetic trees were reconstructed using neighbor-joining method [12], with evolutionary distances computed using the Kimura 2-parameter method [13]. A bootstrap test with 1,000 replicates was used to estimate the confidence of branching patterns in the trees [14]. The resulting alignments were analyzed for evidence of recombination using the Simplot software package version 3.5.1 (Johns Hopkins University, Baltimore, USA). Similarity plots were built with a window size of 200 nt, incremental steps of 20 bases, and a transition-transversion ratio of 2. To assess potential recombinational relationships, aligned sequences were subsequently analyzed by using the bootscanning method implemented in SimPlot. Phylogenetic trees were generated for each 200nucleotide window in 20-nucleotide increments, by using the neighbor-joining method, with Kimura twoparameter model, and a transition-transversion ratio of 2. The bootstrap values for all possible sequence comparisons were plotted as a function of genome position.

Phylogenetic analysis
Based on the similarity of the VP1 nucleotide sequence of the three EV isolates with that of other strains available in GenBank database, 2104 was identified as CV-B1 (best BLAST hit, 99%), 3936 as CV-B5 (best BLAST hit, 95%), and 3988 as echovirus 7 (best BLAST hit, 88%). The phylogenetic analysis was then conducted for the complete nucleotide sequences of HEV-B species including all available full genome sequences for CV-B1, CV-B5 and echovirus 7 ( Table 1). The phylogenetic dendrogram obtained with the full EV genome ( Figure 1A) and the structural coding P1 region ( Figure 1B)  The results of the phylogenetic analysis of the noncoding and nonstructural coding regions (P2 and P3) were not consistent ( Table 2). The three EV isolates could not be typed using P3 region, whereas using P2 region, only the 2104 isolate could be typed as CV-B1 (bootstrap value >70%). Furthermore, the phylogenetic analysis conducted using 5'UTR was not reliable for the 3988 isolate (Table 2). This conflicting tree topologies Figure 1. Phylogenetic analysis based on the full-genome (A) and P1 region (B) of human Enteroviruses B. The evolutionary history was inferred using the Neighbor-Joining method. The bootstrap consensus tree inferred from 1000 replicates is taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are collapsed. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Kimura 2parameter method and are in the units of the number of base substitutions per site. The black circles next to the taxa represent enteroviral sequences from this study. Each enterovirus reference sequence is labeled with its corresponding type.
in the 5' UTR, P2 and P3 genomic regions may be due to a previous recombination event resulting in chimeric genome sequences.
The 5'UTR of 3988 isolate had 80-95% nucleotide sequence similarity with EV-B86 (nucleotides 400-600), whereas P1 region had 80-91% nucleotide sequence similarity with the echovirus 7 UMMC strain. However, P2 region was most closely related to the EV-B86 BAN00-10354 strain, whereas P3 region had 80-90% nucleotide sequence similarity with the echovirus 7 DH22G strain (5400-7300), and 78-88% similarity with the EV-B86 BAN00-10354 strain (nucleotides 5057-7300) ( Figure 2E). Evidence of recombination between the echovirus 7 UMMC strain, the EV-B86 BAN00-10354 strain, and the E7 DH22G strain was found by bootscan analysis ( Figure 2F). Table 3 shows the nucleotide and amino acid similarities of the 2104 isolate with the prototype and other CV-B1 strains. Within the complete genome, the similarities between the clinical isolate and the prototype strain were 80 % for nucleotides and 95% for amino acids. The highest nucleotide sequence similarity was found with the 1167438-pmMC strain (88.2%) whereas the highest amino acid sequence similarity was found with the Chi07 strain (96.25%) and the 1167438-pmMC strain (96.20%). The 5'UTR of the 2104 strain is 5 nucleotides longer than that of the Conn-5 strain. However, 5'UTRs of the 2104 and Conn-5 strains have comparable minimum free energy secondary structures (G= -253.7 Kcal/mol and -251.7 Kcal/mol, respectively). The two virus genomes share identical nucleotide lengths within the ORF and 3'UTR.       The 3'UTR of the Conn-5 strain has a thermodynamically more favourable free energy level (G= -28 Kcal/mol) compared with that of the 2104 isolate (G= -23.8 Kcal/mol). The 2A, 2B, 3A and 3D proteins had the highest level of amino acid sequence variability ( Table 3). Examination of relative frequencies of synonymous and nonsynonymous substitutions in different genomic regions of the CV-B1 isolates showed no evidence for selective pressure (Table 4). Based on alignment with the threedimensional structure of coxsackievirus B3 [15], a number of amino acid polymorphisms could be mapped to specific capsid regions. In comparison with the Conn-5 strain, seven amino acid substitutions were noted in the Puff region of the VP2 protein ( Figure 3A, in bold), four in the Knob region of the VP3 protein ( Figure 3A, underlined), and two within the hydrophobic pocket region of the VP1 protein ( Figure  3A, double underlined).

Sequence analysis
The similarities between the 3936 isolate and the CV-B5 prototype strain (Faulkner) were 79% for nucleotides and 94 % for amino acids ( Table 5). The highest nucleotide and amino acid sequence similarity was found with the 1654/85/UK strain (84.6% for nucleotides and 92.4 % for amino acids). The 5'UTR of the 3936 strain was 1 nucleotide longer than that of the Faulkner strain. Both the 5' and 3'UTR RNA secondary structures of the 3936 isolate (G= -262.6 Kcal/mol and -45.4 Kcal/mol) have a thermodynamically more favourable free energy level compared with those of the Faulkner strain (G= -255.4 Kcal/mol and -35.3 Kcal/mol). When the extra nucleotide in the 5'UTR of the 3936 isolate was removed, the RNA folded again at a thermodynamically more stable free energy level (G= -262.8 Kcal/mol) than that of the Faulkner strain. No deletions or insertions were observed in the ORF region. The highest level of amino acid sequence variability were detected at the VP3 and 2A proteins (Table 5). Nine amino acid substitutions were noted in the Puff region of the VP2 protein ( Figure 3B, in bold), four in the Knob region of the VP3 protein ( Figure 3B, underlined), and one within the hydrophobic pocket region of the VP1 protein (Fig 3B, double underlined). Calculation of the dN/dS ratios indicated no evidence of selective pressure exerted in the structural and nonstructural proteins ( Table 6).
The 3988 isolate has 80% nucleotide and 96% amino acid sequence similarity with the echovirus 7 prototype strain Wallace ( Table 7). The highest nucleotide similarity was found with the DH22G strain (82.88%) and the UMMC strain (82.85%), whereas the amino acid sequence similarity was higher with the DH22G strain (97.30%) than with the UMMC strain (96.89%). The 3'UTR of 3988 strain is 2 nucleotides longer than that of the Wallace strain. The two virus genomes share identical nucleotide lengths within the ORF and 5'UTR. The 5'UTRs of the Wallace and 3988 strains have comparable minimum free energy secondary structures (G= -261.0 Kcal/mol and -264.7 Kcal/mol, respectively). However, the 3'UTR of the 3988 strain has a thermodynamically more favourable free energy level (G= -35.5 Kcal/mol) compared with that of the Wallace strain (G= -31.1 Kcal/mol). The 2A, 3A and VP1 proteins of the 3988 isolate had the highest level of amino acid sequence variability compared to other viral proteins. Two amino acid substitutions were noted in the Puff region of the VP2 protein ( Figure 3C, in bold), and three in the Knob region of the VP3 protein ( Figure 3C, underlined). Nonsynonymous mutations were not detected in the hydrophobic pocket region of the VP1 protein (Fig 3C,  double underlined). Furthermore, no evidence for selective pressure was found in the structural and nonstructural coding regions (Table 8).

Discussion
The CV-B1 2104 strain was isolated in 2008 from cerebrospinal fluid of 1-year-old female with respiratory distress and neurological signs. It had shown decreased susceptibility to the antiviral activity of MxA protein [16], and increased potential to induce inflammatory cytokines [17], compared to the prototype strain. The molecular basis for the phenotypic differences observed between the clinical and prototype CVB1 has not been yet investigated. The 2104 isolate shared high nucleotide and amino acid sequence similarity with CV-B1 Chi07 strain which caused 6 neonatal deaths in USA in 2007 due to heart failure [18,19]. However, results of similarity plot and bootscanning analyses suggest that a recombination had occurred in the structural and nonstructural coding regions between CV-B1 11167438_pmMC and echovirus 1 Farouk.
The CVB5 3936 strain was isolated in 2009 from stool of 4-year-old male with aseptic meningitis. It is most closely related to CV-B5 1954/85/UK strain that was isolated from the vesicular fluid within skin lesions of a patient with hand food and mouth disease (HFMD) in UK in 1985; its structural coding region was reported to be most closely related to that of swine vesicular disease virus [20]. However, according to the bootscan analysis, the 3936 strain is a mosaic virus resulting from recombination between CV-B5 1954/85/UK, EV-B86 BAN00-10354 and EV-B88 BAN01-10398 strain. The last two strains were isolated from stool specimens of patients presenting with acute flaccid paralysis (AFP) in Bangladesh during the years 2000 and 2001 [21].
The E7 3988 strain was isolated in 2009 from the blood of a neonate with sepsis-like illness. It shared high nucleotide and amino acid sequence similarity with E7 UMMC at P1 genetic region. UMMC strain was isolated from the cerebrospinal fluid of a Malaysian child with HFMD and fatal encephalomyelitis in 2000 [22,23]. P2 region was most closely related to EV-B86 BAN00-10354 strain, whereas P3 region was most closely related to E7 DH22G strain that was isolated from the stool of a Chinese patient with HFMD in 2012 [24]. Recombination between E7 UMMC, E7 DH22G, and EV-B86 BAN00-10354 strains was detected with strong bootstrap support.
There was no evidence of positive selective pressure exerted on the three EV strains tested in this study. However, a sequence divergence of 12-14% was detected in the 5'UTR between the clinical isolates and their corresponding prototype strains. Mutations in the EV 5'UTR which plays an important role in the control of replication and translation [25], have been proposed as major determinants of cardiovirulence [26], neurovirulence [23], and myopathogenicity [27]. Similar to the 5′ UTR, the EV 3′ UTR contains secondary structural elements and has been shown to play a role in both translation and replication [28]. It is noteworthy that the secondary structures of both untranslated regions of the 3936 isolate fold at a thermodynamically more favourable free energy level than those of the prototype strain. More investigations are required to determine whether the lower free energy level of EV RNA secondary structures observed in this study is associated with viral virulence and more severe clinical outcome, as has been described earlier for EV71 [29].
Interestingly, compared to the prototype strain, several amino acid substitutions were noted in the Puff, Knob, and the hydrophobic pocket regions of the viral capsid protein. The VP2 Puff region is known to be the largest and most variable surface loop of VP2 among group B coxsackieviruses [15]. It is a major neutralization site in both the polioviruses and rhinoviruses, suggesting its role in receptor binding or recognition [30]. The three clinical isolates in this study had amino acid substitutions in the Puff region of the VP2 protein. Amino acid substitutions in the Puff region of CVB3 VP2 protein were previously linked to cardiovirulence [31]. Like the Puff region of VP2, the Knob region of VP3 is known to contain a major neutralization site for both poliovirus and rhinovirus [30,32]. Furthermore, mutations in the VP3 Knob region have been suggested to have a role in the CVB3induced cardiovirulence [31]. Whether or not the nonsynonymous mutations detected in the capsid region of the three clinical isolates play a role in virulence and pathogenesis of enterovirus infection, needs further investigation.

Conclusion
Phylogenetic analysis of the enterovirus strains suggested that the cases resulted from circulation of different genetic lineages of species B enteroviruses. Results from the similarity plots and bootscan analyses indicated accumulation of multiple inter-and intratypic recombination events along the EV genome. Whether the observed recombination results or not into better adaptation to new environments, or more potential to persist, replicate and cause disease, needs to be investigated.