Characterization of H5N1 influenza A virus that caused the first highly pathogenic avian influenza outbreak in Saudi Arabia

Introduction: Saudi Arabia (SA) experienced a highly pathogenic avian influenza (HPAI) H5N1 outbreak in domesticated birds in 2007. Methodology: Forty-three hemagglutinin (HA) and 41 neuraminidase (NA) genes of HPAI H5N1 viruses were sequenced and phylogenetic analyses of completely sequenced genes were performed to compare with other viral HA and NA gene sequences available in the public databases. Results: Molecular characterization of the H5N1 viruses revealed two genetically distinct clades, 2.2.2 and 2.3.1, of H5N1 viruses circulating in the area. Amino acid sequence analysis of the HA gene indicated that the virus from 2.2.2 contained the sequence SPQGERRRK-R/G at the cleavage site, while the virus from 2.3.1 contained the sequence SPQRERRRK-R/G. Additionally, a few mutations with amino acid substitutions such as M226I at N-link glycosylation site were identified in two of these isolates. Amino acid sequence of the NA gene showed a 20-amino-acid deletion in the NA stalk region, required for enhanced virulence of influenza viruses and its adaptation from wild birds to domestic chickens. As close contact between humans and birds is unavoidable, there is a need for a thorough understanding of the virus epidemiology, factors affecting the spread of the virus, and molecular characterization such as phylogeny and substitution rates of H5N1 viruses circulating in the region. Conclusion: Two genetically distinct clades were found to be circulating in the country, which could likely result in recombination and emergence of more virulent viral strains. These findings could be helpful for the authorities devising control measures against these viruses.


Introduction
Avian influenza is an acute infectious disease of poultry, waterfowl, wild birds, and animals, and is able to be zoonotically transmitted to humans. This disease, caused by the influenza A virus, is listed as one of the most dangerous diseases for animal health by the World Organization for Animal Health (OIE; http://www.oie.int/eng/normes/mmanual/A_summry.h tml). Influenza A virus belongs to the family of Orthomyxoviridae, of which the envelope is embedded by two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA) [1]. Currently, there are 18 HA and 11 NA types of known influenza A viruses, based on the composition and activity of the HA (from H1 to H18) and the NA (from N1 to N11) genes [2,3]. The numerous types of HA and NA genes enable the components going through genetic reassortments between HA and NA for generation of new subtypes [4]. Highly pathogenic avian influenza (HPAI) H5N1 virus was detected in China for the first time in 1996 [5]. From that time, viral infection is being continuously detected and recorded in avian species and human beings [6,7]. In April 2005, a large-scale outbreak of H5N1 viruses was recorded in bar-headed geese in China (Anser indicus) in Qinghai Lake. These viruses were genetically characterized as clade 2.2 [8]. Later on, these H5N1 clade 2.2 viruses were detected in wild birds and domestic poultry in various countries of Asia, Europe, and Africa. In South Asia, H5N1 was first detected in India and Pakistan during February 2006 [6,7]. The first reports of H5N1 virus infection from Bangladesh and Nepal in poultry were in March 2007 and January 2009, respectively [6]. The possible source of these introductions could be due to trade in poultry and wild bird migration across the borders [8,10]. It has also been reported that six humans in Bangladesh and Pakistan were diagnosed with H5N1 virus, with one death in Pakistan [6]. The continuous detection of the H5N1 virus in South Asia indicates that the virus has now become endemic in some parts of the region. Since late 2005, both wild birds and poultry were affected by HPAI viruses subtype H5N1 in Europe, the Middle East, and Africa [11,12]. In the European Union, the disease has been seen mainly in wild birds, and a few outbreaks have occurred in poultry as a result of the spread from wild birds or from poultry to other domestic birds. Therefore, viral infections have been promptly eradicated in the vast majority of the European countries [13], but in contrast, the situation in Africa and in the Middle East is quite different. Since the first outbreaks reported in early 2006, there has been extensive spread and circulation of the H5N1 virus in poultry and domesticated birds, but only an extremely limited number of isolations from wild birds [14,15] [16]. These birds are traded to the Middle East from Central Asia [17]. Trading of live birds across the border without proper quarantine measures represents a potential vehicle for the introduction and spread of avian influenza viruses in the KSA and in other countries of the Gulf area. In addition, rearing of these birds may result in an increased risk of human exposure to H5N1, compared with other avian rearing practices. An improved understanding of the epidemiology and of the viral characteristics of H5N1 viruses circulating in the region would appear essential in managing the human health risks. There have been more than five hundred outbreaks in the KSA of HPAI H5N1 in falconry, backyard poultry, and later in industrially reared poultry [18]. During the last quarter of 2007, HPAI H5N1 virus spread rapidly in the Al-Kharj area and other areas near Riyadh, and was limited to domesticated birds. The aim of this study was to examine the genetic diversity of different clades of H5N1 viruses circulating in Saudi Arabia and to establish genetic relatedness, phylogeny, and the substitution rate of viral isolates compared to other isolates around the world. Understanding of virus diversity and epidemiology is essential for devising resistance and control strategies against these viruses.

Sample collection and virus isolation
Forty-three birds (28 chickens, 4 turkey, 5 ostriches, 4 falcons, and 2 houbara) showing typical symptoms of HPAI H5N1 virus infection, identified in Riyadh during the 2007 outbreak in different localities, were used in this study. Tissue samples from tracheal and cloacal swabs were taken by the veterinary health authorities from the Ministry of Agriculture. All samples were collected in phosphate buffer saline (PBS, pH 7-7.4), supplemented with antibiotics, and transported to the Central Veterinary Diagnostic Laboratory (CVDL) in Riyadh for virus isolation, identification, and propagation. Tracheal and cloacal swabs were inoculated into specific-pathogen-free eggs. The amnio allantoic fluids (AAF) of inoculated embryos were harvested 72 hours post-inoculation. The harvested AAF was tested for hemagglutinating activity and typed according to standard methods [18] using subtype-specific antisera obtained from (VLA Weybridge, Surrey, UK).

RNA extraction, cDNA synthesis and polymerase chain reaction (PCR)
Viral RNA was extracted from infected amino allantoic fluid using QIAamp Viral RNA mini kit (Qiagen, Hilden, Germany) and was reversetranscribed using RevertAid H Minus First Strand cDNA Synthesis Kit (Fermentas, Glen Burnie, MD, USA) using universal 12-mer primer Uni 12 (5-AGCAAAAGCAGG-3) as per the manufacturer's protocol. PCR amplification and sequencing of HA and NA genes was performed using specific primers ( Table 1).

Sequence assembly, manipulation, and analysis
PCR product was gel purified and sequenced using an automated DNA sequencing system (ABI 3100) and BigDye Terminator v3.1 cycle sequencing kit (Applied Biosystems, Foster City, CA, USA) according to the manufacturer's instructions. Sequences were assembled and analyzed using the Lasergene version 8 package (DNASTAR, Madison, Wisconsin). The sequences obtained were compared with sequences available in public databases using BLASTn (National Center for Biotechnology Information, NCBI). The coding regions and coding capacities of HA/NA clones were predicted using the EditSeq module of Lasergene.

Phylogenetic analysis and estimation of nucleotide substitution rates
Two viral gene segments (HA and NA) were analyzed phylogenetically, together with representative nucleotide sequences that have been recently circulating in Asia. These sequences were retrieved from the NCBI Influenza Virus Sequence Database [19] and the EpiFlu database of the Global Initiative on Sharing All Influenza Data [20]. All nucleotide sequences were aligned by using the ClustalW tool in MEGA 6.6 (http://www.megasoftware.net/mega.html) and trimmed to equal lengths, but at least 90% of the coding region was follows: HA, 1630 nt; NA, 1350 nt. Phylogenic trees were generated using neighborjoining (NJ) methods using MEGA 6.6. Bootstrapping support for tree topologies was performed using NJ methods with 1,000 replicates. For two (HA and NA) datasets, nexus files were generated after aligning sequences in MEGA 6.6. The general time reverse (GTR+E) substitution model was chosen and the nucleotide dataset was partitioned into three sets (codon positions 1, 2, and 3). Coalescent Bayesian skyline was chosen in the tree panel for the tree prior, and each dataset was run for a chain length of 4×10 7 to ensure an adequate sample size in the Markov Chain Monte Carlo (MCMC) panel of the BEAUTi module in BEAST [21].

Influenza A viruses in Saudi domesticated birds
The Kingdom of Saudi Arabia experienced a severe outbreak of H5N1 influenza viruses in 2007. Samples were collected from freshly dead birds, or from those showing clinical signs, i.e., respiratory and /or nervous manifestations, marked blue to purple discoloration of un-feathered parts of the body, or swelling of the infraorbital sinuses and head region. In most cases, the occurrence of sudden high mortality in a flock without clinical symptoms is the only suggestive sign for HPAI. Forty-three samples were analyzed, and out of these samples, molecular and

Molecular characteristics of Saudi H5N1 viruses
Alignment of the sequences of both genes (HA and NA of H5N1 viruses isolated in this study) with representative sequences from NCBI and Global Initiative on Sharing All Influenza Data identified some amino acid residues in each genomic segment that were molecular signatures of the viruses isolated in the South Asian geographic region (Tables 2 and 3).

HA gene
All H5N1 viruses isolated in Saudi Arabia had multiple basic amino acids at the cleavage site (CS) at positions 321-333 of the HA gene, which is a marker for high pathogenicity in chickens. Clade

NA gene
All the Saudi H5N1 viruses had a 20-amino-acid deletion (positions 49-68) in the NA stalk region, which is reported to be required for influenza viruses to adapt from wild aquatic birds to domestic chickens and for enhanced virulence in mice [22,23]. Amino acids 116V, 119E, 136Q, 156R, 199D, 223I, 247S, 275H, 277E, and 295N (N2 numbering) in the NA conferred sensitivity to oseltamivir and/or zanamivir in the H5N1 isolates [24]. Most of the isolates had NA 149V, suggesting sensitivity to zanamivir, but two mutations 149I (NA/houbara/Saudi Arabia/37/2007) were found at the position that had not been previously reported to cause resistance.

Phylogenetic analysis of Saudi H5N1 isolates
To track the evolutionary relationships of the H5N1 viruses isolated from the KSA with viruses representing different H5N1 clades and lineages, as well as other influenza viral subtypes, clade affiliations of all viruses included in the study were assigned in the HA phylogenetic tree.

H5N1 viruses from clade 2.2.2
Phylogenetic analysis of the HA gene of 32 H5N1 isolates from the KSA showed that 29 isolates belonged to clade 2.2.2 ( Figure 1A). In the phylogenetic tree, HA genes clustered together and were closely related to H5N1 viruses from clade 2.2.2  Figure 1A).
Phylogenetic analysis of the NA gene of 42 H5N1 isolates from the KSA showed that these isolates belonged to clade 2.2.2 ( Figure 1B). In the phylogenetic tree, NA genes clustered together and

Estimation of nucleotide substitution rates for HA and NA genes
The mean nucleotide substitution rates for the HA and NA genes were determined using recombination free datasets with the relaxed clock and Bayesian skyline plot method. For each dataset, the sequences were partitioned into the three-codon positions. The mean substitute rates for the HA gene and NA gene were 2.036×10 -3 and 2.072×10 -3 substitutions/nucleotide/year, respectively ( Table 4). The mean substitution rate for different codon positions was also measured for both the genes (HA and NA) from Saudi isolates. The mean substitution rate at the third codon position of HA gene was higher (1.718) than codon positions 1 and 2 (0.767 and 0.516, respectively) (Table 4). Similarly, the mean substitution in NA genes at codon position 3 was more than at codon positions 1 and 2 (0.754 and 0.32) ( Table 4). The mutation rate and mean substitution rate at all three positions in HA and an NA gene was almost similar.

Discussion
Saudi Arabia experienced its first major HPAI H5N1 outbreak in February 2007 in the falconry and poultry sectors, and viruses belonging to two different lineages clade 2.2.2 and 2.2.3 [16] were isolated. Phylogenetic analyses suggested that there have been two separate introductions of H5N1 into the KSA. In this study, we extended the analysis and analyzed 43 samples from the same outbreak and found that viruses belonging to clade 2.3.1 are also circulating in the country. Our analysis showed that these viruses might have been introduced into the KSA by migratory birds from Mongolia and Nigeria in early 2007 ( Figure 1A). Since then, the H5N1 virus has become endemic to the poultry in the country. Phylogenetic analyses of the clade 2.2.2 isolates from the KSA revealed a close relationship with H5N1 viruses from Qinghai, China, and with wild birds from Mongolia. Although the exact pathway of introduction of clade 2.2.2 into the KSA remains unclear, it is possible that virus transmission to neighboring territories occurred through birds, after the virus had been already introduced into Nepal from Mongolia through migratory birds. Clade 2.2.2 is currently circulating in poultry in the KSA. Phylogenetic analyses of the clade 2.3.1 isolates from the KSA revealed a close relationship with H5N1 viruses from China. We found only three viruses from the collection of 43 samples representing this clade; therefore, the diversity and prevalence of these viruses in the region need further clarification. All HPAI H5N1 viruses isolated in this study were detected in domesticated birds of Al-Kharj area, which underlines the important role that these birds can play in perpetuating and maintaining both subtypes. Our sample size and sampling area was not large enough, and thus may not be representative of all farms in the country. The phylogenetic analyses suggest that both clades 2.2.2 and 2.3.1 might have had one initial introduction each into the region by migratory birds, followed by an intensive evolutionary process driven by poultry movements. New H5N1 viruses were introduced into Saudi Arabia either via wild-bird migrations or by the poultry trade, but they have not persisted in the poultry population.
Analysis of the amino acid sequences of the Saudi H5N1 HA proteins revealed a pattern typical of HPAI. The signal peptide consisted of 16 amino acids, including 11 nonpolar residues, so that the peptide possessed the high level of hydrophobicity that is necessary for initiation of HA protein synthesis and successful virus propagation [22]. The matured HA of the H5N1 Saudi isolates contained the connected segment between HA1 and HA2 chains, the cleavage site, which consists of basic amino acids PQGERRRKKR/GLF and PQRERRRKKR/GLF. The multi-basic cleavage site is one of the important characteristics of the highly pathogenic influenza virus strains [25,26]. It is known that this structure may be cleaved by both trypsin-like extracellular and furinlike cellular proteases. Moreover, this sequence possesses a high level of hydrophilicity that, in turn, promotes the high accessibility of this site for proteolytic enzyme action. Furthermore, amino acid related to the receptor binding site 226I has been changed in one isolate A/Saudi Arabia/33/2009 (Table  2), is an indication of preference to bind to the avian receptor as reported previously [22]. We also observed that Saudi H5N1 isolates possessed residues S120, D124, S129, Q138, R140, S141, N154, N155, T159, R162, R189 (H5 numbering) (Tables 2 and 3), described as important antigenic sites of clade 2.2 HPAI H5N1 viruses [27]. Several mutations found in the Saudi isolates have been previously reported to cause binding to a-2, 6 receptors and increased pathogenicity in mice [28,29]. In addition, some other amino acids (Y98, S136, W153, H183, E190, and L194) of the receptor-binding pocket that were identified among the isolates are known to bind preferentially to a-2,3-linked but not to a-2,6-linked sialic acids [22,30]. Molecular analysis showed that the NA protein of one of the isolates had the I117V mutation, which may cause reduced susceptibility to the commonly used neuraminidase inhibitor oseltamivir. The H5N1 viruses isolated in the KSA had the NA protein with deletion of 20 amino acids, similar to the isolates from Israel and Gaza [31,32]. This is typical of this subtype's strains and this deletion enables the viruses to broaden the host range from wild birds to domesticated birds. H5N1 viruses isolated from poultry contain this deletion in the NA protein [22,23]. All the presently studied isolates contained amino acid residues in the catalytic and framework sites, which have been shown elsewhere to be inherent in viruses sensitive to NA-inhibitor drugs [24,33,34]. To evaluate the real effect of the amino acid substitutions on the antigenicity of these viruses, a series of recombination mutants should be generated, and the cross-reactivities of the antisera to these mutants should be determined.
Our molecular and phylogenetic analyses of the H5N1 viruses showed that the majority of H5N1 isolates from the KSA from both clade 2.2.2 and clade 2.3.1 were genetically similar to and clustered together with contemporary H5N1 isolates from Qinghai, Cina. The viruses from clade 2.2.2 did not show any evidence of genetic reassortment and made a separate cluster in the phylogenetic tree. They seemed to have established a stable lineage for the period of their circulation in the KSA without reassorting with other subtypes. Of the H5N1 viruses from clade 2.2.2, 99% had common ancestry and formed an evolutionary distinct "Saudi" cluster in both the HA and NA gene phylogenetic trees. To analyze Saudi isolates in a global perspective, the substitution rate of the HA genes was determined. The mean substitute rates for the HA and NA genes were 2.036×10 -3 and 2.072×10 -3 substitutions/nucleotide/year, respectively. This substitution rate is lower than the substitution rate estimated for the influenza virus reported earlier (4.23×10 -3 and 4.27×10 -3 substitutions/nucleotide/year, respectively) [35,36]. The obvious reason could be the controlled or lesser movement of these birds to Saudi Arabia. The mean substitution rate for different codon positions was measured for both the genes (HA and NA) from Saudi isolates. The mean substitution rates at the third codon position in HA and NA genes were higher (1.718 and 1.928) than at codon positions 1 and 2 (0.767, 0.754 and 0.516, 0.32, respectively). The higher mutation at the third codon position in HA and NA gene is plausible, due to its wobble position in the genetic code.
The results of our study emphasize the need for continuous surveillance activities in Saudi Arabia that would enable detection of any newly emerging avian influenza viruses with pandemic potential.