Swine Influenza A Evolution via Recombination – Genetic Drift Reservoir
Correspondence: (Login to view email address)
- Recombinomics, Inc
PDF (116.5 KB)
- Document Type:
- Manuscript
- Date:
- Received 07 July 2007 03:02 UTC; Posted 10 July 2007
- Subjects:
- Ecology, Genetics & Genomics
- Abstract:
The looming influenza pandemic has focused attention1-4 on the rapid evolution of H5N1 and other human and avian serotypes. The basic tenets of influenza genetics5 define gradual changes as drifts caused by point mutations created by a polymerase that lacks a proof reading function. More abrupt changes have been linked to reassortment, which shuffles the eight sub-genomic segments of the influenza genome in dually infected host. The complex evolution of these viruses has created a challenge in vaccine development. Swine influenza isolates from 2003 and 2004 have been identified6 that have acquired a human influenza gene, PB1. My analysis of the eight gene segments found large portions of two genes, PB2 and PA, which were identical matches with 1977 swine isolates7,8. Additional regions were exact matches with 1998 and 2002 isolates,9,10 demonstrating homologous recombination between earlier genomes. The absolute fidelity discounts the role of point mutations in gene drift. Moreover the human PB1 gene represented a reservoir for acquisition of polymorphisms in human seasonal flu. These observations challenge the basic tenets of influenza genetics and provide a method for predicting the changes in seasonal and pandemic influenza, as well as other rapidly evolving genomes.
Discussion
- Votes:
-
36 votes
- Comments:
-
22 comments
The acquisitions can also be nested, which increases the number of alternating sequences. These alternating sequences are quite evident in the 1918 H1N1 pandemic sequences, which can be generated by swine H1N1 and human H1N1, eliminating the need to jump from avian to mammal.
This relationship applies to all eight gene segments. These data were presented at the Options VI meeting in Toronto last month.
http://www.recombinomics.com/presentations.html
The demonstrated pattern of unbroken blocks of identical sub-sequences is inconsistent with a mechanism of randomly acquired and cumulative transcription errors, and consistent with sub-gene exchange of genetic information. Clearly different mechanisms are at work than random drift; this data suggests a much more sophisticated method of conserving well-adapted sequences and combining them to create novel versions of the virus.
Yes, the absolute fidelity in replication of long continuous stretches is exact at the nucleotide level. Even synonymous changes are absent. In one case the regions of identity, albeit somewhat shorter, go back to swine sequences from 1931 (see legend in figure 2b).
Speaking of random drift, the data for dispersal of SNPs is included in hdl:10101/npre.2007.459.2 , while the data for aggregation is in doi:10.1038/npre.2007.553.1 .
These exchanges involving genetic drift are far from random.
Sequences recently released from Switzerland add additional support for regional markers in H5N1 clade 2.2 isolates. The Swiss sequences are similar to a sub-clade from Germany, and two of these regional markers, G295A and C1480T have been acquired by the human sequence from Nigeria. Although the number of regional markers is limited, such markers have been aggregated into the Nigerian HA sequence.
The aggregation is far from random, and can be mapped to predict future acquisitions.
Speaking of conservation of large blocks of sequences, new H5N1 sequences from Thailand create another problem for explanations of influenza evolution via random mutations. In the swine sequences, the most dramatic examples of sequence conservation were in two internal genes, PB2 and PA. These genes encode proteins that are part of the polymerase complex, and would therefore have considerable selection pressure for high fidelity replication, although the sequence identities were at the nucleotide level, so even silent changes were absent.
Today HA and NA sequences from a 2007 H5N1 isolate from a duck in Thailand were released at Genbank (A/duck/Phitsanulok/NIAH6-5-0001/2007). HA and NA genes encode surface glycoproteins which change rapidly to escape from the immune response of the host. This “antigenic drift” is of interest, because most vaccines target these two surface proteins and the vaccines quickly become obsolete because of antigenic drift.
The recent sequences from Thailand are also of interest because Thailand now had two distinct H5N1 co-circulating. Clade 1 was first identified in 2003/2004 isolates from Vietnam and Thailand. Clade 1 was found in birds and humans. Recently, the Fujian strain (Clade 2.3), was reported in southeast Asia in multiple countries, including Thailand. New human cases in Thailand and Vietnam have refocused attention on H5N1 circulating there.
The sequences released today were both clade 1. However, the partial 2007 sequence (1101 BP) was an exact match of three 2004 sequences from Thailand (listed below). Similarly, the partial NA sequence (432 BP) also exactly matched the available sequences (up to 406 BP) from the same three isolates.
Thus, even the rapidly evolving surface genes were copied with absolute fidelity for three years, raising additional questions about the role of random mutations generated by an error prone polymerase in the evolution of influenza.
A/duck/Uthaithani-2-02/2004
A/chicken/Nakornsawan-2-06/2004
A/chicken/Thailand/Phitsanulok/01/2004The Journal of Virology ahead of print (J. Virol. doi:10.1128/JVI.02683-07) paper, “Homologous Recombination is Very Rare or Absent in Human Influenza A Virus” cites this pre-print as bioinformatics evidence for influenza recombination.
However, the paper fails to find strong homologous recombination evidence in the dataset analyzed. There are several reasons for this failure. The study was limited to complete human gene sequences, which eliminated several clear examples in human HA sequences from South Korea in 2002. The requirement for human sequences eliminated additional examples, such as the swine recombinants described in this pre-print, and other examples in swine or birds. The full sequence requirement also eliminated many sequences from China, South Korea, and southeast Asia where many donor sequences originate.
The above requirements created a database that was largely composed of sequences generated under an NIAID Influenza Sequencing Project, which largely consisted of isolates from the United States or Australia. Therefore, most co-infections would involve closely related sequences. Recombination between closely related sequences would generate limited number of differences in the recombinants relative to the parental sequences. Moreover, the study was directed toward the identification of recombinants plus the two parental sequences. Although there were hundreds of examples of short regions supporting recombination, only two isolates were identified as recombinants which had recombined regions greater than 100 BP, which was the minimum requirement for confirmation by phylogenetic analysis. However, the paper also suggested that isolates with clear cut evidence for recombination involving large regions could have been generated during amplification because of contaminating sequences.
The recombinants described in this paper involved large regions of identity between the recombinants and the parental sequences. For the most striking examples, in PB2 and PA, generation of these recombinants during amplification would require the presence of two 1977 Tennessee isolates, a 1998 North Carolina isolates, a 2002 Korea isolate, and a 1931 Iowa isolate. Moreover, several sequences would have required the simultaneous presence of many of these isolates during the amplification, and the above isolates would have had to selectively contaminate the PB2 and PA sequences. Additional contamination would have been required during amplification of additional genes and be absent from the application of the human genes to artificially create the other gene sequences.
Such a scenario is highly unlikely.
The Journal of Virology ahead of print (doi:10.1128/JVI.00101-08) paper, ”Anomalies in the Influenza Genome Database: New Biology or Lab Errors?“, has two appendices that include the Canadian swine isolates discussed in “Swine Influenza A Evolution via Recombination – Genetic Drift Reservoir” above. The isolates are in the appendices because the acquired sequences are evolutionarily conserved, which include those found in 1977 isolates. The ahead of print paper speculates that these sequences are lab contaminants, and the apparent recombination is due to amplification of the contaminating sequences. The authors predict that re-sequencing of the samples will yield new sequences because maintenance of the cross-over points in the amplicons would be unlikely.
However, the published swine sequences include independent samples which have maintained cross over points. The PB2 sequences in figure 1A, three isolates (11112, 57561, and 56626) have acquired sequences from a 1998 North Carolina isolate, A/swine/North Carolina/35922/98, which begin at position 1326 and ends at position 1594. Moreover, two of the isolates have a 1977 Tennessee sequence, A/swine/Tennessee/24/77, which is nested within the North Carolina sequence, between positions 1006 and 1326. These shared crossover points between two or more isolates would minimize the possibility of artifact sequences due to amplification of contaminating viral RNA.
Similarly, the examples of recombination in human H3N2 HA sequences from South Korea have similar sets of sequences which have identical cross-over points. The six sequences fall into two groups of three. All six sequences are contemporary 2002 HA sequences which have a nested set of HA sequences which match H3N2 in circulation a decade earlier. These acquired sequences are between positions 575 and 1008. Three of the isolates (A/Incheon/260/2002, A/Daejeon/258/2002, and A/Kyongbuk/320) have the earlier sequences uninterrupted. However, the other three sequences, (A/Chennam/323/2002, A/Chennam/340/2002, and A/Chennam/432/2002) have contemporary 2002 sequences nested between positions 685-806 as well as 879-963. Once again these cross over points match in three or more isolates, reducing the likelihood that these earlier sequences are due to lab RNA contamination.
Moreover, samples representing each group were re-sequenced by two independent laboratories, the CDC in Atlanta and NIID in Tokyo, and each lab generated sequences which were exact matches with the earlier sequences of the CDC in South Korea. The independent sequences were included in the Science paper “The Global Circulation of Seasonal Influenza A (H3N2) Viruses” (DOI: 10.1126/science.1154137).
Thus, the swine sequences in the above pre-print, as well as the human H3N2 sequences, contain contemporary genes that have acquired evolutionarily conserved sequences which match those in circulation a decade or more earlier. Each dataset includes multiple isolates which share cross-over points, greatly reducing the likelihood that the sequence data are due to lab contamination, and supports the observation that such sequences are generated by homologous recombination.
H1N1 sequences have been recently released from South Africa, which confirm initial reports that 100% of H1N1 isolates include neuraminidase (NA) H274Y, which confers oseltamivir resistance. As expected, all eight NA sequences had H274Y and the eight HA sequences were closely related to Brisbane/59 (clade 2B) sequences reported earlier from other countries with widespread oseltamivir resistance. The HA sequences, however, fell into two distinct sub-clades. Five of the eight sequences had a cluster of polymorphisms (A558G, G604A, G605A, C610T, and G617A). G617A had been reported earlier in the dominant H274Y containing sub-clade in the United States. Published phylogenetic trees suggested that this change was also present in northern Europe, as was seen in one public sequence from England (see list below). G617A generated the non-synonymous change A193T (H3 numbering), which was also in circulation in H1N1 in the 1940’s as shown below.
However, the adjacent change in the South African isolates, C610T was also present in a sub-set of the 1940 isolates, providing additional evidence for sequential acquisition of earlier adjacent polymorphisms.
This type of acquisition is most easily explained by homologous recombination.
Isolates with 16 BP region of identity including G617A
A/Johannesburg/46/2008
A/Johannesburg/35/2008
A/Johannesburg/34/2008
A/Johannesburg/25/2008
A/Johannesburg/10/2008
A/South Carolina/01/2008
A/Wisconsin/01/2008
A/Hawaii/02/2008
A/North Carolina/02/2008
A/New Jersey/10/2008
A/Memphis/03/2008
A/Washington/01/2008
A/New Jersey/08/2008
A/New Jersey/AF1291/2008
A/England/557/2007
A/New Jersey/05/2008
A/New Jersey/20/2007
A/New Jersey/06/2008
A/Pennsylvania/02/2008
A/Maryland/04/2007
A/New Jersey/16/2007
A/New Jersey/15/2007
A/Albany/4836/1950
A/Roma/1949
A/Albany/4835/1948
A/Lepine/1948
A/Fort Monmouth/1/1947
A/Rhodes/47
A/Cam/46
A/Hickox/1940Isolates with extended identity including G617A and C610C
A/Johannesburg/46/2008
A/Johannesburg/35/2008
A/Johannesburg/34/2008
A/Johannesburg/25/2008
A/Johannesburg/10/2008
A/Fort Monmouth/1/1947
A/Rhodes/47
A/Cam/46
A/Hickox/1940The upcoming Virology paper (YVIRO-04809) “Homologous recombination evidence in human and swine influenza A viruses” by He et al. describes recombination in A/swine/Ontario/57561/03(H1N1) and A/Swine/Ontario/53518/03H1N1) which included donor sequences from A/swine/Ontario/55384/04(H1N2) and A/swine/Alberta/56626/03(H1N1).
These examples of recombination are described in the Nature Precedings above paper.
Twenty full sets of swine influenza sequence from Korea were recently released at Genbank under the title “Swine Influenza: seroprevalence and genetic evolution under vaccine pressure in Korean swine herds”. The H3N2, H1N2, and H1N1 isolates had a constellation of genes similar to the majority of the Canadian swine isolates, with seven swine influenza gene segments and a human PB1 most closely related to human seasonal flu sequences from the mid 90’s.
However, these sequences had clear examples of recombination with avian flu sequences from wild birds. Two of the isolates, A/swine/Korea/CY07/2007(H3N2) and A/swine/Korea/CY08/2007(H1N2) had the same wild bird sequence in the first half of PA. The H1N2 isolate also had long segments of avian sequences in PB2.
However, recombination was also present in the human PB1 sequence. One isolate, A/swine/Korea/CY09/2007(H3N2) had long regions of identity with wild bird sequences. Moreover, the other isolate, A/swine /Korea/CAS05/2004(H3N2) had clade 2.2 H5N1 sequences in its human PB1. In addition, there were clade 1 H5N1 sequences in PB2 for this isolate.
These sequences have examples of recombination which compliment the analysis on the Canadian swine sequences, but in contrast to the Canadian sequences, which represent recombination between contemporary and earlier swine influenza sequences, the isolates in Korea have genes that are recombinants between swine and avian or human and avian influenza genes.
Although swine reassortants with swine, avian, and human influenza genes have been described previously, these sequences represent the first reported swine / avian, or human /avian recombinant genes.
The above sequences are included in Virus Research 2008 Sep 10. [Epub ahead of print] by Pascua PN, et al.
The recent H1N1 sequences from South Africa with H274Y have now been described in “Widespread Oseltamivir Resistance in Influenza A Viruses (H1N1), South Africa” by Besselaar TG et al. Epub ahead of print DOI:10.3201/eid1411.080958
The recent release of H1N1 HA and NA sequences from this season in the United States (by the CDC), coupled with the release of HA sequences from isolates collected last season in Norway, allows for phylogenetic analysis and polymorphism tracing to detail the fixing of H274Y in clade IIB (Brisbane/59).
Earlier analysis indicated that the widespread distribution of H274Y was linked to the emergence of a dominant sub-clade, which included isolates in Norway, where H274Y levels in H1N1 exceeded 65%. Previously, H274Y had been reported in clade IIC (Hong Kong) in China, followed by clade I (New Caledonia) in the United States and United Kingdom, followed by clade IIB in Hawaii. However, although H274Y could jump from one genetic background to another, which is most easily explained by recombination, the ealier isolates did not become dominant.
However, when H274Y jumped to a sub-clade with tandem NA changes, D344N and D354G, which had been found in earlier IIC isolates, frequencies of H274Y rose. The jump of these two polymorphisms to clade IIB led to H274Y dominance in Norway, as well as high levels in several countries in northern Europe.
Over the summer, the dominance rose to 100% in South Africa. Those isolates had the tandem NA changes, but they also had a stretch of 16 BP near position 190 in the receptor binding domain, which was in the dominant sub-clade in the United States, and also found in H1N1 isolates from the 1940’s. This stretch encoded for A193T, which is found in all of the public clade IIC sequences from the United States this season.
Moreover, 49 out of 50 H1N1 isolates in the United States are resistant, suggesting this sub-clade is widespread. The published sequences are from isolates in Hawaii, Texas, Pennsylvania, and Wisconsin, as well as a Washington state isolate collected over the summer, which had the cluster of changes in the South African sequences.
The presence of A193T may have influenced influenza serotypes in the United States and Europe. Only one public European sequence from last season contained A193T. It was not present in the more than 30 HA sequences from Norway, or other sequences from several European countries, but is in last season’s dominant sub-clade in the United States last season, and all clade II isolates published for this season. In the United States, most influenza is H1N1 and all but one was clade IIB, while in Europe most influenza A is H3N2.
New sequences with 16 BP region including A193T
A/Wisconsin/12/2008
A/Texas/18/2008
A/Texas/17/2008
A/Texas/16/2008
A/Texas/15/2008
A/Pennsylvania/09/2008
A/Pennsylvania/08/2008
A/Hawaii/21/2008
A/Hawaii/19/2008Very elegant work. I think it’s fascinating that these polymorphisms that are important in antigenic determination and antiviral resistance can be tracked back over many years and also that strong predictions of them occuring on new templates can be made. Especiallly interesting that this can be tracked on multiple gene segments.
Would be interesting to see these mechanisms used to help choose seasonal and pandemic vaccine candidates.
Is the data sufficient yet to apply the reasoning of your recombination argument to predict future evolutionary outcome?
You’ve made predictions on this topic in the past, but they were general, assumedly due to the paucity of data.
Has the new data, not present in the H5N1 literature (hoarding being imo onerous and obviously risky to the world’s health), present in H1N1 and H3N2 been sufficient to begin to establish test predictive rules and consequent predictions?
A recent update from the NIH in Japan includes phylogenetic analysis of HA , NA , and MP of three isolates (A/Sendai/103/2008, A/Sendai/104/2008, A/Sendai/105/2008) from the elementary school students linked to the closing of 10 schools because of an influenza outbreak in October.
The three isolates are identical to each other and are the Brisbane/59 sub-clade 2B with H274Y. The HA sequences have three changes flanking position 190 (H3 numbering). These three changes, G189V, A193T, H196R, match the three changes in isolates collected this season from Hawaii, Texas, and Pennsylvania, support the role of A193T as the driver in the hitch hiking of H274Y associated with worldwide fixing of oseltamivir resistance in H1N1.
A193T was also present in the H1N2 isolates from the three Canadian swine that had human HA and NA genes, as well as human H1N2 in circulation from 2001-2003.
The movement of A193T from one genetic background to another also supports recombination.
Can this be used as a match pattern for the next inclusion round when selecting the next trivalents? It’s in Japan. So, is there a way of determining probability of its sustainability at least long enough to move from East to West, from Japan to the US?
The NIH in Japan just released a report on H1N1 including an NA phylogenetic tree which included 33 local isolates from this season. All had D354G, which was in the dominant H1N1 sub-clade last season in Europe and North America. Most also mapped with US isolates from this season that had HA A193T, which emerged last season in clade 2B in the US and England in late 2007, in advance of vaccine target selection for the current season. However, neither HA A193T nor NA D354G or H274Y are in the current vaccine target, Brisbane/59.
The Japan NIH has issued a report on H1N1 that included phylogenetic analysis of 33 of the 50 H274Y isolates reported to date in Japan. All 33 mapped into the sub-clade with D354G, which was in the dominant sub-clade with H274Y last season. However, this season the isolates in the United States emerged from a sub-clade within this dominant sub-clade, and all isolates had A193T on HA.
An earlier phylogenetic analysis from Japan had three isolates from a major outbreak in Sendai. Those sequences matched the dominant sub-clade from the United States, which had three receptor binding domain changes (G189V, A193T, H196R). The NA from those sequences had N28S which was seen in isolates from Yokohama and Sendai (see list below). Another grouping had two other NA polymorphisms, A86T and T339A (from Miyagi and Yamaguchi). Those changes matched A/Hawai/19/2008, which also had A193T.
The largest branch from Japan had isolates that did not have non-synonymous changes, but mapped on a branch with the above isolates, suggesting this branch (with isolates from Sapporo, Mie, Osaka, Sakai, Sopporo, Shiuoka, Kobe, and Hiroshima) also identified isolates with A193T on HA.
In addition, one isolate had N73K, which was in the major sub-clade from South Africa, which also had A193T. Thus, it appears that 31/33 of the isolates from Japan have HA A193T. The two exceptions were from Shiga.
Moreover, an isolate from China, A/Jilin-Chaoyang/1191/08, maps with the US isolates, suggesting this same sub-clade has spread to China. Another isolate from China, A/Beijing-Huairou/15/08, is closely related, raising concerns that H274Y levels in H1N1 in China are also near 100%.
The spread of resistant H1N1 to China raises concerns with regard to co-circulating H5N1. China has just reported four confirmed cases of H5N1 and four additional contacts are suspect cases. These patients, as well as contacts, are treated with oseltamivir, raising concerns that patients or contacts infected with H5N1 could also be infected with H1N1 with H274Y, which could lead to the emergence of H5N1 with H274Y.
Japan isolates (31/33)
N28S
Yokahama/95/08
Yokohama/96/08
Sendai/103/08
Sendai/104/08
Sendai/105/08
Sendai-H/2103/08
Texas/15/2008
Texas/16/2008
Texas/17/2008
Texas/18/2008
Pennsylvania/08/08
Pennsylvania/09/08
Hawaii/21/08A86T + T339A
Miyagi/35/08
Yamaguchi/26/08
Yamaguchi/27/08
Yamagichi/28/08
Hawaii/19/08Related to above
Hiroshima/44/08
Hiroshima/45/08
Hiroshima/46/08
Hiroshima/48/08
Kobe/49/08
Mie/32/08
Osaka-C/39/08
Sakai/30/08
Sakai/47/08
Sapporo/63/08
Sapporo/64/08
Sapporo/66/08
Sapporo/68/08
Sapporo/70/08
Sapporo/72/08
Sapporo/80/08
Sapporo/83/08
Shizouka-C/51/08
Shizouka-C/54/08N73K
Chiba/86/08
Johannesburg/12/2008
Johannesburg/14/2008
Johannesburg/33/2008
Johannesburg/35/2008HA sequences from Kenya have been released through the US Air Force surveillance program. These sequences identified high levels of oseltamivir resistance H1N1 in Kenya, and provided additional support for recombination and the role of A193T in the fixing of H274Y.
Most of the sequences fell into two groups. One matched the sequences released from isolates collected last season in Kenya. These sequences had G189N (H3 numbering), which was encoded by two adjacent non-synonymous changes, G604A and G605A. However, a second group matched the oseltamivir resistant isolates identified in South Africa over the summer. These isolates also had G189N, but were flanked by two additional changes S187N and A193T. S187N was found in clade 2C in Hong Kong, while A193T had also been in clade 2C prior to acquisition by the clade 2B sub-clade that emerged worldwide this season. There were six isolates (see list below) that had the three receptor binding changes, S187N, G189N, and A193T, which was the dominant sub-clade in South Africa and also was in an isolate from Washington State. Moreover, phylogenetic analysis of the NA sequences suggests an isolate from Chiba, Japan also matches this group. The isolates in Kenya were collected in October and November, but support an earlier Kenya origin for G189N.
In addition to the series that matches the isolates from South Africa, another isolate, A/Mbagathi/7586/2008, has G189A and A193T, which matches another isolate from the US that emerged this season, A/Hawaii/19/2008. This combination also matches another series from Japan this season, based on phylogenetc analysis of NA.
Another Kenya isolate from the summer, A/Kisumu/6543/2008, has G189N plus H196R. H196R is in the dominant H1N1 sub-clade reported to date in the United States, as well as multiple locations in Japan this season.
Yet another isolate, A/Kisii/7577/2008, has G189N plus A193T.
Thus, the isolates from Kenya contain a number of combinations of receptor binding domain polymorphisms which match H1N1 isolates with H274Y on NA, as well as A193T on HA.
S187N, G189N, A193T
Kisii/7541/2008
Kisii/7547/2008
Kisii/7559/2008
Kisii/7562/2008
Kisii/7570/2008
Kisii/7576/2008
Johannesburg/10/2008
Johannesburg/25/2008
Johannesburg/34/2008
Johannesburg/35/2008
Johannesburg/46/2008
Washington/05/2008
Chiba/86/2008G189A, A193T
Mbagathi/7586/2008
Hawaii/19/2008
Miyagi/35/08
Yamaguchi/26/08
Yamaguchi/27/08
Yamaguchi/28/08H196R (G189V, A193T also in all but Kisumu)
Kisumu/6543/2008
Hawaii/21/2008
Pennsylvania/08/2008
Pennsylvania/092008
Texas/15/2008
Texas/16/2008
Texas/17/2008
Texas/18/2008
Sendai/103/08
Sendai-H/2103/08
Sendai/104/08
Sendai/105/08
Yokohama/95/08
Yokohama/96/08 - (Login to share with a colleague)
Additional information
- License:
- This document is licensed to the public under the Creative Commons Attribution 2.5 License
- How to cite this document:
-
Niman, Henry. Swine Influenza A Evolution via Recombination – Genetic Drift Reservoir. Available from Nature Precedings <http://hdl.handle.net/10101/npre.2007.385.1> (2007)
- Version info:
-
Other versions of this document in Nature Precedings
None.
Other versions of this document elsewhere on the web
None known.
NS Unis on 16 July 2007 14:41 UTC
Fidelity of over 1,600 nucleotides in the PB2 of Influenza across 27 years is a substantial find. Adding to that fact, the author presents data showing that this same fidelity has occurred at least twice.
The analysis of identities from the other associated gene segments provides further evidence.
Great color-coded figures showing the aggregated construction of each recent isolate as potentially composed by recombination of prior sequences.