Document information
Swine Influenza A Evolution via Recombination – Genetic Drift Reservoir
Correspondence: (Login to view email address)
- Recombinomics, Inc
PDF (116.5 KB)
- Document Type:
- Manuscript
- Date:
- Received 07 July 2007 03:02 UTC; Posted 10 July 2007
- Subjects:
- Biotechnology, Evolution and Ecology, Genetics, Microbiology, Bioinformatics
- Abstract:
The looming influenza pandemic has focused attention1-4 on the rapid evolution of H5N1 and other human and avian serotypes. The basic tenets of influenza genetics5 define gradual changes as drifts caused by point mutations created by a polymerase that lacks a proof reading function. More abrupt changes have been linked to reassortment, which shuffles the eight sub-genomic segments of the influenza genome in dually infected host. The complex evolution of these viruses has created a challenge in vaccine development. Swine influenza isolates from 2003 and 2004 have been identified6 that have acquired a human influenza gene, PB1. My analysis of the eight gene segments found large portions of two genes, PB2 and PA, which were identical matches with 1977 swine isolates7,8. Additional regions were exact matches with 1998 and 2002 isolates,9,10 demonstrating homologous recombination between earlier genomes. The absolute fidelity discounts the role of point mutations in gene drift. Moreover the human PB1 gene represented a reservoir for acquisition of polymorphisms in human seasonal flu. These observations challenge the basic tenets of influenza genetics and provide a method for predicting the changes in seasonal and pandemic influenza, as well as other rapidly evolving genomes.
Discussion
- Votes:
-
33 votes
- Comments:
-
9 comments
The acquisitions can also be nested, which increases the number of alternating sequences. These alternating sequences are quite evident in the 1918 H1N1 pandemic sequences, which can be generated by swine H1N1 and human H1N1, eliminating the need to jump from avian to mammal.
This relationship applies to all eight gene segments. These data were presented at the Options VI meeting in Toronto last month.
http://www.recombinomics.com/presentations.html
The demonstrated pattern of unbroken blocks of identical sub-sequences is inconsistent with a mechanism of randomly acquired and cumulative transcription errors, and consistent with sub-gene exchange of genetic information. Clearly different mechanisms are at work than random drift; this data suggests a much more sophisticated method of conserving well-adapted sequences and combining them to create novel versions of the virus.
Yes, the absolute fidelity in replication of long continuous stretches is exact at the nucleotide level. Even synonymous changes are absent. In one case the regions of identity, albeit somewhat shorter, go back to swine sequences from 1931 (see legend in figure 2b).
Speaking of random drift, the data for dispersal of SNPs is included in hdl:10101/npre.2007.459.2 , while the data for aggregation is in doi:10.1038/npre.2007.553.1 .
These exchanges involving genetic drift are far from random.
Sequences recently released from Switzerland add additional support for regional markers in H5N1 clade 2.2 isolates. The Swiss sequences are similar to a sub-clade from Germany, and two of these regional markers, G295A and C1480T have been acquired by the human sequence from Nigeria. Although the number of regional markers is limited, such markers have been aggregated into the Nigerian HA sequence.
The aggregation is far from random, and can be mapped to predict future acquisitions.
Speaking of conservation of large blocks of sequences, new H5N1 sequences from Thailand create another problem for explanations of influenza evolution via random mutations. In the swine sequences, the most dramatic examples of sequence conservation were in two internal genes, PB2 and PA. These genes encode proteins that are part of the polymerase complex, and would therefore have considerable selection pressure for high fidelity replication, although the sequence identities were at the nucleotide level, so even silent changes were absent.
Today HA and NA sequences from a 2007 H5N1 isolate from a duck in Thailand were released at Genbank (A/duck/Phitsanulok/NIAH6-5-0001/2007). HA and NA genes encode surface glycoproteins which change rapidly to escape from the immune response of the host. This “antigenic drift” is of interest, because most vaccines target these two surface proteins and the vaccines quickly become obsolete because of antigenic drift.
The recent sequences from Thailand are also of interest because Thailand now had two distinct H5N1 co-circulating. Clade 1 was first identified in 2003/2004 isolates from Vietnam and Thailand. Clade 1 was found in birds and humans. Recently, the Fujian strain (Clade 2.3), was reported in southeast Asia in multiple countries, including Thailand. New human cases in Thailand and Vietnam have refocused attention on H5N1 circulating there.
The sequences released today were both clade 1. However, the partial 2007 sequence (1101 BP) was an exact match of three 2004 sequences from Thailand (listed below). Similarly, the partial NA sequence (432 BP) also exactly matched the available sequences (up to 406 BP) from the same three isolates.
Thus, even the rapidly evolving surface genes were copied with absolute fidelity for three years, raising additional questions about the role of random mutations generated by an error prone polymerase in the evolution of influenza.
A/duck/Uthaithani-2-02/2004
A/chicken/Nakornsawan-2-06/2004
A/chicken/Thailand/Phitsanulok/01/2004The Journal of Virology ahead of print (J. Virol. doi:10.1128/JVI.02683-07) paper, “Homologous Recombination is Very Rare or Absent in Human Influenza A Virus” cites this pre-print as bioinformatics evidence for influenza recombination.
However, the paper fails to find strong homologous recombination evidence in the dataset analyzed. There are several reasons for this failure. The study was limited to complete human gene sequences, which eliminated several clear examples in human HA sequences from South Korea in 2002. The requirement for human sequences eliminated additional examples, such as the swine recombinants described in this pre-print, and other examples in swine or birds. The full sequence requirement also eliminated many sequences from China, South Korea, and southeast Asia where many donor sequences originate.
The above requirements created a database that was largely composed of sequences generated under an NIAID Influenza Sequencing Project, which largely consisted of isolates from the United States or Australia. Therefore, most co-infections would involve closely related sequences. Recombination between closely related sequences would generate limited number of differences in the recombinants relative to the parental sequences. Moreover, the study was directed toward the identification of recombinants plus the two parental sequences. Although there were hundreds of examples of short regions supporting recombination, only two isolates were identified as recombinants which had recombined regions greater than 100 BP, which was the minimum requirement for confirmation by phylogenetic analysis. However, the paper also suggested that isolates with clear cut evidence for recombination involving large regions could have been generated during amplification because of contaminating sequences.
The recombinants described in this paper involved large regions of identity between the recombinants and the parental sequences. For the most striking examples, in PB2 and PA, generation of these recombinants during amplification would require the presence of two 1977 Tennessee isolates, a 1998 North Carolina isolates, a 2002 Korea isolate, and a 1931 Iowa isolate. Moreover, several sequences would have required the simultaneous presence of many of these isolates during the amplification, and the above isolates would have had to selectively contaminate the PB2 and PA sequences. Additional contamination would have been required during amplification of additional genes and be absent from the application of the human genes to artificially create the other gene sequences.
Such a scenario is highly unlikely.
The Journal of Virology ahead of print (doi:10.1128/JVI.00101-08) paper, ”Anomalies in the Influenza Genome Database: New Biology or Lab Errors?“, has two appendices that include the Canadian swine isolates discussed in “Swine Influenza A Evolution via Recombination – Genetic Drift Reservoir” above. The isolates are in the appendices because the acquired sequences are evolutionarily conserved, which include those found in 1977 isolates. The ahead of print paper speculates that these sequences are lab contaminants, and the apparent recombination is due to amplification of the contaminating sequences. The authors predict that re-sequencing of the samples will yield new sequences because maintenance of the cross-over points in the amplicons would be unlikely.
However, the published swine sequences include independent samples which have maintained cross over points. The PB2 sequences in figure 1A, three isolates (11112, 57561, and 56626) have acquired sequences from a 1998 North Carolina isolate, A/swine/North Carolina/35922/98, which begin at position 1326 and ends at position 1594. Moreover, two of the isolates have a 1977 Tennessee sequence, A/swine/Tennessee/24/77, which is nested within the North Carolina sequence, between positions 1006 and 1326. These shared crossover points between two or more isolates would minimize the possibility of artifact sequences due to amplification of contaminating viral RNA.
Similarly, the examples of recombination in human H3N2 HA sequences from South Korea have similar sets of sequences which have identical cross-over points. The six sequences fall into two groups of three. All six sequences are contemporary 2002 HA sequences which have a nested set of HA sequences which match H3N2 in circulation a decade earlier. These acquired sequences are between positions 575 and 1008. Three of the isolates (A/Incheon/260/2002, A/Daejeon/258/2002, and A/Kyongbuk/320) have the earlier sequences uninterrupted. However, the other three sequences, (A/Chennam/323/2002, A/Chennam/340/2002, and A/Chennam/432/2002) have contemporary 2002 sequences nested between positions 685-806 as well as 879-963. Once again these cross over points match in three or more isolates, reducing the likelihood that these earlier sequences are due to lab RNA contamination.
Moreover, samples representing each group were re-sequenced by two independent laboratories, the CDC in Atlanta and NIID in Tokyo, and each lab generated sequences which were exact matches with the earlier sequences of the CDC in South Korea. The independent sequences were included in the Science paper “The Global Circulation of Seasonal Influenza A (H3N2) Viruses” (DOI: 10.1126/science.1154137).
Thus, the swine sequences in the above pre-print, as well as the human H3N2 sequences, contain contemporary genes that have acquired evolutionarily conserved sequences which match those in circulation a decade or more earlier. Each dataset includes multiple isolates which share cross-over points, greatly reducing the likelihood that the sequence data are due to lab contamination, and supports the observation that such sequences are generated by homologous recombination.
- (Login to share with a colleague)
Additional information
- License:
- This document is licensed to the public under the Creative Commons Attribution 2.5 License
- How to cite this document:
-
Niman, Henry. Swine Influenza A Evolution via Recombination – Genetic Drift Reservoir. Available from Nature Precedings <http://hdl.handle.net/10101/npre.2007.385.1> (2007)
- Version info:
-
Other versions of this document in Nature Precedings
None.
Other versions of this document elsewhere on the web
None known.
NS Unis on 16 July 2007 14:41 UTC
Fidelity of over 1,600 nucleotides in the PB2 of Influenza across 27 years is a substantial find. Adding to that fact, the author presents data showing that this same fidelity has occurred at least twice.
The analysis of identities from the other associated gene segments provides further evidence.
Great color-coded figures showing the aggregated construction of each recent isolate as potentially composed by recombination of prior sequences.