E.coli EDL933 is the prototypic strain for enterohemorrhagic E. coli serotype O157:H7.
The publicly available sequence of the EDL933 genome has gaps and>6,000 ambiguous base calls, researchers presented an updated unambiguous genome sequence with no assembly gaps.
Their analysis includes so-called jumping genes that can move around the same genome, sometimes causing damage to individual genes or enabling antibiotic resistance.
Isolated in early 1980’s
Strain EDL933 (ATCC 43895) was isolated from ground beef linked to a hamburger outbreak in Michigan, USA in 1982.
It gained public attention following its association with an outbreak in 1993 related to the US fast-food chain Jack-in-the-Box.
“With a complete genome sequence, we can now pinpoint the precise location of all such elements, which might help to track and treat future outbreaks,“ said Ramy Aziz, the senior author.
The genome sequence was first published in 2001, but there were many gaps that could not be closed with the sequencing technology available to scientists in 2001.
“Although the full genome of EDL933 was sequenced and published in 2001, the deposited assembled genome has >6,000 ambiguous base calls and a chromosomal gap of 4,000 bp,” said the researchers.
“While the utility of this reference genome…is indisputable, several analyses reliant on a pristine reference (e.g., single nucleotide polymorphism studies) are hindered by those ambiguities and gaps.
“EDL933 has long phage-associated repeat regions >7 kb. Microbial genomes with these characteristics are the most complex to assemble, so we resorted to singlemolecule sequencing using PacBio followed by polishing using Illumina short-reads to complete the EDL933 sequence.
“This produced a gapless genome assembly, with no ambiguous base calls, and an updated genome annotation.”
No place to hide
Aziz, a professor at Cairo University in Egypt, led the research as a visiting scientist working in Bernhard Palsson’s Systems Biology Research Group at UC San Diego Jacobs School of Engineering.
“New sequencing and assembly methods are enabling a full expose of pesky pathogens; there is no place to hide genetic characteristics anymore,” said Palsson, the Galletti Professor of Bioengineering at UC San Diego.
“The full genetic delineation of multiple pathogenic strains is likely to not only improve our understanding of their characteristics, but to find and exploit their vulnerabilities.”
In a separate project, the US Food and Drug Administration (FDA), University of California, Davis, Agilent Technologies and the Centers for Disease Control and Prevention (CDC) are working on the 100k Foodborne Pathogen Genome Project.
The five-year effort will create the largest public database of 100,000 foodborne pathogen genomes to speed identification of bacteria responsible for outbreaks and reduce public health response time from weeks to days.
Source: Genome Announcements
Online ahead of print, DOI: 10.1128/genomeA.00821-14
“A Gapless, Unambiguous Genome Sequence of the Enterohemorrhagic Escherichia coli O157:H7 Strain EDL933”
Authors: Haythem Latif, Howard J. Li,Pep Charusanti,Bernhard Ø. Palsson,Ramy K. Aziz