Gene Interval Updater is a tool that can be used to convert U00096.2 gene (and feature) genomic addresses to U00096.3 coordinates. The MG1655(Seq) [ATCC 700926] represented in Genbank U00096.2 [4,639,675 bp] has several sequence errors, including two missed IS element insertions, that were reported in 2012 by Freddolino et al. [pmid: 22081388]. We have confirmed these DNA sequence errors and corrected them, creating U00096.3 [4,641,652 bp]. MG1655(Seq) has three previously unknown gene mutations, in crl, gatC and glpR; two sequencing errors in ylbE were corrected, including a frameshift error that restored this ORF previously thought to be a pseudogene.

Download a customized table from the current EcoGene database. The table will be a tab delimited ascii text file. You can specify the fields you want to download and each field will be separated by a tab. Please be aware that the database contents are being altered daily and that the contents of this download can change at anytime.

A graphic presentation of Boolean query comparisons using of two or three GeneSets can be displayed in an interactive Venn diagram. Boolean queries can be executed using EcoGene GeneSets or user-specified GeneSets. EcoGene GeneSets are collections of genes clustered in EcoTopics or EcoArray. User provided GeneSets are lists of EG accession numbers uploaded at a "GeneSets Venn Diagram and Boolean Query" page, accessible using the Download tab or Download in the Navigation menu. EcoSearch can be used to obtain a list of user-specified EG numbers from a query or from an uploaded ID file containing gene names or any other unique gene identifiers. Gene names that are synonyms are mapped to primary genes.

Cross Reference Mapping and Download page was created for user access to many additional accession numbers and other gene identifiers such as gene name and synonyms. The E. coli K-12 gene accession numbers from over 30 other databases were collected to enable the construction of hyperlinks from EcoGene GenePages to the gene pages at other websites and to easily update tables that lack EG or ECK ids to the most recent gene names and functions.

PrimerPairs is an embedded design tool for obtaining genome-wide PCR primer sequences for generating deletion-insertions or making an ordered clone library based on the current EcoGene gene interval annotations. The user has the options of setting primer length, offset length in basepairs inside or outside of genes, add-on sequences for adding restriction sites, add-on sequences for amplifying kan or cat antibiotic resistance cassettes. PrimerPairs can detect and correct primer sequences that delete part of an adjacent overlapping gene. The offending primers are repositioned automatically to avoiding the double deletion problem found in the Keio mutant collection. PrimerPairs has previously been described in detail (Zhou and Rudd 2011).

Download any E. coli K-12 genomic DNA sub-sequence. Instead of downloading the complete sequence, users can specify a sub-sequence to be downloaded by specifying the start and the end positions (genomic addess) on the complete sequence. Users can also request the reverse complement of the sub-sequence. The current MG1655 genome sequence in Genbank (U00096.3) is the default, but the two previous versions (U00096.2, U00096.1) are also made available for comparison purposes.

Download from the current database the intergenic region information (in a tab delimited ascii table) or intergenic DNA sequences (in a FASTA library file). The intergenic sequence intervals are named and oriented according to the flanking genes. One option will remove the repeat family DNA sequences such as CRISPR and REP from the intergenic regions, causing the names to be dictated by the flanking genes or repeats. Another option will add in the overlapping regions of adjacent genes, listing their shared DNA sequences as having a negative length value. The intergenic regions contain many conserved regulatory sequences such as promoters, terminators, and transcription factor binding sites.

Intergenic Repeat Families

Complete A complete listing of the Verified Set for downloading.

Verified Set

Daily updated EcoProt and EcoGene

  • EcoProt (no pseudogenes) download
  • EcoProt pseudogenes (including incomplete and hypothetically reconstructed pseudogene translations) download
  • EcoGene (protein-coding, no pseudogenes) download
  • EcoGene (protein-coding, including incomplete and hypothetically reconstructed pseudogenes) download
  • EcoGene (RNA genes only) download
Pseudogene

Daily Updated PDF Map of the E. coli chromosome download

