New deep sequenced reference genome and genome browser available for NCTC 14078 Streptococcus pneumoniae
A new detailed reference genome accompanied with its own novel genome browser (produced using a combination of state-of-the-art sequencing and bioinformatics technologies) for the NCTC 7466 descendent strain NCTC 14078 / D39V Streptococcus pneumoniae has recently been published in Nucleic Acids Research1.
Long-read genome sequencing technology (PacBio) supplemented by Sanger sequencing and prior Illumina sequences were used to unambiguously determine the genome sequence of the NCTC 14078 / D39V strain. In addition to automated and extensive manual genome annotation, the strain’s transcriptome (the sum of mature mRNA molecules expressed from a cell’s DNA at any given time) expressed under four infection relevant conditions was extracted and sequenced, and then mapped to the newly generated genome. This improved the clarity of several genomic features, including more accurate identification of transcriptional start sites, 89 new protein encoding genes, 45 previously incorrectly annotated start codons and 63 small RNA features such as bacterial small RNAs (sRNA).
As well as depositing a high quality annotated reference genome in GenBank, Slager et al also released a new user-friendly and highly accessible genome browser; PneumoBrowse.
PneumoBrowse overlays the NCTC 14078 / D39V whole genome sequence with annotated features such as protein coding genes, transcription start sites and terminators, putative operons, other predicted regulatory features in a visually intuitive way, with further information available on each annotated feature on the platform. This annotation will be periodically updated with additional information from future peer-reviewed studies to ensure that it continuously reflects current knowledge in the field.
Despite the availability of vaccines and antibiotic therapies Streptococcus pneumoniae remains a very dominant opportunistic human pathogen globally, causing a range of non-invasive and invasive infections in the developed and developing world. The combination of the detailed D39V genome visualised by PneumoBrowse and the availability of the live culture from the National Collection of Type Cultures means that NCTC 14078 / D39V has great potential as a model organism in the study of pneumococcal biology and pathogenesis; with Slager et al highlighting this research’s value in supporting the discovery of novel targets for vaccines and antibiotics.
The NCTC 14078 / D39V assembled and annotated whole genome sequence is available in GenBank under accession number CP027540.
The Veening Lab’s PneumoBrowse tool, including a user guide and list of updates, can be accessed here.
References
Slager J, Aprianto R, Veening JW. Deep genome annotation of the opportunistic human pathogen Streptococcus pneumoniae D39. Nucleic Acids Res. 2018 Aug 13. https://www.ncbi.nlm.nih.gov/pubmed/30107613
Written by Jake D. Turnbull
Follow Jake on Twitter @hotchpotchjake