Towards structural genomics of prophages
Sankaran Krishnaswamy

Towards structural genomics of prophages

K V Srividhya#, M R Sankar Narayanan#, Tarun Kumar Bhatt#, Geeta V. Rao, Dinesh Kumar#, GP Singh#, L Raghavendaran#, Krishnamohan Katta#, Preeti Mehta#, Jaime Prilusky%,$, Tamar Unger§,$, Yoav Peleg§,$, Shira Albeck§,$, Orly Dym§,$, Harry Greenblatt$, Joel Sussman§,$ and S Krishnaswamy# *

#School of Biotechnology, Madurai Kamaraj University, Madurai, India
%Biological Services, §Israel Structural Proteomics Center, $Dept of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
*Presentation author: S. Krishnaswamy

Integrated virus genomes, termed as prophages, can constitute 10-20% of a bacterial genome. The prophage can be induced under different conditions such as UV irradiation, to develop into full-fledged virions leading to cell lysis. In some cases the prophage elements are modified or part of them are lost in the course of evolution leading to the formation of the cryptic prophages and phage remnants. The large amount of bacterial genome sequence information thus provides a repertoire of prophage encoded proteins. Prophages contribute to the evolution of the diversity of bacteria through horizontal transfer. Sequence comparison near the end of the presumptive tail-encoding region of prophages and cryptic prophages show a mosaic structure indicative of multiple genetic exchanges. Analysis of the nucleic acid sequences of bacteriophages, prophages, the cryptic prophages and the phage remnants suggest the modular nature of the evolution by acquisition of entire genes or groups of genes. In the horizontal gene transfer that has been clearly established from whole genome sequence analysis, phage elements – such as prophages, cryptic prophages and phage remnants - contribute substantially. Toxins encoded by these prophages, is one way by which, the prophages can contribute to virulence in the case of pathogenic strains of bacteria. The phage elements are found to encode a variety of virulence-related proteins and also exhibit similarity to some pathogenicity islands. The prophage, cryptic prophage and phage remnants can help provide a hidden structural and functional repertoire of proteins for the organism in their survival mechanism and pathogenicity.
Identification of prophage regions is generally either by looking for attP sites where integration could have taken place or by looking at regions in the vicintiy of integrase proteins. Based on our analysis of the e14 cryptic prophage in E.coli, we have developed two different methods, one based on protein similarity and another based on relative dinucleotide abundance, for detection of prophage regions in genomes. Using these methods we have identified possible new prophage elements. The prophages available from literature have been used to initiate a prophage database (http://203.90.127.174:8082/prophagedb), enabling the data set for browsing and searching. Genome level data briefs on details of prophage location, genome size, GC content, taxonomy, functionality of the prophages. Protein level data details possible functional and domain annotation, whether the protein is likely to be disordered and other details, which are useful as selection criteria for structural genomics initiatives. For example in the forty-five bacterial genomes covered so far in the database, there are 226 prophage entries with 6628 ORFs. There are thirty-five pathogenic bacteria in the database and these have 168 prophages in them. Most of the proteins from these cryptic phages have less than 30% sequence identity to known protein folds. The database includes detailed primary information about prophages and provides access to a collection of comprehensive description about the prophages and the encoded proteins. This provides a step towards the development of effective and systematic support for prophage data management and enables this to serve as a platform for prophage detection and target selection for structural genomics. We have selected a set of targets from the prophage proteins from E.coli for structural genomics initiatives. The clones for these have been obtained with His and GFP tags. These have been screened for expression and solubilization. Some of these targets are promising and have been taken forward for purification and crystallisation. Details of the identification, database construction and screening of the prophages will be presented.