0% (Table 3 and Figure 3). Of the 2,052 genes predicted, 2,001 were protein-coding genes, and 51 RNAs; 29 pseudogenes were also identified. The majority of the protein-coding genes (64.1%) were assigned with a putative function while the remaining selleck chemicals Idelalisib ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. Table 3 Genome Statistics Figure 3 Graphical circular map of the chromosome. From outside to the center: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew. Table 4 Number of genes associated with the general COG functional categories Acknowledgements We would like to gratefully acknowledge the help of Sabine Welnitz (DSMZ) for growing R.
anatipestifer cultures. This work was performed under the auspices of the US Department of Energy Office of Science, Biological and Environmental Research Program, and by the University of California, Lawrence Berkeley National Laboratory under contract No. DE-AC02-05CH11231, Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344, and Los Alamos National Laboratory under contract No. DE-AC02-06NA25396, UT-Battelle and Oak Ridge National Laboratory under contract DE-AC05-00OR22725, as well as German Research Foundation (DFG) INST 599/1-2.
A representative genomic 16S rRNA sequence of strain 1651/6T was compared using NCBI BLAST under default settings (e.g.
, considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent release of the Greengenes database  and the relative frequencies of taxa and keywords (reduced to their stem ) were determined, weighted by BLAST scores. The most frequently occurring genera were Bacteroides (43.5%), Odoribacter (37.9%), Alistipes (15.2%) and Brumimicrobium (3.4%) (21 hits in total). Regarding the two hits to sequences from members of the species, the average identity within HSPs was 99.7%, whereas the average coverage by HSPs was 97.9%. Regarding the two hits to sequences from other members of the genus, the average identity within HSPs was 93.4%, whereas the average coverage by HSPs was 42.5%. The highest-scoring environmental sequence was “type”:”entrez-nucleotide”,”attrs”:”text”:”EF401000″,”term_id”:”126111311″,”term_text”:”EF401000″EF401000 (‘human fecal clone SJTU D 04 48′), which showed an identity of 99.
8% and an HSP coverage of 98.2%. The most frequently occurring keywords within the labels of environmental samples which yielded hits were ‘human’ (13.4%), ‘biopsi’ (7.4%), ‘mucos’ (7.1%), ‘fecal’ (6.1%) and ‘colon’ (5.3%) (229 hits in total). The most frequently occurring keyword within the labels of environmental samples which Entinostat yielded hits of a higher score than the highest scoring species was ‘fecal/human’ (50.0%) (27 hits in total).