Genomic differences among strains of Corynebacterium cystitidis isolated from uterus of camels

Introduction: Members of the Corynebacterium cystitidis species are usually isolated from kidney and urine of cow having pyelonephritis. Nevertheless, we have isolated Corynebacterium cystitidis for the first time from uterus of camels, extending the type of mammalian host for this species. Furthermore, it remains unknown whether there are significant genetic variations between strains isolated from different host species and anatomic sites. In this perspective, we investigated the genomic diversity of Corynebacterium cystitidis species, whose pan genome remain unexplored to date. Methodology: Thus, we sequenced and compared the genomes of five Corynebacterium cystitidis of camel origin and a public genome of cow associated Corynebacterium cystitidis. Results: Results revealed open pan genome of 4,038 gene clusters and horizontal gene transfer played a role in the extensive genetic diversity. Further, we found an obvious distinction between cow and camel associated C. cystitidis via phylogenomic analysis and by average nucleotide identity value of 95% between the two distant lineages and > 99% within camel associated C. cystitidis strains. Moreover, our data supports the hypothesis that the gene repertoire of cow associated Corynebacterium cystitidis developed so as to become more adaptable to the urine milieu. These genetic potentials are specifically evident for genes required for benzoate breakdown, iron transport, citrate and alanine utilization. Conclusions: Our findings confirm the differentiation of strains into camel lineage and cow lineage. These different niches, comprising the uterus of camel and urinary tract of cow probably played a role in shaping the gene repertoire of strains.


Introduction
The first described Corynebacterium cystitidis infection was reported in 1978 by Yanagawa and Honda, as they found a Gram-positive bacterial rod causing pyelonephritis in cow [1]. This bacterium remained unreported until 2005, when Rosenbaum et al., isolated C. cystitidis from bovine kidney with pyelonephritis during a survey in slaughterhouse [2]. Recently, Smith et al., 2020 reported the isolation of C. cystitidis from 4 cases of pyelonephritis in cow [3]. Currently, studies on C. cystitidis are rare with one study showing bacterium ability to survive in soil for two months [4] and another study describing adhesion to urinary epithelial cells through pili structures existing on the bacteria cell surface [5]. Yet, there is no information about uterine infection caused by C. cystitidis in animals nor report of isolation from others site in animals than cow.
Currently, only a single closed genome is available for the type strain NCTC 11863 from an isolate cultured from a cow kidney. The genome is circular with a size of 3.01 Mbp that harbor 2,838 proteins. Nevertheless, it remains unknown whether there are significant genetic variations between strains isolated from different host species and anatomic sites. Furthermore, genetic discrimination of strains based on whole genome sequence has become a valuable tool to scrutiny on the population structure of pathogens [6,7]. Yet, no information about the molecular characteristics of C. cystitidis strains.
Uterine contamination by bacteria which can progress to infection such as metritis and endometritis occur mostly after delivery. Natural postpartum events lead to presence of blood inside uterus, therefore the postpartum milieu of the uterine lumen is nutrients rich for growth of bacteria [8]. In contrast, urinary tract (UT) contain urine that is known as a moderately nutrientlimited growth environment. Metabolome profile of cow urine reveals several components such as Benzoate, Citrate, Choline and Alanine [9]. Thus, the bacteria need to have necessary metabolic pathways to utilize nutrients in the urine as well as efficient iron acquisition tools to cope with limited iron concentration in the urine, as iron is essential for bacterial survival and pathogenicity. The UT of cow and uterus of camel are different niche, therefore we expect to impact gene content of C. cystitidis in way that adapt it to the specific niche. Therefore, the aim of this study was to investigate the phylogenetic relationships of the C. cystitidis strains collected from the uterus of camels and the strain NCTC 11863 isolated from the kidney of cow. Also, to identify the genomic makeup of strains belonging to either of the two host in order to test the hypothesis of the presence of strain-specific repertoire of genes associated with adaptation to urine milieu. To address these questions, we sequence, describe and compare the genomes of five C. cystitidis isolates recovered from the uterus of camels showing uterine infection from different location and time point. In addition, we compare the camel associated C. cystitidis strains to already sequenced C. cystitidis strain isolated from cow kidney.

Samples collection and isolates
All C. cystitidis clinical isolates were collected between November 2017 and July-2018 from uterus of five she-camels during routine examination at Veterinary Teaching Hospital with a history of conception failure despite several times mating with fertile camel. Examination of reproductive tract included transrectal palpation and transrectal ultrasound of uterus and ovaries using a 7.5 MHz. linear transducer probe (Aloka, Tokyo, Japan) beside inspection of the vagina and cervix. For uterine swab collection, the perineum area of the camel was cleaned and disinfected with acriflavine 0.1%. Then a doubleguarded sterile swab (Kruuse, Langeskov, Denmark) was then passed through the vagina and cervix into the lumen of the uterus, the cotton swab was released from the double guard tube and rub with the endometrium before being retracted into the guard. Swabs were immediately cultured aerobically at 37 °C in 5% sheep blood agar media.

Genome sequencing, assembly and annotation
DNA was extracted from 48 hours cultured bacteria in blood agar plates using the Wizard genomic DNA purification kit (Promega, Madison, USA). DNA libraries for whole genome sequencing were constructed using the TruSeq Nano DNA Library Preparation Kits (Illumina, San Diego, USA) according to the manufacturer's instructions. Then libraries concentration was measured, equimolar volume of each of the libraries was pooled and sequenced by Macrogen Inc (Macrogen, Seoul, South Korea) on an Illumina's NovaSeq sequencer to produce 2x151 bp paired-end reads. Following sequencing, sequence reads were checked by Fastqc 0.11.8 [10] and low-quality bases were trimmed of using fastp software version 0.20.0 [11]. De novo assembly of passed filter reads was performed with the ABySS 2.2.1 [12], SPAdes 3.11.1 [13] and CLC genomic workbench 20.0.4 (Qiagen, Copenhagen, Denmark). Scaffolds with a coverage value lower than 10% and length < 500 base pairs (bp) were discarded. The best assembly was selected based on number of scaffolds and N50. Contigs were ordered against the closed reference genome Corynebacterium cystitidis strain NCTC11863 using Mauve software [14]. Open reading frames (ORFs) were identified with Prodigal [15] with default parameters. Identified ORFs spanning a sequencing gap region (containing N) were removed. The functional categorization of ORFs was performed using online RPS-BLAST against the cluster of orthologous groups (COG) [16] Pfam databases [17] (E-value 1e −02 ). If no hit was found, a search against the non-redundant (NR) database was done using BLASTP with E-value of 1e −03 , coverage 0.7 and an identity percent of 30% on both query and hit. tRNAs were predicted with the Aragorn software [18] and ribosomal RNAs were predicted with RNAmmer [19].

Isolates identification
The identification of isolates to the species level was performed via calculation of average nucleotide identity (ANI) values via OrthoANIu (OrthoANI using USEARCH) [20] and by amplifications of 16S rRNA gene with published universal primers 27f and 1392R [2]. Briefly, two microliters of Corynebacterium DNA and 0.3 μL of each primer (10 pmol) (Macrogen, Seoul, South Korea) were added to the PCR mixture, consisting of one unit of Max Taq DNA Polymerase (Vivantis, Shah Alam, Malaysia), 5 μL of 10X ViBuffer (Vivantis, Shah Alam, Malaysia) and 2 μL of dNTPs (10 mM). The volume was completed to 25 μL by adding distilled water. DNA amplification was performed on a Tpersonal Thermocycler (BIOMETRA, Göttingen, Germany) with an initial 15 minutes cycle at 95 °C followed by 35 cycles consisting of 30 seconds at 94 °C, 1 minute at 55 °C and 1 minute extension at 72 °C, followed by a 10 minutes final extension step at 72 °C. To exclude DNA or amplicon contamination, molecular grade water negative control was used during the expermints. The PCR products were sequenced by Macrogen Inc. (Seoul, South Korea) using BigDye (Applied Biosystems, Foster, USA) on ABI3730XL DNA sequencer (Applied Biosystems). The obtained sequences were screened against NCBI non-redundant (nr) database to find the closest type species.

Detection of prophages, CRISPRs and virulence genes
The prediction of prophages in the genomes was performed using PHASTER (Phage Search Tool Enhanced Release) [22]. It uses a scoring technique to classify prophage regions as intact (> 90), questionable (70-90), or incomplete (< 70). The detection of genes that are predicted to be acquired by horizontal gene transfer (HGT) events was done using COLOMBO v4.0 with Sigi-HMM v1.0 and Sigi-CRF [23]. The identification of clustered regularly interspaced short palindromic repeats (CRISPRs) was performed using both CRISPRfinder and CRISPRCasFinder software [24,25], which are used to accurately detect CRISPR-Cas subtype, number of CRISPR and spacers. As for detection of potential virulence genes, we performed homology searching via BLAST tool against virulence factor database (VFDB) [26], which currently contains 36 virulence genes for seven Corynebacterium species including C. diphtheriae. Notably, homology searching against VFDB could only detect conserved virulence genes but may fail to detect virulence genes that are evolutionary distant from deposited virulence genes in VFDB.

Pan and core genome analysis
The C. cystitidis pan and core genome size was calculated in a way related to that previously reported approaches [27]. To calculate the number of orthologous genes within this species, we used the GET_HOMOLOGUES scripts [28] and Ortho Markov Cluster algorithm (OMCL) [29]. Orthologous genes were grouped using sequence identity of > 30%, query coverage of > 70% and E-value of > 1e -05 .
lastly, the pan-genome is defined as the entire repertoire of genes present in this species, these genes are grouped into genes conserved in all strains (i.e., core-genome) and set of auxiliary genes (i.e., present in some but not all genomes) and strain specific genes (i.e., present in only one genome). Both auxiliary and strain specific genes represent the variable genomic content. The curve fitting of the core and pan-genome was executed as previously described [27]. Finally, the core and pan-genome were functionally described using COG (Clusters of Orthologous Groups) and KEGG (Kyoto Encyclopedia of Genes and Genomes) annotations [30].

Phylogenomic analysis
For whole-genome phylogenetic analysis of C. cystitidis strains, we used two approaches. First, we used genome-wide core-single nucleotide polymorphism (GWC-SNP) based phylogenetic tree approach as follow; the sequence reads were mapped to the C. cystitidis reference genome (Strain NCTC11863) using MEM command form Burrows-Wheeler Aligner (BWA) version 0.7.17-r1188 with default parameters [31]. The generated Sequence Alignment Maps (sam files) were sorted and indexed to yield Binary Alignment Maps (bam files) using SAMtools version 1.10 (using htslib 1.10.2) [32]. Duplicates reads were checked using flagstat command in Samtools. SNPs were determined using mpileup and call commands from BCFtools version 1.10.2 (using htslib 1.10.2). The SNPs were filtered according to the following parameters: minimal read mapping quality of 60, quality of variant calls > 40, read depth > 10 and genotyping quality >30. The identified SNPs were annotated using snpEff 4.3t [33]. For each genome, pseudosequences of polymorphic positions were used to infer maximum likelihood tree using MEGA7 [34]. In the second approach we constructed parsimony pangenome tree for the 6 C. cystitidis genome derived from presence/absence data in a consensus OMCL pangenome matrix, by using GET_HOMOLOGUES software package.

Isolation and genome comparisons of C. cystitidis isolates
Uterine swab samples were taken from 5 shecamels having infertility and uterine infection symptoms (symptomatic n = 4; asymptomatic n = 1) between November 2017 and July 2018. Clinical examination showed accumulation of purulent content into the uterine lumen. Based on clinical signs and ultrasonography, samples were diagnosed as either clinical endometritis or pyometra (Table 1). Culturing uterine swab on blood agar media followed by16S rRNA sequencing revealed that sample2 and 5 were monoculture and named (C. cystitidis G2 and C. cystitidis G5). As for sample1, 3 and 4 they yielded mixed bacterial culture named (C. cystitidis G1, C. cystitidis G3 and C. cystitidis G4).  I  I  I  I  I  I  II  II  II  II  II  II  III  III  IV  IV  IV  IV  IV  IV  GenBank  accession  LT906473 JAERQP000000000 JAERQQ000000000 JAERQR000000000 JAERQS000000000 JAERQT000000000 Other identified bacterial species found with sample1,3 and 4 were shown in Table 1.
Hence, the genome of these five newly isolated C. cystitidis strains were sequenced using Illumina's NovaSeq platform. De novo assembly of the five strains generated a total genome size ranged from 2.89 to 3.01 Mbp with average coverage per scaffold > 100x. As summarized in Table 1, Gene number ranged from 2837 to 2993 and with coding density of 89%. For each of the 5 camel C. cystitidis genomes, contigs were ordered by mauve against closed genome of cow associated C. cystitidis NCTC11863 and concatenated by N characters yielding single chromosome. Overall, large number of collinear blocks of homologous regions in the genomes were observed (Figure 1). Notable, another publicly available C. cystitidis DSM 20524 unfinished genome was excluded since it is same type strain and therefore identical with C. cystitidis NCTC 11863. Entire genome sequence comparisons were accomplished in a pairwise manner, through calculation of ANI values for every genome pair. The ANI value calculated from pairwise comparisons within camel associated C. cystitidis strains were > 99%. In contrast, the ANI values calculated from comparisons of cow associated C. cystitidis strain with camel associated C. cystitidis strains were in the ranged between 95.81-95.93% ( Figure 2). Four identical copies of 16S rRNA   sequence were found in the genome of C. cystitidis strains of camel origin. Pairwise comparison of 16S rRNA sequence between camel associated C. cystitidis strains and cow associated C. cystitidis were 99.4% while within camel associated strains were 100%. (Figure 2). Moreover, the best second blast match for camel associated C. cystitidis strains was with Corynebacterium mycetoides DSM 20632 with 16S rRNA sequence similarity value of 95.1%.
Core genome and pan genome of the C. cystitidis species We described the core-and pan-genome among 6 C. cystitidis strains to explore their genetic diversity. A total of 4,038 pan-genome gene clusters were found. The core gene clusters (2,082, 62%) were abundant in the category Translation, ribosomal structure and biogenesis, (177, 11.1%) followed by the category Amino acid transport and metabolism (157, 9.8%) and the category Carbohydrate transport and metabolism (140, 8.8%) ( Table 2). The prevalence of strain-specific genes in C. cystitidis was various, ranging from 73 genes (strain G1) to 518 genes (strain NCTC 11863), suggesting high genomic plasticity. These unique genes encode diverse functions, of these functions which were much represented include the category Amino acid transport and metabolism and category Defense mechanisms ( Table 2).

Shared and strain-specific metabolic potentials for survival in urine environment
Genome functional analysis by KEGG revealed shared and specific metabolic pathway. One of the shared traits, urease pathway for decomposition of urea encoded by ureABCEFGD operon. Urea can act as source of organic nitrogen for bacterial growth. The difference between cow and camel associated C. cystitidis strains include, for aromatics compound degradation, cow associated C. cystitidis strain encode the operon benABCD for degradation of benzoate into catechol and cat genes degrading the resulting catechol ( Figure 3A). Thus, cow associated C. cystitidis strain seems to have a potential capacity to utilize benzoate, which is present in cow urine, as a carbon source. In contrast, only camel associated C. cystitidis strains encode genes for tetrahydrofolate biosynthesis. Major differences in molecule transport are also detected among the two strains: Unlike camel associated C. cystitidis strains, cow associated C. cystitidis strain carries putative citrate transport systems encoded by the tctCBA operon ( Figure 3B). Thus, cow associated C. cystitidis seems to have a potential capacity to utilize citrate, which is present in cow urine, as a carbon source. Furthermore, cow associated C. cystitidis harbors ribose transport systems encoded by the rbsACBK operon. The amino acid alanine is present in the urine of cow, unlike camel associated C. cystitidis strains, cow associated C. cystitidis encodes alanine dehydrogenase gene. Thus, cow associated C. cystitidis seems to have a potential capacity to synthesize alanine from pyruvate and utilization of alanine as a nitrogen source.

Potential virulence factors
To define the potential virulence capability of C. cystitidis strains, we inspected the genomes for the presence of virulence genes using blast search tool against virulence factor database (VFDB) [26]. Virulence factors of pathogenic Corynebacterium species in VFDB were classified into 4 main groups, that is, adherence, iron uptake, toxin and regulation. We have identified in all strains two LPXTG motif proteins. Surface proteins that carry an LPXTG motif are covalently link to the cell wall peptidoglycan by membrane-bound sortase enzymes. All C. cystitidis strains carried two LPXTG-anchored proteins arranged in a cluster including, SpaH/EbpB family LPXTGanchored major pilin, sortase A and LPXTG cell wall anchor domain-containing protein. These genes set are highly conserved between camel associated C. cystitidis strains (98-99%) but had low homology (<56%) to the cow associated C. cystitidis strain. This in turn are probably associated with niche adaptation. Another putative surface protein not mentioned in VFDB and present only in C. cystitidis of cow origin include Rib/alpha/Esp surface antigen repeat.
Cow and camel associated C. cystitidis strains have several shared sets of iron acquisition systems with some differences. We found seven clusters of iron related genes that are highly conserved between cow and camel associated C. cystitidis strains (> 98) but cow associated C. cystitidis strain possess extra 13 additional iron related genes involved in iron transport and acquisition. Other commonly detected virulence factors included metal-dependent transcriptional regulator gene (DtxR). Notably, genes encoding the diphtheria toxin of C. diphtheriae and phospholipase D of C. pseudotuberculosis were not found among all C. cystitidis strain genomes.
The two-component systems (TCSs) aid bacteria to sense, act and adapt to a broad variety of stimuli, such as nutrients state, changes in osmolarity and temperature. Furthermore, some TCSs can regulate gene clusters involved in biofilm formation and virulence in pathogenic bacteria [35]. TCS usually composed of sensor histidine kinase (HK), mostly a membrane linked protein and a cytoplasmic response regulator (RR) [35]. Genomic analysis revealed 4 out of 5 TCSs were shared by all strains that include 1) MtrB-MtrA system, involved in osmotic stress response in C. glutamicum [36], 2) SenX3-RegX3 system is required for virulence in Mycobacterium tuberculosis [37], 3) MprB-MprA required for persistent infections in Mycobacterium tuberculosis and the fourth TCS include sensor kinase-response regulator. Unlike cow associated C. cystitidis, the genome of strains C. cystitidis of camel origin carry one more TCS system called DesK-DesR. The DesK-DesR system is triggered at low temperature and preserves membrane lipid fluidity upon temperature changes in the bacteria Bacillus subtilis [38].

C. cystitidis defense strategies
In order to examine the capability of members of the C. cystitidis species to protect themselves against attack by foreign DNA, we examined the existence of CRISPR-Cas systems using CRISPRfinder and CRISPRCasFinder software. Among the 6 C. cystitidis genomes analyzed, we detected CRISPR-Cas system except for strain G5 (Table 1). Based on cas gene content and CRISPR length, we identified 3 type I systems and 2 type II systems, the types were further classified into subtypes I-U and II-C When inspecting the CRISPR sequences, the highest number of spacers was found in cow associated C. cystitidis and this strain possess two different CRISPR sequences.
Other systems that prevent gaining of foreign DNA are represented by restriction-modification (RM) systems [39]. Exploring the predicted C. cystitidis proteome for RM systems shown that type I, II, IV RM systems represent the prevalent gene cluster, being present in all strains, followed by type III RM systems identified on the genomes of C. cystitidis G4 and G5 (Table 1). Hence, it seems that the genomes of C. cystitidis G4 and G5 were armed with a wider genomic collection to prevent the attack by foreign DNA sequences.
Toxin-antitoxin (TA) systems are wide spread in bacteria and have been linked to abortive infection (bacteriophage immunity through altruistic suicide of infected cells) [40] and enhancing bacterial survival in environmental stresses such as heat shock, nutrient starvation and antibiotic. Screening C. cystitidis of cow origin genome revealed at least 8 type II TA loci, followed by C. cystitidis G1 encodes 6 type II TA loci and other strains encode 5 type II TA loci (Table 1). We observed that most of the C. cystitidis TA systems are located within core genome, suggesting that it may achieve a function common to all C. cystitidis strains. Notably, the dispensable genome of cow associated C. cystitidis strain carry two type II TA loci (FitB-FitA and Doc-Phd) that are missing from the camel strains. The TA system, FitB-FitA, is constituted of an antitoxin FitA and a toxin FitB and has been linked to reduce intracellular replication and intracellular trafficking of Neisseria gonorrhoeae leading to the formation of persistent populations [41]. As for Doc-Phd, it contributes to persister formation in Salmonella Typhimurium [42]. Thus, FitB-FitA and Doc-Phd TA system probably play roles in virulence of cow associated C. cystitidis. Another type II TA system, ParE-ParD are involved in plasmid stabilization but the deposited closed genome sequence in the GenBank lack any plasmids associated with C. cystitidis genome. Therefore, the function played by this locus in the genome of C. cystitidis of cow origin and C. cystitidis G1 is unknown. Other loci of TA system are presented in Table 1.

C. cystitidis mobilome
The mobilome is defined as the entire genes that may have been acquired by horizontal genes transfer (HGT). The gaining of foreign genes could transfer new traits, which are essential for pathogenicity and adaptation into various niches [43,44]. The obtained results revealed that 3.7% of the entire gene pools in the case of cow associated C. cystitidis were predicted to represent HGT gained genes and up to 2.7%, 2%, 3.3%, 3.2%, 2.7% of that of camel associated C. cystitidis G1, C. cystitidis G2, C. cystitidis G3, C. cystitidis G4 and C. cystitidis G5 respectively. HGT genes contributed to the auxiliary genes and strain specific genes as follow: we found a percent of 12.6, 7.3, 5.9, 8.3, 7.7 and 5.6 of auxiliary genes were HGT for strain NCTC 11863, G1, G2, G3, G4 and G5 respectively. In addition, 9.5, 15.1, 8, 14, 4.9 and 10.3 and 15.1 to 4.9 of strain specific genes were HGT for strain NCTC 11863, G1, G2, G3, G4 and G5 respectively. Moreover, a total of 11 potential donor taxa were identified including the genus Anaplasma, Brevibacillus, Desulfomonile, Grimontia, Hahella, Mycobacterium, Nitrosococcus, Serratia, Syntrophothermus, Talaromyces and Thiomicrospira. Out of the 11 potential donor taxa the genera Talaromyces and Mycobacterium were the main donor taxa. The COG functional classification of predicted mobilome of C. cystitidis revealed genes encoding diverse functions, of these functions which were much represented include the category Cell wall and membrane envelope biogenesis, Defense mechanisms and "Mobilome: prophages, transposons" ( Table 2). Closer inspection of cow associated C. cystitidis mobilome revealed genes involved in iron acquisition and Type II R-M system. Also, closer inspection of strain G1, G2 and G4 mobilome revealed genes involved in Type IV R-M system. As for the C. cystitidis G3 mobilome, genes involved in Type IV R-M system and CRISPR-Cas system were also found.
Finally, we used in silico analysis to detect candidate prophages throughout the genomes of C. cystitidis strains. This genomic screen revealed no intact or questionable prophages in any of the six C. cystitidis strains, at least one incomplete phage region was found in all strains. As for C. cystitidis of cow origin harbor one incomplete prophage region (7.8kb) which match best to protein sequence of PHAGE_Bacill_phiNIT1_NC_021856. As for C. cystitidis G1, four different incomplete prophage regions (9kb, 4.7kb, 6.2kb and 7.8 kb) were identified and having match best to protein sequence of PHAGE_Staphy_42E_NC_007052, PHAGE_Shigel_SfIV_NC_022749, PHAGE_Sinorh_PBC5_NC_00332 and PHAGE_Achrom_JWAlpha_NC_023556 respectively. Moreover, cystitidis G2, C. cystitidis G3 and C. cystitidis G4 harbor similar one incomplete prophage region (~43.4kb) which match best to PHAGE_Coryne_Poushou_NC_042139. In the case of C. cystitidis G5, two incomplete prophage regions (45.7kb and 37.9kb) were identified and having match best to protein sequence of PHAGE_Coryne_Poushou_NC_042139 and PHAGE_Gordon_Attis_NC_041883 respectively. Notably strain G5 harbor the largest incomplete prophage regions and it lacks CRISPR-Cas system while strain of cow origin has two CRISPR and harbor the smallest incomplete prophage regions. Thus, CRISPR-Cas system perhaps provide protection against phage infection.

Phylogenomic analyses
Phylogenetic relationship analysis concerning strains of a same species needed to be performed using core or whole genome sequences to reach a higher resolution and a more comprehensive analysis [45]. Therefore, we investigated the genetic diversity within the strains of C. cystitidis by constructing the phylogenomic tree based on GWC-SNP. For this purpose, we mapped read sequences of the C. cystitidis G1, G2, G3, G4 and G5 against cow associated C. cystitidis that yield a total of 95338 SNPs, of these, 83397 SNPs (87.48%) were located in the coding region and 11941 (12.52%) in intergenic region. Further analysis of SNPs reveals that 29.03% were missense mutation, 0.23% were nonsense and 70.74% were silent mutation. We generated high resolution maximum likelihood phylogeny based on 95338 SNPs. As displayed in Figure 4), all of the camel associated C. cystitidis strains clustered in a separate branch named lineage1 while the cow associated C. cystitidis formed a distant long branch named lineage 2, indicating a clear distinction between camel and cow associated C. cystitidis strains. lineage1 is further sub-divided into sub-lineage 1A and 1B (Figure 4). In addition to the aforesaid phylogenetic method, we also inferred parsimony pangenome tree for the 6 C. cystitidis genome derived from presence or absence of the 4,038 pan-genes. As shown in Figure 4, the camel associated C. cystitidis strains resolved into two branches named lineage 1 and 2, while cow associated C. cystitidis placed in a distant branch named lineage 3.

Discussion
Since their isolation, members of the C. cystitidis have been associated with the development of pyelonephritis in cow. To our knowledge this is the first report of isolating C. cystitidis from the uterus of animals. Whether C. cystitidis is a true causative agent of endometritis/pyometra or cause disease synergistically with another uterine microbes remain to be clarified.
ANI is authenticated method for comparing genomes, identification and estimating species genetic relationships [46]. The ANI value within camel associated C. cystitidis strains were > 99%, exhibiting high level of relatedness. In contrast, the ANI values calculated from comparisons of cow associated C. cystitidis strain with camel associated C. cystitidis strains were in the ranged between 95.81-95.93% reflecting genetic diversity between C. cystitidis strain isolated from cow and camels. This is similar with a figure of 95.1% for the ANI between two members of Serratia marcescens, strains DB11 and SM39 harbored by two unrelated hosts [47]. Thus, both the ANI value which exceeded the recommended 95% threshold for separating two species [46] and the high 16S rRNA value, confirmed that camel associated Corynebacterium strains are members of the species C. cystitidis.
The variable genomic content constituted (48.4%) of the pan-genome, revealing genetic diversity in C. cystitidis. The variable genomic content is responsible for species diversity and confer selective benefit to the bacteria, such as niche adaptation, virulence traits and antibiotic resistance. The auxiliary gene clusters were abundant in the category General function prediction only, Mobilome: prophages, transposons and Defense mechanisms. Furthermore, the strain-specific gene existing in only one genome are probably conferring special traits. Additionally, viewing the core genome and pan genome plot of analyzed genomes revealed that a curve has not reached a plateau suggesting that this pangenome is still open. Therefore, more than the six sequenced genomes should be sequenced to estimate the diversity of C. cystitidis gene repertoire ( Figure 5A and 5B). This is in contrast to the pan-genome of the intracellular endosymbiont Buchnera aphidicola has been closed by 6 genomes [48]. This is due the fact that obligate intracellular bacteria have a restricted niche and have low capacity to gain foreign genes. On the contrary, free-living and environment linked bacteria occupy numerous niches, facing many external stresses. Additionally, it exhibits a capacity to gain foreign genes by horizontal transfer [49,50]. To our knowledge, C. cystitidis inhibit at least two different niches (urinary tract of cow and camel uterus), thus, the pan-genome probably has capacity to acquire or lose gene to adapt to the different niche.
Cow associated C. cystitidis needs to cope with limited nutrient in the urine and antibacterial effect of urea [51]. Metabolome profile of cow urine reveals several components such as urea, benzoate, citrate and alanine [7]. Unlike camel associated C. cystitidis, cow associated C. cystitidis seems to have a potential capacity to utilize benzoate, citrate [52] and alanine which is present in cow urine, as a carbon source. Knowledge from other bacteria causing UTI such as in Enterococcus faecalis revealed upregulation of genes linked to iron transport during urine growth [53]. In Pseudomonas aeruginosa, the use of lactate, citrate, and amino acids as carbon sources has been shown during growth in synthetic urine [54]. Together, these genes support the notion that C. cystitidis of cow origin is probably more adapted to the nutritional environment in the urinary tract than C. cystitidis of camel origin.
As for the virulence factor, we have identified in all strains two LPXTG motif proteins. Several characterized LPXTG motif proteins have role in adhesion and consequently virulence [55]. Another putative surface protein not mentioned in VFDB and present only in C. cystitidis of cow origin include Rib/alpha/Esp surface antigen repeat. We hypothesize that C. cystitidis use the above-mentioned proteins for adhesion to host cells.
Iron uptake is crucial for bacterial growth inside host organisms because iron is sequestered by the host. Thus, genes associated with iron acquisition are known as virulence factors [26]. The finding of more iron related genes in cow associated C. cystitidis strain indicates that effective transport systems could convene a selective growth advantage to cow associated C. cystitidis strain by assisting it to efficiently acquire iron from the urine milieu. Other commonly detected virulence factors included metal-dependent transcriptional regulator gene (DtxR) that is known to controls the transcription of several genes involved in iron homeostasis and toxin gene [56]. The finding of different donor taxa raise question where the HGT events occur. Notably the uterus of camel is normally contaminated by several bacteria species after delivery, which provide opportunities for HGT to occur. Thus altogether, this information suggests that HGT events contributed particularly to iron transport and defense mechanism and generating genetic diversity within the strains. Of note, one of the donor taxa, Talaromyces is a fungus, this finding is not surprising as HGT from Bacteria and plants to the Fungus was reported recently (57). Notably, prophages exit as remnant in the genome of C. cystitidis strains a finding similar to what have been observed previously in C. bovis [58]. In contrast, pathogenic strains of C. diphtheriae and C. ulcerans harbor the gene encoding diphtheria toxin on a prophage [59,9].
Phylogenomic analysis of C. cystitidis strains resolved into two distant branches. Notably, the detailed topologies differed somewhat between GWC-SNP phylogeny and the pan-genome phylogeny. This discordance indicated that variable genomic content played important role in the evolution of C. cystitidis species, which was subsequently characterized by determinations of several horizontally transferred genes, mobile genetic elements. Regardless of the differences in tree deep branch topology, both trees shown a clear distinction between C. cystitidis strains isolated from camel and the strain of C. cystitidis isolated from cow. In addition, our genome phylogeny unveils no clonality of C. cystitidis strains, suggesting that C. cystitidis strains differences mirror the ecological source of such inspected strains.

Conclusions
In this study, isolation of five C. cystitidis strains from the uterus of camels showing signs of uterine disease for the first time, extend the type of mammalian host for this species. Further investigation is required to clarify the role of C. cystitidis in causing uterine disease. Analysis of five C. cystitidis sequenced genomes and the one publicly available C. cystitidis genome, unveiled that more genome sequencing of new C. cystitidis strains is required to close the current open pan genome. Nonetheless, phylogenomic trees, which were constructed using two different methods (GWC-SNP and gene presence/absence), showed a clear differentiation between the camel associated C. cystitidis strains and cow associated C. cystitidis branch. Likewise, ANI support this finding. Anyhow, camel associated C. cystitidis strains showed large diversity. Furthermore, horizontal gene transfer played a role in C. cystitidis strains diversity. In fact, genome analysis of each strain showed the existence of SNPs and strain specific genes, representing the genetic variations between these six strains. Interestingly, a portion of strain specific genes of cow associated C. cystitidis seems to be involved in benzoate breakdown, iron transport, citrate and alanine utilization, supporting the hypothesis that the gene repertoire of this strain developed to become more adaptable to the urine milieu. Hence, these different niches, comprising the uterus of camel and urinary tract of cow probably played a role in shaping the gene repertoire of strains. Anyhow, our findings confirm the differentiation of strains into camel lineage and cow lineage.