Figure 2

Volume 7, Number 3—June 2001

Spoligotype Database of Mycobacterium tuberculosis: Biogeographic Distribution of Shared Types and Epidemiologic and Phylogenetic Perspectives

Christophe Sola*

, Ingrid Filliol*, Maria Cristina Gutierrez†, Igor Mokrousov*, Véronique Vincent†, and Nalin Rastogi*

Author affiliations: *Institut Pasteur de Guadeloupe, Pointe à Pitre, Guadeloupe; †Centre National de Référence des Mycobactéries, Institut Pasteur, Paris, France

Main Article

Phylogenetic tree of shared types of Mycobacterium tuberculosis constructed by pairwise comparison of patterns using the "1-Jaccard" index and the neighbor-joining algorithm. Approximately 15 branches may be visualized at an arbitrary distance of 0.2. The position of some reference strains (M. tuberculosis H37Rv, M. bovis BCG) or well-studied spoligotyping families of isolates (Beijing, Haarlem, and the M. africanum group) are also indicated.

Figure 2. . Phylogenetic tree of shared types of Mycobacterium tuberculosis constructed by pairwise comparison of patterns using the "1-Jaccard" index and the neighbor-joining algorithm. Approximately 15 branches may be visualized at an arbitrary distance of 0.2. The position of some reference strains (M. tuberculosis H37Rv, M. bovis BCG) or well-studied spoligotyping families of isolates (Beijing, Haarlem, and the M. africanum group) are also indicated.

Main Article

¹For this purpose, the independent sampling sizes for Europe and the USA were taken as n₁ and n₂, the number of individuals within a given shared-type "x" was k₁ and k₂, and in this case, the representativeness of the two samples was p₁=k₁/n₁ and P₂=k₂/n₂, respectively. To assess if the divergence observed between p₁ and p₂ was due to sampling bias or the existence of two distinct populations, the percentage of individuals (p₀) harboring shared-type "x" in the population studied was estimated by the equation p₀= k₁+k₂/n₁+n₂=n₁p₁+n₂p₂/n₁+n₂. The distribution of the percentage of shared-type "x" in the sample sizes n₁ and n₂ follows a normal distribution with a mean p₀ and a standard deviation of formula image and respectively, and the difference d=p₁-p₂ follows a normal distribution of mean p₀-p₀=0 and of variance σ_d²=σ_p1²+σ_p2² = p₀q₀/n₁+p₀q₀/n₂ or σ_d²=p₀q₀ (1/n₁+1/n₂). The two samples being independent, the two variances were additive; the standard deviation σ_d= was calculated, and the homogeneity of the samples tested was assessed using the quotient d/σ_d=p₁-p₂/ formula image . If the absolute value of the quotient d/σ_d<2, the two samples were considered to belong to a same population (CI 95%) and the variation observed in the distribution of isolates for given shared types could be due to a sampling bias. Inversely, if d/σ_d>2, then the differences observed in the distribution of isolates for given shared types were statistically significant and not due to potential sample bias.

Page created: April 26, 2012

Page updated: April 26, 2012

Page reviewed: April 26, 2012

The conclusions, findings, and opinions expressed by authors contributing to this journal do not necessarily reflect the official position of the U.S. Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.

Volume 7, Number 3—June 2001

Synopsis

Spoligotype Database of Mycobacterium tuberculosis: Biogeographic Distribution of Shared Types and Epidemiologic and Phylogenetic Perspectives

Figure 2