Skip directly to search Skip directly to A to Z list Skip directly to page options Skip directly to site content

Volume 8, Number 11—November 2002
Tuberculosis Genotyping

Tuberculosis Genotyping Network, United States

Genotyping Analyses of Tuberculosis Cases in U.S.- and Foreign-Born Massachusetts Residents

Sharon Sharnprapai*Comments to Author , Ann C. Miller*, Robert Suruki*, Edward Corkren*, Sue Etkind*, Jeffrey Driscoll†, Michael McGarry†, and Edward Nardell*‡
Author affiliations: *Massachusetts Department of Public Health, Boston, Massachusetts, USA; †New York State Department of Health, Wadsworth Center, Albany, New York, USA; ‡Harvard Medical School, Boston, Massachusetts, USA;

Suggested citation for this article


We used molecular genotyping to further understand the epidemiology and transmission patterns of tuberculosis (TB) in Massachusetts. The study population included 983 TB patients whose cases were verified by the Massachusetts Department of Public Health between July 1, 1996, and December 31, 2000, and for whom genotyping results and information on country of origin were available. Two hundred seventy-two (28%) of TB patients were in genetic clusters, and isolates from U.S-born were twice as likely to cluster as those of foreign-born (odds ratio [OR] 2.29, 95% confidence interval [CI] 1.69, 3.12). Our results suggest that restriction fragment length polymorphism analysis has limited capacity to differentiate TB strains when the isolate contains six or fewer copies of IS6110, even with spoligotyping. Clusters of TB patients with more than six copies of IS6110 were more likely to have epidemiologic connections than were clusters of TB patients with isolates with few copies of IS6110 (OR 8.01, 95%; CI 3.45,18.93).

The incidence of tuberculosis (TB) in the United States is closely linked to the global TB epidemic (1). In 2000, 46% of all reported TB cases in the United States occurred among persons not born there (foreign-born), and 20 states reported that >50% of TB cases occurred among the foreign-born (2). In Massachusetts, 202 (71%) of 285 cases reported were among foreign-born persons (from 41 different countries). Being born outside the United States is the primary risk factor for being reported with TB in Massachusetts (3).

The distribution of places of birth among TB patients reported in Massachusetts has changed greatly over the past 3 decades, reflecting changes in populations immigrating to Massachusetts. As late as 1970, 80% of foreign immigrants in Massachusetts were from Europe or Canada; only 5% of the immigrants were from Asia, and less than 3% were from Central and South America combined and Africa (4). Since 1970, the proportion of immigrants to Massachusetts from Europe has declined, and the proportion of those from Asia, the Caribbean Islands, Africa, and South and Central America has risen. Immigrants from Asia increased sharply, from 3% to 16%. Between 1996 and 2000, the proportion of foreign-born TB patients reported in Massachusetts rose from 61% to72%. This increase was seen primarily among Asians, Africans, and immigrants from Central and South America.

Understanding the factors that contribute to the incidence of TB is critical for TB elimination. Molecular fingerprinting data can be used to further an understanding of the epidemiology and transmission patterns of TB. In this article, we describe the epidemiology of TB patients in Massachusetts and results of using genotyping to evaluate the extent to which genetic clustering of Mycobacterium tuberculosis differs between foreign-born and U.S.-born TB patients.


In 1996, the Massachusetts Department of Public Health, Division of Tuberculosis Prevention and Control (TB Division) became part of the Centers for Disease Control and Prevention (CDC)’s National Tuberculosis Genotyping and Surveillance Network. The TB Division attempted to locate and submit at least one isolate for every culture-confirmed TB case-patient reported from July 1, 1996, through December 31, 2000, to the Northeast Regional Genotyping Laboratory, New York State Department of Health, Wadsworth Center, Albany, New York. DNA genotyping by using IS6110 restriction fragment length polymorphism (RFLP) and the spoligotyping technique (spacer oligotyping) was performed by the Wadsworth Center. RFLP analysis was performed by using the standard method (5,6) with the molecular weight standards provided by CDC. Spoligotyping was performed with a commercially available kit, in accordance with the manufacturer’s instructions (Isogen Bioscience BV, Maarseen, the Netherlands).

Specimen Collection for DNA Fingerprinting Analysis

The following procedures were used to identify patients with positive Mycobacterium tuberculosis cultures and obtain isolates for RFLP analysis. In 1996, a survey of hospitals and private physicians was conducted to ascertain where specimens were being sent for mycobacterial culture. This survey allowed the TB Division to determine which laboratories inside and outside of the state were processing clinical specimens for Massachusetts residents. In addition, a letter was sent to directors of all laboratories in Massachusetts that are licensed under the Clinical Laboratory Improvement Act (CLIA) to perform mycobacteriology services and to other laboratories that were identified through the survey, asking for their cooperation with the TB genotyping network project. Most (71%) hospitals and physicians sent specimens to the Massachusetts State Laboratory Institute (MSLI) for culture identification, susceptibility testing, or both. The TB Division and the Mycobacteriology Laboratory,MSLI, share a joint database where all bacteriology reports, including drug susceptibility information, are automatically linked to suspected and confirmed cases of TB. For M. tuberculosis specimens that were processed elsewhere, the epidemiologists on the TB genotyping network project identified laboratories by attending routine TB case and cohort reviews conducted monthly by the state TB nurses and the Boston Public Health Commission TB Program. Laboratories were then contacted and arrangements were made for shipment of specimens to the MSLI and the Wadsworth Center.

Cluster Investigation

RFLP analysis by using IS6110 is a powerful tool for discerning one strain of M. tuberculosis from another when there are many copies of IS6110. However, for strains of M. tuberculosis with low copy numbers of IS6110, RFLP analysis has less discriminating power, and therefore a secondary typing method is used to help differentiate strains (7,8). For the TB genotyping network project, isolates were considered to be clonally related (i.e., were the same strain of TB) if they had identical IS6110 patterns containing seven or more bands or they had identical IS6110 patterns containing six or fewer bands with identical spoligotyping. A cluster was defined as containing two or more patients with clonally related TB strains.

In 1998, CDC funded the Cluster Investigation Study to evaluate epidemiologic links between clustered cases in a more formal manner. Cluster investigations consisted of standardized medical record reviews wherever a patient was seen for TB, and standardized interviews with the patient (or a proxy) if the patient was unable to participate. All patients were eligible for interview, unless strong epidemiologic links were found between all members of the cluster. In that situation, interviews were considered unnecessary. Written informed consent was obtained from all subjects, and interpreters were used as needed. Information collected through medical record reviews and patient interviews included the estimated period of infectivity, demographics, employment history, and social connections and activities during the 2 years before diagnosis. Each patient in a genetic cluster was examined to determine the following: 1) the period of infectivity (by reviewing date of diagnosis, disease type, smear status, chest radiology results, and date treatment started), 2) name of contacts identified, and 3) how and where the patient spent his or her time during the period of infectivity. If a patient identified another patient in the same cluster, or if patients were found to be in the same place at the same time when one was infectious, the likelihood of transmission was classified as “definite.” Transmission was “possible” if patients were thought to be at the same place, at the same time up to 2 years before diagnosis, or if patients identified the same contact as being the source of TB. A final category, “unlikely,” was designated when no common place or other epidemiologic connection was identified or when patients had arrived so recently in the country that transmission was unlikely to have occurred. Further details about the formal cluster investigation study are provided elsewhere (9). Data were analyzed by using Epi Info version 6.03 (10). The study was reviewed and approved by the, Human Research Review Committee, Massachusetts Department of Public Health.


Epidemiology of TB in Massachusetts and Genotypes

From July 1, 1996, to December 31, 2000, a total of 1,281 cases were reported and verified as TB by the TB Division, of which 1,032 (81%) were confirmed with positive culture for M. tuberculosis. Genotype results were obtained for 984 (95%) of the culture confirmed cases. For the remaining 48 cases, genotype results were not obtained for a variety of reasons, including inability to obtain M. tuberculosis isolates from private laboratories and too little growth on culture. Of the 984 TB patients for whom DNA fingerprinting results were obtained, epidemiologic analyses were conducted for 983 patients whose country of origin was known. The greatest risk for developing TB in Massachusetts was being born outside the United States.

Six hundred eighty four (70%) of the TB patients were foreign-born (from 78 different countries). Most (295; 43%) foreign-born patients were from Asia, followed by the Caribbean region (118;17%) and Africa (116;17%). Countries with the highest number of cases included: Vietnam: 87 cases (13%); Haiti, 83 (12%); China, 59 (9%), India, 54 (8%); Cambodia, 31 (5%), and the Dominican Republic, 30 (4%). Analyses of intervals between arrival into the United States and diagnosis of TB indicated that 176 (26%) patients were diagnosed with TB within 1 year of arrival and 353 (52%) were diagnosed with TB within 5 years of arrival (Table 1).

Foreign-born patients were likely to be younger than U.S.-born TB patients (Table 2). Three hundred twenty-seven (48%) of the foreign-born patients were ages 25–44, as compared to 75 (25%) of U.S.-born patients; 103 (15%) of foreign-born patients were >65 years, as compared with 108 (36%) of U.S.-born patients. Foreign-born patients were also more likely to have extrapulmonary disease: 232 (34%) of foreign-born patients had extrapulmonary TB compared with 61 (20%) of U.S.-born patients. TB patients born in the United States were more likely to have been homeless within the year before diagnosis, and drug use and excessive alcohol use were higher among U.S.-born patients than among foreign-born TB patients. Definition of drug use (injecting drug use and noninjecting drug use), homelessness, and excessive alcohol use are based on CDC criteria as contained in the instruction for the completion of the CDC TB cases reporting forms (11).

Distribution of Genotypes

Analyses of RFLP distribution indicated that 208 (21%)of 983 isolates contained six or fewer copies of IS6110. Sixty-seven (22%) of the isolates from 299 U.S.-born TB patients contained few copies of IS6110, as did 141 (21%) of the 684 isolates from foreign-born TB patients. However, isolates from foreign-born patients differed substantially by geographic region and country of birth (Table 3).One hundred one (34%) of isolates from Asian patients contained few copies of IS6110 compared with 4% of isolates from persons born in South America. In addition, 42 (48%) of isolates from Vietnam contained few copies of IS6110 compared with 7 (12%) from China.

Genetic Clustering of TB Cases by Genotyping

Of isolates from 983 TB patients, 711 (72%) had DNA fingerprints unique among Massachusetts isolates. The remaining 272 (27.7%) were in 82 genetic clusters. However, 171 (22%) of the 775 isolates containing more than six copies of IS6110 were in genetic clusters as compared to 100 (48%) of the 208 isolates containing few copies of IS6110. Of the 208 isolates, 158 (76%) clustered by IS6110 alone; 100 (48%) of the isolates remained clustered even with the addition of spoligotyping data to further differentiate the TB strain. The genetic clusters were relatively small in size; 52 (63%) of 82 clusters contained only 2 people, 25 clusters (30%) contained 3–5 people, and the largest cluster contained 16 people. Among the 299 U.S.-born TB patients, 119 (40%) patients had isolates in genetic clusters; 180 (60%) of those had isolates with a unique fingerprint. These figures compare with 153 (22%) of the 684 foreign-born TB patients who had isolates in genetic clusters and 531 (78%) who had unique fingerprints. U.S.-born TB patients were more likely to cluster than foreign-born TB patients (odds ratio [OR] 2.29, 95% confidence interval [CI] 1.69, 3.12). Foreign-born patients who had lived longer in the United States were more likely to have isolates that clustered than were recent arrivals (chi square for trend 6.31, p<0.05). Overall, 29 (16%) of those diagnosed with TB within 1 year of arrival had isolates that clustered with others as compared to 38 (22 %) among those diagnosed from 1 to 5 years of arrival and 26% among those diagnosed >5 years after arrival (Table 4). Stratified analyses by age group (<25, 25–44, 45–64, >65) indicated that clustering was associated with increased time spent in the United States for all age groups; however, the association was strongest among those 25–44 years of age (p<0.05).

Likelihood of Epidemiologic Link among Clustered TB Cases

Although the TB genotyping network was started in 1996, cluster investigation did not formally begin until 1998. Therefore, of the 272 patients found in 82 clusters overall, only 161 patients in 52 clusters were investigated for epidemiologic connections as part of the formal Cluster Investigation Study. Information regarding epidemiologic connections was obtained for 152 (94%) of 161 patients. Epidemiologic connections were established for 68 (45%) of the 152 clustered TB cases, but none were found for 84 (55%) of the clustered TB cases. Epidemiologic connections were more likely to be identified for clusters containing only U.S.-born TB patients than clusters containing some or all foreign-born TB patients (62% vs. 42% and 33%, respectively; chi square for trend, p<0.05). In addition, clustered TB patients with many copies of IS6110 were more likely to have epidemiologic connections than clusters with few copies of IS6110 (OR 8.01; 95% CI 3.45,18.93). Of the 90 clustered TB isolates with many copies of IS6110, 57 (63%) had epidemiologic connections identified, compared with the 11 (18%) epidemiologic connections that were identified among the 62 clustered TB case-patients with few copies of IS6110. Among the U.S.-born patients, 26 (79%) of the 33 patients with many copies of IS6110 had definite or possible epidemiologic connections, whereas none of the 9 patients with few copies of IS6110 had connections (Table 5).

Of the 152 clustered TB patients, 42 (28%) were in clusters containing only U.S.-born patients, 67 (44%) were in clusters with mixed U.S.-born and foreign-born patients, and 43 (28%) were in clusters containing only foreign-born patients. Analysis of the 67 TB patients in mixed clusters containing both U.S.-born and foreign-born persons indicate that 38 (57%) of the TB patients were foreign-born, and 29 (43%) were U.S.-born. Epidemiologic connections were established for 28 (42%) of the 67 TB patients in mixed clusters, and the 17 resulting relationships were analyzed to determine the direction of TB transmission between the cluster members. Results indicate that TB was transmitted from foreign-born to U.S.-born persons in 6 (35%) relationships, foreign-born to foreign-born persons in five (29%) relationships, U.S.-born to U.S.-born persons in three (18%) relationships and U.S.-born to foreign-born persons in three (18%) relationships. However, three of the six foreign-born to U.S.-born relationships involved children of foreign-born parents born in the United States. Epidemiologic relationships were established for 26 (62%) of the 42 TB patients in clusters containing only U.S.-born persons, resulting in 20 relationships. Of the 43 TB patients in clusters containing only foreign-born persons, epidemiologic connections were established for 14 patients (33%), resulting in eight relationships. Overall, of the 45 relationships established through the 68 clustered TB patients with epidemiologic connections, possible TB transmission between U.S.-born persons occurred in 23 (51%) relationships, from foreign-born to foreign-born persons in 13 (29%) relationships, from foreign-born to U.S.-born in 6 (13%) relationships and from U.S.-born to foreign-born in 3 (7%) relationships. In addition, of the 38 foreign-born TB patients in mixed U.S.-born and foreign-born clusters, 10 (26%) TB was diagnosed within 1 year of arrival, in 7 (18%), TB was diagnosed from 1–5 years of arrival, and among 21 (55%), TB was diagnosed > 5 years after the person arrived in the United States. However, TB patients in mixed clusters were no more likely than patients in clusters containing only foreign-born persons to be diagnosed with TB within 1 year, from 1–5 years, or >5 years of arrival (chi square for trend 0.038, p=0.85).


The greatest risk of developing TB in Massachusetts is being foreign-born. This finding is consistent with the results found by Mitnick et al., indicating that the foreign-born were 7.5 times more likely to have TB than U.S.-born residents of this state (3). An analysis of time from arrival to TB diagnosis indicated that among 26%,TB was diagnosed within 1 year of arrival and among another 26%, it was diagnosed from 1 to 5 years of arrival. This increased risk soon after arrival is particularly true for persons arriving from Africa and South America, among whom TB was diagnosed within 1 year of their arrival for 41% and 35%, respectively,. In Massachusetts, the TB Division is notified of refugees and immigrants with a class A or B TB condition identified through the overseas screening process. Together with the Massachusetts Refugee and Immigrant Health Program, the TB Division works to ensure that those refugees and immigrants are evaluated for active TB soon after their arrival in the United States. However, most foreign-born persons moving into Massachusetts are not refugees or immigrants but students or tourists, and therefore the TB Division has little or no information that would allow targeted TB screening.

Only 28% of Massachusetts TB patients had M. tuberculosis isolates that were clonally related. Most TB cases were likely the result of reactivation of old infection or recent infection that occurred in the person’s country of origin, rather than new infection acquired in this state.. U.S.-born patients were twice as likely to cluster as foreign-born TB patients, suggesting that transmission may be occurring more in the U.S.-born population. U.S.-born TB patients were significantly more likely than foreign-born patients to have a communicable form of TB disease, which may be one more explanation for the increase in clustering among U.S.-born patients. TB transmission between foreign-born and U.S.-born cluster members was established in 9 (20%) of the clustered TB patients with epidemiologic connections; however, to fully examine the extent that U.S.-born and foreign-born TB patients transmit TB in Massachusetts requires a longitudinal investigation of contacts, which was beyond the scope of this investigation. In addition, among those not born in the United States, increased time spent in the United States and clustering appeared to be related. Thus, TB that developed soon after the arrival of the foreign-born appeared to have been acquired abroad, and more of the later onset cases in foreign-born persons appeared to be due to infection acquired in Massachusetts.

The comparison between genotype clustering and epidemiologic connection provides evidence that the ability of DNA fingerprints to differentiate TB strains is limited when there are few copies of IS6110. Only 37% of the isolates in clusters containing few copies of IS6110 had their TB strain differentiated further by spoligotyping. Examination of clustered TB patients with no epidemiologic links indicated that two thirds had few copies of IS6110. Epidemiologic connections were more often discovered when the clusters involved U.S.-born TB patients. Despite the use of interpreters, we may have been less successful in obtaining epidemiologic relationship information from foreign-born patients than from U.S.-born patients because of language and cultural barriers. However, even in the clusters of the U.S.-born patients, in which language was not an issue, epidemiologic connections could not be found in clusters with few RFLP bands. This suggests that the use of RFLP analysis, even with spoligotyping, may not be powerful enough to identify true clustering among isolates with few copies of IS6110.

The drawbacks to the RFLP technique include the following: it is labor-intensive, requires culture growth, is difficult to reproduce, and can require laborious secondary typing techniques (7,8,12). Other genotyping techniques, such as mycobacterial interspersed repetitive units–variable number of tandem repeats, are being considered that may offer advantages, including rapid turnaround time for results, reproducibility, and high sensitivity and specificity for M. tuberculosis. However, those methods may have less discriminating power than RFLP (7,12). Analyses of distribution and clustering of RFLP patterns may provide information regarding the ability of RFLP and other possible DNA fingerprinting methods to differentiate TB strains within various communities. For example, our analysis suggests that the ability of DNA fingerprinting to differentiate TB strains in the Asian community may be limited because one third of the isolates contained few copies of IS6110, and the secondary fingerprinting technique had less discriminatory power (Table 3).

Some limitations of the study must be noted. First, in RFLP analysis, the usual turnaround time between specimen collection and availability of result is lengthy (7,8). In some years, our turnaround time averaged 8 months. This lag time hindered the program’s ability to locate clustered patients for interview and affected the patients’ ability to recall contacts, and thus could have contributed to the relatively low percentage of completed interviews (65%). Of 56 patients eligible for interviews, 41% were lost to follow-up or had moved out of state.

Other limitations include the lack of specificity to differentiate TB strains with few copies of IS6110 (7) and incomplete sampling (13). An overestimation of genetic clustering, particularly among isolates with few copies of IS6110, may have occurred. On the other hand, clustered TB patients may have been underestimated because possible clonal relationships of isolates from our study population may have existed with patients reported outside of Massachusetts or outside the study time frame. In addition, a certain number of isolates in every population are unable to be given RFLP types.


Molecular fingerprint data were useful in describing the epidemiology of TB in Massachusetts. Using this information, the TB Division can estimate TB patients that resulted from transmission in this state and design appropriate interventions. However, the capacity of DNA fingerprinting data to differentiate TB stains may vary by community of interest, and RFLP analysis, even with secondary typing, may not identify true clusters when isolates have few copies of IS6110. This situation has implications for genotyping techniques that have less discriminatory power than RFLP analysis. DNA fingerprinting should therefore be used in conjunction with effective surveillance and appropriate epidemiologic investigation.

Ms. Sharnprapai is the director of TB surveillance and epidemiology for the Bureau of Communicable Disease, Division of Tuberculosis Prevention and Control, Massachusetts Department of Public Health.


The authors thank Paul Elvin, Alissa Scharf, and the Mycobacteriology Laboratory, Massachusetts State Laboratory; Denise O’Connor, John Bernardo, and the Tuberculosis (TB) Division nurses of the Boston Public Health Commission; Janice Boutotte and the Division nurses of the Massachusetts Department of Public Health; Muriel Day, JoAnn Dopp, and Harry Taber; Al DeMaria, Barbara Ellis, and Jack Crawford for manuscript review; and Christopher R. Braden for project direction.

This research was supported in part by the National Tuberculosis Genotyping and Surveillance Network cooperative agreement U52/CCU100156, Centers for Disease Control and Prevention.


  1. Institute of Medicine. Ending neglect. Washington: National Academy Press; 2001:149–58.
  2. Centers for Disease Control and Prevention. Reported tuberculosis in the United States, 2000. Atlanta: U.S. Dept. of Health and Human Services; 2001. p.24.
  3. Mitnick C, Furin J, Henry C, Ross J. Tuberculosis among the foreign-born in Massachusetts 1982–1994: a reflection of social and economic disadvantages. Int J Tuberc Lung Dis. 1998;2:S3240.PubMed
  4. Sum AM, Fogg WN, Palma S, Fogg N, Kroshko J, Suozzo P, The changing workforce: immigrants and the new economy in Massachusetts: final report. Boston: Center for Labor Market Studies, Northeastern University; 1999.
  5. van Embden JD, Cave MD, Crawford JT, Dale JW, Eisenach KD, Gicquel B, Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. J Clin Microbiol. 1993;31:4069.PubMed
  6. Kamerbeek J, Schouls L, Kolk A, van Agterveld M, van Soolingen D, Bunschoten A, Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J Clin Microbiol. 1997;35:9074.PubMed
  7. van Soolingen D. Molecular epidemiology of tuberculosis and other mycobacterial infections: main methodologies and achievements. J Clin Med. 2001;249:126.
  8. Mazars E, Lesjean S, Banuls AL, Gilbert M, Vincent V, Gicquel B, High resolution minisatellite-based typing as a portable approach to global analysis of Mycobacterium tuberculosis molecular epidemiology. Proc Natl Acad Sci U S A. 2001;98:19016. DOIPubMed
  9. Miller AC, Sharnprapai S, Suruki R, Corkren E, Nardell EA, Driscoll JR, Utility for genotyping in public health practice in Massachusetts. Emerg Infect Dis. 2002;8:12859.PubMed
  10. Dean AG, Dean JA, Coulombier D, Brendel KA, Smith DC, Burton AH, Epi-Info, version 6: a word processing, database, and statistics program for epidemiology on microcomputers. Atlanta:Centers for Disease Control and Prevention; 1994.
  11. Centers for Disease Control and Prevention. SURVS-TB RVCT instructions version 2.0. Atlanta: U.S. Department of Health and Human Services; 1994.
  12. Supply P, Lesjean S, Evgueni S, Kremer K, van Soolinger D, Locht C. Automated high-throughput genotyping for study of global epidemiology of Mycobacterium tuberculosis based on mycobacterial interspersed repetitive units. J Clin Microbiol. 2001;39:356371. DOIPubMed
  13. Murray M. Sampling bias in the molecular epidemiology of tuberculosis. Emerg Infect Dis. 2002;8:3639.PubMed


Suggested citation for this article: Sharnprapai S, Miller AC, Suruki R, Corkren E, Etkind S, Driscoll J, et al. Genotyping analyses of tuberculosis cases in U.S.- and foreign-born Massachusetts residents. Emerg Infect Dis [serial online] 2002 Nov [date cited]. Available from

DOI: 10.3201/eid0811.020370

Table of Contents – Volume 8, Number 11—November 2002


Please use the form below to submit correspondence to the authors or contact them at the following address:

Sharon Sharnprapai, Massachusetts Dept. of Public Health, Division of Tuberculosis Prevention and Control, 305 South Street, Boston, MA 02130, USA; fax: 617-983-6990;

character(s) remaining.

Comment submitted successfully, thank you for your feedback.