Accommodating Error Analysis in Comparison and Clustering of Molecular Fingerprints
Figure 4. Histograms of the fragment lengths for 84 two-banded patterns connected by identity (autoclustered with in-house software) exhibit enough spread in values to make detecting outliers and band shifts difficult (a,b). Aligning the 84 lanes to the mean-value lane for this collection reveals that the lanes do not align well, but instead shows bimodal distributions for the fragment lengths (c,d). Dividing the 84 fingerprints into two sets and separating the distinct distributions detected when aligning all 84 fingerprints show that 26 fingerprints align well to their mean-value lane (e,f), and the remaining 58 also align well to their respective mean value lane (g,h). The smaller fragment length fragment does not appear shifted between the two sets of 2-banders (comparing e to g), but the larger fragment is clearly shifted (comparing f to h).