One considers the whole sequence to be a complex character, comprising several unit characters (the individual nueleotides) which can have states U, C, A or G. 2. One aligns the sequences in all possible ways. For example: UCAGCAAUCCGU UCAGCAAUCCGUAAA or UCAGCAAUCCGU UCAGCAAUCCGUAAA or UCAGCAAUCCGU UCAGCAAUCCGUAAA and so on. 3. For each possible alignment one counts the number of unit characters that match in the two aligned fragments. So, for the three alignments given above, the number of matches are one, three and twelve, respectively.

4 Examples not independent of the scale(s) of the original measurements. MUltiplying one of the variables by a 'constant (for example, by altering the scale from metres to centimetres) will change the covariance matrix and produce a different set of principal components. It should also be remembered that where the original characters are measured in widely different units, linear combinations of them will have no sensible physical dimensions. Consequently, the analysis is often carried out on standardized measurements and the components extracted from the correlation rather than the covariance matrix.

The size coefficient, C~, will be large when the character states of the two OTUs are quite different in magnitude, and the differences are largely in one direction. For example, a large C~ value would arise if one OTU was very similar to another but much larger along most of the character scales. ) In many studies in numerical taxonomy this partition into size and shape coefficients may not be of great importance, but it may 36 Measurement of similarity be useful in a minority of investigations which involve comparing organisms of widely different sizes.

