However, some care should be taken because 1 and 0 may have diﬀerent meanings for diﬀerent variables. Sometimes, we may only wish to consider (1, 1) as a match. For example, suppose we have a binary variable for nationality where x = 1 if a person is a UK citizen, and x = 0 otherwise. In this case, if we are really interested in nationality (rather than just whether UK or not), we cannot tell whether two people with x = 0 match on nationality. If we do not wish to consider (0, 0) as a match, a suitable measure of similarity is a/p.

For example, we may ﬁnd evidence that people fall into two groups (essentially pro- or anti-abortion), or that attitudes are more diverse. As there are only four binary variables, the similarity measure r = (a+d)/p has just four distinct values (0, 14 , 12 and 34 ) corresponding to whether two patterns match on 0, 1, 2 or 3 responses. This could only give very limited possibilities for clustering response patterns. 16, and which we explain below. 16 are obtained by weighting each response for each variable to give the following measure of similarity between patterns i and j: 4 4 wk0 (1 − xik )(1 − xjk ), wk1 xik xjk + k=1 k=1 where xik and xjk are the responses to item k for patterns i and j, respectively, and where wk1 and wk0 are weights.

Again, these diagrams may be plotted horizontally or vertically, but the name derives from the vertical form which (with some imagination) looks like hanging icicles. 8. 5. 8 are similarities rather than distances. This means that when looking for the closest pair, we look for the largest number rather than the smallest. Here the individuals are villages labelled V13 to V19. 5. 8 The numbers involved here are too small to demonstrate adequately the usefulness of the icicle plot, and we simply use this example to show how the ﬁgure is constructed.

