SINGLE LINKAGE CLUSTERING AND CONTINUUM PERCOLATION

By Mathew D. Penrose.

Suppose $f$ is a probability density function in $d$ dimensions, $d > 1$. A single linkage $a$-cluster on a sample of size $n$ from the density $f$ is a connected component of the union of balls of volume $a$, centred at the sample points. Let $\lambda_c$ be the percolation threshold above which a $d$-dimensional Poisson process of rate $\lambda$ has an unbounded 1-cluster. We show that for large $n$, the `big' single linkage $(\lambda_c/(hn))$-clusters can be used to detect population clusters, $i.e.$ connected components of sets of the form $f^{-1}(-infinty,h]$. Here, a `big' cluster is one that contains a positive fraction of the sample points.

Journal of Multivariate Analysis 53, 94-109 (1995).