Least squares using RKHS

Statistical Data mining: Clustering

Supervised vs. unsupervised

Natural kinds

Good natural kinds should be easy to seperate

Blackboards are a bad model

K-means algorithm


Testing the clusters

Natural kinds don't need perfect algorithms

Natural kinds should be well seperated. All other clusters are sperious. So this really is a CS problem rather than a statistical signfiicance problem.