Last modified: Tue Nov 22 14:30:52 EST 2005 by Dean Foster


Least squares using RKHS

Statistical Data mining: Clustering

Supervised vs. unsupervised

Natural kinds

Good natural kinds should be easy to seperate

Blackboards are a bad model

K-means algorithm


Testing the clusters

Natural kinds don't need perfect algorithms

Natural kinds should be well seperated. All other clusters are sperious. So this really is a CS problem rather than a statistical signfiicance problem.