Clustering

Algorithms

Evaluation

  • High intra-cluster similarity, low inter-cluster similarity
  • External
    • Precision, Recall
    • F1 score
  • Internal Consistency
    • Sum of Square Error (SSE)
    • BetaCV, compactness and separability, average between intra-cluster and inter-cluster distance
  • Silhouette Coefficient
    • Cohesion and separation
    • ranges from to
    • Calculate average coefficient of all points, is strong, is reasonable, is poor
    • = mean distance with all points in the same cluster, = mean distance with all points in the nearest other cluster