DBSCAN

DBSCAN = Density-Based Spatial Clustering of Applications of Noise

Algorithm

  • Find core data points of high density and expand clusters from them by neighborhoods
  • Parameters
    • = max radius of neighborhood
    • Minimum Points (MinPts) = min num of points in the neighborhood
    • = the neighborhood
  • Points
    • Core points - with valid neighborhood
    • Border points
    • Noise/Outlier

An animation.

Pros and Cons

  • Pros
    • Resistant to noise/outliers
    • Arbitrary cluster shape
    • Efficiency: one scan
  • Cons
    • Sensitive to parameters chosen