Different kinds of distance

Image credit for formulas: Maarten Grootendorst [2]

Euclidean distance

Distance between two series.

Euclidean distance

Manhattan distance

Distance between two series.

Manhattan distance

Chebyshev distance

Distance between two series. Chebyshev distance has limited applicability.

Chebyshev distance

Minkowski distance

This is the general formula that covers Euclidean, Manhattan and Chebyshev formulas for distance.

Minkowski distance
  • p=1 — Manhattan distance
  • p=2 — Euclidean distance
  • p=∞ — Chebyshev distance

Cosine distance

Distance between two vectors

Cosine distance

Jaccard distance

Jaccard index is the total number of similar entities between sets divided by the total number of entities. Jaccard distance is 1 minus Jaccard index.

Jaccard distance

Sorensen-Dice distance

Sørensen-Dice index is a measure of overlap between two sets

Sorensen-Dice distance

Haversine distance

Distance between two points on a sphere given their longitudes and latitudes.

Haversine distance

Distance in flat space-time (“proper distance” in Special Relativity)

https://sites.pitt.edu/~jdnorton/teaching/HPS_0410/chapters/spacetime/index.html

Distance in curved space-time (General Relativity)

https://physics.stackexchange.com/questions/267138/whats-the-definition-of-distance-in-curved-space-time-in-general-relativity

Hamming distance

This is a measure of distance between two bit strings of equal length. Hamming distance is the number of values that are different between the two strings.

Log-Spectral distance

https://en.wikipedia.org/wiki/Log-spectral_distance

Chi-square distance

Chi-square distance is a measure of dissimilarity between two histograms and has been widely used in various applications such as image retrieval, texture and object classification, and shape classification.

https://stats.stackexchange.com/questions/184101/comparing-two-histograms-using-chi-square-distance

Pearson correlation coefficient

This is the most commonly used similarity measure, simply known as “correlation”

https://en.wikipedia.org/wiki/Pearson_correlation_coefficient

Mahalanobis distance

https://en.wikipedia.org/wiki/Mahalanobis_distance

Canberra distance

https://en.wikipedia.org/wiki/Canberra_distance

Bray-Curtis distance

https://en.wikipedia.org/wiki/Bray%E2%80%93Curtis_dissimilarity

Kullback–Leibler distance

https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence

Other correlation measures

Eisen cosine correlation distance, Spearman correlation distance, Kendall correlation distance [4]

Others

There is more, I am sure 🙂

References

[1] Jason Brownlee , “4 Distance Measures for Machine Learning

[2] Maarten Grootendorst, “9 Distance Measures in Data Science

[3] McCune and Grace, Chapter 6

[4] https://www.datanovia.com/en/lessons/clustering-distance-measures/

[5] Distance Measures

This entry was posted in machine learning, science and tagged , . Bookmark the permalink.