Unsupervised Learning – K Means Clustering

k-means clustering groups data points into a predetermined number of clusters based on their distances to the cluster centres. The algorithm requires us to first specify the desired number of clusters (k) and the centres of these clusters. Data points are then assigned to the cluster with the nearest centre.

In this tutorial, we’ll learn to cluster stocks based on their weekly ATR (average true range).

The code demonstrates the following concepts:

  • How to get DOW 30 stock tickers from Wikipedia
  • How to use the resample() method
  • How to do an elbow plot
  • How to cluster stocks based on their weekly ATR

It also backtests a strategy that clusters Tesla stock (TSLA) based on scaled volume (volume/20D average volume) and range ((high – low)/ATR).

The stock’s daily data is grouped into 3 clusters and the performances of the clusters are compared.

Link to Code (as HTML file)

I did not have time to check through the code in this file. Hence, there may be errors in some of the code.

KMeans.html


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *