Clustering in R
- Accessed a data set of customer purchasing habits at a winery. Included was demographic data of the customers and their spending habits at this store.
- Cleaned data, removing unnecessary columns and bad entries.
- Scaled dataset based on standardization.
- Selected optimal number of clusters (2) using silhouette scores and elbow plot.
- Utilized K-means clustering to segment, then employed fviz_cluster from the factoextra package to visualize the results.
- Confirmed that the optimal number of clusters had been chosen with a silhouette plot.
- Separated data entries into the two clusters and plotted average and total values for each.
- The two clusters became customer segments for which a marketing plan was made to be recommended to the winery.
- Created charts of purchasing channels and sales of each product category.
Full code can be found in github repository here