This article presents my analysis of an unsupervised learning experiment on a dataset that contains clusters. The goal of the experiment is to find the cluster model. The experiment is based on \(7,500\) observations and initially \(23\) features (including \(1\) categorical feature).
After analyzing the dataset using principal component analysis, selection of the features was applied. The remaining number of features was \(16\).
The train clustering model was created after the feature engineering step. I excluded the categorical feature for evaluation purposes.
Finally, I evaluated the model by comparing the cluster and categorical value for each observation. The evaluation showed an average Silhouette coefficient of \(0.7\). So, I may conclude that my clustering configuration is appropriate.