The problem: We've got a corpus of data with 4 attributes (features to some) and a classification column to specify 3 types of iris flower. Can we identify a mechanism that accurately classifies the iris flowers based solely on its attributes?
Data Summary: Using Tableau Public, I loaded a CSV filled with all the instances and acquired the summary statistics along with some really sexy data visualizations.
Data Visualization: Below is a gallery of scatterplot visuals also generated in Tableau Public.
The Sepal: Length to Width graph shows segregation of sepal length and width of lower values. This is seen by the smaller and lighter green circles. The classification of the values are further extrapolated when only one of each class is visible. This can be seen in the following graph.
The Petal: Length to Width graph shows an evident segregation of petal length and width of lower values. This is seen by the blue circles on the bottom left. The other classes are not as clear to the naked eye but become more evident when the scatterplot is partitioned by class in the following graphs.
Results: A full write-up with R-Studio commands can be found here. R-Studio served as a fantastic and intuitive tool to be able to easily load data and train an unsupervised clustering algorithm. In addition, I was able to easily load a library for plotting the results in 3 dimensions for more interactive models.
Attention to detail? Nah, attention to the whole picture.