Cluster Analysis with the BigML Dashboard
15 Takeaways
This document covered Clusterings in detail. We conclude it with a list of key points:
BigML Clusters can learn how your data instances group together based on their similarity.
Each cluster group is represented by its center, called Centroids.
To build a cluster you just need a dataset. (See Figure 15.1 ).
A cluster can be an input to a prediction, to a batch prediction, to a dataset, or to a BigML model. (See Figure 15.1 ).
Create centroids or batch centroids from a cluster to know to which instance group previously unseen data instances belong.
You can also create clusters using BigML REST API or the BigML bindings for your language of choice.
Create a BigML model or a dataset from a cluster to further analyze the instances that belong to any given group of instances discovered by training the cluster. For example, a BigML model may help you identify which fields are more relevant in determining whether a data instance should be considered member of a cluster group.
Numeric fields are automatically scaled to prevent their different magnitudes from biasing the calculation.
BigML provides two different methods to do the clustering: K-means and G-means. Use G-means when you do not know how many cluster groups can be found.
When you create a BigML Cluster from a dataset, you can define a number of options, such as the number K of clusters (K-means) or the critical value (G-means), field scaling and weighting, and sampling.
BigML visualizes clusters through circles of different colors that represent found centroids. Each circle is sized according to the number of instances that belong to the corresponding cluster group.
You can use BigML Clusters to calculate the nearest centroid to a given data instance or to a number of instances.
You can download clusters in several languages, including Python, JSON PML, and Node.js, to use for local computation.
At any time you can update a cluster’s descriptive information, move a cluster to a different project, rename it, or delete it permanently.