Cluster Analysis with the BigML Dashboard
7.6 Descriptive Information
Descriptive information is what allows you to describe a centroid so you can find it later and easily recognize it among other centroids.
Each centroid has an associated name, description, category, and tags. You can find a brief description for each concept in the following subsections. In Figure 7.32 , you can see the options that the More info panel gives to edit them.
7.6.1 Name
If you do not specify a name for your predictions, BigML assigns a default name depending on the type of predictions:
Single centroid: the name always follows the structure “Centroid for
<objective field name>
”.Batch centroid: BigML combines your prediction dataset name and the cluster name: “Batch centroid for
<cluster name>
with<dataset name>
”.
Centroid names are displayed in the list view and also on the top bar of a prediction view. Centroid names are indexed to be used in searches. You can rename your centroids at any time from the More info panel.
The name of a centroid cannot be longer than 256 characters. There is no restriction on the characters that can be used in a name. More than one centroids can have the same name even within the same project, since they are automatically assigned unique internal identifiers.
7.6.2 Description
Each cluster prediction also has a description that it is very useful for documenting your Machine Learning projects. Centroids take the description from the clusters used to create them.
Descriptions can be written using plain text and also markdown. BigML provides a simple markdown editor that accepts a subset of markdown syntax. (See Figure 7.33 .)
Descriptions cannot be longer than 8192 characters and can use almost any character.
7.6.3 Category
Each prediction has associated a category taken from cluster used to create it. Categories are useful to classify predictions according to the domain which your data comes from. This is useful when you use BigML to solve problems across industries or multiple customers.
A prediction category must be one of the categories listed on table Table 7.1 .
Category Aerospace and Defense Automotive, Engineering and Manufacturing Banking and Finance Chemical and Pharmaceutical Consumer and Retail Demographics and Surveys Energy, Oil and Gas Fraud and Crime Healthcare Higher Education and Scientific Research Human Resources and Psychology Insurance Law and Order Media, Marketing and Advertising Miscellaneous Physical, Earth and Life Sciences Professional Services Public Sector and Nonprofit Sports and Games Technology and Communications Transportation and Logistics Travel and Leisure Uncategorized Utilities
7.6.4 Tags
A prediction can also have a number of tags associated with it that can help to retrieve it via the BigML API or to provide predictions with some extra information. Your prediction inherits the tags from the cluster use to create it. Each tag is limited to a maximum of 128 characters. Each prediction can have up to 32 different tags.