Anomaly Detection with the BigML Dashboard
1 Introduction
There are problems that require identifying the Instancess within a dataset that do not conform to a regular pattern, e.g., detecting any kind of fraud, or discovering errors in your data. BigML anomaly detector (called anomaly in BigML) is an Unsupervised learning learning method that is capable to detect anomalous instances in unlabeled datasets. This means that you do not need to collect a training dataset knowing in advanced which instances are anomalous and which are normal. The algorithm can find suspicious patterns in your data given a set of input fields.
BigML anomaly is an optimized implementation of the Isolation Forest algorithm, a highly scalable method that can efficiently deal with high-dimensional datasets. Learn more about Isolation Forests in Chapter 2 .
This chapter provides a comprehensive description of BigML anomalies, including how they can be created with 1-click (Chapter 3 ), all the configuration options (Chapter 4 ), and the visualization provided by BigML (Chapter 5 ). For each instance, BigML computes an Anomaly Score, which can take values between 0% and 100%, and the field importances, an indicator of each field contribution to the anomaly score (section 2.1 ). Once your anomaly detector has been created, you can use it to score new instances one by one or in batch (Chapter 6 ). You can even download the anomaly score to score new instances locally (see section 7.1 ). You can also create, configure, retrieve, list, update, delete, and use your anomaly detector for making scoring predictions using the BigML API and bindings (section 7.2 and section 7.3 ).
The fifth tab of the main menu of the BigML Dashboard allows you to list all your available anomalies. The anomaly list view (Figure 1.1 ), shows the dataset used to create each anomaly, the Name, Top N (the top number of anomalies explained in section 4.1 ), Age (time elapsed since it was created), Size, and number of Scores and Batch Scores that have been created using that anomaly. The search menu option in the top right menu of the anomaly list view allows you to search your anomalies by name.
When you first create an account at BigML, or every time that you start a new Project, your list of anomalies will be empty. (See Figure 1.2 .)
Finally, in Figure 1.3 you can see the icon used to represent an anomaly in BigML.