Anomaly Detection with the BigML Dashboard
14 Takeaways
This document covered anomalies in detail. We conclude it with a list of key points:
Anomaly dectection is an unsupervised learning method used to detect instances that do not follow a regular pattern.
BigML anomaly use an optimized implementation of the Isolation Forest algorithm, a highly scalable and efficient method that usually yields the best results compared to other anomaly detection techniques.
BigML computes an anomaly score for each instance and a measure to indicate the relative contribution of each input field to the anomaly score.
BigML anomalies support categorical and numeric fields as inputs, text and items fields will not be taken into account to compute the anomaly score.
BigML anomalies also supports missig data.
To create anomalies you just need an existing dataset. Then anomalies can be used to make a single score prediction or a batch score prediction. Additionally, you can create a dataset from anomalies. (See Figure 14.1 .)
You can use the 1-click option to create your anomaly or you can configure the several parameters provided by BigML before.
When the anomaly has been created, you get a list of your TOP ANOMALIES ranked by score.
You can inspect your anomalous instances values in the DATA INSPECTOR.
You can create a new dataset removing your anomalous instances or including them.
You can use your anomaly to score single or multiple instances in batch not seen before by the model.
You can create, configure, update, and use your anomalies programmatically via the BigML API and bindings.
You can download your anomalies to locally score your new instances.
You can add descriptive information to your anomalies.
You can move your anomalies between projects.
You can share your anomalies with other people using the secret link or embedding them into your own applications.
You can stop your anomalies creation by deleting them.
You can permanently delete your existing anomalies.