Anomaly Detection with the BigML Dashboard

5.2 Create a Dataset from Anomalies

In BigML, you can easily remove the anomalous instances and create a new clean dataset; or you can create a new dataset just including the anomalous instances to analyze them further. The following sections explain both options.

5.2.1 Remove Anomalous Instances

Create a new dataset removing the anomalous instances from the original dataset used to create the anomalies. The new dataset will contain all the input fields used to compute the anomaly score and the ID fields. Read more about ID fields in section 4.4 .

First, select the top anomalies you want to remove. Then click the icon next to the green button to remove the selected instances and finally click Create dataset . (See Figure 5.6 .)

\includegraphics[]{images/an-remove}
Figure 5.6 Create a dataset removing anomalies

5.2.2 Include only Anomalous Instances

Create a new dataset including just the selected instances of your top anomalies. The new dataset will contain all the input fields used to compute the anomaly score and the ID fields. Read more about ID fields in section 4.4 .

First, select the top anomalies you want to include. Then ensure the icon next to the green button to remove anomalies is not clicked as shown in Figure 5.7 . Finally, click Create dataset button.

\includegraphics[]{images/an-include}
Figure 5.7 Create a dataset including only anomalies