Datasets with the BigML Dashboard
8.5 Ordering Instances
The ordering instances option in BigML allows you to sort the rows of a dataset by one or more selected fields in ascending or descending order. The instances will be sorted first by the first selected field, then by the second field, and so on. You can select up to 8 different sorting fields.This option is very useful for time series, when you have a dataset containing a date field and you need to sort your instances chronologically.
For example, imagine we have a dataset containing the monthly minimum temperatures in Melbourne (Australia) and they are not chronologically sorted (see left-side table in Figure 8.43 ). If we want to create a Time series model using this dataset, we first need to sort instances in ascending order by date as you can see in the right-side table of the Figure 8.43 below.
We can easily do this in BigML by following these steps:
From the dataset view, click on the Order instances menu option (see Figure 8.44 ).
You cannot select the full date-time field to sort instances, but you can select the expanded fields (year, month, day of month, etc.) to do so. Remember when you select multiple fields to sort your intances, the first field is the one that decides the final order first, then the second field (keeping the order of the first field) and so on. That’s why we need to select first the larger date unit, in this case the year, and then the next date unit, the month in this case (see Figure 8.45 ). Then click on the button.
A new dataset will be created with the sorted instances. You can see the confirmation message on top of the dataset view in blue color (see Figure 8.46 ).
The ordering option in the Dashboard uses an SQL query underneath. Therefore, when the dataset is created, you can view the SQL query by clicking the option shown in Figure 8.47 below.