Classification and Regression with the BigML Dashboard

Classification and Regression with the BigML Dashboard
Evaluations
Introduction

7.1 Introduction

BigML evaluations provide an easy way to measure and compare the performance of Classification and Regression models (i.e., models, ensembles, logistic regressions, deepnets, and fusions created using Supervised learning learning algorithms). The main purpose of evaluations is twofold:

First, obtaining an estimation of the model’s performance in production (i.e., making predictions for new instances the model has never seen before).
Second, providing a framework to compare models built using different configurations or different algorithms to help identify the models with best predictive performance.

The basic idea behind evaluations is to take some test data different from the one used to train a model and create a Predicting for every instance. Then compare the actual Objective Field values of the instances in the test data against the predictions and compute several performance measures based on the correct results as well as the errors made by the model.

The usual way to obtain some test data is to split a Dataset into two disjoint subsets: a training set and a test set. You can easily do this from your BigML Dashboard by using the 1-click menu option that automatically splits your dataset into a random 80% subset for training and 20% for testing, or if you prefer, you can also configure those percentages. Chapter 7 in Datasets with the BigML Dashboard document [ 23 ] explains how to do this.

Depending on whether you evaluate a classification or a regression model different metrics and visualizations are provided as subsection 7.2.1 and subsection 7.2.2 describe. In BigML you can also perform cross-validation, another popular model evaluation technique explained in subsection 7.3.6 . Moreover BigML provides some tools so you can compare several evaluations built with different algorithms and configurations explained in section 7.6 .

BigML evaluations are first-class citizens. This means that they can be created via the BigML API and can also be queried automatically (you can find an example in subsection 7.7.1 ). This allows you to automate workflows when you want to iteratively change your model parameters and see how the performance is altered.

The evaluations section in the BigML Dashboard is found in the third tab under the models’ menu (see Figure 7.1 .) This section contains all your model’s evaluations ordered by creation date so most recent evaluations are found at the top of the list. Order your evaluations by Name, by Type (classification or regression and cross-validation icons below), by Performance (f-measure or r-squared explained in Classification Measures and subsection 7.2.2 respectively), by Age (time since the evaluation was created), Dataset Size (the testing dataset weight) and Instances (number of instances in the testing dataset). (See Figure 7.2 .) The icons on the left of the evaluations names in the evaluation list view allow you to go to the original Resources used to create the evaluation (the model, ensemble, logistic regression, deepnet, or fusion and the testing dataset).

\includegraphics[]{images/evaluations/evaluations-section} — Figure 7.1 Evaluations section in the BigML Dashboard

\includegraphics[]{images/evaluations/evaluations-listing-view} — Figure 7.2 Evaluation list view

In the evaluation list view, you can search evaluations by name by clicking the search menu option in the top right corner. You can also access the 1-click action menu shown in Figure 7.3 :

Evaluate a model: create a new evaluation by selecting a model and a testing dataset (see subsection 7.3.1 )
evaluate an ensemble: create a new evaluation by selecting an ensemble and a testing dataset (see subsection 7.3.2 )
Evaluate a logistic regression: create a new evaluation by selecting a logistic regression and a testing dataset (see subsection 7.3.3 )
Evaluate a deepnet: create a new evaluation by selecting a deepnet and a testing dataset (see subsection 7.3.4 )
Evaluate a fusion: create a new evaluation by selecting a fusion and a testing dataset (see subsection 7.3.5 )
Compare evaluations: compare two existing evaluations side by side (see subsection 7.6.1 )
Compare multiple evaluations: compare multiple existing evaluations (see subsection 7.6.2 )

\includegraphics[]{images/evaluations/one-click-options-listing-view} — Figure 7.3 Evaluation list view 1-click action menu

Note: time series evalautions are explained in the document Time Series with the BigML Dashboard [ 1 ] .

A Name, Description, Category, and Tags are associated with each evaluation, which can be helpful to retrieve and document your projects. (See section 7.9 .) Share your evaluation with others by using the secret link (see section 7.10 .)

An evaluation is associated with the same Project which the model, ensemble or logistic regression belongs to. Move an evaluation between projects (see section 7.11 ,) or delete it permanently from your account. (See section 7.13 .)

Finally, you can see the corresponding icons used to represent a single evaluation and cross-validation evaluations in Figure 7.4 and Figure 7.5 , respectively.

\includegraphics[width=2cm]{images/evaluations/evaluations-icon} — Figure 7.4 Evaluations icon

\includegraphics[width=2cm]{images/evaluations/icon-crossvalidation} — Figure 7.5 Cross-validation icon

Depending on the evaluations Type, you can find the following icons in the evaluation list view: