Classification and Regression with the BigML Dashboard
7.5 Visualizing Evaluations
After evaluations are created, you can visualize them in the BigML Dashboard.
Different visualizations are provided for classification (see subsection 7.5.2 ) and regression (see subsection 7.5.3 ) evaluations. Both types of evaluations shared the original resources information at the top of the view (see subsection 7.5.1 ).
Cross-validation visualizations are very similar to single evaluations, the main differences are explained in subsection 7.5.4 .
7.5.1 Original Resources Information
For single evaluations, the original model and testing dataset used to create the evaluation will appear at the top of the evaluations view along with their corresponding icons so you can come back to see the resources. (See Figure 7.80 .)
The information for each resource includes some of the parameter values used to create them:
For models, ensembles, logistic regressions and deepnets:
Model type (regression or classification).
Objective field name.
Model size.
Number of fields: used to build the model.
Number of instances: used to build the model.
Description (see section 1.10 .)
Sampling and ordering options (see subsection 1.4.7 and subsection 1.4.8 .)
Instances: number of sampled instances.
Sample rate: percentage of the training dataset used to build the model.
Range: instances selected to build the model.
Sampling: deterministic or random.
Replacement: true or false.
Out of bag: true or false.
Ordering: linear, random or deterministic.
Ensembles also include:
Type: Decision Forests or Boosted Trees (See subsection 2.4.3 )
Number of models: total single models in the ensemble
For testing datasets:
Dataset size
Number of fields: in the testing dataset
Number of instances: in the testing dataset
Description: see subsection 7.9.2
Sampling and ordering options (See subsection 7.4.4 .)
Instances: number of sampled instances
Sample rate: percentage of the testing dataset used to create the evaluation
Range: instances selected to sample the testing dataset
Sampling: deterministic or random.
Replacement: true or false.
Out of bag: true or false.
Ordering: linear, random or deterministic.
7.5.2 Classification Evaluations
For classification models, BigML provides several views that are accesible by clicking in the corresponding icons at the top menu.
General Evaluation
This first visualization provides the general confusion matrix which contains the correctly classified instances as well as the errors made by the model for each of the objective field classes whithout applying any threshold.
In the table rows you can find the predictions for the positive class and the negative classes, and in the columns the actual instances in the testing dataset. You can transpose rows by columns by clicking on the switcher shown in Figure 7.81 . You can also download the confusion matrix in Excel format by clicking the option shown in Figure 7.81 .
If you select a postive class in the selector, the confusion matrix cells will be painted according to the True Positives (TP), False Positives (FP), True Negatives (TN) and False Negatives (FN). By hovering a class in the table you get also these colors Figure 7.82 .
In the metrics below the confusion matrix you will see the performance measures for the class selected as the positive class such as Precision, Recall, F-Measure, Phi Coefficient and Accuracy. If you select “All classes”, you will get the averages for the overall model. You can also compared these metrics to other two types of models: one using the mode as its prediction and the other predicting a random class of the objective field. At the very least, you would expect your model to outperform these weak benchmarks. (See Figure 7.83 ).
The confusion matrix will display up to five different classes. If your model contains more than five classes in the objective field, you can download the confusion matrix in Excel format by clicking the option shown in Figure 7.84 .
Evaluation curves
BigML offers four different evaluation curves: the ROC curve, the Precision-Recall curve, the Gain curve and the Lift curve. All of them are explained in detailed in Evaluation curves .
You can select the positive class and the threshold for each curve and you will see the confusion matrix values and the metrics changing in real-time (see Figure 7.85 ).
The confusion matrices for the curves have always two columns and two rows although there are more than two classes in the objective field. This is because the postive class is the one predicted given a certain threshold and if the threshold is not met, the rest of classes are aggregated as a single “Negative class” and predicted instead. Read more about positive and negative classes in Confusion Matrix ).
From each curve visualization you can:
Download the confusion matrix in Excel format
Download the chart in PNG format with or without legends.
7.5.3 Regression Evaluations
You can visualize the regression measures in the green boxed histograms: Mean Absolute Error, Mean Squared Errror and R Squared. Read more about regression measures in subsection 7.2.2 .
By default, BigML provides the measures of two other types of models to compare against your model performance. One of them uses the mean as its prediction and the other predicts a random value in the range of the objective field. At the very least, you would expect your model to outperform these weak benchmarks. You can remove the benchmarks by clicking the icons shown in Figure 7.88 .
7.5.4 Cross-Validation Visualization
Since cross-validation measures are calculated by averaging the single \(k\) evaluations, you cannot find the confusion matrix and evaluation curves explained in subsection 7.5.2 . Instead, the main metrics are plot in different histograms. Additionally, you can see the standard deviation per measure under the sigma symbol icon. You can find a cross-validation example for a classification model below.
At the top of the view, you can find the list of the single evaluations used to compute the cross-validation measures by clicking on the Evaluations panel. (See Figure 7.90 .)