Classification and Regression with the BigML Dashboard

7.4 Evaluation Configuration Options

BigML allows you to configure different options for single evaluations and for cross-validation evaluations. For cross-validation, the configuration options are called inputs.

In the following subsections you will find an explanation of each single evaluation configuration options. The last subsection (subsection 7.4.5 ) details cross-validation inputs.

For single evaluations, all the options are found under the configuration panel that appears once you select the model and the testing dataset as shown in Figure 7.70 .

\includegraphics[]{images/evaluations/eval-config-panel}
Figure 7.70 Evaluation configuration panel

Note: some configuration options may change depending on whether you are evaluating a model, an ensemble, a logistic regression, a deepnet, or a fusion.

7.4.1 Missing Strategies

This option is only available for models, ensembles, and fusions that contain models and/or ensembles; for logistic regressions and deepnets the option to include missing values is configured when training the model.

When the testing dataset contains instances with missing values for models and ensembles, BigML can take two different approaches to return a prediction for those instances:

  • Last prediction: when a missing value is found in the testing data for a decision node, the prediction returned will be the one from the parent of the missing split. (See Figure 7.71 .)

    \includegraphics[width=2cm]{images/evaluations/icon_missing_strategic2}
    Figure 7.71 Last prediction icon
  • Proportional missing strategy: when a missing value is found in the testing data for a decision node, the prediction is calculated based on all subtrees predictions, taking into account the proportion of data in each subtree. (See Figure 7.72 .)

    \includegraphics[width=2cm]{images/evaluations/icon_missing_strategic1}
    Figure 7.72 Proportional missing strategy icon

BigML uses Last prediction as the default strategy. Select your preferred option by clicking the corresponding icon shown in Figure 7.73 .

\includegraphics[]{images/evaluations/missing_strategies}
Figure 7.73 Missing strategies for evaluations

Note: if the model or ensemble has been trained with Missing splits (learn more about this parameter in Ensembles chapter in subsection 1.4.4 ), the Missing strategy may not have any impact since missing values would already be considered by the model or the ensemble as valid values.

7.4.2 Select the voting strategy for Decision Forests

For Decision Forests ensembles, you can select the voting strategy you want to calculate the evaluation. See a complete explanation about the three options that BigML provides in Combine single tree predictions: probability, confidence or votes . Selecting a given voting strategy will also determine the threshold to calculate the curves (see Confidence, Probability and Vote Thresholds ).

  • Probability: For classification ensembles per-class probabilities are averaged taking into account all trees composing the ensemble. The class with higher probability is the winner class. For regression ensemble the probability option averages the predictions of the trees composing the ensemble.

  • Confidence: For classification ensembles per-class confidences are averaged taking into account all trees composing the ensemble. The class with higher confidence is the winner class. For regression ensembles the confidence option averages the predictions of the trees composing the ensemble

  • Votes: For classification ensembles each tree prediction is considered as one vote. The “votes” of a given class is the percentage of trees in the ensemble that vote for that class. For regression ensembles the votes option averages the predictions of the trees composing the ensemble. It gives the same results as the probability

\includegraphics[]{images/evaluations/threshold-type}
Figure 7.74 Select the voting strategy to calculate the evaluation

Note: for single decision tree evaluations you can also choose between probabilities and confidences to calculate the thresholds of the evaluation curves.

7.4.3 Fields Mapping

You can specify which fields in the model, ensemble, logistic regression, deepnet, or fusion match with the fields in the testing dataset. BigML automatically matches fields by name, but you can also set an automatic match by field ID by clicking on the green switcher shown in Figure 7.75 . You can also manually search for fields or remove them from the Dataset fields column if you do not want them to be considered during the evaluation.

\includegraphics[]{images/evaluations/fields_mapping}
Figure 7.75 Fields Mapping for evaluations

Note: the fields mapping from the BigML Dashboard has a limit of 200 fields; for evaluations with a higher number of fields, you can use from the BigML API if you need to map your fields.

7.4.4 Sampling your Dataset

Sometimes you do not need all the data contained in your testing dataset to generate your evaluations. If you have a very large dataset, sampling may be a good way of getting faster results. BigML allows you to take a sample before creating an evaluation so you do not need to create a different dataset. You can configure the sampling options detailed in the following subsections. (See Figure 7.76 .)

Rate

The Rate is the proportion of instances to include in your sample. You can set any value between 0% and 100%. Defaults to 100%.

Range

Specifies a subset of instances from which to sample, e.g., choose from instance 1 until 200. The Rate you set will be computed over the Range configured. This option may be useful when you have temporal data, and you want to train your model with historical data, and test it with the most recent one to check if it can predict based on time.

Sampling

By default, BigML selects your instances for the sample by using a random number generator, which means two samples from the same dataset will likely be different even when using the same rates and row ranges. If you choose deterministic sampling, the random-number generator will always use the same seed, thus producing repeatable results. This lets you work with identical samples from the same dataset.

Replacement

Sampling with replacement allows a single instance to be selected multiple times. Sampling without replacement ensures that each instance cannot be selected more than once. By default, BigML generates samples without replacement.

Out of bag

This argument will create a sample containing only out-of-bag instances for the currently defined rate. If an instance is not selected as part of a sample, it is considered out of bag. Thus, the final total percentage of instances for your sample will be 100% minus the rate configured for your sample (when replacement is false). This can be useful for splitting a dataset into training and testing subsets. It is only electable when a sample rate is less than 100%.

\includegraphics[]{images/evaluations/sampling_params}
Figure 7.76 Sampling options for evaluations

7.4.5 Cross-Validation Configuration Options

Depending on the cross-validation script, you will be able to configure different inputs. As it was mentioned in subsection 7.3.6 , you can find three different types of cross-validation scripts in the BigML Gallery. You can find a separate explanation of each one in the following subsections.

Basic 5-fold cross-validation

Basic 5-fold cross-validation script only has one configurable input: the dataset-id. (See Figure 7.77 .) You can click on the selector and search the dataset by name. Once you have selected the dataset you can execute the script and it automatically splits your dataset into \(k=5\) subsets creating five different models and five evaluations using default inputs. Next Model’s k-fold cross-validation explains the default inputs for models.

\includegraphics[]{images/evaluations/cross-val-basic-config}
Figure 7.77 Select a dataset to execute a basic 5-fold cross-validation

Model’s k-fold cross-validation

You can configure the following inputs for your model’s k-fold cross-validation (see Figure 7.78 ):

  • dataset-id: select an existing dataset by clicking on the selector and searching the dataset by name.

  • k-folds: set the number of partitions for your dataset. Your dataset will be split into \(k\) subsets creating \(k\) different models and \(k\) different evaluations. By default \(k\)=5, a \(k\)=10 is also commonly used.

  • objective-id: set the ID of the field you want to be the Objective Field. You can find the field ID in your dataset view by mousing over the field type. It can be a categorical or numeric field. If no ID is given, BigML takes the dataset default objective field. (See subsection 1.4.1 .)

  • missing-splits: set this input to true to consider missing values as valid values in your model. By default, it is set to false. (See subsection 1.4.4 .)

  • stat-pruning: apply statistical pruning to all the tree nodes in order to avoid Overfitting. By default, this option is disabled and the smart pruning is enabled. (See subsection 1.4.3 .)

  • balance-objective: enable this option to let BigML automatically balance the classes of the objective field. This is only available for classification models. By default this option is disabled. (See Balance Objective .)

  • weight-field: weigh the instances considering the values of one field. You need to input the field ID. The selected field must be numerical and it must not contain missing values. This is valid for both regression and classification models. (See Weight Field .)

  • objective-weights: set a specific weight for each class of the objective field. You need to list the weights in the same order that classes are found in your dataset histogram, e.g., to weight red, blue and yellow classes you need to input [2,3,1]. If a class is not listed, it is assumed to have a weight of 1. Weights of 0 are also valid. This option is only available for classification models. (See Objective Weights .)

  • node-threshold: set a threshold for the nodes so the model stops growing. You can set a value between 3 and 2,000. By default it is set to -1 which indicates that no threshold applies. This parameter is also useful to avoid overfitting. (See subsection 1.4.5 .)

\includegraphics[]{images/evaluations/cross-val-model-config}
Figure 7.78 Model’s k-fold cross-validation configuration

Ensemble’s k-fold cross-validation

You can configure the following inputs for your ensemble’s k-fold cross-validation (see Figure 7.79 ):

  • dataset-id: select an existing dataset by clicking on the selector and searching the dataset by name.

  • k-folds: set the number of partitions for your dataset. Your dataset will be split into \(k\) subsets creating \(k\) different ensembles and \(k\) different evaluations. By default \(k\)=5, a \(k\)=10 is also commonly used.

  • objective-id: set the ID of the field you want to be the Objective Field. You can find the field ID in your dataset view mousing over the field type. It can be a categorical or numeric field. If no ID is given, BigML takes the dataset default objective field. (See subsection 2.4.1 .)

  • number-of-models: total number of models in the ensemble. You can choose between 2 and 1,000 models. By default it is set to 10 models. (See subsection 2.4.4 .)

  • missing-splits: set this input to true to consider missing values as valid values in your ensemble. By default it is set to false. (See Missing Splits .)

  • stat-pruning: apply statistical pruning to all the tree nodes in order to avoid Overfitting. By default this option is disabled and the smart pruning is enabled. (See subsection 1.4.3 .)

  • balance-objective: enable this option to let BigML automatically balance the classes of the objective field. This is only available for classification ensembles. By default this option is disable. (See Balance Objective .)

  • weight-field: weigh the instances considering the values of one field. You need to input the field ID. The selected field must be numerical and it must not contain missing values. This is valid for both regression and classification ensembles. (See Weight Field .)

  • objective-weights: set a specific weight for each class of the objective field. You need to list the weights in the same order that classes are found in your dataset histogram, e.g., to weight red, blue and yellow classes you need to input [2,3,1]. If a class is not listed, it is assumed to have a weight of 1. Weights of 0 are also valid. This option is only available for classification ensembles. (See Objective Weights .)

  • node-threshold: when the number of computed nodes is greater than this threshold, model growth stops. You can set a value between 3 and 2,000. By default it is set to -1 which indicates that no threshold applies. This parameter is useful to avoid overfitting. (See Node Threshold .)

  • sample-rate: set a percentage of the dataset to build each single tree. By default it is set to 100% with replacement. (See Rate .)

  • replacement: enable this parameter to allow a single instance to be selected multiple times. Sampling without replacement ensures that each instance cannot be selected more than once. For ensembles, you need to set this parameter to true if your sampling rate is 100% to ensure single trees are built using different subsets of your dataset. (See Replacement .)

  • randomize: set this input to true to use Random Decision Forests algorithm instead of Bagging as the algorithm to build the ensemble. (See subsection 2.4.3 .)

  • seed: write any character to randomize the ensemble sampling. By leaving this input in blank the random-number generator will always use the same seed, producing repeatable results. By default it is configured to create random samples. (See subsection 2.4.9 .)

\includegraphics[]{images/evaluations/cross-val-ensemble-config}
Figure 7.79 Ensemble’s k-fold cross-validation configuration