Sources with the BigML Dashboard
6.2 Single Field or Multiple Fields
The Single Field or Multiple Fields switch allows you to tell BigML if your source is composed of only one field of type items.
6.2.1 Auto-detection of single, item-type fields
Sources containing a field of type items may be submitted without surrounding quotes, in which case the input will appear to have a varying number of columns in each row. Figure 6.2 shows an excerpt of a single-field source. BigML will attempt to detect this case, rather than assume a “square” CSV format with a large number of bad rows. (See Figure 6.3 ). The criteria are as follows:
The proportion of rows, whose column counts differ from the most frequent count, is greater than 0.25.
There are no missing values as items.
There are no items greater in length than 64 characters.
When a single-column source is detected, its separator is set to the empty string (""). There is no separator when there are not at least two columns to separate. You can also indicate that a source consists of a single column by setting the separator to the empty string ("").
Conversely, erroneous single-column auto-detections can be overridden via an update of the source by setting an items separator that is not the empty string.