Sources with the BigML Dashboard

4.3 Image Composite Sources

When the component sources of a composite source are all image sources (See section 2.4 ), that is, they all have the format “Image” (See section 2.2 ), the composite source will also have the format “Image”. This is an image composite source.

An image composite source represents a collection of images and is more useful in marchine learning than a single image. While a single image can be used for prediction, a machine learning model is trained on a collection of images, not a single image. A dataset intended for image applications contains many images, with each image as an instance in the dataset.

While single image sources cannot be used to create dataset, image composite sources can. In the dataset created from an image composite source, every row will correspond to one of the images in the composite source, and have a column representing the image data (type image), a second column as its filename (type path), and a possible third column as its label.

Note: As stated before, BigML allows up to 445,000 components in a composite source. Hence an image composite source can have at most 445,000 images.

\includegraphics[]{images/sources/source-list-image-composites}
Figure 4.10 Image composite sources

4.3.1 Automatic Image Labels

Image labels are important in machine learning, and especially they are indispensable in image classification. A common practice in the industry is to group images by folders, with the folder names being their labels. BigML accommodates such practice by providing automatic image labeling with folders.

When an image composite source is created by uploading an archive file, most of times, it will have three fixed fields plus a set of autogenerated image fields (See next section subsection 4.3.2 ). The three fixed fields are inherited from the single image sources as follows:

  • image_id is a field with optype image that contains the identifier of the corresponding single image source.

  • filename is a field with optype path that contains the path inside of the archive file for each image file.

  • label is a categorical field whose values are extracted from the filenames. More specifically, they correspond to the innermost directory segment of the filename.

    For instance, the label is “foo” for “bar/baz/foo/image.jpg” or is missing for “another_image.png”.

The label field is omitted in the following cases:

  • If BigML doesn’t detect any directory names in the filenames.

  • If the source creation request is made via API and it includes the property disable_autolabel which is set to the Boolean value true.

Zip file grape-strawberry-dir.zip has two folders named grape and strawberry:


Archive:  grape-strawberry-dir.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  11-19-2020 22:05   grape/
    44691  01-04-2006 21:33   grape/092_0020.jpg
    13527  01-04-2006 21:33   grape/092_0008.jpg
    37413  02-18-2006 16:46   grape/092_0009.jpg
    ......
        0  11-19-2020 22:05   strawberry/
    10378  05-17-2020 20:57   strawberry/image_0032.jpg
     8912  05-17-2020 20:57   strawberry/image_0026.jpg
     8717  05-17-2020 20:57   strawberry/image_0027.jpg
     ......

Each folder contains a number of images. Once this zip is uploaded to the BigML Dashboard, an image composite source will be created:

\includegraphics[]{images/sources/source-folder-labels}
Figure 4.11 Image labels automatically generated from folder names

As shown, image labels will be automatically generated from their folder names.

4.3.2 Extracted Image Features

When an image composite source is created, BigML analyzes the images and automatically generate a set of numeric features for each image. Those features appear as new, added fields in the composite source. While those fields can be viewed in the source, their values cannot be changed because they are generated fields.

\includegraphics[]{images/sources/source-image-features}
Figure 4.12 Extracted image features in a composite source

Image features are only generated for image composite sources, they are not generated for single image sources.

By default, BigML extracts 234 features per image, representing its histogram of gradients. You can see the number of fields in the description string under the source name:

\includegraphics[]{images/sources/source-list-image-feature-description}
Figure 4.13 Image feature fields in the description string

In total, BigML offers five sets of extracted image features:

  • Dimensions

  • Average pixels

  • Level histogram

  • Histogram of gradients

  • Wavelet subbands

\includegraphics[]{images/sources/source-image-feature-config}
Figure 4.14 Options of image features in the source configuration

These features are all numeric values describing the structure and contents of the images in the composite source.

BigML also offers five pre-trained convolutional neural networks (CNNs):

  • MobileNet

  • MoibleNetV2

  • ResNet-18

  • ResNet-50 (Available to customers of virtual private clouds and private deployments)

  • Xception (Available to customers of virtual private clouds and private deployments)

Users can configure and select different combinations of image features in the Source Configuration Options (Chapter 6 ). Different configurations are reflected by the numbers of fields in the description strings under the source names:

\includegraphics[]{images/sources/source-list-image-feature-description-more}
Figure 4.15 Different image feature fields in the description string

Please see “Image Analysis” (section 6.9 ) for the detailed explanation of image features and how to configure them.

By exposing those image features as explicit fields in your datasets, they can be used by all machine learning algorithms and models on the BigML platform, just as any other feature.

4.3.3 Image Composite Views

On the Dashboard, there are three views for an image composite source. When users click on an image composite source in the source list view, it enters fields view.

Fields View

A fields view lists all the fields in the composite source:

\includegraphics[]{images/sources/source-composite-fields-view}
Figure 4.16 Fields view of an image composite source

As seen in the figure above, an image comes with two fields, with the following icons representing their types:

\includegraphics[width=2.5cm]{images/sources/source-datatype-icon-image}
Figure 4.17 Image field type
\includegraphics[width=2.5cm]{images/sources/source-datatype-icon-path}
Figure 4.18 Path field type

When the images are organized by folders with the folder names intended as their labels (see subsection 4.3.1 ), an image comes with three fields. In addition to the two fields “image_id” and “filename”, as shown above, whose respective types are image and path, it has a third field named “label”, which type is categorical.

\includegraphics[]{images/sources/source-composite-fields-view-label}
Figure 4.19 Fields view of an image composite source with a label field

Users can preview all the fields, including the images by hovering the mouse over the image IDs:

\includegraphics[]{images/sources/source-composite-fields-view-image}
Figure 4.20 Preview of an image in an image composite source

By default, all fields of the extracted image features are hidden. That’s why only two image fields are shown in Figure 4.20 . However, you can click on the “show image features” icon next to the search box as shown below:

\includegraphics[]{images/sources/source-composite-fields-view-features-icon}
Figure 4.21 Icon to click to show image feature fields

Then, you can preview all fields, including the image features. At this point, when all fields are shown, you can click on the same icon to hide the fields of image features.

\includegraphics[]{images/sources/source-composite-fields-view-all-fields}
Figure 4.22 Preview of all fields of a composite source

This fields view of a composite source is equivalent to the source view of a non-composite source (Figure 1.8 ).

Again, in a fields view, BigML transposes rows and columns compared to the original data. That is, each row is associated with one of the fields of the original data, and each column shows the corresponding values of an instance. It becomes much easier to navigate them using a web browser if they are arranged this way when sources contain hundreds or thousands of fields.

A fields view only shows the first 25 instances of the data. The main goal of this view is to help quickly identify if BigML is parsing the data correctly.

Using the tabs on top of the field list, users can switch to other two views.

\includegraphics[]{images/sources/source-composite-fields-view-tabs}
Figure 4.23 Tabs for switching views

Sources View

A sources view lists all the component sources in a composite source:

\includegraphics[]{images/sources/source-composite-sources-view}
Figure 4.24 Sources view of an image composite source

It’s essentially a source list view inside a composite source. Users can click on each component source to view its details. Besides viewing the information of component sources, users can select them to perform the follow operations in an open composite source:

\includegraphics[]{images/sources/source-composite-sources-view-select}
Figure 4.25 Selecting component sources of a composite source
  • Delete sources: Delete the selected component sources. This will not only remove the component sources from the composite source, but also delete them from the platform permanently.

  • Exclude sources: Exclude the selected component sources. Thie will exclude them from the composite source, but they will stay as invididual sources on the platform and won’t be deleted.

  • Create composite: Create a new composite source using the selected sources as its component sources.

For a closed composite source, users can only select component sources to perform Create composite .

When making selections, users can use the “Select all sources” checkbox on the top right to sell all component sources. They can also use the “Search by name” box which acts as a name filter. That is, when a text string is typed into the box, all component sources whose name contain the string are shown and are for selection.

Images View

An images view shows all the images in the composite source.

\includegraphics[]{images/sources/source-composite-images-view}
Figure 4.27 Images view of an image composite source

For a closed image composite source, users can only view images, and they can load new images by clicking on the Get new preview button on the bottom right.

\includegraphics[]{images/sources/source-closed-composite-images-view}
Figure 4.28 Images view of a closed image composite source

In the images view of an open image composite source, besides viewing all images, users can select them to perform certain operations. When making selections, users can use the “Select all images” checkbox on the top right to sell all images. They can also use the “Search by name” box which acts as a name filter. That is, when a text string is typed into the box, all images whose name contain the string are shown and are for selection.

After the selection of images, the following operations can be performed to them:

  • Delete images: Delete the selected images. This will not only remove the images from the composite source, but also delete them from the platform permanently.

  • Exclude images: Exclude the selected imagess. Thie will exclude them from the composite source, but they will stay as invididual image sources on the platform and won’t be deleted.

  • Create composite: Create a new composite source using the selected images as its component sources.

  • Label images: Give a label to the selected images.

When labeling images, there has to be a label field. When such field doesn’t exist, there will be a prompt to create one:

\includegraphics[]{images/sources/source-composite-images-view-no-label}
Figure 4.30 Prompt if no field for labels

There are two ways to create a label field.

  • First, on top of the images, on the left is a “Label field” textbox. Clicking on the “+” next to it, users will be prompted with a dialog box asking for the field name and the field type:

    \includegraphics[]{images/sources/source-composite-images-view-new-label}
    Figure 4.31 Create a new label

    After inputing the field name and selecting its type from the dropdown, click the Add button to create the label field.

    When there are one or more labels in the composite source, clicking on the “x” next to the “Label field” textbox will remove them.

    \includegraphics[]{images/sources/source-composite-images-view-remove-label}
    Figure 4.32 Remove a label

    Users will get a modal box to confirm the removal:

    \includegraphics[]{images/sources/source-composite-images-view-remove-label-confirm}
    Figure 4.33 Confirm to remove a label
  • The second way to create a label field is, by following the link in the prompt Figure 4.30 , using the Source Configuration Panel of the composite source.

    \includegraphics[]{images/sources/source-config-new-field}
    Figure 4.34 Adding a field in source configuration

    As shown above, there is an Add new label field button above the field list. Clicking on it will add a new field at the bottom of the field list. This is especially useful in creating multiple labels. After entering filed names and selection field types, users have to click on the Update button to save the changes.

After a label field is created, users can go to the images view, select images and label them by giving the label field a value, such as “grape” shown below:

\includegraphics[]{images/sources/source-composite-images-view-label-images}
Figure 4.35 Labeling images

4.3.4 Importing Label Fields to Image Composites

Label fields can be added to image composites by importing them from CSV or JSON files. This is very useful as sometimes it’s easier to enter labels in a different file or the labels are prepared separately from images in the business workflow. Moreover, additional label fields can provide more information about the images, such as captions, comments, geo-coordinates, which can be in the CSV or JSON files.

What CSV or JSON files do in such context is to provide information about the images in a table format, so here we call these files table files.

Importing label fields is basically a join operation, which combines rows from the image source and the table source. The join is based on a related field between the two sources, and the related field is the image filename or path.

\includegraphics[]{images/sources/source-composite-import-label-fields-menu}
Figure 4.36 The menu icon for importing label fields

From an image composite source, click on the Import label fields from a table source menu icon, as shown in Figure 4.36 . The icon is composed of three rows, an arrow and a source, signifying a table adding to a source.

\includegraphics[]{images/sources/source-composite-import-label-fields}
Figure 4.37 Import label fields from a table source

Then users are presented with the input panel, as show in Figure 4.37 . This panel is for selecting the table source and the image path field.

\includegraphics[]{images/sources/source-composite-import-fields-select-source}
Figure 4.38 Select the table source

As shown in Figure 4.38 , clicking on the “Select a source” input box will bring up a dropdown list, which are all table sources available. Select the desired one from the list.

\includegraphics[]{images/sources/source-composite-import-fields-select-path}
Figure 4.39 Select the path field

Then select the path field in the table source that corresponds to the image path field in the current image composite source. Clicking on the input box will show the candiate field, after the selected table source was parsed.

\includegraphics[]{images/sources/source-composite-import-label-fields-progress}
Figure 4.40 Progress of the importing

After both selections were made, click on the Import label fields button to start the import. During the operation, there will be a progress bar along with relevant information (Figure 4.40 ).

\includegraphics[]{images/sources/source-composite-imported-label-fields}
Figure 4.41 Newly imported label fields

After the operation is finished, the new label fields are shown in the Fields view of the composite source (Figure 4.41 ).

Note: if the current image composite source is closed, importing label fields to it from a table source will create a new open image composite source, which will have the newly imported label fields. If the current image composite source is open, the importing operation will add the newly imported label fields to the image composite source and keep it open.

After the new label fields are imported, users can inspect them in the Images view of the composite source. As shown in Figure 4.42 , different label fields can be selected by using the dropdown list of the “Label field” on top. They will show up in the image captions.

\includegraphics[]{images/sources/source-composite-import-label-fields-view}
Figure 4.42 Selecting different label fields to view