Sources with the BigML Dashboard

6.9 Image Analysis

The Image Analysis panel allows users to enable or disable extracted features of images, and allows them to configure different sets of the features.

\includegraphics[]{images/sources/source-config-image-analysis}
Figure 6.13 Image analysis panel

The first control is the DISABLED/ENABLED switch. If disabled, there won’t be any image features generated.

If enabled, users can configure to have any combination of the following five sets of image features:

  • Dimensions: This gives four values, corresponding to the raw image file size, pixel width, pixel height and aspect ratio. 4 numeric fields.

  • Average pixels: This gives the red, blue, and green pixel values for several extremely low resolution versions of the image (1x1, 3x3 and 4x4). This is fast to calculate and captures the high-level spatial and color information, but all detail is lost. 78 numeric fields.

  • Level histogram: This gives the color information in each channel divided into 16 equally spaced histogram bins. Each color histogram is normalized so all values are between 0 and 1. While this gives very detailed color intensity information, all spatial information is lost. 48 numeric fields.

  • Histogram of gradients: Computes a histogram of oriented gradients for the entire image, and for all subimages on a 3x3 and 4x4 grid. The histograms are normalized within each subimage, so all values are between 0 and 1. This histogram generally captures useful spatial and detail information, but precise color information is lost. Generally, this extractor is good at classifiying different shapes, or images where the orientation of the edges is a defining characteristic. 234 numeric fields.

  • Wavelet subbands: Performs an \(n\) level Haar wavelet decomposition on the image, where \(n\) is a parameter. This parameter determines the number of recursive compositions that the featurizer will undertake, and so determines the number of output features. After decomposition, the pixels in each subband are aggregated using mean and standard deviation for both the full images and a 2x2 grid. Since each subband contains all image detail at a certain resolution in one of three directions, this feature type contains both spatial and frequency domain information about the nature of the detail in the image, but the directionality of the detail is only coarsely captured (contrast histogram_of_gradients). Typically useful for problems where texture is a defining characteristic of the image, or where there is obvious periodicity. 160 numeric fields.

Users can also configure to have one of the five pre-trained CNNs (Convolutional Neural Networks):

  • MobileNet: 1024 numeric fields.

  • MoibleNetV2: 1280 numeric fields.

  • ResNet-18: 512 numeric fields.

  • ResNet-50: 2048 numeric fields.

  • Xception: 2048 numeric fields.

Note: ResNet-50 and Xception are only available to customers of virtual private clouds and private deployments

Each option uses the top layer before the softmax output of an ImageNet pre-trained CNN as the input features. These features generally capture high-level features useful for real-world object classification (the presence of eyes, wheels, or striped fur, for example). While these features are easily the best for natural image classification, poor capture conditions and artificial domains (handwriting, images of documents, low resolution security video, etc.) can make these features unsuitable.