Sources with the BigML Dashboard

4 Composite Sources

A composite source is a collection of other sources, which are called component sources. Component sources can be any type (section 2.1 ), they can be single sources, they can be composite sources as well. In other words, composite sources can be nested. A composite source can only be a component source if it’s closed (See section 2.3 ).

BigML uses the icon in Figure 4.1 to represent a BigML compoiste source.

\includegraphics[width=2cm]{images/sources/source-composite-icon}
Figure 4.1 Composite source icon

You can see the composite source icons in the “Type” column under the source list view:

\includegraphics[]{images/sources/source-list-composites}
Figure 4.2 Composite sources under the source list view

The open/closed lock in the icons of composite sources signifies their open/closed status.

\includegraphics[width=1.5cm]{images/sources/source-composite-icon-open}
\includegraphics[width=1.5cm]{images/sources/source-composite-icon-closed}
Figure 4.3 Icons for open and closed composite sources

When all the component sources of a composite source have the same fields, the composite source will inherit those fields, and a dataset can be created from it. The result will just be the concatenation of all the rows extracted from each component source inside the composite source. For instance, a composite source may have several CSV component sources, which were created from several CSV files with exactly the same fields, and the composite source will inherit those fields and behave like a single source.

A composite source is created open (section 2.3 ), so is any other source. Being open means it’s modifiable. The following operations can be performed to an open composite source:

  • Add component sources.

  • Remove component sources.

  • Replace the full list of component sources with a new list.

A source can belong to as many composite sources as desired. However, when a source belongs to one or more composite sources, it cannot be modified, regardless of whether it is open or closed. This way all composite sources see the same version of the source all the time.

When adding or removing component sources to a composite source, it will check the compatibility of the fields of all its component sources, and update its own set of fields.

Once a composite source’s components are finalized and it must be closed to create datasets. When closing a composite source, all its component surces will be automatically closed.

You may create a dataset from an open composite source, but the composite source will be closed at the same time.

Unlike all other types of sources, composite sources must be explicitly closed by an API call or UI action. This is mainly to avoid accidentally closing a composite source by mistake. For instance, since composite sources can have a huge number of component sources, they may be shared and worked on by several collaborators. Then the mistake of accidentally closing the component sources would be costly.

To close a composite source, click on the Close This Composite Source menu item, either from the source’s context menu in a source list view,

\includegraphics[]{images/sources/source-close-composite}
Figure 4.4 Close a composite source in the source list view

or from the cloud action menu in the source view.

\includegraphics[]{images/sources/source-close-composite-source-view}
Figure 4.5 Close a composite source in the source view

Note: BigML currently limits the number of component sources in a composite source to 445,000. In other words, a composite source can have at most 445,000 components.