Association Discovery with the BigML Dashboard
15 Takeaways
This document explains associations in detail. We finish it with a list of key points:
Association Discovery (or associations) finds meaningful relationships among fields and their values in high-dimensional datasets, whereas statistical techniques focus on controlling the risk of making false discoveries.
Associations output is easily expressed as rules that can be understood by non-experts.
You can create associations from datasets that have been created in BigML, and then create a new dataset from the association rules that you discover. (See Figure 15.1 .)
Associations require the data to be structured in a specific way, using the items field type.
You can create an association with just 1-click or configure it as you wish.
There is no single measure (Support, Coverage, Confidence (Associations), Leverage, or Lift) that is always more important than others. This will depend on your main goals.
You can set minimum levels for a number of association measures that let you focus on more interesting association rules, while filtering out potentially spurious ones.
You can control multiple interestingness measures, yet easy to tune without having to configure difficult to comprehend parameters.
You can easily discretize your numeric fields to transform them into categorical fields.
BigML lets you create associations for a sample of your dataset.
After associations are created, you will get a table that summarizes all the rules discovered, and you can visualize these rules in a network chart.
You can download your association rules in a CSV file, and export the network chart as an image.
You can programmatically create, list, delete, and use your associations through the BigML API and the BigML bindings.
You can furnish your associations with descriptive information (name, description, tags, and category).
You can stop an association creation before the Task is finished.
You can permanently delete an association.