# Association Rule Learning

Explore Qualta**Models**

The aim of a model is to represent the key traits of the reality it seeks to stylize. Models are used to understand and explain phenomena. They therefore contain hypotheses and clarify relationships.

In general, **models refer to a theory**. For example, Keynesian theory postulates that consumption and revenue are linked, that private investment and interest rates are linked, and that domestic product is equal to consumption plus investment.

Hypotheses and models **formalize relations** between values and function properties. Using the same example, C = f(Y) with f’ > 0 means that consumption is an increasing function of revenue. Likewise, Y = C + I. As a classical choice for function f, C = a0 + a1 x Y. This means that an increase in revenue entails a proportional increase in consumption, and the existence of consumption that cannot be reduced (a0). This choice is not obvious at first glance; we could have chosen another functional form signifying a consumption effect that lessens as revenue increases by choosing a slight propensity for consumption (0 < a1 < 1) in the model C = a0 x Y.

After specifying the problem, it is time to **select and measure the variables**. If we are dealing with interest rates, should we choose an overnight rate ? Or the base rate of the Central Bank ? Should we use raw data or data adjusted for seasonal variations ?

In fact, validating a theory consists of a sequence of general steps: theory à formalization of the theory through modeling à confrontation of model with data (estimation of parameters) à theory validated or invalidated à if theory invalidated, we test on new data or repeat the model specification step.

In general, we seek to write models involving variables with a common trend; in this case, we speak of correlation. We then mainly distinguish between cases where correlation is linear (model linking consumption and revenue) or non-linear.

**Limitations of models**

Depending on the issue at hand, the use of modeling can sometimes be difficult or even impossible. This raises a number of questions regarding the type of solution to be provided for understanding a phenomenon:

- What happens if no existing theory allows us to formulate hypotheses or specify the form of the relations between variables ?
- Likewise, if the issue is not sufficiently understood in theoretical terms, how can we choose relevant variables (liable to contain information on a variable to be predicted or explained) from among all the existing variables for which we possess experimental data ?

A major limitation in all modeling operations resides in the fact that once the variables of the problem have been circumscribed, and once the data used to estimate the parameters has been collected, you must use all the data, in the sense that all the data is considered important in describing the problem. This leads to major issues in processing missing data, extreme data, and outliers. Lastly, there is always an arbitrage between the model issue (you can end up with a bad model if all the variables that really impact the output variable were not taken into account) and the estimate issue (the more explanatory variables you choose, the lower the risk of omitting relevant variables, but the greater the risk of error in estimating the parameters).

**Prediction through association rule learning**

We seek to predict the status of variable X for a given individual. To do this, we use a sample of individuals for whom we know the status of X, and for whom we have their characteristics. We identify groups of close individuals (i.e. individuals with certain common characteristics) within which the status of variable X is very often the same, in proportions much higher than those observed in the sample. If the given individual has the same characteristics as those of the group of individuals, we predict that he too will have the same status for variable X.

**Advantages of local, model-free approaches**

- No theoretical framework, therefore particularly well suited to situations without such a framework
- No writing of equations postulating in principle a form of relations between variables, and no definition of parameters to be estimated
- Therefore not subject to modeling and estimation issues
- Based on algorithms that isolate variables that are useful in specific configurations, without having to choose the variables beforehand
- Not sensitive to the influence of extreme data or outliers
- By definition, enable discovery of associations in the data that were not necessarily foreseen (whereas with models, you seek what you expect to find)
- Apply as well as model-based approaches to time series, cross-sectional data, panel data and cohorts
- Even in situations where modeling exists and produces results, model-free approaches can be complementary to the former
- Results are easy to interpret and read

Naturally, there are a number of disadvantages:

- Combinatorial difficulties requiring use of powerful algorithms
- Identification of interesting rules from among a potentially high number of irrelevant rules
- Known methods do not process continuous variables
- Technique rarely available in commercial software
- Local rules – not global rules – are obtained: there is no attempt to highlight predominant factors and the main interactions between these factors in a global vision of a phenomenon.

**Interpretation of association rules**

- Identification of obvious, trivial or already known rules
- Specification of modalities where there should be learning (a single interesting modality vs. modalities with opposing consequences)
- Definition of filters according to initial construction of output variable modalities (e.g. if you transformed return series into positive or negative returns, you are exposed to the intensity of the results if you are mistaken)
- For learning on different modalities, acknowledgment of intersections/meetings when contradictory rules are activated at the same time on the validation sample.
- Choice of a decision-making methodology based on criteria such as majority (more or less strong), unanimity, local or global configurations, etc.
- Individual and/or collective rule analysis