Now that we’ve covered univariate measures of central tendency and dispersion, we need to talk about bivariate measures. **Univariate measures are measures with respect to a single variable**. **Bivariate measures are measures with respect to the relationship between two variables**. To quantify such relationships, we use

There is a distinction between a dependence *measure* and a dependence *relationship*. The first is simply a bivariate measure of how one variable is related to the other. On the other hand, a dependence relationship implies a deeper connection, where not only are they related but information regarding one variable gives us information about the other one.

There are two main measures of dependence, and they are practically the same since one is the other constrained to the unit range between -1 and 1. **Covariance is a measure of the joint variability of two variables**. It is the expectation of one variable minus its expectation times another variable minus its expectation:

\[ \operatorname{cov}(x, y) = \operatorname{E} \left[ \left( x - \operatorname{E}(x) \right) \cdot \left( y - \operatorname{E}(y) \right) \right]. \qquad(7)\]

High positive or negative covariance indicates either a strong positive or negative joint variability, respectively. It measures how much one unit increase/decrease in one variable is related to one unit increase/decrease in another variable.

The interpretation of a given covariance demands knowledge about units of measurement from both underlying variables. So, if we want to analyze covariance between two variables, we must have some understanding of how the variables’ units are measured. That is why most of the time we use the **correlation, which is normalized covariance**. The correlation is

\[ \rho(x, y) = \frac{\operatorname{cov}(x, y)}{\sigma_x \cdot \sigma_y}, \qquad(8)\]

where \(\sigma_x\) and \(\sigma_y\) are the standard deviations of \(x\) and \(y\), respectively.

In Figure 55, we can see some correlations and their underlying scatter plot for 50 random generated observations.

As we can see, the correlation depicts the **linear association** between variables. The slope of the dashed line shows the correlation between variables. The further from zero and closer to \(\pm\) 1, the stronger is the association between variables. Finally, the sign of the correlation denotes the type of relationship between variables. **Positive correlations** implies **positive relationship**, increase in one variable results in an *increase* in the other one. **Negative correlations** implies **negative/inverse relationship**, increase in one variable results in a *decrease* in the other one.