## Correlation Analyses

The scatter diagram shows the relationship between energy and production values. If we are interested only in the strength of association between energy and production, we can use correlation analysis to provide a measure of this strength. Correlation analysis can measure how closely variations in both values move together and quantify the strength of their relationship, assuming that the underlying relationship is linear. This assumption is correct for most industries provided that production output variability is not too large.

Spreadsheet software usually has a built-in function to calculate correlation coefficients for a given data set. The symbol 'r' is normally used to denote the correlation coefficient, E; and P; denote energy and production, and E and P denote mean (average) energy and production for the given data set, with n equal to number of energy/production pairs (i = 1, 2, 3,... n).

where:

P = —-= Mean value (arithmetic mean) of P( = 1, 2, 3, .. .n)

E = —- = Mean value (arithmetic mean) of Ej(i = 1, 2, 3, .. .n)

The calculated correlation coefficient for our case study with monthly data (Fig. 3.11) gives values of r2 = 0.6648 andr = 0.815, which denotes a reasonably strong correlation for a real industrial environment. For the daily data set shown in Figure 3.9, the correlation coefficient has a value of r = 0.293 and r2 = 0.0860, which denotes a very week correlation indeed.

The correlation coefficient r will indicate how well a best-fit line explains variations in the value of a dependent variable, i.e. energy. If r = 1, all data points will be exactly on the regression line. This will be a case of perfect correlation. The larger the scatter around the best-fit line, the smaller the r-value is, and the weaker the correlation between energy and production.