Stop Mistaking Correlation For Causation In Your Engagement Survey Results

If you have any experience with surveys, or if you've ever worked with data, you've had probably come across the term correlation. Correlation describes to which degree two variables (values) move in coordination with one another, or in other words, what is the relationship between them.

A correlation is expressed as a number that can take range from -1 to +1, where the value of +1 tells us that the two figures move in the same direction (have a positive correlation), whereas a value of -1 tells us the opposite (negative correlation). Thus, with a positive correlation, if one value goes up, the other goes up as well, and with a negative correlation, an increase in one value is related to a decrease in the other.

The correlation can thus give the impression that there is a causal relationship between the two values, i.e. that one causes the other to change. However, this can often lead to a number of fallacies which, in practice, mean that you can invest a lot of effort, time, and cost in something that does not lead to the expected result.

So if you want to work well with your survey results, it is useful to understand what correlation truly means and what it tells you.

Let's try to illustrate this with a simple example. If the data shows you that there is a correlation between phenomenon A and phenomenon B, you know that there is a relationship between the two. But you don't know what that relation is. The reality may be that phenomenon A affects phenomenon B. In that case, we say that A is the independent variable, whereas B is the dependent variable (dependent on A).

Examples of such variables can be age (A) and height (B) in children. It is quite indisputable that the older children are, the taller they are. The reverse relationship does not hold, i.e. body height does not affect age. In this case, causality, i.e. what affects what can be determined by deeper knowledge and experience. It is logical.

In another case, however, the relationship may be reversed, i.e., that B affects A. And there may even be a correlation between the two values, but they don't actually affect each other, but another third (fourth, fifth, etc.) variable enters the picture and affects both.

A long time ago, research proved that children's weight correlates with intelligence. So what does this mean? What is the dependent variable and what is the independent variable? Is it that the heavier a child is, the smarter they are, or is it the other way around, that the smarter a child is, the more they weigh? Neither sounds very logical. How can weight, meaning obesity, be related to intelligence? And yet there was a high correlation between the two. Coincidence? No. The variable the researchers forgot was age. The older the child studied, the more they weighted and thus performed better on intelligence tests. In this instance, age is another variable (C) that affects both A and B without any causal relationship between A and B.

If you are working with survey results, be sure not to make a false conclusion about cause and effect. Take, for example, variables such as motivation and performance. If you find that there is a correlation between the two, what does that tell you? Does it mean that motivated people perform better or that high performers feel more motivated? It kind of brings to mind the question of what came first: the chicken or the egg?

What to do about it? And how to make the most out of correlations? Often, a deeper knowledge of the phenomenon and logic can help, like the example where, you know, age affects the height of children.

For example, in surveys, it might be the correlation between eNPS (the willingness to recommend a company as an employer), and the company's index of Employee care. If a positive correlation shows between the two (as it often does), logic tells us that the more people feel a company cares about them, the more they are willing to recommend it as an employer. The reverse causality is not logical, though not impossible. So here again, you can only establish causality if you have enough information and therefore deeper knowledge.

You can test the reverse hypothesis by offering some people a benefit for recommending the firm and others not, and you can observe whether and how this translates into realistic Employee care ratings.

But both experiments may show that there is no direct relationship between these values, i.e. changing one does not change the other, and both depend on something else. For example, on how people perceive their manager. If I am satisfied with my boss, I rate more positively how the company takes care of me and am more likely to recommend the company as an employer.

The ideal way to test hypotheses in this way is not to work with surveys once in a while, but to measure continuously. In that case, you get clear feedback fairly quickly on whether and how other parameters change at all when you put effort into changing one of them.

Moreover, such continuous measurement gives you information about the evolution of correlations. These are usually not static and unchanging and can shift over time. Recall the initial fairly clear example that children's height depends on age. However, even this dependence changes over time. Children can literally grow by leaps and bounds during puberty, and it is not so much the exact age at that point that is crucial, but rather the onset of puberty and the level of hormones affecting growth. Typically, you will then see that gender is also a significant determinant of growth at this age, as girls have an earlier onset of puberty than boys.

In the corporate practice, try to keep on postponing people's bonus payments, for example, and you'll find that no matter how much care you give them, they still won't recommend you as an employer 😉.

Stop Mistaking Correlation For Causation In Your Engagement Survey Results

Why correlation is not causation?

What does this mean for your engagement surveys?

Never miss a LutherOne article or e-Book: SUBSCRIBE

So the lesson is to form hypotheses, test them, and only use the ones that are confirmed going forward

Related

What Is eNPS And What Is It Good For?

Anonymous Feedback: Good or Not-So-Good?

It’s Here Again! Or How To Prepare For A Performance Review

The Glitch In The Matrix? Cognitive Biases In Performance Management Structures

Get the latest news straight into your e-mail