What is meant by confounding what is a lurking variable What is a confounding variable?

Voting Behavior in the 2012 Election

An instructional resources project sponsored by the APSA, ICPSR, and SETUPS.

  • About the Project
  • Background
    • The 2012 Election
    • Voting Behavior
  • The Dataset
    • Survey Research Methods
    • The Codebook
  • The Exercises
    • Data Analysis
    • Analysis Exercises

Confounding variables are those that affect other variables in a way that produces spurious or distorted associations between two variables. They confound the "true" relationship between two variables.

For example, if we have an association between two variables (X and Y), and that association is due entirely to the fact that both X and Y are affected by a third variable (Z), then we would say that the association between X and Y is spurious and that it is a result of the effect of a confounding variable (Z). Of course, Z may be an confounding variable when it comes to this particular relationship, but it might not be for other relationships.

Confounding variables also can affect two variables that do have some causal connection. For example, if X and Y are associated and also causally related (for example, if X affects Y), the association between X and Y may reflect not only their causal connection but also the influence of a third variable (Z) that affects both of them. Thus, the association between X and Y may exaggerate the causal effect of X and Y because the association is inflated by the effect of Z on both X and Y. In this case, we could say that the relationship between X and Y is confounded by Z, even though it is not a purely spurious relationship.

Not all researchers use these terms or use them in exactly the way that we have defined them here. Some use the term "spurious variable" or "extraneous variable" to refer to a variable that produces a purely spurious association between two other variables. The term confounding variable sometimes is used more narrowly to refer only to the second example that we discuss above, where a causal connection between X and Y is distorted by the effects of a third variable. Regardless of the terminology used, the possible confounding effects of a third variable are recognized by everyone. Thus, we must always remember that the simple bivariate association between two variables can be quite unrepresentative of the true causal connection between the variables.

Published on May 29, 2020 by Lauren Thomas. Revised on July 21, 2022.

In research that investigates a potential cause-and-effect relationship, a confounding variable is an unmeasured third variable that influences both the supposed cause and the supposed effect.

It’s important to consider potential confounding variables and account for them in your research design to ensure your results are valid.

What is a confounding variable?

Confounding variables (a.k.a. confounders or confounding factors) are a type of extraneous variable that are related to a study’s independent and dependent variables. A variable must meet two conditions to be a confounder:

  • It must be correlated with the independent variable. This may be a causal relationship, but it does not have to be.
  • It must be causally related to the dependent variable.
Example of a confounding variableYou collect data on sunburns and ice cream consumption. You find that higher ice cream consumption is associated with a higher probability of sunburn. Does that mean ice cream consumption causes sunburn?

Here, the confounding variable is temperature: hot temperatures cause people to both eat more ice cream and spend more time outdoors under the sun, resulting in more sunburns.

What is meant by confounding what is a lurking variable What is a confounding variable?

Why confounding variables matter

To ensure the internal validity of your research, you must account for confounding variables. If you fail to do so, your results may not reflect the actual relationship between the variables that you are interested in..

For instance, you may find a cause-and-effect relationship that does not actually exist, because the effect you measure is caused by the confounding variable (and not by your independent variable).

ExampleYou find that more workers are employed in states with higher minimum wages. Does this mean that higher minimum wages lead to higher employment rates?

Not necessarily. Perhaps states with better job markets are more likely to raise their minimum wages, rather than the other way around. You must consider the prior employment trends in your analysis of the impact of the minimum wage on employment, or you might find a causal relationship where none exists.

Even if you correctly identify a cause-and-effect relationship, confounding variables can result in over- or underestimating the impact of your independent variable on your dependent variable.

ExampleYou find that babies born to mothers who smoked during their pregnancies weigh significantly less than those born to non-smoking mothers. However, if you do not account for the fact that smokers are more likely to engage in other unhealthy behaviors, such as drinking or eating less healthy foods, then you might overestimate the relationship between smoking and low birth weight.

Receive feedback on language, structure and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Grammar
  • Style consistency

See an example

What is meant by confounding what is a lurking variable What is a confounding variable?

How to reduce the impact of confounding variables

There are several methods of accounting for confounding variables. You can use the following methods when studying any type of subjects—humans, animals, plants, chemicals, etc. Each method has its own advantages and disadvantages.

Restriction

In this method, you restrict your treatment group by only including subjects with the same values of potential confounding factors.

Since these values do not differ among the subjects of your study, they cannot correlate with your independent variable and thus cannot confound the cause-and-effect relationship you are studying.

Restriction example You want to study whether a low-carb diet can cause weight loss. Since you know that age, sex, level of education and exercise intensity are all factors that may be associated with weight loss, as well as with the diet your subjects choose to follow, you choose to restrict your subject pool to 45-year-old women with bachelor’s degrees who exercise at moderate levels of intensity between 100–150 minutes per week.
  • Relatively easy to implement
  • Restricts your sample a great deal
  • You might fail to consider other potential confounders

Matching

In this method, you select a comparison group that matches with the treatment group. Each member of the comparison group should have a counterpart in the treatment group with the same values of potential confounders, but different independent variable values.

This allows you to eliminate the possibility that differences in confounding variables cause the variation in outcomes between the treatment and comparison group. If you have accounted for any potential confounders, you can thus conclude that the difference in the independent variable must be the cause of the variation in the dependent variable.

Matching exampleIn your study on low-carb diet and weight loss, you match up your subjects on age, sex, level of education and exercise intensity. This allows you to include a wider range of subjects: your treatment group includes men and women of different ages with a variety of education levels.

Each subject on a low-carb diet is matched with another subject with the same characteristics who is not on the diet. So for every 40-year-old highly educated man who follows a low-carb diet, you find another 40-year-old highly educated man who does not, to compare the weight loss between the two subjects. You do the same for all the other subjects in your treatment sample.

  • Allows you to include more subjects than restriction
  • Can prove difficult to implement since you need pairs of subjects that match on every potential confounding variable
  • Other variables that you cannot match on might also be confounding variables

Statistical control

If you have already collected the data, you can include the possible confounders as control variables in your regression models; in this way, you will control for the impact of the confounding variable.

Any effect that the potential confounding variable has on the dependent variable will show up in the results of the regression and allow you to separate the impact of the independent variable.

Statistical control exampleAfter collecting data about weight loss and low-carb diets from a range of participants, in your regression model, you include exercise levels, education, age, and sex as control variables, along with the type of diet each subjects follows as the independent variable. This allows you to separate the impact of diet chosen from the influence of these other four variables on weight loss in your regression.
  • Easy to implement
  • Can be performed after data collection
  • You can only control for variables that you observe directly, but other confounding variables you have not accounted for might remain

Randomization

Another way to minimize the impact of confounding variables is to randomize the values of your independent variable. For instance, if some of your participants are assigned to a treatment group while others are in a control group, you can randomly assign participants to each group.

Randomization ensures that with a sufficiently large sample, all potential confounding variables—even those you cannot directly observe in your study—will have the same average value between different groups. Since these variables do not differ by group assignment, they cannot correlate with your independent variable and thus cannot confound your study.

Since this method allows you to account for all potential confounding variables, which is nearly impossible to do otherwise, it is often considered to be the best way to reduce the impact of confounding variables.

Randomization exampleYou gather a large group of subjects to participate in your study on weight loss. You randomly select half of them to follow a low-carb diet and the other half to continue their normal eating habits.

Randomization guarantees that both your treatment (the low-carb-diet group) as well as your control group will have not only the same average age, education and exercise levels, but also the same average values on other characteristics that you haven’t measured as well.

  • Allows you to account for all possible confounding variables, including ones that you may not observe directly
  • Considered the best method for minimizing the impact of confounding variables
  • Most difficult to carry out
  • Must be implemented prior to beginning data collection
  • You must ensure that only those in the treatment (and not control) group receive the treatment

Frequently asked questions about confounding variables

What is a confounding variable?

A confounding variable, also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design, it’s important to identify potential confounding variables and plan how you will reduce their impact.

An extraneous variableis any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

How do I prevent confounding variables from interfering with my research?

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction, you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching, you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable.

In statistical control, you include potential confounders as variables in your regression.

In randomization, you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Thomas, L. (2022, July 21). Confounding Variables | Definition, Examples & Controls. Scribbr. Retrieved November 3, 2022, from https://www.scribbr.com/methodology/confounding-variables/

Is this article helpful?

You have already voted. Thanks :-) Your vote is saved :-) Processing your vote...

What is a lurking and confounding variable?

A lurking variable is a variable that has an important effect on the relationship among the variables in the study, but is not one of the explanatory variables studied. Confounding. Two variables are confounded when their effects on a response variable cannot be distinguished from each other.

What is meant by confounding variable?

Confounding variables are those that affect other variables in a way that produces spurious or distorted associations between two variables. They confound the "true" relationship between two variables.

What is meant by confounding?

1 : to throw (a person) into confusion or perplexity tactics to confound the enemy. 2a : refute sought to confound his arguments. b : to put to shame : discomfit a performance that confounded the critics.

What is meant by confounding variable quizlet?

A confounding variable is an explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study.