Causal Inference and Bias in Learning Analytics

A Primer on Pitfalls Using Directed Acyclic Graphs




learning analytics, causal inference, LA, directed acyclic graphs, DAG, research design, observational research, bias, research paper


As a research field geared toward understanding and improving learning, Learning Analytics (LA) must be able to provide empirical support for causal claims. However, as a highly applied field, tightly controlled randomized experiments are not always feasible nor desirable. Instead, researchers often rely on observational data, based on which they may be reluctant to draw causal inferences. The past decades have seen much progress concerning causal inference in the absence of experimental data. This paper introduces directed acyclic graphs (DAGs), an increasingly popular tool to visually determine the validity of causal claims. Based on this, three basic pitfalls are outlined: confounding bias, overcontrol bias, and collider bias. Further, the paper shows how these pitfalls may be present in the published LA literature alongside possible remedies. Finally, this approach is discussed in light of practical constraints and the need for theoretical development.


Achen, C. H. (2005). Let’s put garbage-can regressions and garbage-can probits where they belong. Conflict Management and Peace Science, 22(4), 327–339.

Akçapınar, G., Altun, A., & Aşkar, P. (2019). Using learning analytics to develop early-warning system for at-risk students. International Journal of Educational Technology in Higher Education, 16(40).

Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making causal claims: A review and recommendations. The Leadership Quarterly, 21(6), 1086–1120.

Arnold, K. E., & Pistilli, M. D. (2012). Course signals at Purdue: Using learning analytics to increase student success. In S. Buckingham Shum, D. Gašević, & R. Ferguson (Eds.), Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (LAK ʼ12), 29 April–2 May 2012, Vancouver, BC, Canada (pp. 267–270). ACM Press.

Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), 399–424.

Baker, R. S., Gašević, D., & Karumbaiah, S. (2021). Four paradigms in learning analytics: Why paradigm convergence matters. Computers and Education: Artificial Intelligence, 2, 100021.

Bareinboim, E., Correa, J. D., Ibeling, D., & Icard, T. (2020). On Pearl’s hierarchy and the foundations of causal inference. Technical Report R-60, Causal AI Lab, Columbia University.

Beheshitha, S. S., Hatala, M., Gašević, D., & Joksimović, S. (2016). The role of achievement goal orientations when studying effect of learning analytics visualizations. Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 54–63). ACM Press.

Bellemare, M. F., Bloem, J. R., & Wexler, N. (2019). The paper of how: Estimating treatment effects using the front-door criterion. Working Paper.

Berliner, D. C. (2002). Comment: Educational research: The hardest science of all. Educational Researcher, 31(8), 18–20.

Bloom, H. S. (2012). Modern regression discontinuity analysis. Journal of Research on Educational Effectiveness, 5(1), 43–82.

Caulfield, M. (2013, Sept 26). Why the Course Signals math does not add up. Hapgood.

Cinelli, C., Forney, A., & Pearl, J. (2020, October 29). A crash course in good and bad controls.

Clow, D. (2013). An overview of learning analytics. Teaching in Higher Education, 18(6), 683–695.

Cooper, A. (2012). What is analytics? Definition and essential characteristics. CETIS Analytics Series, 1(5), 1–10.

Dawson, S., Mirriahi, N., & Gašević, D. (2015). Importance of theory in learning analytics in formal and workplace settings. Journal of Learning Analytics, 2(2), 1–4.

Dawson, S., Joksimović, S., Poquet, O., & Siemens, G. (2019). Increasing the impact of learning analytics. Proceedings of the 9th International Conference on Learning Analytics and Knowledge (LAK ’19), 4–8 March 2019, Tempe, AZ, USA (pp. 446–455). ACM Press.

Elwert, F., & Winship, C. (2014). Endogenous selection bias: The problem of conditioning on a collider variable. Annual Review of Sociology, 40, 31–53.

Ferguson, R., & Clow, D. (2017). Where is the evidence? A call to action for learning analytics. Proceedings of the 7th International Conference on Learning Analytics and Knowledge (LAK ’17), 13–17 March 2017, Vancouver, BC, Canada (pp. 56–65). ACM Press.

Flanders, W. D., & Ye, D. (2019). Limits for the magnitude of M-bias and certain other types of structural selection bias. Epidemiology, 30(4), 501–508.

Foster, E., & Siddle, R. (2020). The effectiveness of learning analytics for identifying at-risk students in higher education. Assessment & Evaluation in Higher Education, 45(6), 842–854.

Galikyan, I., Admiraal, W., & Kester, L. (2021). MOOC discussion forums: The interplay of the cognitive and the social. Computers & Education, 165, 104133.

Gašević, D., Dawson, S., Rogers, T., & Gašević, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84.

Grosz, M. P., Rohrer, J. M., & Thoemmes, F. (2020). The taboo against explicit causal inference in nonexperimental psychology. Perspectives on Psychological Science, 15(5), 1243–1255.

Gustafsson, J.-E. (2013). Causal inference in educational effectiveness research: A comparison of three methods to investigate effects of homework on student achievement. School Effectiveness and School Improvement, 24(3), 275–295.

Hellings, J., & Haelermans, C. (2020). The effect of providing learning analytics on student behaviour and performance in programming: A randomised controlled experiment. Higher Education, 83, 1–18.

Hernán, M. A., Hsu, J., & Healy, B. (2019). A second chance to get causal inference right: A classification of data science tasks. Chance, 32(1), 42–49.

Hernán, M. A. (2018). The C-word: Scientific euphemisms do not improve causal inference from observational data. American Journal of Public Health, 108(5), 616–619.

Hernán, M. A., & Robins, J. M. (2020). Causal inference: What if. Chapman & Hall/CRC.

Hicks, B., Kitto, K., Payne, L., & Buckingham Shum, S. (2022). Thinking with causal models: A visual formalism for collaboratively crafting assumptions. Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK ’22), 21–25 March 2022, Online (pp. 250–259). ACM Press.

Jacob, B. A., & Lefgren, L. (2004). Remedial education and student achievement: A regression-discontinuity analysis. The Review of Economics and Statistics, 86(1), 226–244.

Jivet, I., Scheffel, M., Schmitz, M., Robbers, S., Specht, M., & Drachsler, H. (2020). From students with love: An empirical study on learner goals, self-regulated learning and sense-making of learning analytics in higher education. The Internet and Higher Education, 47, 100758.

Joksimović, S., Poquet, O., Kovanović, V., Dowell, N., Mills, C., Gašević, D., Dawson, S., Graesser, A. C., & Brooks, C. (2018). How do we model learning at scale? A systematic review of research on MOOCs. Review of Educational Research, 88(1), 43–86.

Jørnø, R. L., & Gynther, K. (2018). What constitutes an “actionable insight” in learning analytics? Journal of Learning Analytics, 5(3), 198–221.

Kahlert, J., Gribsholt, S. B., Gammelager, H., Dekkers, O. M., & Luta, G. (2017). Control of confounding in the analysis phase: An overview for clinicians. Clinical Epidemiology, 9, 195–204.

King, G., Nielsen, R., Coberley, C., Pope, J. E., & Wells, A. (2011). Comparative effectiveness of matching methods for causal inference. Unpublished manuscript, Institute for Quantitative Social Science, Harvard University, Cambridge, MA.

King, G., & Nielsen, R. (2019). Why propensity scores should not be used for matching. Political Analysis 27(4), 435–454.

Klenke, J., Massing, T., Reckmann, N., Langerbein, J., Otto, B., Goedicke, M., & Hanck, C. (2021). Effects of early warning emails on student performance.

Knight, C. R., & Winship, C. (2013). The causal implications of mechanistic thinking: Identification using directed acyclic graphs (DAGs). In Handbook of Causal Analysis for Social Research (pp. 275–299). Springer.

Lee, J. J. (2012). Correlation and causation in the study of personality. European Journal of Personality, 26(4), 372–390.

Lim, L.-A., Gentili, S., Pardo, A., Kovanović, V., Whitelock-Wainwright, A., Gašević, D., & Dawson, S. (2021). What changes, and for whom? A study of the impact of learning analytics-based process feedback in a large course. Learning and Instruction, 72, 101202.

Littlejohn, A., Hood, N., Milligan, C., & Mustain, P. (2016). Learning in MOOCs: Motivations and self-regulated learning in MOOCs. The Internet and Higher Education, 29, 40–48.

Mathewson, T. G. (2015, August 21). Analytics programs show ‘remarkable’ results — and it’s only the beginning. Higher Ed Dive.

McNamee, R. (2005). Regression modelling and other methods to control confounding. Occupational and Environmental Medicine, 62(7), 500–506.

Moreno-Marcos, P. M., Muñoz-Merino, P. J., Maldonado-Mahauad, J., Pérez-Sanagustín, M., Alario-Hoyos, C., & Kloos, C. D. (2020). Temporal analysis for dropout prediction using self-regulated learning strategies in self-paced MOOCs. Computers & Education, 145, 103728.

Morgan, S. L., & Winship, C. (2015). Counterfactuals and causal inference. Cambridge University Press.

Meehl, P. E. (1970). Nuisance variables and the ex post facto design. University of Minnesota Press.

Motz, B. A., Carvalho, P. F., de Leeuw, J. R., & Goldstone, R. L. (2018). Embedding experiments: Staking causal inference in authentic educational contexts. Journal of Learning Analytics, 5(2), 47–59.

Mousavi, A., Schmidt, M., Squires, V., & Wilson, K. (2021). Assessing the effectiveness of student advice recommender agent (SARA): The case of automated personalized feedback. International Journal of Artificial Intelligence in Education, 31, 603–621.

Mullaney, T., & Reich, J. (2015). Staggered versus all-at-once content release in massive open online courses: Evaluating a natural experiment. Proceedings of the 2nd ACM Conference on Learning @ Scale (L@S 2015), 14–18 March 2015, Vancouver, BC, Canada (pp. 185–194). ACM Press.

Munafò, M. R., Tilling, K., Taylor, A. E., Evans, D. M., & Smith, G. D. (2018). Collider scope: When selection bias can substantially influence observed associations. International Journal of Epidemiology, 47(1), 226–235.

Pearl, J. (1993). Comment: Graphical models, causality and intervention. Statistical Science, 8(3), 266–269.

Pearl, J. (1995). Causal diagrams for empirical research. Biometrika, 82(4), 669–688.

Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge University Press.

Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal inference in statistics: A primer. John Wiley & Sons.

Pearl, J. (2019). The seven tools of causal inference, with reflections on machine learning. Communications of the ACM, 62(3), 54–60.

Pearl, J. (2021). Causal and counterfactual inference. In M. Knauff & W. Spohn (Eds.), The Handbook of Rationality (pp. 427–438). The MIT Press.

Prosperi, M., Guo, Y., Sperrin, M., Koopman, J. S., Min, J. S., He, X., Rich, S., Wang, M., Buchan, I. E., & Bian, J. (2020). Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence, 2, 369–375.

Rao, P. (1971). Some notes on misspecification in multiple regressions. The American Statistician, 25(5), 37–39.

Reich, J., & Ruipérez-Valiente, J. A. (2019). The MOOC pivot. Science, 363(6423), 130–131.

Richardson, T. G., Smith, G. D., & Munafò, M. R. (2019). Conditioning on a collider may induce spurious associations: Do the results of Gale et al. (2017) support a health-protective effect of neuroticism in population subgroups? Psychological Science, 30(4), 629–632.

Robins, J. M. (2001). Data, design, and background knowledge in etiologic inference. Epidemiology, 12(3), 313–320.

Rohrer, J. M. (2018). Thinking clearly about correlations and causation: Graphical causal models for observational data. Advances in Methods and Practices in Psychological Science, 1(1), 27–42.

Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55.

Russell, J.-E., Smith, A., & Larsen, R. (2020). Elements of success: Supporting at-risk student resilience through learning analytics. Computers & Education, 152, 103890.

Schisterman, E. F., Cole, S. R., & Platt, R. W. (2009). Overadjustment bias and unnecessary adjustment in epidemiologic studies. Epidemiology, 20(4), 488–495.

Shrier, I., & Platt, R. W. (2008). Reducing bias through directed acyclic graphs. BMC Medical Research Methodology, 8.

Tennant, P. W. G., Murray, E. J., Arnold, K. F., Berrie, L., Fox, M. P., Gadd, S. C., Harrison, W. J., Keeble, C., Ranker, L. R., Textor, J., Tomova, G. D., Gilthorpe, M. S., & Ellison, G. T. H. (2021). Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: Review and recommendations. International Journal of Epidemiology, 50(2), 620–632.

Sønderlund, A. L., Hughes, E., & Smith, J. (2018). The efficacy of learning analytics interventions in higher education: A systematic review. British Journal of Educational Technology, 50(5), 2594–2618.

Siemens, G., & Gašević, D. (2012). Guest editorial: Learning and knowledge analytics. Educational Technology & Society, 15(3), 1–2.

Tripepi, G., Jager, K. J., Dekker, F. W., & Zoccali, C. (2010). Stratification for confounding — Part 1: The Mantel-Haenszel formula. Nephron Clinical Practice, 116(4), c317–c321.

Tseng, S.-F., Tsao, Y.-W., Yu, L. C., Chan, C.-L., & Lai, K. R. (2016). Who will pass? Analyzing learner behaviors in MOOCs. Research and Practice in Technology Enhanced Learning, 11.

VanderWeele, T. (2015). Explanation in causal inference: Methods for mediation and interaction. Oxford University Press.

Verbert, K., Manouselis, N., Drachsler, H., & Duval, E. (2012). Dataset-driven research to support learning and knowledge analytics. Educational Technology & Society, 15(3), 133–148.

Viberg, O., Hatakka, M., Bälter, O., & Mavroudi, A. (2018). The current landscape of learning analytics in higher education. Computers in Human Behavior, 89, 98–110.

Wang, W., Guo, L., He, L., & Wu, Y. J. (2019). Effects of social-interactive engagement on the dropout ratio in online learning: Insights from MOOC. Behaviour & Information Technology, 38(6), 621–636.

Westfall, J., & Yarkoni, T. (2016). Statistically controlling for confounding constructs is harder than you think. PloS One, 11(3), e0152719.

Westreich, D., & Greenland, S. (2013). The Table 2 fallacy: Presenting and interpreting confounder and modifier coefficients. American Journal of Epidemiology, 177(4), 292–298.

Winne, P. H. (1982). Minimizing the black box problem to enhance the validity of theories about instructional effects. Instructional Science, 11, 13–28.

Winne, P. H. (1983). Distortions of construct validity in multiple regression analysis. Canadian Journal of Behavioural Science, 15(3), 187–202.

Winne, P. (2017). Leveraging big data to help each learner and accelerate learning science. Teachers College Record, 119(3), 1–24.

Winne, P. H. (2020). Construct and consequential validity for learning analytics based on trace data. Computers in Human Behavior, 112, 106457.

Wise, A. F., & Shaffer, D. W. (2015). Why theory matters more than ever in the age of big data. Journal of Learning Analytics, 2(2), 5–13.

Wong, J., Baars, M., de Koning, B. B., van der Zee, T., Davis, D., Khalil, M., Houben, G.-J., & Paas, F. (2019). Educational theories and learning analytics: From data to knowledge. In D. Ifenthaler, D.-K. Mah, & J. Yin-Kim Yau (Eds.), Utilizing learning analytics to support study success (pp. 3–25). Springer.

Zhou, M., & Winne, P. H. (2012). Modeling academic achievement by self-reported versus traced goal orientation. Learning and Instruction, 22(6), 413–419.

Zhu, M., Bergner, Y., Zhang, Y., Baker, R., Wang, Y., & Paquette, L. (2016). Longitudinal engagement, performance, and social connectivity: A MOOC case study using exponential random graph models. Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 223–230). ACM Press.

Zimmerman, B. J. (1990). Self-regulated learning and academic achievement: An overview. Educational Psychologist, 25(1), 3–17.




How to Cite

Weidlich, J., Gašević, D., & Drachsler, H. (2022). Causal Inference and Bias in Learning Analytics: A Primer on Pitfalls Using Directed Acyclic Graphs. Journal of Learning Analytics, 9(3), 183-199.

Most read articles by the same author(s)

1 2 3 4 > >>