Interpretable Predictive Analytics for Online Learning: A Markov-Based Machine Learning Approach

Chaewon Lee; Lan Luo; Shelbi L. Kuhlmann; Robert D. Plumley; Abigail T. Panter; Matthew L. Bernacki; Jeffrey A. Greene; Kathleen M. Gates

doi:10.18608/jla.2025.8375

Authors

Chaewon Lee University of North Carolina at Chapel Hill https://orcid.org/0009-0008-0981-025X
Lan Luo University of North Carolina at Chapel Hill https://orcid.org/0000-0003-0779-2808
Shelbi L. Kuhlmann University of Memphis https://orcid.org/0000-0002-6451-6765
Robert D. Plumley University of North Carolina at Chapel Hill https://orcid.org/0000-0001-8979-6276
Abigail T. Panter University of North Carolina at Chapel Hill https://orcid.org/0000-0001-6914-8490
Matthew L. Bernacki University of North Carolina at Chapel Hill https://orcid.org/0000-0003-1279-2829
Jeffrey A. Greene University of North Carolina at Chapel Hill https://orcid.org/0000-0003-4145-1847
Kathleen M. Gates University of North Carolina at Chapel Hill https://orcid.org/0000-0002-1246-4529

DOI:

https://doi.org/10.18608/jla.2025.8375

Keywords:

predictive learning analytics, learning management system, machine learning, Markov chain, explainable AI, research paper

Abstract

The increasing use of learning management systems (LMSs) generates vast amounts of clickstream data, opening new avenues for predicting learner performance. Traditionally, LMS predictive analytics have relied on either supervised machine learning or Markov models to classify learners based on predicted learning outcomes. Machine learning excels at pattern recognition but often overlooks temporal learning dynamics and obscures the reasoning behind predictions due to the black-box nature of many algorithms. Alternatively, Markov models provide an effective solution by capturing temporal learning dynamics for prediction, uncovering distinctive learning patterns between high and low performers. Despite these advantages, Markov model classification struggles with the heterogeneity of learning sequences, limiting its broad applicability. To address these limitations and bridge the gap between the two dominant approaches, we propose a hybrid framework: sequence-based Markov machine learning classification (seqMAC). Leveraging early-stage clickstream data, seqMAC provides an interpretable sequence classification method that captures critical behavioural transitions and identifies distinct learning patterns across performance groups. Tested on six LMS samples, seqMAC effectively identified at-risk students despite sequence heterogeneity, uncovering key predictive learning dynamics that differentiate performance groups. It also demonstrated promising generalizability, accurately identifying future at-risk students based on historical clickstream data.

References

Adnan, M., Uddin, M. I., Khan, E., Alharithi, F. S., Amin, S., & Alzahrani, A. A. (2022). Earliest possible global and local interpretation of students’ performance in virtual learning environment by leveraging explainable AI. IEEE Access, 10, 129843–129864. https://doi.org/10.1109/ACCESS.2022.3227072

Afzaal, M., Nouri, J., Zia, A., Papapetrou, P., Fors, U., Wu, Y., Li, X., & Weegar, R. (2021). Explainable AI for data-driven feedback and intelligent action recommendations to support students self-regulation. Frontiers in Artificial Intelligence, 4, Article 723447. https://doi.org/10.3389/frai.2021.723447

Aggarwal, C. C. (2017). Outlier analysis (2nd ed.). Springer Cham. https://doi.org/10.1007/978-3-319-47578-3_10

Alamri, A., Alshehri, M., Cristea, A., Pereira, F. D., Oliveira, E., Shi, L., & Stewart, C. (2019). Predicting MOOCs dropout using only two easily obtainable features from the first week’s activities. In A. Coy, Y. Hayashi, & M. Chang (Eds.), Intelligent tutoring systems: 15th international conference, ITS 2019, Kingston, Jamaica, June 3–7, 2019, proceedings (pp. 163–173). Springer Cham. https://doi.org/10.1007/978-3-030-22244-4_20

Aleknavičiūtė, V., Lehtinen, E., & Södervik, I. (2023). Thirty years of conceptual change research in biology: A review and meta-analysis of intervention studies. Educational Research Review, 41, Article 100556. https://doi.org/10.1016/j.edurev.2023.100556

Al-Sulami, A., Al-Masre, M., & Al-Malki, N. (2023). Deep learning to predict at-risk students’ achievement in a preparatory-year English courses. In 2023 1st international conference on advanced innovations in smart cities (ICAISC) (pp. 1–6). IEEE. https://doi.org/10.1109/ICAISC56366.2023.10085097

Arizmendi, C. J., Bernacki, M. L., Raković, M., Plumley, R. D., Urban, C. J., Panter, A. T., Greene, J. A., & Gates, K. M. (2023). Predicting student outcomes using digital logs of learning behaviors: Review, current standards, and suggestions for future work. Behavior Research Methods, 55(6), 3026–3054. https://doi.org/10.3758/s13428-022-01939-9

Baker, R., Xu, D., Park, J., Yu, R., Li, Q., Cung, B., Fischer, C., Rodriguez, F., Warschauer, M., & Smyth, P. (2020). The benefits and caveats of using clickstream data to understand student self-regulatory behaviors: Opening the black box of learning processes. International Journal of Educational Technology in Higher Education, 17(1), Article 13. https://doi.org/10.1186/s41239-020-00187-1

Banerjee, I., de Sisternes, L., Hallak, J., Leng, T., Osborne, A., Durbin, M., & Rubin, D. (2019). A deep-learning approach for prognosis of age-related macular degeneration disease using SD-OCT imaging biomarkers. arXiv. https://doi.org/10.48550/arXiv.1902.10700

Bento, J., Saleiro, P., Cruz, A. F., Figueiredo, M. A. T., & Bizarro, P. (2021). TimeSHAP: Explaining recurrent models through sequence perturbations. In F. Zhu, B. C. Ooi, C. Miao, H. Wang, I. Skrypnyk, W. Hsu, & S. Chawla (Eds.), KDD ’21: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 2565–2573). ACM Press. https://doi.org/10.1145/3447548.3467166

Bernacki, M. L. (2018). Examining the cyclical, loosely sequenced, and contingent features of self-regulated learning: Trace data and their analysis. In D. H. Schunk & J. A. Greene (Eds.), Handbook of self-regulated learning and performance (2nd ed., pp. 370–387). Routledge.

Bernacki, M. L., Chavez, M. M., & Uesbeck, P. M. (2020). Predicting achievement and providing support before STEM majors begin to fail. Computers & Education, 158, Article 103999. https://doi.org/10.1016/j.compedu.2020.103999

Bernacki, M. L., Vosicka, L., & Utz, J. C. (2020). Can a brief, digital skill training intervention help undergraduates “learn to learn” and improve their STEM achievement? Journal of Educational Psychology, 112(4), 765–781. https://doi.org/10.1037/edu0000405

Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324

Caeiro-Rodríguez, M., Anido-Rifón, L., & Llamas-Nistal, M. (2005). Improving the modelling of heterogeneous learning activities. In Proceedings of the eighth IFIP world conference on computers in education. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=d58281c31156cd266c12cee3f5129783730b6044

Cardenas-Ovando, R. A., Noguez, J., & Rangel-Escareno, C. (2017). RcppHMM: Rcpp hidden Markov model (R version 1.2.2) [Computer software]. R Foundation for Statistical Computing. https://cran.r-project.org/web/packages/RcppHMM/

Chen, F., & Cui, Y. (2020). Utilizing student time series behaviour in learning management systems for early prediction of course performance. Journal of Learning Analytics, 7(2), 1–17. https://doi.org/10.18608/jla.2020.72.1

Chen, J., Fang, B., Zhang, H., & Xue, X. (2024). A systematic review for MOOC dropout prediction from the perspective of machine learning. Interactive Learning Environments, 32(5), 1642–1655. https://doi.org/10.1080/10494820.2022.2124425

Chui, K. T., Fung, D. C. L., Lytras, M. D., & Lam, T. M. (2020). Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Computers in Human Behavior, 107, Article 105584. https://doi.org/10.1016/j.chb.2018.06.032

Cogliano, M., Bernacki, M. L., Hilpert, J. C., & Strong, C. L. (2022). A self-regulated learning analytics prediction-and-intervention design: Detecting and supporting struggling biology students. Journal of Educational Psychology 114(8), 1801–1816. https://doi.org/10.1037/edu0000745

Eddy, S. R. (2004). What is a hidden Markov model? Nature Biotechnology, 22, 1315–1316. https://doi.org/10.1038/nbt1004-1315

Elmäng, N. (2020). Sequence classification on gamified behavior data from a learning management system: Predicting student outcome using neural networks and Markov chain [Master’s Thesis, University of Skövde]. DiVA Portal. https://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18654

Faisan, S., Thoraval, L., Armspach, J.-P., & Heitz, F. (2007). Hidden Markov multiple event sequence models: A paradigm for the spatio-temporal analysis of fMRI data. Medical Image Analysis, 11(1), 1–20. https://doi.org/10.1016/j.media.2006.09.003

Fok, A. W. P., Wong, H. S., & Chen, Y. S. (2005). Hidden Markov model based characterization of content access patterns in an e-learning environment. In 2005 IEEE international conference on multimedia and expo (pp. 201–204). IEEE. https://doi.org/10.1109/ICME.2005.1521395

Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2

Friedman, J. H., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22. https://doi.org/10.18637/jss.v033.i01

Gan, W., Chen, L., Wan, S., Chen, J., & Chen, C.-M. (2023). Anomaly rule detection in sequence data. IEEE Transactions on Knowledge and Data Engineering, 35(12), 12095–12108. https://doi.org/10.1109/TKDE.2021.3139086

Geigle, C., & Zhai, C. (2017). Modeling student behavior with two-layer hidden Markov models. Journal of Educational Data Mining, 9(1), 1–24. https://doi.org/10.5281/zenodo.3554623

Greene, J. A., Bernacki, M. L., & Hadwin, A. F. (2024). Self-regulation. In P. A. Schutz & K. R. Muis (Eds.), Handbook of educational psychology (4th ed., pp. 314–334). Routledge. https://doi.org/10.4324/9780429433726-17

Greenhow, C., Graham, C. R., & Koehler, M. J. (2022). Foundations of online learning: Challenges and opportunities. Educational Psychologist, 57(3), 131–147. https://doi.org/10.1080/00461520.2022.2090364

Greenwell, B. M. (2024). Fastshap: Fast approximate Shapley values (R package version 0.0.7) [Computer software]. R Foundation for Statistical Computing. https://CRAN.R-project.org/package=fastshap

Greenwell, B. M., & Boehmke, B. C. (2020). Variable Importance Plots: An introduction to the vip package. The R Journal, 12(1), 343–366. https://journal.r-project.org/archive/2020/RJ-2020-013/RJ-2020-013.pdf

Gupta, A., Garg, D., & Kumar, P. (2022). Mining sequential learning trajectories with hidden Markov models for early prediction of at-risk students in e-learning environments. IEEE Transactions on Learning Technologies, 15(6), 783–797. https://doi.org/10.1109/TLT.2022.3197486

Hajian-Tilaki, K. (2018). The choice of methods in determining the optimal cut-off value for quantitative diagnostic test evaluation. Statistical Methods in Medical Research, 27(8), 2374–2383. https://doi.org/10.1177/0962280216680383

Hastie, T., Tibshirani, R., & Friedman, J. (2017). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.

Heins, K., & Stern, H. (2014). A statistical model for event sequence data. Proceedings of Machine Learning Research, 33, 338–346. https://proceedings.mlr.press/v33/heins14.html

Helske, S., & Helske, J. (2019). Mixture hidden Markov models for sequence data: The seqHMM package in R. Journal of Statistical Software, 88(3), 1–32. https://doi.org/10.18637/jss.v088.i03

Ibe, O. C. (2013). Markov processes for stochastic modeling (2nd ed.). Elsevier.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning: With applications in R. Springer.

Jang, Y., Choi, S., Jung, H., & Kim, H. (2022). Practical early prediction of students’ performance using machine learning and eXplainable AI. Education and Information Technologies, 27(9), 12855–12889. https://doi.org/10.1007/s10639-022-11120-6

Janssen, N., & Lazonder, A. W. (2024). Meta-analysis of interventions for monitoring accuracy in problem solving. Educational Psychology Review, 36(3), Article 96. https://doi.org/10.1007/s10648-024-09936-4

Kalinowski, T., Falbel, D., Allaire, J. J., Chollet, F., RStudio, Google, Tang, Y., Van Der Bijl, W., Studer, M., & Keydana, S. (2024). R interface to “Keras” (Version 2.15.0) [Computer software]. R Foundation for Statistical Computing. https://cran.r-project.org/web/packages/keras/keras.pdf

Kay, J., Kummerfeld, B., Conati, C., Porayska-Pomsta, K., & Holstein, K. (2023). Scrutable AIED. In B. du Boulay, A. Mitrovic, & K. Yacef (Eds.), Handbook of artificial intelligence in education (pp. 101–126). Edward Elgar.

Kokoç, M., Akçapınar, G., & Hasnine, M. N. (2021). Unfolding students’ online assignment submission behavioral patterns using temporal learning analytics. Educational Technology & Society, 24(1), 223–235. https://www.jstor.org/stable/26977869

Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28(5), 1–26. https://doi.org/10.18637/jss.v028.i05

Larrabee Sønderlund, A., Hughes, E., & Smith, J. (2019). The efficacy of learning analytics interventions in higher education: A systematic review. British Journal of Educational Technology, 50(5), 2594–2618. https://doi.org/10.1111/bjet.12720

Lee, C., Gates, K. M., Chun, J., Al Kontar, R., Kamali, M., McInnis, M. G., & Deldin, P. (2025). Suicide risk estimation in bipolar disorder using N200 and P300 event-related potentials and machine learning: A pilot study. Journal of Affective Disorders Reports, 20, Article 100875. https://doi.org/10.1016/j.jadr.2025.100875

Lee, C.-A., Tzeng, J.-W., Huang, N.-F., & Su, Y.-S. (2021). Prediction of student performance in massive open online courses using deep learning system based on learning behaviors. Educational Technology & Society, 24(3), 130–146. https://www.jstor.org/stable/27032861

Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2, 56–67. https://doi.org/10.1038/s42256-019-0138-9

Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588–599. https://doi.org/10.1016/j.compedu.2009.09.008

McDaniel, M. A., & Einstein, G. O. (2020). Training learning strategies to promote self-regulation and transfer: The knowledge, belief, commitment, and planning framework. Perspectives on Psychological Science, 15(6), 1363–1381. https://doi.org/10.1177/1745691620920723

Nanavaty, S., & Khuteta, A. (2024). A deep learning dive into online learning: Predicting student success with interaction-based neural networks. International Journal of Intelligent Systems and Applications in Engineering, 12(1), 102–107. https://www.ijisae.org/index.php/IJISAE/article/view/3769

Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41, 71–90. https://doi.org/10.1016/j.dr.2016.06.004

Rabiner, L., & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–16. https://doi.org/10.1109/MASSP.1986.1165342

Raftery, A. E. (1985). A model for high-order Markov chains. Journal of the Royal Statistical Society Series B: Statistical Methodology, 47(3), 528–539. https://doi.org/10.1111/j.2517-6161.1985.tb01383.x

R Core Team. (2022). R: A language and environment for statistical computing (R version 4.2.2) [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier. In B. Krishnapuram, M. Shah, A. Smola, C. Aggarwal, D. Shen, & R. Rastogi (Eds.), KDD ’16: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144). https://doi.org/10.1145/2939672.2939778

Riestra-González, M., del Puerto Paule-Ruíz, M., & Ortin, F. (2021). Massive LMS log data analysis for the early prediction of course-agnostic student performance. Computers & Education, 163, Article 104108. https://doi.org/10.1016/j.compedu.2020.104108

Rizvi, S., Rienties, B., & Khoja, S. A. (2019). The role of demographics in online learning; A decision tree based approach. Computers & Education, 137, 32–47. https://doi.org/10.1016/j.compedu.2019.04.001

Rudin, C., & Radin, J. (2019). Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Review, 1(2). https://doi.org/10.1162/99608f92.5a8a3a3d

Shapley, L. S. (1953). A value for n-person games. In H. W. Kuhn & A. W. Tucker (Eds.), Contributions to the theory of games (Vol. II, pp. 307–318). Princeton University Press. https://doi.org/10.1515/9781400881970-018

Staufer, S., Bugert, F., Hauser, F., Grabinger, L., Ezer, T., Nadimpalli, V. K., Bittner, D., Röhrl, S., & Mottok, J. (2024). Tyche algorithm: Markov models for generating learning paths in learning management systems. In L. Gómez Chova, C. González Martínez, & J. Lees (Eds.), INTED2024 proceedings: 18th international technology, education and development conference (pp. 4195–4205). IATED Academy. https://doi.org/10.21125/inted.2024.1080

Sun, D., Cheng, G., Xu, P., Zheng, Q., & Chen, L. (2019). Using HMM to compare interaction activity patterns of student groups with different achievements in MPOCs. Interactive Learning Environments, 27(5-6), 766–781. https://doi.org/10.1080/10494820.2019.1610780

Tamada, M. M., Giusti, R., & de Magalhães Netto, J. F. (2022). Predicting students at risk of dropout in technical course using LMS logs. Electronics, 11(3), Article 468. https://doi.org/10.3390/electronics11030468

Tang, Y., Li, Z., Wang, G., & Hu, X. (2023). Modeling learning behaviors and predicting performance in an intelligent tutoring system: A two-layer hidden Markov modeling approach. Interactive Learning Environments, 31(9), 5495–5507. https://doi.org/10.1080/10494820.2021.2010100

Theobald, M. (2021). Self-regulated learning training programs enhance university students’ academic performance, self-regulated learning strategies, and motivation: A meta-analysis. Contemporary Educational Psychology, 66, Article 101976. https://doi.org/10.1016/j.cedpsych.2021.101976

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

Tran, T. M., & Hasegawa, S. (2022). Using Markov chain on online learning history data to develop learner model for measuring strength of learning habits. In D. G. Sampson, D., Ifenthaler, & P. Isaías (Eds.), Proceedings of the international conference on cognition and exploratory learning in the digital age. International Association for Development of the Information Society. https://eric.ed.gov/?id=ED626882

Tsiakmaki, M., Kostopoulos, G., Kotsiantis, S., & Ragos, O. (2020). Transfer learning from deep neural networks for predicting student performance. Applied Sciences, 10(6), Article 2145. https://doi.org/10.3390/app10062145

Van Goidsenhoven, S., Bogdanova, D., Deeva, G., Broucke, S. v., De Weerdt, J., & Snoeck, M. (2020). Predicting student success in a blended learning environment. In C. Rensing, H. Drachsler, V. Kovanović, N. Pinkwart, M. Scheffel, & K. Verbert (Eds.), LAK ’20: Proceedings of the tenth international conference on learning analytics & knowledge (pp. 17–25). ACM Press. https://doi.org/10.1145/3375462.3375494

Weaver, D., Spratt, C., & Nair, C. S. (2008). Academic and student use of a learning management system: Implications for quality. Australasian Journal of Educational Technology, 24(1). https://doi.org/10.14742/ajet.1228

Wen, X., & Juan, H. (2023). Early prediction of students’ performance using a deep neural network based on online learning activity sequence. Applied Sciences, 13(15), Article 8933. https://doi.org/10.3390/app13158933

Winne, P. H. (2020). Construct and consequential validity for learning analytics based on trace data. Computers in Human Behavior, 112, Article 106457. https://doi.org/10.1016/j.chb.2020.106457

Witteveen, D., & Attewell, P. (2017). The college completion puzzle: A hidden Markov model approach. Research in Higher Education, 58(4), 449–467. https://doi.org/10.1007/s11162-016-9430-2

Zepeda, C. D., Richey, J. E., Ronevich, P., & Nokes-Malach, T. J. (2015). Direct instruction of metacognition benefits adolescent science learning, transfer, and motivation: An in vivo study. Journal of Educational Psychology, 107(4), 954–970. https://doi.org/10.1037/edu0000022

Interpretable Predictive Analytics for Online Learning

A Markov-Based Machine Learning Approach

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)