Interpretable Predictive Analytics for Online Learning
A Markov-Based Machine Learning Approach
DOI:
https://doi.org/10.18608/jla.2025.8375Keywords:
predictive learning analytics, learning management system, machine learning, Markov chain, explainable AI, research paperAbstract
The increasing use of learning management systems (LMSs) generates vast amounts of clickstream data, opening new avenues for predicting learner performance. Traditionally, LMS predictive analytics have relied on either supervised machine learning or Markov models to classify learners based on predicted learning outcomes. Machine learning excels at pattern recognition but often overlooks temporal learning dynamics and obscures the reasoning behind predictions due to the black-box nature of many algorithms. Alternatively, Markov models provide an effective solution by capturing temporal learning dynamics for prediction, uncovering distinctive learning patterns between high and low performers. Despite these advantages, Markov model classification struggles with the heterogeneity of learning sequences, limiting its broad applicability. To address these limitations and bridge the gap between the two dominant approaches, we propose a hybrid framework: sequence-based Markov machine learning classification (seqMAC). Leveraging early-stage clickstream data, seqMAC provides an interpretable sequence classification method that captures critical behavioural transitions and identifies distinct learning patterns across performance groups. Tested on six LMS samples, seqMAC effectively identified at-risk students despite sequence heterogeneity, uncovering key predictive learning dynamics that differentiate performance groups. It also demonstrated promising generalizability, accurately identifying future at-risk students based on historical clickstream data.
References
Adnan, M., Uddin, M. I., Khan, E., Alharithi, F. S., Amin, S., & Alzahrani, A. A. (2022). Earliest possible global and local interpretation of students’ performance in virtual learning environment by leveraging explainable AI. IEEE Access, 10, 129843–129864. https://doi.org/10.1109/ACCESS.2022.3227072
Afzaal, M., Nouri, J., Zia, A., Papapetrou, P., Fors, U., Wu, Y., Li, X., & Weegar, R. (2021). Explainable AI for data-driven feedback and intelligent action recommendations to support students self-regulation. Frontiers in Artificial Intelligence, 4, Article 723447. https://doi.org/10.3389/frai.2021.723447
Aggarwal, C. C. (2017). Outlier analysis (2nd ed.). Springer Cham. https://doi.org/10.1007/978-3-319-47578-3_10
Alamri, A., Alshehri, M., Cristea, A., Pereira, F. D., Oliveira, E., Shi, L., & Stewart, C. (2019). Predicting MOOCs dropout using only two easily obtainable features from the first week’s activities. In A. Coy, Y. Hayashi, & M. Chang (Eds.), Intelligent tutoring systems: 15th international conference, ITS 2019, Kingston, Jamaica, June 3–7, 2019, proceedings (pp. 163–173). Springer Cham. https://doi.org/10.1007/978-3-030-22244-4_20
Aleknavičiūtė, V., Lehtinen, E., & Södervik, I. (2023). Thirty years of conceptual change research in biology: A review and meta-analysis of intervention studies. Educational Research Review, 41, Article 100556. https://doi.org/10.1016/j.edurev.2023.100556
Al-Sulami, A., Al-Masre, M., & Al-Malki, N. (2023). Deep learning to predict at-risk students’ achievement in a preparatory-year English courses. In 2023 1st international conference on advanced innovations in smart cities (ICAISC) (pp. 1–6). IEEE. https://doi.org/10.1109/ICAISC56366.2023.10085097
Arizmendi, C. J., Bernacki, M. L., Raković, M., Plumley, R. D., Urban, C. J., Panter, A. T., Greene, J. A., & Gates, K. M. (2023). Predicting student outcomes using digital logs of learning behaviors: Review, current standards, and suggestions for future work. Behavior Research Methods, 55(6), 3026–3054. https://doi.org/10.3758/s13428-022-01939-9
Baker, R., Xu, D., Park, J., Yu, R., Li, Q., Cung, B., Fischer, C., Rodriguez, F., Warschauer, M., & Smyth, P. (2020). The benefits and caveats of using clickstream data to understand student self-regulatory behaviors: Opening the black box of learning processes. International Journal of Educational Technology in Higher Education, 17(1), Article 13. https://doi.org/10.1186/s41239-020-00187-1
Banerjee, I., de Sisternes, L., Hallak, J., Leng, T., Osborne, A., Durbin, M., & Rubin, D. (2019). A deep-learning approach for prognosis of age-related macular degeneration disease using SD-OCT imaging biomarkers. arXiv. https://doi.org/10.48550/arXiv.1902.10700
Bento, J., Saleiro, P., Cruz, A. F., Figueiredo, M. A. T., & Bizarro, P. (2021). TimeSHAP: Explaining recurrent models through sequence perturbations. In F. Zhu, B. C. Ooi, C. Miao, H. Wang, I. Skrypnyk, W. Hsu, & S. Chawla (Eds.), KDD ’21: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 2565–2573). ACM Press. https://doi.org/10.1145/3447548.3467166
Bernacki, M. L. (2018). Examining the cyclical, loosely sequenced, and contingent features of self-regulated learning: Trace data and their analysis. In D. H. Schunk & J. A. Greene (Eds.), Handbook of self-regulated learning and performance (2nd ed., pp. 370–387). Routledge.
Bernacki, M. L., Chavez, M. M., & Uesbeck, P. M. (2020). Predicting achievement and providing support before STEM majors begin to fail. Computers & Education, 158, Article 103999. https://doi.org/10.1016/j.compedu.2020.103999
Bernacki, M. L., Vosicka, L., & Utz, J. C. (2020). Can a brief, digital skill training intervention help undergraduates “learn to learn” and improve their STEM achievement? Journal of Educational Psychology, 112(4), 765–781. https://doi.org/10.1037/edu0000405
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324
Caeiro-Rodríguez, M., Anido-Rifón, L., & Llamas-Nistal, M. (2005). Improving the modelling of heterogeneous learning activities. In Proceedings of the eighth IFIP world conference on computers in education. https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=d58281c31156cd266c12cee3f5129783730b6044
Cardenas-Ovando, R. A., Noguez, J., & Rangel-Escareno, C. (2017). RcppHMM: Rcpp hidden Markov model (R version 1.2.2) [Computer software]. R Foundation for Statistical Computing. https://cran.r-project.org/web/packages/RcppHMM/
Chen, F., & Cui, Y. (2020). Utilizing student time series behaviour in learning management systems for early prediction of course performance. Journal of Learning Analytics, 7(2), 1–17. https://doi.org/10.18608/jla.2020.72.1
Chen, J., Fang, B., Zhang, H., & Xue, X. (2024). A systematic review for MOOC dropout prediction from the perspective of machine learning. Interactive Learning Environments, 32(5), 1642–1655. https://doi.org/10.1080/10494820.2022.2124425
Chui, K. T., Fung, D. C. L., Lytras, M. D., & Lam, T. M. (2020). Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Computers in Human Behavior, 107, Article 105584. https://doi.org/10.1016/j.chb.2018.06.032
Cogliano, M., Bernacki, M. L., Hilpert, J. C., & Strong, C. L. (2022). A self-regulated learning analytics prediction-and-intervention design: Detecting and supporting struggling biology students. Journal of Educational Psychology 114(8), 1801–1816. https://doi.org/10.1037/edu0000745
Eddy, S. R. (2004). What is a hidden Markov model? Nature Biotechnology, 22, 1315–1316. https://doi.org/10.1038/nbt1004-1315
Elmäng, N. (2020). Sequence classification on gamified behavior data from a learning management system: Predicting student outcome using neural networks and Markov chain [Master’s Thesis, University of Skövde]. DiVA Portal. https://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18654
Faisan, S., Thoraval, L., Armspach, J.-P., & Heitz, F. (2007). Hidden Markov multiple event sequence models: A paradigm for the spatio-temporal analysis of fMRI data. Medical Image Analysis, 11(1), 1–20. https://doi.org/10.1016/j.media.2006.09.003
Fok, A. W. P., Wong, H. S., & Chen, Y. S. (2005). Hidden Markov model based characterization of content access patterns in an e-learning environment. In 2005 IEEE international conference on multimedia and expo (pp. 201–204). IEEE. https://doi.org/10.1109/ICME.2005.1521395
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
Friedman, J. H., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1–22. https://doi.org/10.18637/jss.v033.i01
Gan, W., Chen, L., Wan, S., Chen, J., & Chen, C.-M. (2023). Anomaly rule detection in sequence data. IEEE Transactions on Knowledge and Data Engineering, 35(12), 12095–12108. https://doi.org/10.1109/TKDE.2021.3139086
Geigle, C., & Zhai, C. (2017). Modeling student behavior with two-layer hidden Markov models. Journal of Educational Data Mining, 9(1), 1–24. https://doi.org/10.5281/zenodo.3554623
Greene, J. A., Bernacki, M. L., & Hadwin, A. F. (2024). Self-regulation. In P. A. Schutz & K. R. Muis (Eds.), Handbook of educational psychology (4th ed., pp. 314–334). Routledge. https://doi.org/10.4324/9780429433726-17
Greenhow, C., Graham, C. R., & Koehler, M. J. (2022). Foundations of online learning: Challenges and opportunities. Educational Psychologist, 57(3), 131–147. https://doi.org/10.1080/00461520.2022.2090364
Greenwell, B. M. (2024). Fastshap: Fast approximate Shapley values (R package version 0.0.7) [Computer software]. R Foundation for Statistical Computing. https://CRAN.R-project.org/package=fastshap
Greenwell, B. M., & Boehmke, B. C. (2020). Variable Importance Plots: An introduction to the vip package. The R Journal, 12(1), 343–366. https://journal.r-project.org/archive/2020/RJ-2020-013/RJ-2020-013.pdf
Gupta, A., Garg, D., & Kumar, P. (2022). Mining sequential learning trajectories with hidden Markov models for early prediction of at-risk students in e-learning environments. IEEE Transactions on Learning Technologies, 15(6), 783–797. https://doi.org/10.1109/TLT.2022.3197486
Hajian-Tilaki, K. (2018). The choice of methods in determining the optimal cut-off value for quantitative diagnostic test evaluation. Statistical Methods in Medical Research, 27(8), 2374–2383. https://doi.org/10.1177/0962280216680383
Hastie, T., Tibshirani, R., & Friedman, J. (2017). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
Heins, K., & Stern, H. (2014). A statistical model for event sequence data. Proceedings of Machine Learning Research, 33, 338–346. https://proceedings.mlr.press/v33/heins14.html
Helske, S., & Helske, J. (2019). Mixture hidden Markov models for sequence data: The seqHMM package in R. Journal of Statistical Software, 88(3), 1–32. https://doi.org/10.18637/jss.v088.i03
Ibe, O. C. (2013). Markov processes for stochastic modeling (2nd ed.). Elsevier.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning: With applications in R. Springer.
Jang, Y., Choi, S., Jung, H., & Kim, H. (2022). Practical early prediction of students’ performance using machine learning and eXplainable AI. Education and Information Technologies, 27(9), 12855–12889. https://doi.org/10.1007/s10639-022-11120-6
Janssen, N., & Lazonder, A. W. (2024). Meta-analysis of interventions for monitoring accuracy in problem solving. Educational Psychology Review, 36(3), Article 96. https://doi.org/10.1007/s10648-024-09936-4
Kalinowski, T., Falbel, D., Allaire, J. J., Chollet, F., RStudio, Google, Tang, Y., Van Der Bijl, W., Studer, M., & Keydana, S. (2024). R interface to “Keras” (Version 2.15.0) [Computer software]. R Foundation for Statistical Computing. https://cran.r-project.org/web/packages/keras/keras.pdf
Kay, J., Kummerfeld, B., Conati, C., Porayska-Pomsta, K., & Holstein, K. (2023). Scrutable AIED. In B. du Boulay, A. Mitrovic, & K. Yacef (Eds.), Handbook of artificial intelligence in education (pp. 101–126). Edward Elgar.
Kokoç, M., Akçapınar, G., & Hasnine, M. N. (2021). Unfolding students’ online assignment submission behavioral patterns using temporal learning analytics. Educational Technology & Society, 24(1), 223–235. https://www.jstor.org/stable/26977869
Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28(5), 1–26. https://doi.org/10.18637/jss.v028.i05
Larrabee Sønderlund, A., Hughes, E., & Smith, J. (2019). The efficacy of learning analytics interventions in higher education: A systematic review. British Journal of Educational Technology, 50(5), 2594–2618. https://doi.org/10.1111/bjet.12720
Lee, C., Gates, K. M., Chun, J., Al Kontar, R., Kamali, M., McInnis, M. G., & Deldin, P. (2025). Suicide risk estimation in bipolar disorder using N200 and P300 event-related potentials and machine learning: A pilot study. Journal of Affective Disorders Reports, 20, Article 100875. https://doi.org/10.1016/j.jadr.2025.100875
Lee, C.-A., Tzeng, J.-W., Huang, N.-F., & Su, Y.-S. (2021). Prediction of student performance in massive open online courses using deep learning system based on learning behaviors. Educational Technology & Society, 24(3), 130–146. https://www.jstor.org/stable/27032861
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2, 56–67. https://doi.org/10.1038/s42256-019-0138-9
Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588–599. https://doi.org/10.1016/j.compedu.2009.09.008
McDaniel, M. A., & Einstein, G. O. (2020). Training learning strategies to promote self-regulation and transfer: The knowledge, belief, commitment, and planning framework. Perspectives on Psychological Science, 15(6), 1363–1381. https://doi.org/10.1177/1745691620920723
Nanavaty, S., & Khuteta, A. (2024). A deep learning dive into online learning: Predicting student success with interaction-based neural networks. International Journal of Intelligent Systems and Applications in Engineering, 12(1), 102–107. https://www.ijisae.org/index.php/IJISAE/article/view/3769
Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41, 71–90. https://doi.org/10.1016/j.dr.2016.06.004
Rabiner, L., & Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3(1), 4–16. https://doi.org/10.1109/MASSP.1986.1165342
Raftery, A. E. (1985). A model for high-order Markov chains. Journal of the Royal Statistical Society Series B: Statistical Methodology, 47(3), 528–539. https://doi.org/10.1111/j.2517-6161.1985.tb01383.x
R Core Team. (2022). R: A language and environment for statistical computing (R version 4.2.2) [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: Explaining the predictions of any classifier. In B. Krishnapuram, M. Shah, A. Smola, C. Aggarwal, D. Shen, & R. Rastogi (Eds.), KDD ’16: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144). https://doi.org/10.1145/2939672.2939778
Riestra-González, M., del Puerto Paule-Ruíz, M., & Ortin, F. (2021). Massive LMS log data analysis for the early prediction of course-agnostic student performance. Computers & Education, 163, Article 104108. https://doi.org/10.1016/j.compedu.2020.104108
Rizvi, S., Rienties, B., & Khoja, S. A. (2019). The role of demographics in online learning; A decision tree based approach. Computers & Education, 137, 32–47. https://doi.org/10.1016/j.compedu.2019.04.001
Rudin, C., & Radin, J. (2019). Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Review, 1(2). https://doi.org/10.1162/99608f92.5a8a3a3d
Shapley, L. S. (1953). A value for n-person games. In H. W. Kuhn & A. W. Tucker (Eds.), Contributions to the theory of games (Vol. II, pp. 307–318). Princeton University Press. https://doi.org/10.1515/9781400881970-018
Staufer, S., Bugert, F., Hauser, F., Grabinger, L., Ezer, T., Nadimpalli, V. K., Bittner, D., Röhrl, S., & Mottok, J. (2024). Tyche algorithm: Markov models for generating learning paths in learning management systems. In L. Gómez Chova, C. González Martínez, & J. Lees (Eds.), INTED2024 proceedings: 18th international technology, education and development conference (pp. 4195–4205). IATED Academy. https://doi.org/10.21125/inted.2024.1080
Sun, D., Cheng, G., Xu, P., Zheng, Q., & Chen, L. (2019). Using HMM to compare interaction activity patterns of student groups with different achievements in MPOCs. Interactive Learning Environments, 27(5-6), 766–781. https://doi.org/10.1080/10494820.2019.1610780
Tamada, M. M., Giusti, R., & de Magalhães Netto, J. F. (2022). Predicting students at risk of dropout in technical course using LMS logs. Electronics, 11(3), Article 468. https://doi.org/10.3390/electronics11030468
Tang, Y., Li, Z., Wang, G., & Hu, X. (2023). Modeling learning behaviors and predicting performance in an intelligent tutoring system: A two-layer hidden Markov modeling approach. Interactive Learning Environments, 31(9), 5495–5507. https://doi.org/10.1080/10494820.2021.2010100
Theobald, M. (2021). Self-regulated learning training programs enhance university students’ academic performance, self-regulated learning strategies, and motivation: A meta-analysis. Contemporary Educational Psychology, 66, Article 101976. https://doi.org/10.1016/j.cedpsych.2021.101976
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Tran, T. M., & Hasegawa, S. (2022). Using Markov chain on online learning history data to develop learner model for measuring strength of learning habits. In D. G. Sampson, D., Ifenthaler, & P. Isaías (Eds.), Proceedings of the international conference on cognition and exploratory learning in the digital age. International Association for Development of the Information Society. https://eric.ed.gov/?id=ED626882
Tsiakmaki, M., Kostopoulos, G., Kotsiantis, S., & Ragos, O. (2020). Transfer learning from deep neural networks for predicting student performance. Applied Sciences, 10(6), Article 2145. https://doi.org/10.3390/app10062145
Van Goidsenhoven, S., Bogdanova, D., Deeva, G., Broucke, S. v., De Weerdt, J., & Snoeck, M. (2020). Predicting student success in a blended learning environment. In C. Rensing, H. Drachsler, V. Kovanović, N. Pinkwart, M. Scheffel, & K. Verbert (Eds.), LAK ’20: Proceedings of the tenth international conference on learning analytics & knowledge (pp. 17–25). ACM Press. https://doi.org/10.1145/3375462.3375494
Weaver, D., Spratt, C., & Nair, C. S. (2008). Academic and student use of a learning management system: Implications for quality. Australasian Journal of Educational Technology, 24(1). https://doi.org/10.14742/ajet.1228
Wen, X., & Juan, H. (2023). Early prediction of students’ performance using a deep neural network based on online learning activity sequence. Applied Sciences, 13(15), Article 8933. https://doi.org/10.3390/app13158933
Winne, P. H. (2020). Construct and consequential validity for learning analytics based on trace data. Computers in Human Behavior, 112, Article 106457. https://doi.org/10.1016/j.chb.2020.106457
Witteveen, D., & Attewell, P. (2017). The college completion puzzle: A hidden Markov model approach. Research in Higher Education, 58(4), 449–467. https://doi.org/10.1007/s11162-016-9430-2
Zepeda, C. D., Richey, J. E., Ronevich, P., & Nokes-Malach, T. J. (2015). Direct instruction of metacognition benefits adolescent science learning, transfer, and motivation: An in vivo study. Journal of Educational Psychology, 107(4), 954–970. https://doi.org/10.1037/edu0000022
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Learning Analytics

This work is licensed under a Creative Commons Attribution 4.0 International License.