Analyzing Students’ Problem-Solving Sequences

A Human-in-the-Loop Approach


  • Erica Kleinman University of California, Santa Cruz
  • Murtuza Shergadwala University of California, Santa Cruz
  • Zhaoqing Teng University of California, Santa Cruz
  • Jennifer Villareale Drexel University
  • Andy Bryant University of California, Santa Cruz
  • Jichen Zhu IT University of Copenhagen
  • Magy Seif El-Nasr University of California, Santa Cruz



learning analytics, sequence analysis, visualization, human-in-the-loop methods, mixed methods, game-based learning, research paper


Educational technology is shifting toward facilitating personalized learning. Such personalization, however, requires a detailed understanding of students’ problem-solving processes. Sequence analysis (SA) is a promising approach to gaining granular insights into student problem solving; however, existing techniques are difficult to interpret because they offer little room for human input in the analysis process. Ultimately, in a learning context, a human stakeholder makes the decisions, so they should be able to drive the analysis process. In this paper, we present a human-in-the-loop approach to SA that uses visualization to allow a stakeholder to better understand both the data and the algorithm. We illustrate the method with a case study in the context of a learning game called Parallel. Results reveal six groups of students organized based on their problem-solving patterns and highlight individual differences within each group. We compare the results to a state-of-the-art method run with the same data and discuss the benefits of our method and the implications of this work.


Abbott, A., & Tsay, A. (2000). Sequence analysis and optimal matching methods in sociology: Review and prospect. Sociological Methods & Research, 29(1), 3–33.

Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.

Adams, D. M., Mayer, R. E., MacNamara, A., Koenig, A., & Wainess, R. (2012). Narrative games for learning: Testing the discovery and narrative hypotheses. Journal of Educational Psychology, 104(1), 235–249.

Ahmad, S., Bryant, A., Kleinman, E., Teng, Z., Nguyen, T.-H. D., & Seif El-Nasr, M. (2019). Modeling individual and team behavior through spatio-temporal analysis. Proceedings of the Annual Symposium on Computer-Human Interaction in Play (CHI PLAY 2019), 22–25 October 2019, Barcelona, Spain (pp. 601–612). ACM.

Baker, R. S., Corbett, A. T., & Wagner, A. Z. (2006). Human classification of low-fidelity replays of student actions. Proceedings of the Educational Data Mining Workshop at the Eighth International Conference on Intelligent Tutoring Systems, 26–30 June 2006, Jhongli, Taiwan (pp. 29–36). Retrieved from

Bakhshinategh, B., Zaiane, O. R., ElAtia, S., & Ipperciel, D. (2018). Educational data mining applications and tasks: A survey of the last 10 years. Education and Information Technologies, 23(1), 537–553.

Balakrishnan, G., & Coetzee, D. (2013). Predicting student retention in massive open online courses using hidden Markov models (Technical Report No. UCB/EECS-2013-109). Electrical Engineering and Computer Sciences, University of California at Berkeley. Retrieved from

Berndt, D. J., & Clifford, J. (1994). Using dynamic time warping to find patterns in time series. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (AAAIWS’94), 31 July–1 August 1994, Seattle, Washington, USA (pp. 359–370). ACM. Retrieved from

Biswas, G., Jeong, H., Kinnebrew, J. S., Sulcer, B., & Roscoe, R. (2010). Measuring self-regulated learning skills through social interactions in a teachable agent environment. Research and Practice in Technology Enhanced Learning, 5(2), 123–152.

Bodily, R., Kay, J., Aleven, V., Jivet, I., Davis, D., Xhakaj, F., & Verbert, K. (2018). Open learner models and learning analytics dashboards: A systematic review. Proceedings of the Eighth International Conference on Learning Analytics and Knowledge (LAK 2018), 7–9 March 2018, Sydney, Australia (pp. 41–50). ACM.

Boumi, S., & Vela, A. (2019). Application of hidden Markov models to quantify the impact of enrollment patterns on student performance. Proceedings of the 2019 Conference on Educational Data Mining (EDM 2019), 2–5 July 2019, Montréal, Québec, Canada. Retrieved from

de Klerk, S., Veldkamp, B. P., & Eggen, T. J. (2015). Psychometric analysis of the performance data of simulation-based assessment: A systematic review and a Bayesian network example. Computers & Education, 85, 23–34.

Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., & Keogh, E. (2008). Querying and mining of time series data: Experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment, 1(2), 1542–1552.

Doleck, T., Basnet, R. B., Poitras, E. G., & Lajoie, S. P. (2015). Mining learner–system interaction data: Implications for modeling learner behaviors and improving overlay models. Journal of Computers in Education, 2(4), 421–447.

Eichmann, B., Greiff, S., Naumann, J., Brandhuber, L., & Goldhammer, F. (2020). Exploring behavioural patterns during complex problem-solving. Journal of Computer Assisted Learning, 36(6), 933–956.

Gasevic, D., Jovanovic, J., Pardo, A., & Dawson, S. (2017). Detecting learning strategies with analytics: Links with self-reported measures and academic performance. Journal of Learning Analytics, 4(2), 113–128.

Geigle, C., & Zhai, C. (2017). Modeling MOOC student behavior with two-layer hidden Markov models. Proceedings of the Fourth ACM Conference on Learning @ Scale (L@S 2017), 20–21 April 2017, Cambridge, Massachusetts, USA (pp. 205–208). AMS.

Ha, E. Y., Rowe, J. P., Mott, B. W., Lester, J., Sukthankar, G., Goldman, R., Geib, C., Pynadath, D., & Bui, H. (2014). Recognizing player goals in open-ended digital games with Markov logic networks. In G. Sukthankar, C. Geib, H. H. Bui, D. Pynadath, & R. P. Goldman (Eds.), Plan, Activity and Intent Recognition: Theory and Practice (pp. 289–311). Morgan Kauffman.

He, Q., Borgonovi, F., & Paccagnella, M. (2021). Leveraging process data to assess adults’ problem-solving skills: Using sequence mining to identify behavioral patterns across digital tasks. Computers & Education, 166, 104170.

Hicks, D., Eagle, M., Rowe, E., Asbell-Clarke, J., Edwards, T., & Barnes, T. (2016). Using game analytics to evaluate puzzle design and level progression in a serious game. Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 440–448). ACM.

Hooshyar, D., Pedaste, M., Saks, K., Leijen, Ä., Bardone, E., & Wang, M. (2020). Open learner models in supporting self-regulated learning in higher education: A systematic literature review. Computers & Education, 154, 103878.

Horn, B., Hoover, A. K., Barnes, J., Folajimi, Y., Smith, G., & Harteveld, C. (2016). Opening the black box of play: Strategy analysis of an educational game. Proceedings of the 2016 Annual Symposium on Computer-Human Interaction in Play (CHI PLAY 2016), 16–19 October 2016, Austin, Texas, USA (pp. 142–153). ACM.

Iske, S. (2008). Educational research online: E-learning sequences analyzed by means of optimal-matching. Proceedings of EdMedia+ Innovate Learning, 30 June 2008, Vienna, Austria (pp. 3780–3789). Association for the Advancement of Computing in Education (AACE). Retrieved from

Javvaji, N., Harteveld, C., & Seif El-Nasr, M. (2020). Understanding player patterns by combining knowledge-based data abstraction with interactive visualization. Proceedings of the Annual Symposium on Computer-Human Interaction in Play (CHI PLAY 2020), 2–4 November 2020, online. ACM.

Jemmali, C., Kleinman, E., Bunian, S., Almeda, M. V., Rowe, E., & Seif El-Nasr, M. (2020). MAADS: Mixed-methods approach for the analysis of debugging sequences of beginner programmers. Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE 2020), 11–14 March 2020, Portland, Oregon, USA (pp. 86–92).

Jeong, H., Gupta, A., Roscoe, R., Wagster, J., Biswas, G., & Schwartz, D. (2008). Using hidden Markov models to characterize student behaviors in learning-by-teaching environments. In B. P. Woolf, E. Aïmeur, R. Nkambou, & S. Lajoie (Eds.), Proceedings of the Ninth International Conference on Intelligent Tutoring Systems (ITS 2008), 23–27 June 2008, Montréal, Québec, Canada (pp. 614–625).

Kantharaju, P., Alderfer, K., Zhu, J., Char, B., Smith, B., & Ontanón, S. (2018). Tracing player knowledge in a parallel programming educational game. Proceedings of the 14th Artificial Intelligence and Interactive Digital Entertainment Conference, 13–17 November 2018, Edmonton, Alberta, Canada (pp. 173–179). Retrieved from

Kinnebrew, J. S., & Biswas, G. (2012). Identifying learning behaviors by contextualizing differential sequence mining with action features and performance evolution. Proceedings of the Fifth International Conference on Educational Data Mining (EDM 2012), 19–21 June 2012, Chania, Greece (pp. 57–64). Retrieved from

Kinnebrew, J. S., Loretz, K. M., & Biswas, G. (2013). A contextualized, differential sequence mining method to derive students’ learning behavior patterns. Journal of Educational Data Mining, 5(1), 190–219.

Kleinman, E., Ahmad, S., Teng, Z., Bryant, A., Nguyen, T.-H. D., Harteveld, C., & Seif El-Nasr, M. (2020). “And then they died”: Using action sequences for data driven, context aware gameplay analysis. Proceedings of the 15th International Conference on the Foundations of Digital Games (FDG 2020), 15–18 September 2020, Bugibba, Malta (pp. 1–12).

Köck, M., & Paramythis, A. (2011). Activity sequence modelling and dynamic clustering for personalized e-learning. User Modeling and User-Adapted Interaction, 21(1), 51–97.

Lesnard, L. (2006). Optimal Matching and Social Sciences (tech. rep.). HAL Open Science. Retrieved from

Liñán, L. C., & Pérez, Á. A. J. (2015). Educational data mining and learning analytics: Differences, similarities, and time evolution. International Journal of Educational Technology in Higher Education, 12(3), 98–112.

Malmberg, J., Järvelä, S., & Järvenoja, H. (2017). Capturing temporal and sequential patterns of self-, co-, and socially shared regulation in the context of collaborative learning. Contemporary Educational Psychology, 49, 160–174.

Min, W., Mott, B. W., Rowe, J. P., Liu, B., & Lester, J. C. (2016). Player goal recognition in open-world digital games with long short-term memory networks. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI 2016), 9–15 July 2016, Palo Alto, California, USA (pp. 2590–2596). Retrieved from

Nguyen, T.-H. D., El-Nasr, M. S., & Canossa, A. (2015). Glyph: Visualization tool for understanding problem solving strategies in puzzle games. Proceedings of the 10th International Conference on the Foundations of Digital Games (FDG 2015), 22–25 June 2015, Pacific Grove, California, USA. Retrieved from

O’Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown. Paaßen, B., Hammer, B., Price, T. W., Barnes, T., Gross, S., & Pinkwart, N. (2018). The continuous hint factory—Providing hints in vast and sparsely populated edit distance spaces. Journal of Educational Data Mining, 10(1), 1–35.

Papamitsiou, Z. K., & Economides, A. A. (2014). Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence. Educational Technology & Society, 17(4), 49–64. Retrieved from

Reilly, J. M., & Dede, C. (2019). Differences in student trajectories via filtered time series analysis in an immersive virtual world. Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 130–134).

Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(6), 601–618.

Romero, C., Ventura, S., Zafra, A., & De Bra, P. (2009). Applying web usage mining for personalizing hyperlinks in web-based adaptive educational systems. Computers & Education, 53(3), 828–840.

Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.

Sawyer, R., Rowe, J., Azevedo, R., & Lester, J. (2018). Filtered time series analyses of student problem-solving behaviors in game-based learning. Proceedings of the 11th International Conference on Educational Data Mining (EDM 2018), 15–18 July 2018, Buffalo, New York, USA (pp. 229–238). Retrieved from

Shute, V. J., Masduki, I., & Donmez, O. (2010). Conceptual framework for modeling, assessing and supporting competencies within game environments. Technology, Instruction, Cognition & Learning, 8(2), 137–161. Retrieved from

Vahdat, M., Ghio, A., Oneto, L., Anguita, D., Funk, M., & Rauterberg, M. (2015). Advances in learning analytics and educational data mining. Proceedings of the 23rd European Symposium on Artificial Neural Networks (ESANN2015), 22–24 April 2015, Bruges, Belgium (pp. 297–306). Retrieved from

Valls-Vargas, J., Ontanón, S., & Zhu, J. (2015). Exploring player trace segmentation for dynamic play style prediction. Proceedings of the 11th Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AAAI 2015), 14–18 November 2015, Santa Cruz, California, USA (pp. 93–99). Retrieved from

Villareale, J., F. Biemer, C., Seif El-Nasr, M., & Zhu, J. (2020). Reflection in game-based learning: A survey of programming games. In G. N. Yannakakis, A. Liapis, V. V. Penny Kyburz, F. Khosmood, & P. Lopes (Eds.), Proceedings of the 15th International Conference on the Foundations of Digital Games (FDG 2020), 15–18 September 2020, Bugibba, Malta (pp. 1–9).

Zhu, J., Alderfer, K., Furqan, A., Nebolsky, J., Char, B., Smith, B., Villareale, J., & Ontañón, S. (2019). Programming in game space: How to represent parallel programming concepts in an educational game. Proceedings of the 14th International Conference on the Foundations of Digital Games (FDG 2019), 26–30 August 2019, San Luis Obispo, California, USA (pp. 1–10).

Zhu, J., Alderfer, K., Smith, B., Char, B., & Ontañón, S. (2020). Understanding learners’ problem-solving strategies in concurrent and parallel programming: A game-based approach. arXiv:2005.04789. Retrieved from

Zhu, J., & El-Nasr, M. S. (2021). Open player modeling: Empowering players through data transparency. arXiv:2110.05810. Retrieved from




How to Cite

Kleinman, E., Shergadwala, M., Teng, Z., Villareale, J., Bryant, A., Zhu, J., & Seif El-Nasr, M. (2022). Analyzing Students’ Problem-Solving Sequences: A Human-in-the-Loop Approach. Journal of Learning Analytics, 1-23.



Research Papers