Discovery and Temporal Analysis of MOOC Study Patterns
Keywords:Learning Analytics, LA, EDM, Sequence mining, Study pattern, Temporal analysis, MOOCs, Markov model, Clustering
The large-scale and granular interaction data collected in online learning platforms such as massive open online courses (MOOCs) provide unique opportunities to better understand individuals’ learning processes and could facilitate the design of personalized and more effective support mechanisms for learners. In this paper, we present two different methods of extracting study patterns from activity sequences. Unlike most of the previous works, with post hoc analysis of activity patterns, our proposed methods could be deployed during the course and enable the learners to receive real-time support and feedback. In the first method, following a hypothesis-driven approach, we extract predefined patterns from learners’ interactions with the course materials. We then identify and analyze different longitudinal profiles among learners by clustering their study pattern sequences during the course. Our second method is a data-driven approach to discover latent study patterns and track them over time in a completely unsupervised manner. We propose a clustering pipeline to model and cluster activity sequences at each time step and then search for matching clusters in previous steps to enable tracking over time. The proposed pipeline is general and allows for analysis at different levels of action granularity and time resolution in various online learning environments. Experiments with synthetic data show that our proposed method can accurately detect latent study patterns and track changes in learning behaviours. We demonstrate the application of both methods on a MOOC dataset and study the temporal dynamics of learners’ behaviour in this context.
Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In Proceedings of the 11th International Conference on Data Engineering (ICDE1995), 6–10 March, 1995, Taipei, Taiwan (pp. 3–14). Piscataway, NJ, USA: IEEE.
Bannert, M., Reimann, P., & Sonnenberg, C. (2014). Process mining techniques for analysing patterns and strategies in students’ self-regulatedlearning. Metacognition and Learning,9(2),161–185. https://doi.org/10.1007/s11409-013-9107-6
Bergner, Y., Shu, Z., & von Davier, A. (2014). Visualization and confirmatory clustering of sequence data from a simulation- based assessment task. In J. Stamper, Z. Pardos, M. Mavrikis, & B. McLaren (Eds.), Proceedings of the 7th International Conference on Educational Data Mining (EDM2014), 4–7 July 2014, London, UK (pp. 177–184). International Educational Data Mining Society.
Bro ́dka, P., Saganowski, S., & Kazienko, P. (2013). GED: The method for group evolution discovery in social networks. Social NetworkAnalysisandMining,3(1),1–14. https://doi.org/10.1007/s13278-012-0058-8
Calin ́ski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics — Theory and Methods,3(1),1–27. https://doi.org/10.1080/03610927408827101
Desmarais, M., & Lemieux, F. (2013). Clustering and visualizing study state sequences. In S. K. D’Mello, R. A. Calvo, & A. Olney (Eds.), Proceedings of the 6th International Conference on Educational Data Mining (EDM2013), 6–9 July 2013, Memphis, TN, USA (pp. 224–227). International Educational Data Mining Society/Springer. http://www.educationaldatamining.org/EDM2013/papers/rnpaper33.pdf
Faucon, L., Kidzinski, L., & Dillenbourg, P. (2016). Semi-Markov model for simulating MOOC students. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining (EDM2016), 29 June–2 July 2016, Raleigh, NC, USA (pp. 358–363). International Educational Data Mining Society. https://pdfs.semanticscholar.org/4714/4a7b8d84915b807282149693e2a3dfda6bdc.pdf
Geigle, C., & Zhai, C. (2017). Modeling MOOC student behavior with two-layer hidden Markov models. In Proceedings of the 4th ACM Conference on Learning @ Scale (L@S 2017), 20–21 April 2017, Cambridge, MA, USA (pp. 205–208). New York: ACM. https://jedm.educationaldatamining.org/index.php/JEDM/article/view/211
Greene, D., Doyle, D., & Cunningham, P. (2010). Tracking the evolution of communities in dynamic social networks. In Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM2010), 9–11August2010,Odense,Denmark(pp.176–183).IEEE. https://doi.org/10.1109/ASONAM.2010.17
Gunther, C., & van der Aalst, W. (2007). Fuzzy mining — Adaptive process simplification based on multi-perspective metrics. BusinessProcessManagement,4714,328–343. https://doi.org/10.1007/978-3-540-75183-024
Hansen, C., Hansen, C., Hjuler, N., Alstrup, S., & Lioma, C. (2017). Sequence modelling for analysing student interaction with educational systems. arXiv preprint arXiv:1708.04164
Jeong, H., & Biswas, G. (2008). Mining student behavior models in learning-by-teaching environments. In R. S. J. de Baker, T. Barnes, & J. E. Beck (Eds.), Proceedings of the 1st International Conference on Educational Data Mining (EDM’08), 20–21 June 2008, Montreal, QC, Canada (pp. 127–136). International Educational Data Mining Society
Kinnebrew, J. S., Loretz, K. M., & Biswas, G. (2013). A contextualized, differential sequence mining method to derive students’ learning behavior patterns. JEDM — Journal of Educational Data Mining, 5(1), 190–219. https://pdfs.semanticscholar.org/70fc/fbd0b95bb61e4142275c869619997c5a187d.pdf
Kizilcec, R. F., Piech, C., & Schneider, E. (2013). Deconstructing disengagement: Analyzing learner subpopulations in massive open online courses. In Proceedings of the 3rd International Conference on Learning Analytics and Knowledge (LAK ’13), 8–12April2013,Leuven,Belgium(pp.170–179).NewYork:ACM https://doi.org/10.1145/2460296.2460330
Klingler, S., Ka ̈ser, T., Solenthaler, B., & Gross, M. H. (2016). Temporally coherent clustering of student data. In Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ’16), 25–29 April 2016, Edinburgh, UK (pp. 102–109). New York: ACM. https://pdfs.semanticscholar.org/de63/dfee95651eaad731b435b09064559f3223e7.pdf
Ko ̈ck, M., & Paramythis, A. (2011). Activity sequence modelling and dynamic clustering for personalized e-learning. User ModelingandUser-AdaptedInteraction,21(1),51–97. https://doi.org/10.1007/s11257-010-9087-z
ISSN 1929-7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution - NonCommercial -NoDerivs 3.0 Unported (CC BY-NC-ND 3.0)
Li, C., & Biswas, G. (2000). A Bayesian approach to temporal data clustering using hidden Markov models. In Proceedings of the 17th International Conference on Machine Learning (ICML2000), 29 June– 2 July 2000, Stanford, CA, USA (pp. 543–550). San Francisco, CA: Morgan Kaufmann Publishers. https://pdfs.semanticscholar.org/50b9/6db8ef7d550ce72630cd1a6d196c7b311815.pdf
Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151. https://doi.org/10.1109/18.61115
Maldonado, R. M., Yacef, K., Kay, J., Kharrufa, A., & Al-Qaraghuli, A. (2011). Analysing frequent sequential patterns of collaborative learning activity around an interactive tabletop. In M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, & J. Stamper (Eds.), Proceedings of the 4th Annual Conference on Educational Data Mining (EDM2011), 6–8 July 2011, Eindhoven, Netherlands (pp. 111–120). International Educational Data Mining Society
Mukala, P., Buijs, J., & Van Der Aalst, W. (2015). Exploring students’ learning behaviour in MOOCs using process min- ing techniques. https://research.tue.nl/en/publications/exploring-students-learning-behaviour-in-moocs-using-process-mini. Eindhoven University of Technology, BPM Center Report
Murtagh, F., & Legendre, P. (2014). Ward’s hierarchical agglomerative clustering method: Which algorithms implement Ward’s criterion? Journal of Classification, 31(3), 274–295
Nesbit, J. C., Zhou, M., Xu, Y., & Winne, P. (2007). Advancing log analysis of student interactions with cognitive tools. In Proceedings of the 12th Biennial Conference of the European Association for Research on Learning and Instruction (EARLI), 28 August–1 September 2007, Budapest, Hungary. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.150.6671
Patel, N., Sellman, C., & Lomas, D. (2017). Mining frequent learning pathways from a large educational dataset. arXiv preprint arXiv:1705.11125. https://arxiv.org/pdf/1705.11125.pdf
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of ComputationalandAppliedMathematics,20,53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Shen, S., & Chi, M. (2017). Clustering student sequential trajectories using dynamic time warping. In X. Hu, T. Barnes, A. Hershkovitz, & L. Paquette (Eds.), Proceedings of the 10th International Conference on Educational Data Min- ing (EDM2017), 25–28 June 2017, Wuhan, China (pp. 266–271). International Educational Data Mining Society. http://educationaldatamining.org/EDM2017/procf iles/papers/paper94.pdf
Shih, B., Koedinger, K. R., & Scheines, R. (2010). Unsupervised discovery of student strategies. In R. S. de Baker, A. Merceron, & P. I. P. Jr. (Eds.), Proceedings of the 3rd International Conference on Educational Data Mining (EDM2010), 11–13 June 2010, Pittsburgh, PA, USA (pp. 201–210). International Educational Data Mining Society. http://educationaldatamining.org/EDM2010/uploads/proc/edm2010submission55. pdf
Shirvani Boroujeni, M., & Dillenbourg, P. (2018). Discovery and temporal analysis of latent study patterns in MOOC interaction sequences. In Proceedings of the 8th International Conference on Learning Analytics and Knowledge (LAK ’18), 5–9 March 2018,Sydney,NSW,Australia(pp.206–215).NewYork:ACM. https://doi.org/10.1145/3170358.3170388
Shirvani Boroujeni, M., Hecking, T., Hoppe, H. U., & Dillenbourg, P. (2017). Dynamics of MOOC discussion forums. In Proceedings of the 7th International Conference on Learning Analytics and Knowledge (LAK ’17), 13–17 March 2017, Vancouver,BC,Canada(pp.128–137).NewYork:ACM. https://doi.org/10.1145/3027385.3027391
Shirvani Boroujeni, M., Kidzinski, Ł., & Dillenbourg, P. (2016). How employment constrains participation in MOOCs. In T. Barnes, M. Chi, & M. Feng (Eds.), Proceedings of the 9th International Conference on Educational Data Mining (EDM2016), 29 June–2 July 2016, Raleigh, NC, USA (pp. 376–377). International Educational Data Mining Society. https://infoscience.epfl.ch/record/218785/files/EDM16employmentPOSTER.pdf
ShirvaniBoroujeni,M.,Sharma,K.,Kidzin ́ski,Ł.,Lucignano,L.,&Dillenbourg,P. (2016). Howtoquantifystu- dent’s regularity? In K. Verbert, M. Sharples, & T. Klobucar (Eds.), Proceedings of the 11th European Conference on Technology Enhanced Learning (EC-TEL 2016), 13–16 September 2016, Lyon, France (pp. 277–291). Springer. https://doi.org/10.1007/978-3-319-45153-421
Trcka, N., Pechenizkiy, M., & van der Aalst, W. (2010). Process Mining from Educational Data. Boca Raton, FL: Chapman & Hall/CRC.
Van der Aalst, W., Weijters, T., & Maruster, L. (2004). Workflow mining: Discovering process models from event logs. IEEE TransactionsonKnowledgeandDataEngineering,16(9),1128–1142. https://doi.org/10.1109/TKDE.2004.47
How to Cite
LicenseAuthors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons License, Attribution - NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) license that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).