Words of Wisdom: A Journey through the Realm of NLP for Learning Analytics - A Systematic Literature Review

Authors

DOI:

https://doi.org/10.18608/jla.2024.8403

Keywords:

natural language processing, text analytics, computational linguistics, learning analytics, research paper

Abstract

Learning Analytics (LA) is one of the world’s most influential research fields related to educational technology. Among many themes that the LA community considers, the application of Natural Language Processing (NLP) algorithms has been largely adopted to extract information from textual data generated in learning environments (e.g., student essays and short answers, online discussion and chat). NLP can shed light on the learning process and student outcomes in different contexts. Based on the importance of NLP for education, this paper conducted a systematic literature review of the application of NLP to understand how the LA community has been applying this method. Our methodology includes automatic and manual methods to extract information about authors, relevant papers, and specific data related to educational applications and algorithms used in the field. This review selected 156 papers that reveal essential aspects of the topic, such as: (i) the majority of the works focused on the analysis of online discussions and essay assessment; (ii) in general, the authors did not apply the developed models in real settings; (iii) recent papers selected start to evaluate deep learning models (e.g., BERT) more frequently; (iv) the datasets used in the experimentation are usually small and containing English text; (v) the average models performance reaches 0.54 and 0.79 of Cohen’s Kappa and Accuracy, respectively. The results of this study and its practical implications are further discussed. 

References

Aggarwal, C. C., & Zhai, C. (2012). A survey of text classification algorithms. In C. C. Aggarwal & C. Zhai (Eds.), Mining text data (pp. 163–222). Springer. https://doi.org/10.1007/978-1-4614-3223-4_6

Aguerrebere, C., Cobo, C., Gomez, M., & Mateu, M. (2017). Strategies for data and learning analytics informed national education policies: The case of Uruguay. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 449–453). ACM. https://doi.org//10.1145/3027385.3027444

Ahadi, A., Singh, A., Bower, M., & Garrett, M. (2022). Text mining in education—A bibliometrics-based systematic review. Education Sciences, 12(3), 210. https://doi.org/10.3390/educsci12030210

Albano, V., Firmani, D., Laura, L., Mathew, J. G., Paoletti, A. L., & Torrente, I. (2023). NLP-based management of large multiple-choice test item repositories. Journal of Learning Analytics, 10(3), 28–44. https://doi.org/10.18608/jla.2023.7897

Albano, V., Firmani, D., Laura, L., Paoletti, A. L., & Torrente, I. (2022). Managing large multiple-choice test items repositories. In E. Banissi, A. Ursyn, M. W. M. Bannatyne, J. M. Pires, N. Datia, K. Nazemi, B. Kovalerchuk, R. Andonie, M. Nakayama, F. Sciarrone, W. Huang, Q. V. Nguyen, M. S. Mabakane, A. Rusu, M. Temperini, U. Cvek, M. Trutschl, H. Mueller, H. Siirtola, . . . V. Geroimenko (Eds.), Proceedings of the 26th International Conference on Information Visualisation (IV 2022), 19–22 July 2022, Vienna, Austria (pp. 275–279). IEEE. https://doi.org/10.1109/IV56949.2022.00054

Allen, L. K., Mills, C., Jacovina, M. E., Crossley, S., D’Mello, S., & McNamara, D. S. (2016). Investigating boredom and engagement during writing using multiple sources of information: The essay, the writer, and keystrokes. In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 114–123). ACM. https://doi.org/10.1145/2883851.2883939

Allen, L. K., Mills, C., Perret, C., & McNamara, D. S. (2019). Are you talking to me? Multi-dimensional language analysis of explanations during reading. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 116–120). ACM. https://doi.org/10.1145/3303772.3303835

Allen, L. K., Snow, E. L., & McNamara, D. S. (2015). Are you reading my mind? Modeling students’ reading comprehension skills with natural language processing techniques. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20 March 2015, Poughkeepsie, New York, USA (pp. 246–254). ACM. https://doi.org/10.1145/2723576.2723617

Azevedo, R. (2015). Defining and measuring engagement and learning in science: Conceptual, theoretical, methodological, and analytical issues. Educational Psychologist, 50(1), 84–94. https://doi.org/10.1080/00461520.2015.1004069

Balducci, B., & Marinova, D. (2018). Unstructured data in marketing. Journal of the Academy of Marketing Science, 46(4), 557–590. https://doi.org/10.1007/s11747-018-0581-x

Barbosa, A., Ferreira, M., Mello, R. F., Lins, R. D., & Gasevic, D. (2021). The impact of automatic text translation on classification of online discussions for social and cognitive presences. In Proceedings of the 11th International Conference on Learning Analytics and Knowledge (LAK 2021), 12–16 April 2021, Irvine, California, USA (pp. 77–87). ACM. https://doi.org/10.1145/3448139.3448147

Barbosa, G., Camelo, R., Cavalcanti, A. P., Miranda, P., Mello, R. F., Kovanovic, V., & Gasevic, D. (2020). Towards automatic cross-language classification of cognitive presence in online discussions. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 605–614). ACM. https://doi.org/10.1145/3375462.3375496

Benedetto, L., Cappelli, A., Turrin, R., & Cremonesi, P. (2020). R2DE: A NLP approach to estimating IRT parameters of newly generated questions. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 412–421). ACM. https://doi.org/10.1145/3375462.3375517

Boroujeni, M. S., Hecking, T., Hoppe, H. U., & Dillenbourg, P. (2017). Dynamics of MOOC discussion forums. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 128–137). ACM. https://doi.org/10.1145/3027385.3027391

Buckingham Shum, S., Sándor, Á., Goldsmith, R., Wang, X., Bass, R., & McWilliams, M. (2016). Reflecting on reflective writing analytics: Assessment challenges and iterative evaluation of a prototype tool. In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 213–222). ACM. https://doi.org/10.1145/2883851.2883955

Canale, L., Farinetti, L., & Cagliero, L. (2021). From teaching books to educational videos and vice versa: A cross-media content retrieval experience. In W. K. Chan, B. Claycomb, H. Takakura, J.-J. Yang, Y. Teranishi, D. Towey, S. Segura, H. Shahriar, S. Reisman, & S. I. Ahamed (Eds.), Proceedings of the IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC 2021), 12–16 July 2021, online (pp. 115–120). IEEE. https://doi.org/10.1109/COMPSAC51774.2021.00027

Carnell, S., Lok, B., James, M. T., & Su, J. K. (2019). Predicting student success in communication skills learning scenarios with virtual humans. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 436–440). ACM. https://doi.org/10.1145/3303772.3303828

Castañeda, L., & Selwyn, N. (2018). More than tools? Making sense of the ongoing digitizations of higher education. International Journal of Educational Technology in Higher Education, 15(1), 1–10. https://doi.org/10.1186/s41239-018-0109-y

Cavalcanti, A. P., Diego, A., Mello, R. F., Mangaroska, K., Nascimento, A., Freitas, F., & Gasevi c, D. (2020). How good is my feedback? A content analysis of written feedback. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 428–437). ACM. https://doi.org/10.1145/3375462.3375477

Cavalcanti, A. P., Mello, R. F., Gasevic, D., & Freitas, F. (2023). Towards explainable prediction feedback messages using BERT. International Journal of Artificial Intelligence in Education. https://doi.org/10.1007/s40593-023-00375-w

Chang, X., Wang, B., & Hui, B. (2022). Towards an automatic approach for assessing program competencies. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 119–129). ACM. https://doi.org/10.1145/3506860.3506875

Chen, B., & Resendes, M. (2014). Uncovering what matters: Analyzing transitional relations among contribution types in knowledge-building discourse. In Proceedings of the Fourth International Conference on Learning Analytics and Knowledge (LAK 2014), 24–28 March 2014, Indianapolis, Indiana, USA (pp. 226–230). ACM. https://doi.org/10.1145/2567574.2567606

Chen, G., Rolim, V., Mello, R. F., & Gasevic, D. (2020). Let’s shine together! A comparative study between learning analytics and educational data mining. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 544–553). ACM. https://doi.org/10.1145/3375462.3375500

Chowdhary, K. R. (2020). Natural language processing. In Fundamentals of artificial intelligence (pp. 603–649). Springer. https://doi.org/10.1007/978-81-322-3972-7_19

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104

Cross, S., Waters, Z., Kitto, K., & Zuccon, G. (2017). Classifying help seeking behaviour in online communities. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 419–423). ACM. https://doi.org/10.1145/3027385.3027442

Crossley, S., Liu, R., & McNamara, D. (2017). Predicting math performance using natural language processing tools. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 339–347). ACM. https://doi.org/10.1145/3027385.3027399

Crossley, S., Paquette, L., Dascalu, M., McNamara, D. S., & Baker, R. S. (2016). Combining click-stream data with NLP tools to better understand MOOC completion. In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 6–14). ACM. https://doi.org/10.1145/2883851.2883931

Dascalu, M., Trausan-Matu, S., Dessus, P., & McNamara., D. S. (2015). Discourse cohesion: A signature of collaboration. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20 March 2015, Poughkeepsie, New York, USA (pp. 350–354). ACM. https://doi.org/10.1145/2723576.2723578

Del Gobbo, E., Guarino, A., Cafarelli, B., & Grilli, L. (2023). GradeAid: A framework for automatic short answers grading in educational contexts—design, implementation and evaluation. Knowledge and Information Systems, 65(10), 4295–4334. https://doi.org/10.1007/s10115-023-01892-9

Donnelly, P. J., Blanchard, N., Olney, A. M., Kelly, S., Nystrand, M., & D’Mello, S. K. (2017). Words matter: Automatic detection of teacher questions in live classroom discourse using linguistics, acoustics, and context. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 218–227). ACM. https://doi.org/10.1145/3027385.3027417

Dood, A., Winograd, B., Finkenstaedt-Quinn, S., Gere, A., & Shultz, G. (2022). PeerBERT: Automated characterization of peer review comments across courses. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 492–499). ACM. https://doi.org/10.1145/3506860.3506892

Erickson, J. A., Botelho, A. F., McAteer, S., Varatharaj, A., & Heffernan, N. T. (2020). The automated grading of student open responses in mathematics. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 615–624). ACM. https://doi.org/10.1145/3375462.3375523

Farrow, E., Moore, J., & Gasevic, D. (2019). Analysing discussion forum data: A replication study avoiding data contamination. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 170–179). ACM. https://doi.org/10.1145/3303772.3303779

Farrow, E., Moore, J. D., & Gasevic, D. (2023). Names, nicknames, and spelling errors: Protecting participant identity in learning analytics of online discussions. In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 145–155). ACM. https://doi.org/10.1145/3576050.3576070

Ferguson, R., & Clow, D. (2017). Where is the evidence? A call to action for learning analytics. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 56–65). ACM. https://doi.org/10.1145/3027385.3027396

Ferguson, R., Khosravi, H., Kovanovic, V., Viberg, O., Aggarwal, A., Brinkhuis, M., Buckingham Shum, S., Chen, L. K., Drachsler, H., Guerrero, V. A., Hanses, M., Hayward, C., Hicks, B., Jivet, I., Kitto, K., Kizilcec, R., Lodge, J. M., Manly, C. A., Matz, R. L., . . . Yan, V. X. (2023). Aligning the goals of learning analytics with its research scholarship: An open peer commentary approach. Journal of Learning Analytics, 10(2), 14–50. https://doi.org/10.18608/jla.2023.8197

Ferguson, R., & Shum, S. B. (2012). Social learning analytics: Five approaches. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 23–33). ACM. https://doi.org/10.1145/2330601.2330616

Ferreira, M., Rolim, V., Mello, R. F., Lins, R. D., Chen, G., & Gasevic, D. (2020). Towards automatic content analysis of social presence in transcripts of online discussions. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 141–150). ACM. https://doi.org/10.1145/3375462.3375495

Ferreira, M. A. D., Mello, R. F., Kovanovic, V., Nascimento, A., Lins, R., & Gasevic, D. (2022). NASC: Network analytics to uncover socio-cognitive discourse of student roles. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 415–425). ACM. https://doi.org/10.1145/3506860.3506978

Ferreira Mello, R., Fiorentino, G., Oliveira, H., Miranda, P., Rakovic, M., & Gasevic, D. (2022). Towards automated content analysis of rhetorical structure of written essays using sequential content-independent features in Portuguese. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 404–414). ACM. https://doi.org/10.1145/3506860.3506977

Ferreira-Mello, R., Andre, M., Pinheiro, A., Costa, E., & Romero, C. (2019). Text mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(6), e1332. https://doi.org/10.1002/widm.1332

Fiallos, A., & Ochoa, X. (2019). Semi-automatic generation of intelligent curricula to facilitate learning analytics. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 46–50). ACM. https://doi.org/10.1145/3303772.3303834

Fougt, S. S., Siebert-Evenstone, A., Eagan, B., Tabatabai, S., & Misfeldt, M. (2018). Epistemic network analysis of students’ longer written assignments as formative/summative evaluation. In Proceedings of the Eighth International Conference on Learning Analytics and Knowledge (LAK 2018), 7–9 March 2018, Sydney, New South Wales, Australia (pp. 126–130). ACM. https://doi.org/10.1145/3170358.3170414

Gibson, A., Aitken, A., Sándor,Á., Buckingham Shum, S., Tsingos-Lucas, C., & Knight, S. (2017). Reflective writing analytics for actionable feedback. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 153–162). ACM. https://doi.org/10.1145/3027385.3027436

Guo, L., Du, J., & Zheng, Q. (2023). Understanding the evolution of cognitive engagement with interaction levels in online learning environments: Insights from learning analytics and epistemic network analysis. Journal of Computer Assisted Learning, 39(3), 984–1001. https://doi.org/10.1111/jcal.12781

Herodotou, C., Rienties, B., Hlosta, M., Boroowa, A., Mangafa, C., & Zdrahal, Z. (2020). The scalable implementation of predictive learning analytics at a distance learning university: Insights from a longitudinal case study. The Internet and Higher Education, 45, 100725. https://doi.org/10.1016/j.iheduc.2020.100725

Hsiao, I.-H., & Awasthi, P. (2015). Topic facet modeling: Semantic visual analytics for online discussion forums. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20 March 2015, Poughkeepsie, New York, USA (pp. 231–235). ACM. https://doi.org/10.1145/2723576.2723613

Hu, X. (2017). Automated recognition of thinking orders in secondary school student writings. Learning: Research and Practice, 3(1), 30–41. https://doi.org/10.1080/23735082.2017.1284253

Hu, Y., Donald, C., Giacaman, N., & Zhu, Z. (2020). Towards automated analysis of cognitive presence in MOOC discussions: A manual classification study. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 135–140). ACM. https://doi.org/10.1145/3375462.3375473

Hunkins, N., Kelly, S., & D’Mello, S. (2022). “Beautiful work, you’re rock stars!”: Teacher analytics to uncover discourse that supports or undermines student motivation, identity, and belonging in classrooms. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 230–238). ACM. https://doi.org/10.1145/3506860.3506896

Iqbal, S., Rakovic, M., Chen, G., Li, T., Ferreira Mello, R., Fan, Y., Fiorentino, G., Radi Aljohani, N., & Gasevic, D. (2023). Towards automated analysis of rhetorical categories in students essay writings using Bloom’s taxonomy. In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 418–429). ACM. https://doi.org/10.1145/3576050.3576112

Iqbal, S., Swiecki, Z., Joksimovic, S., Ferreira Mello, R., Aljohani, N., Ul Hassan, S., & Gasevic, D. (2022). Uncovering associations between cognitive presence and speech acts: A network-based approach. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 315–325). ACM. https://doi.org/10.1145/3506860.3506908

Jayakodi, K., Bandara, M., Perera, I., & Meedeniya, D. (2016). WordNet and cosine similarity based classifier of exam questions using Bloom’s taxonomy. International Journal of Emerging Technologies in Learning (Online), 11(4), 142. https://doi.org/10.3991/ijet.v11i04.5654

Jung, Y., & Friend Wise, A. (2020). How and how well do students reflect? Multi-dimensional automated reflection assessment in health professions education. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 595–604). ACM. https://doi.org/10.1145/3375462.3375528

Khalil, M., Prinsloo, P., & Slade, S. (2021). Realising the potential of learning analytics: Reflections from a pandemic. In J. Liebowitz (Ed.), Online learning analytics (pp. 79–94). Auerbach Publications. https://doi.org/10.1201/9781003194620-5

Khurana, D., Koli, A., Khatter, K., & Singh, S. (2023). Natural language processing: State of the art, current trends and challenges. Multimedia tools and applications, 82(3), 3713–3744. https://doi.org/10.1007/s11042-022-13428-4

Kitchenham, B. A., Budgen, D., & Brereton, P. (2015). Evidence-based software engineering and systematic reviews. Chapman & Hall/CRC.

Kitto, K., Manly, C. A., Ferguson, R., & Poquet, O. (2023). Towards more replicable content analysis for learning analytics. In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 303–314). ACM. https://doi.org/10.1145/3576050.3576096

Knight, S., Buckingham Shum, S., Ryan, P., Sándor, Á., & Wang, X. (2018). Designing academic writing analytics for civil law student self-assessment. International Journal of Artificial Intelligence in Education, 28, 1–28. https://doi.org/10.1007/s40593-016-0121-0

Knight, S., & Littleton, K. (2015). Developing a multiple-document-processing performance assessment for epistemic literacy. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20

March 2015, Poughkeepsie, New York, USA (pp. 241–245). ACM. https://doi.org/10.1145/2723576.2723577

Kong, B., Hemberg, E., Bell, A., & O’Reilly, U. - M. (2023). Investigating student’s problem-solving approaches in MOOCs using natural language processing. In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 262–272). ACM. https://doi.org/10.1145/3576050.3576091

Kovanovic, V., Joksimovic, S., Mirriahi, N., Blaine, E., Gasevic, D., Siemens, G., & Dawson, S. (2018). Understand students’ self-reflections through learning analytics. In Proceedings of the Eighth International Conference on Learning Analytics and Knowledge (LAK 2018), 7–9 March 2018, Sydney, New South Wales, Australia (pp. 389–398). ACM. https://doi.org/10.1145/3170358.3170374

Kovanovic, V., Joksimovic, S., Waters, Z., Gasevic, D., Kitto, K., Hatala, M., & Siemens, G. (2016). Towards automated content analysis of discussion transcripts: A cognitive presence case. In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 15–24). ACM. https://doi.org/10.1145/2883851.2883950

Kumar, V., & Boulanger, D. (2020). Explainable automated essay scoring: Deep learning really has pedagogical value. Frontiers in Education, 5, 572367. https://doi.org/10.3389/feduc.2020.572367

Larusson, J. A., & White, B. (2012). Monitoring student progress through their written “point of originality.” In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 212–221). ACM. https://doi.org/10.1145/2330601.2330653

Lee, A. V. Y., & Tan, S. C. (2017). Temporal analytics with discourse analysis: Tracing ideas and impact on communal discourse. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 120–127). ACM. https://doi.org/10.1145/3027385.3027386

Leeman-Munk, S. P., Wiebe, E. N., & Lester, J. C. (2014). Assessing elementary students’ science competency with text analytics. In Proceedings of the Fourth International Conference on Learning Analytics and Knowledge (LAK 2014), 24–28 March 2014, Indianapolis, Indiana, USA (pp. 143–147). ACM. https://doi.org/10.1145/2567574.2567620

Li, Y., Sha, L., Yan, L., Lin, J., Rakovic, M., Galbraith, K., Lyons, K., Gasevic, D., & Chen, G. (2023). Can large language models write reflectively. Computers and Education: Artificial Intelligence, 4, 100140. https://doi.org/10.1016/j.caeai.2023.100140

Liu, Z., Kong, X., Chen, H., Liu, S., & Yang, Z. (2023). MOOC-BERT: Automatically identifying learner cognitive presence from MOOC discussion data. IEEE Transactions on Learning Technologies, 16(4), 528–542. https://doi.org/10.1109/TLT.2023.3240715

Marquez, L., Henrıquez, V., Chevreux, H., Scheihing, E., & Guerra, J. (2024). Adoption of learning analytics in higher education institutions: A systematic literature review. British Journal of Educational Technology, 55(2), 439–459. https://doi.org/10.1111/bjet.13385

McAuley, J., O’Connor, A., & Lewis, D. (2012). Exploring reflection in online communities. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 102–110). ACM. https://doi.org/10.1145/2330601.2330630

McNamara, D. S., Allen, L. K., Crossley, S. A., Dascalu, M., & Perret, C. A. (2017). Natural language processing and learning analytics. In C. Lang, G. Siemens, A. Wise, & D. Gasevic (Eds.), Handbook of learning analytics (Vol. 93). Society for Learning Analytics Research. https://doi.org/10.18608/hla17

McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press. https://doi.org/10.1017/CBO9780511894664

Mello, R. F., Neto, R., Fiorentino, G., Alves, G., Aredes, V., Silva, J. V. G. F., Falcao, T. P., & Gasevic, D. (2022). Enhancing instructors’ capability to assess open-response using natural language processing and learning analytics. In I. Hilliger, P. J. Munoz-Merino, T. D. Laet, A. Ortega-Arranz, & T. Farrell (Eds.), Educating for a new future: Making sense of technology-enhanced learning adoption (EC-TEL 2022) (pp. 102–115). Springer. https://doi.org/10.1007/978-3-031-16290-9_8

Min, B., Ross, H., Sulem, E., Veyseh, A. P. B., Nguyen, T. H., Sainz, O., Agirre, E., Heintz, I., & Roth, D. (2023). Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys, 56(2), 1–40. https://doi.org/10.1145/360594

Molenaar, I., & Chiu, M. M. (2015). Effects of sequences of socially regulated learning on group performance. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20 March 2015, Poughkeepsie, New York, USA (pp. 236–240). ACM. https://doi.org/10.1145/2723576.2723586

Morris, W., Crossley, S., Holmes, L., & Trumbore, A. (2023). Using transformer language models to validate peer-assigned essay scores in massive open online courses (MOOCs). In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 315–323). ACM. https://doi.org/10.1145/3576050.3576098

Nicoll, S., Douglas, K., & Brinton, C. (2022). Giving feedback on feedback: An assessment of grader feedback construction on student performance. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 239–249). ACM. https://doi.org/10.1145/3506860.3506897

Niemann, K., Schmitz, H. -C., Kirschenmann, U., Wolpers, M., Schmidt, A., & Krones, T. (2012). Clustering by usage: Higher order co-occurrences of learning objects. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 238–247). ACM. https://doi.org/10.1145/2330601.2330659

Oliveira, H., Ferreira Mello, R., Barreiros Rosa, B. A., Rakovic, M., Miranda, P., Cordeiro, T., Isotani, S., Bittencourt, I., & Gasevic, D. (2023). Towards explainable prediction of essay cohesion in Portuguese and English. In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 509–519). ACM. https://doi.org/10.1145/3576050.3576152

Oncel, P., Flynn, L. E., Sonia, A. N., Barker, K. E., Lindsay, G. C., McClure, C. M., McNamara, D. S., & Allen, L. K. (2021). Automatic student writing evaluation: Investigating the impact of individual differences on source-based writing. In Proceedings of the 11th International Conference on Learning Analytics and Knowledge (LAK 2021), 12–16 April 2021, Irvine, California, USA (pp. 620–625). ACM. https://doi.org/10.1145/3448139.3448207

Paredes, W. C., & Chung, K. S. K. (2012). Modelling learning & performance: A social networks perspective. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 34–42). ACM. https://doi.org/10.1145/2330601.2330617

Park, K., Sohn, H., Mott, B., Min, W., Saleh, A., Glazewski, K., Hmelo-Silver, C., & Lester, J. (2021). Detecting disruptive talk in student chat-based discussion within collaborative game-based learning environments. In Proceedings of the 11th International Conference on Learning Analytics and Knowledge (LAK 2021), 12–16 April 2021, Irvine, California, USA (pp. 405–415). ACM. https://doi.org/10.1145/3448139.3448178

Pugh, S. L., Rao, A. R., Stewart, A. E., & D’Mello, S. K. (2022). Do speech-based collaboration analytics generalize across task contexts? In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 208–218). ACM. https://doi.org/10.1145/3506860.3506894

Qiao, C., & Hu, X. (2019). Measuring knowledge gaps in student responses by mining networked representations of texts. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 275–279). ACM. https://doi.org/10.1145/3303772.3303822

Qureshi, B. (2023). ChatGPT in computer science curriculum assessment: An analysis of its successes and shortcomings. In Proceedings of the Ninth International Conference on e-Society, e-Learning and e-Technologies (ICSLT 2023), 9–11 June 2023, Portsmouth, UK (pp. 7–13). ACM. https://doi.org/10.1145/3613944.3613946

Rajala, J., Hukkanen, J., Hartikainen, M., & Niemela, P. (2023). “Call me Kiran”—ChatGPT as a tutoring chatbot in a computer science course. In Proceedings of the 26th International Academic Mindtrek Conference (Mindtrek 2023), 3–6 October 2023, Tampere, Finland (pp. 83–94). ACM. https://doi.org/10.1145/3616961.3616974

Rakovic, M., Fan, Y., Van Der Graaf, J., Singh, S., Kilgour, J., Lim, L., Moore, J., Bannert, M., Molenaar, I., & Gasevic, D. (2022). Using learner trace data to understand metacognitive processes in writing from multiple sources. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 130–141). ACM. https://doi.org/10.1145/3506860.3506876

Rakovic, M., Iqbal, S., Li, T., Fan, Y., Singh, S., Surendrannair, S., Kilgour, J., van Der Graaf, J., Lim, L., Molenaar, I., Bannert, M., Moore, J., & Gasevic, D. (2023). Harnessing the potential of trace data and linguistic analysis to predict learner performance in a multi-text writing task. Journal of Computer Assisted Learning, 39(3), 703–718. https://doi.org/10.1111/jcal.12769

Rakovic, M., Winne, P. H., Marzouk, Z., & Chang, D. (2021). Automatic identification of knowledge-transforming content in argument essays developed from multiple sources. Journal of Computer Assisted Learning, 37(4), 903–924. https://doi.org/10.1111/jcal.12531

Robinson, C., Yeomans, M., Reich, J., Hulleman, C., & Gehlbach, H. (2016). Forecasting student achievement in MOOCs with natural language processing. In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 383–387). ACM. https://doi.org/10.1145/2883851.2883932

Ruan, S., Wei, W., & Landay, J. (2021). Variational deep knowledge tracing for language learning. In Proceedings of the 11th International Conference on Learning Analytics and Knowledge (LAK 2021), 12–16 April 2021, Irvine, California, USA (pp. 323–332). ACM. https://doi.org/10.1145/3448139.3448170

Rudian, S., Dittmeyer, M., & Pinkwart, N. (2022). Challenges of using auto-correction tools for language learning. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 426–431). ACM. https://doi.org/10.1145/3506860.3506867

Schlippe, T., & Sawatzki, J. (2022). AI-based multilingual interactive exam preparation. In D. Guralnick, M. Auer, & A. Poce (Eds.), Innovations in learning and technology for the workplace and higher education: Proceedings of ‘The Learning Ideas Conference 2021,’ Lecture notes in networks and systems (pp. 396–408, Vol. 349). Springer. https://doi.org/10.1007/978-3-030-90677-1_38

Sekiya, T., Matsuda, Y., & Yamaguchi, K. (2015). Curriculum analysis of CS departments based on CS2013 by simplified, supervised LDA. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20 March 2015, Poughkeepsie, New York, USA (pp. 330–339). ACM. https://doi.org/10.1145/2723576.2723594

Selva Birunda, S., & Kanniga Devi, R. (2021). A review on word embedding techniques for text classification. In J. Raj, A. Iliyasu, R. Bestak, & Z. Baig (Eds.), Innovative data communication technologies and application: Proceedings of ICIDCA 2020, Lecture notes on data engineering and communications technologies (pp. 267–281, Vol. 59). Springer. https://doi.org/10.1007/978-981-15-9651-3 23

Serrat, O. (2017). Social network analysis. In Knowledge solutions (pp. 39–43). Springer. https://doi.org/10.1007/978-981-10-0983-9 9

Settles, B., & Meeder, B. (2016). A trainable spaced repetition model for language learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Vol. 1: Long Papers), 7–12 August 2016, Berlin, Germany (pp. 1848–1858). Association for Computational Linguistics. https://doi.org/10.18653/v1/P16-1174

Sherin, B. (2012). Using computational methods to discover student science conceptions in interview data. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 188–197). ACM. https://doi.org/10.1145/2330601.2330649

Shibani, A., Knight, S., & Buckingham Shum, S. (2019). Contextualizable learning analytics design: A generic model and writing analytics evaluations. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 210–219). ACM. https://doi.org/10.1145/3303772.3303785

Shibani, A., Koh, E., Lai, V., & Shim, K. J. (2016). Analysis of teamwork dialogue: A data mining approach. In J. Joshi, G. Karypis, L. Liu, X. Hu, R. Ak, Y. Xia, W. Xu, A.- H. Sato, S. Rachuri, L. Ungar, P. S. Yu, R. Govindaraju, & T. Suzumura (Eds.), 2016 IEEE International Conference on Big Data (Big Data), 5–8 December 2016, Washington, DC, USA (pp. 4032–4034). IEEE. https://doi.org/10.1109/BigData.2016.7841100

Shusterman, E., Kim, H. G., Facciotti, M., Igo, M., Sripathi, K., Karger, D., Segal, A., & Gal, K. (2021). Seeding course forums using the teacher-in-the-loop. In Proceedings of the 11th International Conference on Learning Analytics and Knowledge (LAK 2021), 12–16 April 2021, Irvine, California, USA (pp. 22–31). ACM. https://doi.org/10.1145/3448139.3448142

Siemens, G., & Baker, R. S. d. (2012). Learning analytics and educational data mining: Towards communication and collaboration. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 252–254). ACM. https://doi.org/10.1145/2330601.2330661

Siemens, G., & Gasevic, D. (2012). Guest editorial—Learning and knowledge analytics. Journal of Educational Technology & Society, 15(3), 1–2. https://drive.google.com/file/d/1SJQZSFOrix9 WZTvBtzvUL70bsLa eqQ/view

Snow, E. L., Allen, L. K., Jacovina, M. E., Perret, C. A., & McNamara, D. S. (2015). You’ve got style: Detecting writing flexibility across time. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20 March 2015, Poughkeepsie, New York, USA (pp. 194–202). ACM. https://doi.org/10.1145/2723576.2723592

Sousa, E., Alexandre, B., Ferreira Mello, R., Pontual Falcao, T., Vesin, B., & Gasevic, D. (2021). Applications of learning analytics in high schools: A systematic literature review. Frontiers in Artificial Intelligence, 4. https://doi.org/10.3389/frai.2021.737891

Southavilay, V., Yacef, K., Reimann, P., & Calvo, R. A. (2013). Analysis of collaborative writing processes using revision maps and probabilistic topic models. In Proceedings of the Third International Conference on Learning Analytics and Knowledge (LAK 2013), 8–13 April 2013, Leuven, Belgium (pp. 38–47). ACM. https://doi.org/10.1145/2460296.2460307

Stone, C., Quirk, A., Gardener, M., Hutt, S., Duckworth, A. L., & D’Mello, S. K. (2019). Language as thought: Using natural language processing to model noncognitive traits that predict college success. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 320–329). ACM. https://doi.org/10.1145/3303772.3303801

Swiecki, Z., & Shaffer, D. W. (2020). Isens: An integrated approach to combining epistemic and social network analyses. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 305–313). ACM. https://doi.org/10.1145/3375462.3375505

Tan, J. P.-L., Yang, S., Koh, E., & Jonathan, C. (2016, April). Fostering 21st century literacies through a collaborative critical reading and learning analytics environment: User-perceived benefits and problematics. In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 430–434). ACM. https://doi.org/10.1145/2883851.2883965

Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24–54. https://doi.org/10.1177/0261927X09351676

Teplovs, C., Fujita, N., & Vatrapu, R. (2011). Generating predictive models of learner community dynamics. In Proceedings of the First International Conference on Learning Analytics and Knowledge (LAK 2011), 27 February–1 March 2011, Banff, Alberta, Canada (pp. 147–152). ACM. https://doi.org/10.1145/2090116.2090139

Thomas, D. A. (2014). Searching for significance in unstructured data: Text mining with Leximancer. European Educational Research Journal, 13(2), 235–256. https://doi.org/10.2304/eerj.2014.13.2.235

Tsai, Y. -S., & Gasevic, D. (2017). Learning analytics in higher education—challenges and policies: A review of eight learning analytics policies. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 233–242). ACM. https://doi.org/10.1145/3027385.3027400

Tsai, Y.- S., Whitelock-Wainwright, A., & Ga ˇsevi ´c, D. (2020). The privacy paradox and its implications for learning analytics. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 230–239). ACM. https://doi.org/10.1145/3375462.3375536

Ullmann, T. D. (2017). Reflective writing analytics: Empirically determined keywords of written reflection. In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 163–167). ACM. https://doi.org/10.1145/3027385.3027394

Vytasek, J. M., Patzak, A., & Winne, P. H. (2019). Topic development to support revision feedback. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 220–224). ACM. https://doi.org/10.1145/3303772.3303816

Wang, X., Wen, M., & Ros ´e, C. (2016). Towards triggering higher-order thinking behaviors in MOOCs. In Proceedings of the Sixth International Conference on Learning Analytics and Knowledge (LAK 2016), 25–29 April 2016, Edinburgh, UK (pp. 398–407). ACM. https://doi.org/10.1145/2883851.2883964

Whitelock, D., Twiner, A., Richardson, J. T. E., Field, D., & Pulman, S. (2015). OpenEssayist: A supply and demand learning analytics tool for drafting academic essays. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge (LAK 2015), 16–20 March 2015, Poughkeepsie, New York, USA (pp. 208–212). ACM. https://doi.org/10.1145/2723576.2723599

Wilson, A., Watson, C., Thompson, T. L., Drew, V., & Doyle, S. (2017). Learning analytics: Challenges and limitations. Teaching in Higher Education, 22(8), 991–1007. https://doi.org/10.1080/13562517.2017.1332026

Yan, L., Martinez-Maldonado, R., & Gasevic, D. (2024). Generative artificial intelligence in learning analytics: Contextualising opportunities and challenges through the learning analytics cycle. In Proceedings of the 14th International Conference on Learning Analytics and Knowledge (LAK 2024), 18–22 March 2024, Kyoto, Japan (pp. 101–111). ACM. https://doi.org/10.1145/3636555.3636856

Zylich, B., & Lan, A. (2021). Linguistic skill modeling for second language acquisition. In Proceedings of the 11th International Conference on Learning Analytics and Knowledge (LAK 2021), 12–16 April 2021, Irvine, California, USA (pp. 141–150). ACM. https://doi.org/10.1145/3448139.3448153

Downloads

Published

2024-10-30

How to Cite

Ferreira, R., Freitas, E., Cabral, L., Dawn, F., Rodrigues, L., Rakovic, M., Raniel, J., & Gasevic, D. (2024). Words of Wisdom: A Journey through the Realm of NLP for Learning Analytics - A Systematic Literature Review. Journal of Learning Analytics, 11(3), 82-105. https://doi.org/10.18608/jla.2024.8403

Issue

Section

Research Papers