The Effects of Explanations in Automated Essay Scoring Systems on Student Trust and Motivation




explainable artificial intelligence, automated essays scoring systems, trust, motivation, academic writing, research paper


Ethical considerations, including transparency, play an important role when using artificial intelligence (AI) in education. Explainable AI has been coined as a solution to provide more insight into the inner workings of AI algorithms. However, carefully designed user studies on how to design explanations for AI in education are still limited. The current study aimed to identify the effect of explanations of an automated essay scoring system on students’ trust and motivation. The explanations were designed using a needs-elicitation study with students in combination with guidelines and frameworks of explainable AI. Two types of explanations were tested: full-text global explanations and an accuracy statement. The results showed that both explanations did not have an effect on student trust or motivation compared to no explanations. Interestingly, the grade provided by the system, and especially the difference between the student’s self-estimated grade and the system grade, showed a large influence. Hence, it is important to consider the effects of the outcome of the system (here: grade) when considering the effect of explanations of AI in education.


Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access, 6, 52138–52160.

Alan, A., Costanza, E., Fischer, J., Ramchurn, S. D., Rodden, T., & Jennings, N. R. (2014). A field study of human–agent interaction for electricity tariff switching. Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2014), 5–9 May 2014, Paris, France (pp. 965–972).

Alikaniotis, D., Yannakoudakis, H., & Rei, M. (2016). Automatic text scoring using neural networks. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), 7–12 August 2016, Berlin, Germany (Vol. 1: Long papers, pp. 715–725). Association for Computational Linguistics.

Allen, L. K., Jacovina, M. E., & McNamara, D. S. (2016). Computer-based writing instruction. In C. A. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing research, 2nd ed. (pp. 316–329). Guildford Press.

Alonso, J. M., & Casalino, G. (2019). Explainable artificial intelligence for human-centric data analysis in virtual learning environments. In D. Burgos, M. Cimitile, P. Ducange, R. Pecori, P. Picerno, P. Raviolo, & C. M. Stracke (Eds.), Higher education learning methodologies and technologies online (pp. 125–138). Springer.

Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Bejamins, R., Chatila, R., & Herrera, F. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115.

Ashoori, M., & Weisz, J. D. (2019). In AI we trust? Factors that influence trustworthiness of AI-infused decision-making processes.

Attali, Y., & Burstein, J. (2004). Automated essay scoring with e-rater® V.2.0. ETS Research Report Series, 2004(2).

Barria-Pineda, J., & Brusilovsky, P. (2019). Making educational recommendations transparent through a fine-grained open learner model. IUI Workshops ’19, 20 March 2019, Los Angeles, CA, USA.

Bellotti, V., & Edwards, K. (2001). Intelligibility and accountability: Human considerations in context-aware systems. Human–Computer Interaction, 16(2–4), 193–212.

Bodily, R., Kay, J., Aleven, V., Jivet, I., Davis, D., Xhakaj, F., & Verbert, K. (2018). Open learner models and learning analytics dashboards: A systematic review. Proceedings of the 8th International Conference on Learning Analytics and Knowledge (LAK ’18), 5–9 March 2018, Sydney, NSW, Australia (pp. 41–50). ACM Press.

Bull, S., & Kay, J. (2016). SMILI☺: A framework for interfaces to learning data in open learner models, learning analytics and related fields. International Journal of Artificial Intelligence in Education, 26, 293–331.

Bussone, A., Stumpf, S., & O’Sullivan, D. (2015). The role of explanations on trust and reliance in clinical decision support systems. Proceedings of the 2015 IEEE International Conference on Healthcare Informatics (ICHI 2015), 21–23 October 2015, Dallas, TX, USA (pp. 160–169). IEEE Computer Society.

Cacioppo, J. T., Petty, R. E., Feinstein, J. A., & Jarvis, W. B. G. (1996). Dispositional differences in cognitive motivation: The life and times of individuals varying in need for cognition. Psychological Bulletin, 119(2), 197–253.

Campagna, R. L., Mislin, A. A., Dirks, K. T., & Elfenbein, H. A. (2022). The (mostly) robust influence of initial trustworthiness beliefs on subsequent behaviors and perceptions. Human Relations, 75(7), 1383–1411.

Cerratto Pargman, T. C., & McGrath, C. (2021). Mapping the ethics of learning analytics in higher education: A systematic literature review of empirical research. Journal of Learning Analytics, 8(2), 123–139.

Chao, C.-Y., Chang, T.-C., Wu, H.-C., Lin, Y.-S., & Chen, P.-C. (2016). The interrelationship between intelligent agents’ characteristics and users’ intention in a search engine by making beliefs and perceived risks mediators. Computers in Human Behavior, 64, 117–125.

Choi, S., Jang, Y., & Kim, H. (2023). Influence of pedagogical beliefs and perceived trust on teachers’ acceptance of educational artificial intelligence tools. International Journal of Human–Computer Interaction, 39(4), 910–922.

Clancey, W. J., & Hoffman, R. R. (2021). Methods and standards for research on explainable artificial intelligence: Lessons from intelligent tutoring systems. Applied AI Letters, 2(4).

Conati, C., Barral, O., Putnam, V., & Rieger, L. (2021). Toward personalized XAI: A case study in intelligent tutoring systems. Artificial Intelligence, 298, 103503.

Conati, C., Porayska-Pomsta, K., & Mavrikis, M. (2018). AI in education needs interpretable machine learning: Lessons from open learner modelling. Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), 14 July 2018, Stockholm, Sweden (pp. 21–27).

Cramer, H., Evers, V., Ramlal, S., van Someren, M., Rutledge, L., Stash, N., Aroyo, L., & Wielinga, B. (2008). The effects of transparency on trust in and acceptance of a content-based art recommender. User Modeling and User-Adapted Interaction, 18, 455–496.

Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114–126.

Dietvorst, B. J., Simmons, J. P., & Massey, C. (2018). Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science, 64(3), 1155–1170.

Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning and Assessment, 5(1).

Dzindolet, M. T., Peterson, S. A., Pomranky, R. A., Pierce, L. G., & Beck, H. P. (2003). The role of trust in automation reliance. International Journal of Human–Computer Studies, 58(6), 697–718.

Eiband, M., Schneider, H., Bilandzic, M., Fazekas-Con, J., Haug, M., & Hussmann, H. (2018). Bringing transparency design into practice. Proceedings of the 23rd International Conference on Intelligent User Interfaces (IUI ’18), 7–11 March 2018, Tokyo, Japan (pp. 211–223).

Esterwood, C., & Robert, L. J. (2021, August 12). Do you still trust me? Human–robot trust repair strategies. Proceedings of the 30th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2021), 8–12 August 2021, Virtual. IEEE Computer Society.

Fenster, M., Zuckerman, I., & Kraus, S. (2012). Guiding user choice during discussion by silence, examples and justifications. Frontiers in Artificial Intelligence and Applications, 242, 330–335.

Ferguson, R. (2019). Ethical challenges for learning analytics. Journal of Learning Analytics, 6(3), 25–30.

Ferguson, R., Hoel, T., Scheffel, M., & Drachsler, H. (2016). Guest editorial: Ethics and privacy in learning analytics. Journal of Learning Analytics, 3(1), 5–15.

Hattie, J. (2008). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.

Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112.

Herlocker, J. L., Konstan, J. A., & Riedl, J. (2000). Explaining collaborative filtering recommendations. Proceedings of the 2000 Conference on Computer Supported Cooperative Work (CSCW ’00), 2–6 December 2000, Philadelphia, PA, USA (pp. 241–250). ACM Press.

Hussein, M. A., Hassan, H., & Nassef, M. (2019). Automated language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208.

Hütter, M., & Ache, F. (2016). Seeking advice: A sampling approach to advice taking. Judgement and Decision Making, 11(4), 401–415.

Jessup, S. A., Schneider, T. R., Alarcon, G. M., Ryan, T. J., & Capiola, A. (2019). The measurement of the propensity to trust automation. In J. Y. C. Chen & G. Fragomeni (Eds.), Virtual, augmented and mixed reality: Applications and case studies (pp. 476–489). Lecture Notes in Computer Science, vol. 11575. Springer.

Jian, J.-Y., Bisantz, A. M., & Drury, C. G. (2010). Foundations for an empirically determined scale of trust in automated systems. International Journal of Cognitive Ergonomics, 4(1), 53–71.

Kamath, U., & Liu, J. (2021). Explainable artificial intelligence: An introduction to interpretable machine learning. Springer.

Khosravi, H., Buckingham Shum, S., Chen, G., Conati, C., Tsai, Y.-S., Kay, J., Knight, S., Martinez-Maldonado, R., Sadiq, S., & Gašević, D. (2022). Explainable artificial intelligence in education. Computers and Education: Artificial Intelligence, 3, 100074.

Kim, P. H., Ferrin, D. L., Cooper, C. D., & Dirks, K. T. (2004). Removing the shadow of suspicion: The effects of apology versus denial for repairing competence- versus integrity-based trust violations. Journal of Applied Psychology, 89(1), 104–118.

Kim, T., & Song, H. (2021). How should intelligent agents apologize to restore trust? Interaction effects between anthropomorphism and apology attribution on trust repair. Telematics and Informatics, 61, 101595.

Kizilcec, R. F. (2016). How much information? Effects of transparency on trust in an algorithmic interface. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ʼ16), 7–12 May 2016, San Jose, CA, USA (pp. 2390–2395). ACM Press.

Knight, S., Buckingham Shum, S., Ryan, P., Sándor, Á., & Wang, X. (2018). Designing academic writing analytics for civil law student self-assessment. International Journal of Artificial Intelligence in Education, 28, 1–28.

Kulesza, T., Burnett, M., Wong, W.-K., & Stumpf, S. (2015). Principles of explanatory debugging to personalize interactive machine learning. Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI ’15), 29 March–1 April 2015, Atlanta, GA, USA (pp. 126–137). ACM Press.

Kulesza, T., Stumpf, S., Burnett, M., Yang, S., Kwan, I., & Wong, W.-K. (2013). Too much, too little, or just right? Ways explanations impact end users’ mental models. Proceedings of the 2013 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’13), 15–19 September 2013, San Jose, CA, USA (pp. 3–10).

Lee, J. D., & See, K. A. (2004). Trust in automation: Designing for appropriate reliance. Human Factors, 46(1), 50–80.

Lim, B. Y., & Dey, A. K. (2010). Toolkit to support intelligibility in context-aware applications. Proceedings of the 2010 ACM International Conference on Ubiquitous Computing (UbiComp ’10), 26–29 September 2010, Copenhagen, Denmark (pp. 13–22). ACM Press.

Lins de Holanda Coelho, G., Hanel, P. H. P., & Wolf, L. J. (2020). The very efficient assessment of need for cognition: Developing a six-item version. Assessment, 27(8), 1870–1885.

Long, Y., & Aleven, V. (2013). Supporting students’ self-regulated learning with an open learner model in a linear equation tutor. In H. C. Lane, K. Yacef, J. Mostow, & P. Pavlik (Eds.), Proceedings of the 16th International Conference on Artificial Intelligence in Education (AIED ʼ13), 9–13 July 2013, Memphis, TN, USA (pp. 219–228). Springer.

Manzey, D., Reichenbach, J., & Onnasch, L. (2012). Human performance consequences of automated decision aids: The impact of degree of automation and system experience. Journal of Cognitive Engineering and Decision Making, 6(1), 57–87.

Matzat, U., & Snijders, C. (2012). Rebuilding trust in online shops on consumer review sites: Sellers’ responses to user-generated complaints. Journal of Computer-Mediated Communication, 18(1), 62–79.

McAuley, E., Duncan, T., Tammen, V. V. (1989). Psychometric properties of the intrinsic motivation inventory in a competitive sport setting: A confirmatory factor analysis. Research Quarterly for Exercise and Sport, 60(1), 48–58.

Meuwissen, M., & Bollen, L. (2021). Transparency versus explainability in AI.

Möhlmann, M., & Zalmanson, L. (2017). Hands on the wheel: Navigating algorithmic management and Uber driver’s autonomy. Proceedings of the 38th International Conference on Information Systems (ICIS 2017), 10–13 December 2017, Seoul, South Korea.

Mohseni, S., Zarei, N., & Ragan, E. D. (2021). A multidisciplinary survey and framework for design and evaluation of explainable AI systems. ACM Transactions on Interactive Intelligent Systems, 11(3–4), 1–45.

Mueller, S. T., Hoffman, R. R., Clancey, W., Emrey, A., & Klein, G. (2019). Explanation in human–AI systems: A literature meta-review, synopsis of key ideas and publications, and bibliography for explainable AI. DARPA XAI Literature Review.

Mueller, S. T., Veinott, E. S., Hoffman, R. R., Klein, G., Alam, L., Mamun, T., & Clancey, W. J. (2021). Principles of explanation in human–AI systems. Proceedings of the 35th Conference on Artificial Intelligence (AAAI-21), 8–9 February 2021, Virtual.

Nazaretsky, T., Ariely, M., Cukurova, M., & Alexandron, G. (2022). Teachers’ trust in AI-powered educational technology and a professional development program to improve it. British Journal of Educational Technology, 53(4), 914–931.

Nazaretsky, T., Cukurova, M., & Alexandron, G. (2022). An instrument for measuring teachers’ trust in AI-based educational technology. Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK ’22), 21–25 March 2022, Online (pp. 56–66). ACM Press.

Ooge, J., Kato, S., & Verbert, K. (2022). Explaining recommendations in e-learning: Effects on adolescents’ trust. Proceedings of the 27th International Conference on Intelligent User Interfaces (IUI ’22), 22–25 March 2022, Helsinki, Finland (pp. 93–105).

Papenmeier, A., Englebienne, G., & Seifert, C. (2019). How model accuracy and explanation fidelity influence user trust. Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19), 10–16 August 2019, Macao, China (pp. 94–100).

Prahl, A., & van Swol, L. (2017). Understanding algorithm aversion: When is advice from automation discounted? Journal of Forecasting, 36(6), 691–702.

Prinsloo, P., & Slade, S. (2018). Mapping responsible learning analytics: A critical proposal. In B. H. Khan, J. R. Corbeil, & M. E. Corbeil (Eds.), Responsible analytics and data mining in education: Global perspectives on quality, support, and decision-making. Routledge.

Qin, F., Li, K., & Yan, J. (2020). Understanding user trust in artificial intelligence‐based educational systems: Evidence from China. British Journal of Educational Technology, 51(5), 1693–1710.

Rawal, A., McCoy, J., Rawat, D. B., Sadler, B. M., & St. Amant, R. (2022). Recent advances in trustworthy explainable artificial intelligence: Status, challenges and perspectives. IEEE Transactions on Artificial Intelligence, 3(6), 852–866.

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16), 13–17 August 2016, San Francisco, CA, USA (pp. 1135–1144). ACM Press.

Rosenthal, S. L., & Dey, A. K. (2010). Towards maximizing the accuracy of human-labeled sensor data. Proceedings of the 15th International Conference on Intelligent User Interfaces (IUI ’10), 7–10 February 2010, Hong Kong, China (pp. 259–268). ACM Press.

Samek, W., & Müller, K.-R. (2019). Towards explainable artificial intelligence. In W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, & K.-R. Müller (Eds.), Explainable AI: Interpreting, explaining and visualizing deep learning (pp. 5–22). Springer.

Sclater, N. (2016). Developing a code of practice for learning analytics. Journal of Learning Analytics, 3(1), 16–42.

Selwyn, N. (2019). What’s the problem with learning analytics? Journal of Learning Analytics, 6(3), 11–19.

Shin, D. (2021). The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI. International Journal of Human–Computer Studies, 146, 102551.

Slade, S., & Prinsloo, P. (2013). Learning analytics: Ethical issues and dilemmas. American Behavioral Scientist, 57(10), 1510–1529.

Snijders, C., Bober, M., & Matzat, U. (2017). Online reputation in eBay auctions: Damaging and rebuilding trustworthiness through feedback comments from buyers and sellers. In B. Jann & W. Przepiorka (Eds.), Social dilemmas, institutions, and the evolution of cooperation (pp. 421–444). De Gruyter Oldenbourg.

Tzimas, D., & Demetriadis, S. (2021). Ethical issues in learning analytics: A review of the field. Educational Technology Research and Development, 69, 1101–1133.

Wang, W., Qiu, L., Kim, D., & Benbasat, I. (2016). Effects of rational and social appeals of online recommendation agents on cognition- and affect-based trust. Decision Support Systems, 86, 48–60.

Warren, G., Keane, M. T., & Byrne, R. M. J. (2022). Features of explainability: How users understand counterfactual and causal explanations for categorical and continuous features in XAI.

Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English language arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100, 94–109.

Wisniewski, B., Zierer, K., & Hattie, J. (2020). The power of feedback revisited: A meta-analysis of educational feedback research. Frontiers in Psychology, 10.

Yang, L. W., Aggarwal, P., & McGill, A. L. (2020). The 3 C’s of anthropomorphism: Connection, comprehension, and competition. Consumer Psychology Review, 3(1), 3–19.

Yin, M., Wortman Vaughan, J., & Wallach, H. (2019). Understanding the effect of accuracy on trust in machine learning models. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19), 4–9 May 2019, Glasgow, Scotland, UK (pp. 1–12). ACM Press.

Zumbrunn, S., Marrs, S., & Mewborn, C. (2016). Toward a better understanding of student perceptions of writing feedback: A mixed methods study. Reading and Writing, 29, 349–370.




How to Cite

Conijn, R., Kahr, P., & Snijders, C. (2023). The Effects of Explanations in Automated Essay Scoring Systems on Student Trust and Motivation. Journal of Learning Analytics, 10(1), 37-53.



Special Section on Fairness, Equity, and Responsibility in Learning Analytics

Most read articles by the same author(s)