Assessing Creativity Across Multi-Step Intervention Using Generative AI Models
DOI:
https://doi.org/10.18608/jla.2025.8571Keywords:
creativity, divergent thinking (DT), Alternative Uses Test (AUT), generative AI (GenAI), longitudinal study, automated scoring, educational assessment, practice opportunities, research paperAbstract
Creativity is an imperative skill for today’s learners, one that has important contributions to issues of inclusion and equity in education. Therefore, assessing creativity is of major importance in educational contexts. However, scoring creativity based on traditional tools suffers from subjectivity and is heavily time- and labour-consuming. This is indeed the case for the commonly used Alternative Uses Test (AUT), in which participants are asked to list as many different uses as possible for a daily object. The test measures divergent thinking (DT), which involves exploring multiple possible solutions in various semantic domains. This study leverages recent advancements in generative AI (GenAI) to automate the AUT scoring process, potentially increasing efficiency and objectivity. Using two validated models, we analyze the dynamics of creativity dimensions in a multi-step intervention aimed at improving creativity by using repeated AUT sessions (N=157 9th-grade students). Our research questions focus on the behavioural patterns of DT dimensions over time, their correlation with the number of practice opportunities, and the influence of response order on creativity scores. The results show improvement in fluency and flexibility, as a function of practice opportunities, as well as various correlations between DT dimensions. By automating the scoring process, this study aims to provide deeper insights into the development of creative skills over time and explore the capabilities of GenAI in educational assessments. Eventually, the use of automatic evaluation can incorporate creativity evaluation in various educational processes at scale.
References
Acar, S. (2023). Creativity assessment, research, and practice in the age of artificial intelligence. Creativity Research Journal, 1–7. https://doi.org/10.1080/10400419.2023.2271749
Acar, S., Abdulla Alabbasi, A. M., Runco, M. A., & Beketayev, K. (2019). Latency as a predictor of originality in divergent thinking. Thinking Skills and Creativity, 33, Article 100574. https://doi.org/10.1016/j.tsc.2019.100574
Atakaya, M. A., Sak, U., & Ayas, M. B. (2024). A study on psychometric properties of creativity indices. Creativity Research Journal, 36(2), 348–364. https://doi.org/10.1080/10400419.2022.2134550
Baas, M., Roskes, M., Sligte, D., Nijstad, B. A., & De Dreu, C. K. W. (2013). Personality and creativity: The dual pathway to creativity model and a research agenda. Social and Personality Psychology Compass, 7(10), 732–748. https://doi.org/10.1111/spc3.12062
Bai, H., Leseman, P. P. M., Moerbeek, M., Kroesbergen, E. H., & Mulder, H. (2021). Serial order effect in divergent thinking in five- to six-year-olds: Individual differences as related to executive functions. Journal of Intelligence, 9(2), Article 20. https://doi.org/10.3390/jintelligence9020020
Baker, R. S. J. d., & Yacef, K. (2009). The state of educational data mining in 2009: A review and future visions. Journal of Educational Data Mining, 1(1), 3–17. https://doi.org/10.5281/zenodo.3554657
Beaty, R. E., & Johnson, D. R. (2021). Automating creativity assessment with SemDis: An open platform for computing semantic distance. Behavior Research Methods, 53(2), 757–780. https://doi.org/10.3758/s13428-020-01453-w
Beaty, R. E., Kenett, Y. N., Christensen, A. P., Rosenberg, M. D., Benedek, M., Chen, Q., Fink, A., Qiu, J., Kwapil, T. R., Kane, M. J., & Silvia, P. J. (2018). Robust prediction of individual creative ability from brain functional connectivity. Proceedings of the National Academy of Sciences, 115(5), 1087–1092. https://doi.org/10.1073/pnas.1713532115
Beaty, R. E., & Silvia, P. J. (2012). Why do ideas get more creative across time? An executive interpretation of the serial order effect in divergent thinking tasks. Psychology of Aesthetics, Creativity, and the Arts, 6(4), 309–319. https://doi.org/10.1037/a0029171
Beaty, R. E., Silvia, P. J., Nusbaum, E. C., Jauk, E., & Benedek, M. (2014). The roles of associative and executive processes in creative cognition. Memory & Cognition, 42(7), 1186–1197. https://doi.org/10.3758/s13421-014-0428-8
Chan, J., & Schunn, C. D. (2015). The importance of iteration in creative conceptual combination. Cognition, 145, 104–115. https://doi.org/10.1016/j.cognition.2015.08.008
Cohen, R. J., Swerdlik, M. E., & Phillips, S. M. (1996). Psychological testing and assessment: An introduction to tests and measurement (3rd ed.). Mayfield Publishing Co.
Cousijn, J., Zanolie, K., Munsters, R. J. M., Kleibeuker, S. W., & Crone, E. A. (2014). The relation between resting state connectivity and creativity in adolescents before and after training. PLoS ONE, 9(9), Article e105780. https://doi.org/10.1371/journal.pone.0105780
de Chantal, P.-L., & Organisciak, P. (2023). Automated feedback and creativity: On the role of metacognitive monitoring in divergent thinking. Psychology of Aesthetics, Creativity, and the Arts. https://doi.org/10.1037/aca0000592
Ding, X., Tang, Y.-Y., Tang, R., & Posner, M. I. (2014). Improving creativity performance by short-term meditation. Behavioral and Brain Functions, 10(1), Article 9. https://doi.org/10.1186/1744-9081-10-9
Dumas, D., Organisciak, P., & Doherty, M. (2021). Measuring divergent thinking originality with human raters and text-mining models: A psychometric comparison of methods. Psychology of Aesthetics, Creativity, and the Arts, 15(4), 645–663. https://doi.org/10.1037/aca0000319
Ezzat, H., Camarda, A., Cassotti, M., Agogué, M., Houdé, O., Weil, B., & Le Masson, P. (2017). How minimal executive feedback influences creative idea generation. PLoS ONE, 12(6), Article e0180458. https://doi.org/10.1371/journal.pone.0180458
Fahoum, N., Pick, H., & Shamay-Tsoory, S. (2023). The impact of creativity training on inter-group conflict-related emotions. Journal of Conflict Resolution, 68(7–8), 1494–1521. https://doi.org/10.1177/00220027231198517
Forthmann, B., & Doebler, P. (2022). Fifty years later and still working: Rediscovering Paulus et al.’s (1970) automated scoring of divergent thinking tests. Psychology of Aesthetics, Creativity, and the Arts, 19(1), 63–76. https://doi.org/10.1037/aca0000518
Forthmann, B., Holling, H., Zandi, N., Gerwig, A., Çelik, P., Storme, M., & Lubart, T. (2017). Missing creativity: The effect of cognitive workload on rater (dis-)agreement in subjective divergent-thinking scores. Thinking Skills and Creativity, 23, 129–139. https://doi.org/10.1016/j.tsc.2016.12.005
Gonthier, C., & Besançon, M. (2024). It is not always better to have more ideas: Serial order and the trade-off between fluency and elaboration in divergent thinking tasks. Psychology of Aesthetics, Creativity, and the Arts, 18(4), 480–492. https://doi.org/10.1037/aca0000485
Grajzel, K., Acar, S., Dumas, D., Organisciak, P., & Berthiaume, K. (2023). Measuring flexibility: A text-mining approach. Frontiers in Psychology, 13, Article 1093343. https://doi.org/10.3389/fpsyg.2022.1093343
Guilford, J. P. (1967). The nature of human intelligence. McGraw-Hill.
Gupta, N., Jang, Y., Mednick, S. C., & Huber, D. E. (2012). The road not taken: Creative solutions require avoidance of high-frequency responses. Psychological Science, 23(3), 288–294. https://doi.org/10.1177/0956797611429710
Haase, J., & Hanel, P. H. P. (2023). Artificial muses: Generative artificial intelligence chatbots have risen to human-level creativity. Journal of Creativity, 33(3), Article 100066. https://doi.org/10.1016/j.yjoc.2023.100066
Hadas, E., & Hershkovitz, A. (2024). Using large language models to evaluate alternative uses task flexibility score. Thinking Skills and Creativity, 52, Article 101549. https://doi.org/10.1016/j.tsc.2024.101549
Hass, R. W. (2017). Tracking the dynamics of divergent thinking via semantic distance: Analytic methods and theoretical implications. Memory & Cognition, 45(2), 233–244. https://doi.org/10.3758/s13421-016-0659-y
Hershkovitz, A., Baker, R. S. J. d., Gobert, J., Wixon, M., & Pedro, M. S. (2013). Discovery with models: A case study on carelessness in computer-based science inquiry. American Behavioral Scientist, 57(10), 1480–1499. https://doi.org/10.1177/0002764213479365
Hickman, L., Dunlop, P. D., & Wolf, J. L. (2024). The performance of large language models on quantitative and verbal ability tests: Initial evidence and implications for unproctored high‐stakes testing. International Journal of Selection and Assessment, 32(4), 499–511. https://doi.org/10.1111/ijsa.12479
Hubert, K. F., Awa, K. N., & Zabelina, D. L. (2024). The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks. Scientific Reports, 14(1), Article 3440. https://doi.org/10.1038/s41598-024-53303-w
Hwang, A. H.-C. (2022). Too late to be creative? AI-empowered tools in creative processes. In S. Barbosa, C. Lampe, C. Appert, & D. A. Shamma (Eds.), CHI EA ’22: CHI Conference on Human Factors in Computing Systems extended abstracts (Article 38). ACM Press. https://doi.org/10.1145/3491101.3503549
Israel-Fishelson, R., & Hershkovitz, A. (2022). Cultivating creativity improves middle school students’ computational thinking skills. Interactive Learning Environments, 32(2), 431–446. https://doi.org/10.1080/10494820.2022.2088562
Johns, G. A., Morse, L. W., & Morse, D. T. (2001). An analysis of early vs. later responses on a divergent production task across three time press conditions. The Journal of Creative Behavior, 35(1), 65–72. https://doi.org/10.1002/j.2162-6057.2001.tb01222.x
Johnson, D. R., Cuthbert, A. S., & Tynan, M. E. (2021). The neglect of idea diversity in creative idea generation and evaluation. Psychology of Aesthetics, Creativity, and the Arts, 15(1), 125–135. https://doi.org/10.1037/aca0000235
Kozbelt, A., & Serafin, J. (2009). Dynamic evaluation of high- and low-creativity drawings by artist and nonartist raters. Creativity Research Journal, 21(4), 349–360. https://doi.org/10.1080/10400410903297634
Kozlowski, J. S., & Si, S. (2019). Mathematical creativity: A vehicle to foster equity. Thinking Skills and Creativity, 33, Article 100579. https://doi.org/10.1016/j.tsc.2019.100579
Leckie, G., & Baird, J.-A. (2011). Rater effects on essay scoring: A multilevel analysis of severity drift, central tendency, and rater experience. Journal of Educational Measurement, 48(4), 399–418. https://doi.org/10.1111/j.1745-3984.2011.00152.x
Levav-Waynberg, A., & Leikin, R. (2012). The role of multiple solution tasks in developing knowledge and creativity in geometry. The Journal of Mathematical Behavior, 31(1), 73–90. https://doi.org/10.1016/j.jmathb.2011.11.001
Long, H., Kerr, B. A., Emler, T. E., & Birdnow, M. (2022). A critical review of assessments of creativity in education. Review of Research in Education, 46(1), 288–323. https://doi.org/10.3102/0091732X221084326
Luria, S. R., Sriraman, B., & Kaufman, J. C. (2017). Enhancing equity in the classroom by teaching for mathematical creativity. ZDM, 49(7), 1033–1039. https://doi.org/10.1007/s11858-017-0892-2
Nazaretsky, T., Cukurova, M., & Alexandron, G. (2022). An instrument for measuring teachers’ trust in AI-based educational technology. In A. F. Wise, R. Martinez-Maldonado, & I. Hilliger (Eds.), LAK22: 12th International Learning Analytics and Knowledge Conference (pp. 56–66). ACM Press. https://doi.org/10.1145/3506860.3506866
Nijstad, B. A., De Dreu, C. K. W., Rietzschel, E. F., & Baas, M. (2010). The dual pathway to creativity model: Creative ideation as a function of flexibility and persistence. European Review of Social Psychology, 21(1), 34–77. https://doi.org/10.1080/10463281003765323
Olson, J. A., Nahas, J., Chmoulevitch, D., Cropper, S. J., & Webb, M. E. (2021). Naming unrelated words predicts creativity. Proceedings of the National Academy of Sciences, 118(25), Article e2022340118. https://doi.org/10.1073/pnas.2022340118
Organisciak, P., Acar, S., Dumas, D., & Berthiaume, K. (2023). Beyond semantic distance: Automated scoring of divergent thinking greatly improves with large language models. Thinking Skills and Creativity, 49, Article 101356. https://doi.org/10.1016/j.tsc.2023.101356
Organisciak, P., Dumas, D., Acar, S., & de Chantal, P.-L. (2025). Open creativity scoring [Computer software]. University of Denver. https://openscoring.du.edu/
Patterson, J. D., Merseal, H. M., Johnson, D. R., Agnoli, S., Baas, M., Baker, B. S., Barbot, B., Benedek, M., Borhani, K., Chen, Q., Christensen, J. F., Corazza, G. E., Forthmann, B., Karwowski, M., Kazemian, N., Kreisberg-Nitzav, A., Kenett, Y. N., Link, A., Lubart, T., … Beaty, R. E. (2023). Multilingual semantic distance: Automatic verbal creativity assessment in many languages. Psychology of Aesthetics, Creativity, and the Arts, 17(4), 495–507. https://doi.org/10.1037/aca0000618
Paulus, D. H., Renzulli, J. S., & Archambault, F. X., Jr. (1970). Computer simulation of human ratings of creativity: Final report. National Center for Educational Research and Development.
Plucker, J. A., Qian, M., & Wang, S. (2011). Is originality in the eye of the beholder? Comparison of scoring techniques in the assessment of divergent thinking. The Journal of Creative Behavior, 45(1), 1–22. https://doi.org/10.1002/j.2162-6057.2011.tb01081.x
Redifer, J. L., Bae, C. L., & Zhao, Q. (2021). Self-efficacy and performance feedback: Impacts on cognitive load during creative thinking. Learning and Instruction, 71, Article 101395. https://doi.org/10.1016/j.learninstruc.2020.101395
Reiter-Palmon, R., Forthmann, B., & Barbot, B. (2019). Scoring divergent thinking tests: A review and systematic framework. Psychology of Aesthetics, Creativity, and the Arts, 13(2), 144–152. https://doi.org/10.1037/aca0000227
Ritter, S. M., Gu, X., Crijns, M., & Biekens, P. (2020). Fostering students’ creative thinking skills by means of a one-year creativity training program. PLOS ONE, 15(3), Article e0229773. https://doi.org/10.1371/journal.pone.0229773
Ritter, S. M., & Mostert, N. (2017). Enhancement of creative thinking skills using a cognitive-based creativity training. Journal of Cognitive Enhancement, 1(3), 243–253. https://doi.org/10.1007/s41465-016-0002-3
Runco, M. A. (2008). Creativity and education. New Horizons in Education, 56(1).
Scott, G., Leritz, L. E., & Mumford, M. D. (2004). The effectiveness of creativity training: A quantitative review. Creativity Research Journal, 16(4), 361–388. https://doi.org/10.1080/10400410409534549
Silvia, P. J., Winterstein, B. P., Willse, J. T., Barona, C. M., Cram, J. T., Hess, K. I., Martinez, J. L., & Richard, C. A. (2008). Assessing creativity with divergent thinking tasks: Exploring the reliability and validity of new subjective scoring methods. Psychology of Aesthetics, Creativity, and the Arts, 2(2), 68–85. https://doi.org/10.1037/1931-3896.2.2.68
Sowden, P. T., Pringle, A., & Gabora, L. (2018). The shifting sands of creative thinking: Connections to dual-process theory. In K. J. Gilhooly, L. J. Ball, & L. Macchi (Eds.), Insight and creativity in problem solving (pp. 40–60). Routledge. https://doi.org/10.4324/9781315144061-3
Sporrong, E., McGrath, C., & Cerratto Pargman, T. (2024). Situating AI in assessment: An exploration of university teachers’ valuing practices. AI and Ethics. https://doi.org/10.1007/s43681-024-00558-8
Stevenson, C. E., Kleibeuker, S. W., de Dreu, C. K. W., & Crone, E. A. (2014). Training creative cognition: Adolescence as a flexible period for improving creativity. Frontiers in Human Neuroscience, 8, Article 827. https://doi.org/10.3389/fnhum.2014.00827
Stevenson, C., Smal, I., Baas, M., Grasman, R., & van der Maas, H. (2022). Putting GPT-3’s creativity to the (alternative uses) test. arXiv. https://doi.org/10.48550/arXiv.2206.08932
Sun, J., Chen, Q., Zhang, Q., Li, Y., Li, H., Wei, D., Yang, W., & Qiu, J. (2016). Training your brain to be more creative: Brain functional and structural changes induced by divergent thinking training. Human Brain Mapping, 37(10), 3371–3699. https://doi.org/10.1002/hbm.23246
Sun, M., Wang, M., & Wegerif, R. (2019). Using computer-based cognitive mapping to improve students’ divergent thinking for creativity development. British Journal of Educational Technology, 50(5), 2217–2233. https://doi.org/10.1111/bjet.12825
Thornhill-Miller, B., Camarda, A., Mercier, M., Burkhardt, J.-M., Morisseau, T., Bourgeois-Bougrine, S., Vinchon, F., El Hayek, S., Augereau-Landais, M., Mourey, F., Feybesse, C., Sundquist, D., & Lubart, T. (2023). Creativity, critical thinking, communication, and collaboration: Assessment, certification, and promotion of 21st century skills for the future of work and education. Journal of Intelligence, 11(3), Article 54. https://doi.org/10.3390/jintelligence11030054
Torrance, E. P. (1969). Creativity. National Education Association.
Torrance, E. P. (1974). The Torrance tests of creative thinking: Norms-technical manual. Personal Press.
Valgeirsdottir, D., & Onarheim, B. (2017). Studying creativity training programs: A methodological analysis. Creativity and Innovation Management, 26(4), 430–439. https://doi.org/10.1111/caim.12245
van de Kamp, M.-T., Admiraal, W., van Drie, J., & Rijlaarsdam, G. (2015). Enhancing divergent thinking in visual arts education: Effects of explicit instruction of meta-cognition. British Journal of Educational Psychology, 85(1), 47–58. https://doi.org/10.1111/bjep.12061
Viberg, O., Cukurova, M., Feldman-Maggor, Y., Alexandron, G., Shirai, S., Kanemune, S., Wasson, B., Tømte, C., Spikol, D., Milrad, M., Coelho, R., & Kizilcec, R. F. (2024). What explains teachers’ trust in AI in education across six countries? International Journal of Artificial Intelligence in Education. https://doi.org/10.1007/s40593-024-00433-x
Wahbeh, H., Cannard, C., Yount, G., Delorme, A., & Radin, D. (2024). Creative self-belief responses versus manual and automated alternate use task scoring: A cross-sectional study. Journal of Creativity, 34(3), Article 100088. https://doi.org/10.1016/j.yjoc.2024.100088
Wang, M., Hao, N., Ku, Y., Grabner, R. H., & Fink, A. (2017). Neural correlates of serial order effect in verbal divergent thinking. Neuropsychologia, 99, 92–100. https://doi.org/10.1016/j.neuropsychologia.2017.03.001
Weiss, S., & Wilhelm, O. (2022). Is flexibility more than fluency and originality? Journal of Intelligence, 10(4), Article 96. https://doi.org/10.3390/jintelligence10040096
Wilson, M., & Case, H. (2000). An examination of variation in rater severity over time: A study in rater drift. In M. Wilson & G. Engelhard, Jr. (Eds.), Objective measurement: Theory into practice (Vol. 5, pp. 113–134). Ablex Publishing Corporation.
Wingström, R., Hautala, J., & Lundman, R. (2024). Redefining creativity in the era of AI? Perspectives of computer scientists and new media artists. Creativity Research Journal, 36(2), 177–193. https://doi.org/10.1080/10400419.2022.2107850
Wise, T. A., & Kenett, Y. N. (2024). Sparking creativity: Encouraging creative idea generation through automatically generated word recommendations. Behavior Research Methods, 56(7), 7939–7962. https://doi.org/10.3758/s13428-024-02463-8
Yu, Y., Beaty, R. E., Forthmann, B., Beeman, M., Cruz, J. H., & Johnson, D. (2023). A MAD method to assess idea novelty: Improving validity of automatic scoring using maximum associative distance (MAD). Psychology of Aesthetics, Creativity, and the Arts. https://doi.org/10.1037/aca0000573
Zedelius, C. M., Mills, C., & Schooler, J. W. (2019). Beyond subjective judgments: Predicting evaluations of creative writing from computational linguistic features. Behavior Research Methods, 51(2), 879–894. https://doi.org/10.3758/s13428-018-1137-1
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Journal of Learning Analytics

This work is licensed under a Creative Commons Attribution 4.0 International License.