Exploring Automated Assessment of Primary Students’ Creativity in a Flow-Based Music Programming Environment
DOI:
https://doi.org/10.18608/jla.2025.8835Keywords:
creativity, automated assessment, generative AI, music programming, flow-based programming, K-12, research paperAbstract
Creativity is a vital skill in science, technology, engineering, and mathematics (STEM)--related education, fostering innovation and problem-solving. Traditionally, creativity assessments relied on human evaluations, such as the consensual assessment technique (CAT), which are resource-intensive, time-consuming, and often subjective. Recent advances in computational methods, particularly large language models (LLMs), have enabled automated creativity assessments. In this study, we extend research on automated creativity scoring to a flow-based music programming environment, a context that integrates computational and creative thinking. We collected 383 programming artifacts from 194 primary school students (2022--2024) and employed two automated approaches: an evidence-centred design (ECD) framework--based approach and an LLM-based approach using ChatGPT-4 with few-shot learning. The ECD-based approach integrates divergent thinking, complexity, efficiency, and emotional expressiveness, while the LLM-based approach uses CAT ratings and ECD examples to learn creativity scoring. Results revealed moderate to strong correlations with human evaluations (ECD-based: \( r = 0.48 \); LLM-based: \( r = 0.68 \)), with the LLM-based approach demonstrating greater consistency across varying learning examples (\( r = 0.82 \)). These findings highlight the potential of automated tools for scalable, objective, and efficient creativity assessment, paving the way for their application in creativity-focused learning environments.
References
Acar, S., Dumas, D., Organisciak, P., & Berthiaume, K. (2024). Measuring original thinking in elementary school. Journal of Educational Psychology, 116(6), 953–981. https://doi.org/10.1037/edu0000844
Almond, R., Shute, V., Tingir, S., & Rahimi, S. (2020). Identifying observable outcomes in game-based assessments. In H. Jiao & R. W. Lissitz (Eds.), Innovative psychometric modeling and methods (pp. 163–192). Information Age Publishing. https://myweb.fsu.edu/vshute/pdf/marcs2020.pdf
Almond, R. G., Mislevy, R. J., Steinberg, L. S., Yan, D., & Williamson, D. M. (2015). Bayesian networks in educational assessment. Springer. https://doi.org/10.1007/978-1-4939-2125-6
Amabile, T. M. (1982). Social psychology of creativity: A consensual assessment technique. Journal of Personality and Social Psychology, 43(5), 997–1013. https://doi.org/10.1037/0022-3514.43.5.997
Amabile, T. M. (1983). The social psychology of creativity: A componential conceptualization. Journal of Personality and Social Psychology, 45(2), 357–376. https://doi.org/10.1037/0022-3514.45.2.357
Amabile, T. M. (1996). Creativity in context: Update to ‘The social psychology of creativity.’Westview Press. https://doi.org/10.4324/9780429501234
Azzam, A. M. (2009). Why creativity now? A conversation with Sir Ken Robinson. Educational Leadership, 67(1), 22–26.
Baas, M., De Dreu, C. K. W., & Nijstad, B. A. (2008). A meta-analysis of 25 years of mood-creativity research: Hedonic tone, activation, or regulatory focus? Psychological Bulletin, 134(6), 779–806. https://doi.org/10.1037/a0012815
Baer, J., & McKool, S. S. (2009). Assessing creativity using the consensual assessment technique. In C. S. Schreiner (Ed.), Handbook of research on assessment technologies, methods, and applications in higher education (pp. 65–77). IGI Global. https://doi.org/10.4018/978-1-60566-667-9.ch004
Banut, M., Albulescu, I., & Simion, A. (2022). Creativity pedagogy: Students’expression through music and programming. In Education, reflection, development—ERD 2022, vol 6. European Proceedings of Educational Sciences (pp. 306–321). European Publisher. https://doi.org/10.15405/epes.23056.28
Barbot, B. (2018). The dynamics of creative ideation: Introducing a new assessment paradigm. Frontiers in Psychology, 9. https://doi.org/10.3389/fpsyg.2018.02529
Beaty, R. E., & Johnson, D. R. (2021). Automating creativity assessment with SemDis: An open platform for computing semantic distance. Behavior Research Methods, 53(2), 757–780. https://doi.org/10.3758/s13428-020-01453-w
Boden, M. A. (2004). The creative mind: Myths and mechanisms. Routledge. https://doi.org/10.4324/9780203508527
Bowkett, S. (2007). 100 ideas for teaching creativity. Continuum. https://books.google.com/books?id=GJdLAAAAYAAJ
Brown, N., Messer, M., K¨olling, M., & Shi, M. (2023). Automated grading and feedback tools for programming education: A systematic review. ACM Transactions on Computing Education, 24(1). https://doi.org/10.1145/3636515
Celik, I., Gedrimiene, E., Siklander, S., & Muukkonen, H. (2024). The affordances of artificial intelligence-based tools for supporting 21st-century skills: A systematic review of empirical research in higher education. Australasian Journal of Educational Technology, 40(3), 19–38. https://doi.org/10.14742/ajet.9069
Chou, E., Fossati, D., & Hershkovitz, A. (2024). A code distance approach to measure originality in computer programming. In O. Poquet, A. Ortega-Arranz, O. Viberg, I.-A. Chounta, B. McLaren, & J. Jovanovic (Eds.), Proceedings of the 16th International Conference on Computer Supported Education (CSEDU 2024), 2–4 May 2024, Angers, France (pp. 541–548, Vol. 2). SciTePress—Science and Technology Publications. https://www.scitepress.org/Papers/2024/126321/126321.pdf
Colton, S. (2008). Creativity versus the perception of creativity in computational systems. In Proceedings of the AAAI Spring Symposium on Creative Intelligent Systems (AAAI 2008), 26–28 March 2008, Palo Alto, California, USA (pp. 14–20). AAAI. https://cdn.aaai.org/Symposia/Spring/2008/SS-08-03/SS08-03-003.pdf
Colton, S., & Wiggins, G. A. (2012). Computational creativity: The final frontier? In L. D. Raedt, C. Bessiere, D. Dubois, P. Doherty, & P. Frasconi (Eds.), Proceedings of the 20th European Conference on Artificial Intelligence (ECAI 2012), 27–31 August 2012, Montpellier, France (pp. 21–26). ACM. https://dl.acm.org/doi/10.5555/3007337.3007345
de Fleurian, R., Blackwell, T., Ben-Tal, O., & M¨ullensiefen, D. (2017). Information-theoretic measures predict the human judgment of rhythm complexity. Cognitive Science, 41(3), 800–813. https://doi.org/10.1111/cogs.12347
DiStefano, P. V., Patterson, J. D., & Beaty, R. E. (2024). Automatic scoring of metaphor creativity with large language models. Creativity Research Journal. https://doi.org/10.1080/10400419.2024.2326343
Doshi, A. R., & Hauser, O. P. (2024). Generative AI enhances individual creativity but reduces the collective diversity of novel content. Science Advances, 10(28). https://doi.org/10.1126/sciadv.adn5290
Dumas, D., Organisciak, P., & Doherty, P. (2020). Measuring divergent thinking originality with human raters and text-mining models: A psychometric comparison of methods. Psychology of Aesthetics, Creativity, and the Arts, 16(4), 665–678. https://doi.org/10.1037/aca0000355
Engelman, S., Magerko, B., McKlin, T., Miller, M., Edwards, D., & Freeman, J. (2017). Creativity in authentic STEAM education with EarSketch. In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (SIGCSE 2017), 8–11 March 2017, Seattle, Washington, USA (pp. 183–188). ACM. https://doi.org/10.1145/3017680.3017763
Granåsen, D. (2018). Towards automated assessment of team performance by mimicking expert observers’ ratings. Cognition, Technology & Work, 21, 253–274. https://doi.org/10.1007/s10111-018-0499-6
Guilford, J. P. (1956). Fundamental statistics in psychology and education (3rd ed.). McGraw-Hill.
Gupta, K., Roychowdhury, S., Kasa, S., Kasa, S., Bhanushali, A., Pattisapu, N., & Murthy, P. (2023). How robust are LLMs to in-context majority label bias? arXiv preprint arXiv:2312.16549. https://doi.org/10.48550/arXiv.2312.16549
Heinen, D. J. P., & Johnson, D. R. (2018). Semantic distance: An automated measure of creativity that is novel and appropriate. Psychology of Aesthetics, Creativity, and the Arts, 12(2), 144–156. https://doi.org/10.1037/aca0000125
Henriksen, D., Creely, E., Henderson, M., & Mishra, P. (2021). Creativity and technology in teaching and learning: A literature review of the uneasy space of implementation. Educational Technology Research and Development, 69(4), 2091–2108. https://doi.org/10.1007/s11423-020-09912-z
Henriksen, D., Henderson, M., Creely, E., Ceretkova, S., Černochová, M., Sendova, E., Sointu, E. T., & Tienken, C. H. (2018). Creativity and technology in education: An international perspective. Technology, Knowledge and Learning, 23(3), 409–424. https://doi.org/10.1007/s10758-018-9380-1
Hershkovitz, A., Sitman, R., Israel-Fishelson, R., Egu´ıluz, A., Garaizar, P., & Guenaga, M. (2019). Creativity in the acquisition of computational thinking. Interactive Learning Environments, 27(5–6), 628–644. https://doi.org/10.1080/10494820.2019.1610451
Hone, A., Williamson, D., & Bejar, I. (1999). “Mental model” comparison of automated and human scoring. Journal of Educational Measurement, 36(2), 158–184. https://doi.org/10.1111/j.1745-3984.1999.tb00552.x
Israel-Fishelson, R., & Hershkovitz, A. (2022). Studying interrelations of computational thinking and creativity: A scoping review (2011–2020). Computers & Education, 176, 104353. https://doi.org/10.1016/j.compedu.2021.104353
Jordanous, A. (2012). A standardised procedure for evaluating creative systems: Computational creativity evaluation based on what it is to be creative. Cognitive Computation, 4, 246–279. https://doi.org/10.1007/s12559-012-9156-1
Kaufman, J. C., Baer, J., Cole, J. C., & Sexton, J. D. (2008). A comparison of expert and nonexpert raters using the consensual assessment technique. Creativity Research Journal, 20(2), 171–178. https://doi.org/10.1080/10400410802059929
Kaufman, J. C., & Beghetto, R. A. (2009). Beyond big and little: The four C model of creativity. Review of General Psychology, 13(1), 1–12. https://doi.org/10.1037/a0013688
Kaufman, J. C., & Sternberg, R. J. (Eds.). (2010). The Cambridge handbook of creativity. Cambridge University Press. https://doi.org/10.1017/9781316979839
Kenett, Y. N., & Faust, M. (2019). A semantic network cartography of the creative mind. Trends in Cognitive Sciences, 23(4), 271–274. https://doi.org/10.1016/j.tics.2019.01.007
Kind, P. M., & Kind, V. (2007). Creativity in science education: Perspectives and challenges for developing school science. Studies in Science Education, 43, 1–37. https://doi.org/10.1080/03057260708560225
Kovalkov, A., Paasen, B., Segal, A., Pinkwart, N., & Gal, K. (2021). Automatic creativity measurement in Scratch programs across modalities. IEEE Transactions on Learning Technologies, 14(6), 740–753. https://doi.org/10.1109/TLT.2022.3144442
Li, Y., Kim, M., & Palkar, J. (2022). Using emerging technologies to promote creativity in education: A systematic review. International Journal of Educational Research Open, 3, 100177. https://doi.org/10.1016/j.ijedro.2022.100177
Lin, Y.-S. (2011). Fostering creativity through education: A conceptual framework of creative pedagogy. Creative Education, 2(3), 149–155. https://doi.org/10.4236/ce.2011.23021
Liu, Z., Zhang, S., Israel, M., Smith, R., Xing, W., & Minces, V. (2025). Engaging K–12 students with flow-based music programming: An experience report on its impact on teaching and learning. In Proceedings of the 56th ACM Technical Symposium on Computer Science Education (SIGCSE TS 2025), 26 February–1 March 2025, Pittsburgh, Pennsylvania, USA (pp. 708–714, Vol. 1). ACM. https://doi.org/10.1145/3641554.3701902
Lou, S.-J., Chou, Y.-C., Shih, R.-C., & Chung, C.-C. (2017). A study of creativity in CaC2 steamship-derived STEM project-based learning. Eurasia Journal of Mathematics, Science and Technology Education, 13(6), 2387–2404. https://doi.org/10.12973/eurasia.2017.01231a
Loveless, A. (2002). Literature review in creativity, new technologies and learning (tech. rep.) (A NESTA Futurelab Research report—report 4). Futurelab. https://telearn.hal.science/hal-00190439
Mednick, S. (1962). The associative basis of the creative process. Psychological Review, 69(3), 220–232. https://doi.org/10.1037/h0048850
Mednick, S. A. (1968). The remote associates test. The Journal of Creative Behavior, 2(3), 213–214. https://doi.org/10.1002/j.2162-6057.1968.tb00104.x
Minces, V., Booker, A., & Khalil, A. (2021). Listening to waves: Engaging underrepresented students through the science of sound and music. Connected Science Learning, 3(4), 12318697. https://doi.org/10.1080/24758779.2021.12318697
Minces, V. H., & Akshay, N. (2023). STEAM for all: A vision for STEM and arts integration. In R. J. Tierney, F. Rizvi, & K. Ercikan (Eds.), International encyclopedia of education (4th ed., pp. 10–18). Elsevier. https://doi.org/10.1016/b978-0-12-818630-5.13053-2
Minces, V. H., Xing,W., & Li, C. (2023).Work in progress: Mflow, a flow-based music programming platform for young children. In C. da Rocha Brito & M. M. Ciampi (Eds.), 2023 IEEE World Engineering Education Conference (EDUNINE 2023), 12–15 March 2023, Bogota, Columbia (pp. 1–4). IEEE. https://doi.org/10.1109/EDUNINE57531.2023.10102852
Nagaraj, N., & Balasubramanian, K. (2017). Three perspectives on complexity: Entropy, compression, subsymmetry. The European Physical Journal Special Topics, 226(15–16), 3251–3272. https://doi.org/10.1140/epjst/e2016-60347-2
National Advisory Committee on Creative and Cultural Education (NACCCE). (1999). All our futures: Creativity, culture and education (tech. rep.). Department for Education and Employment. London, UK. https://eric.ed.gov/?id=ED440037
Newton, D. P. (2013). Moods, emotions and creative thinking. Thinking Skills and Creativity, 8, 34–44. https://doi.org/10.1016/j.tsc.2012.05.006
Newton, L. D., & Newton, D. P. (2014). Creativity in 21st-century education. Prospects, 44(4), 575–589. https://doi.org/10.1007/s11125-014-9322-1
Noh, J., & Lee, J. (2020). Effects of robotics programming on the computational thinking and creativity of elementary school students. Educational Technology Research and Development, 68(1), 463–484. https://doi.org/10.1007/s11423-019-09708-w
OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., Avila, R., Babuschkin, I., Balaji, S., Balcom, V., Baltescu, P., Bao, H., Bavarian, M., Belgum, J., . . . Zoph, B. (2024). GPT-4 technical report (tech. rep.). https://arxiv.org/abs/2303.08774
Organisciak, P., Acar, S., Dumas, D., & Berthiaume, K. (2023). Beyond semantic distance: Automated scoring of divergent thinking greatly improves with large language models. Thinking Skills and Creativity, 49, 101356. https://doi.org/10.1016/j.tsc.2023.101356
Parnami, A., & Lee, M. (2022). Learning from few examples: A summary of approaches to few-shot learning. arXiv preprint arXiv.2203.04291. https://doi.org/10.48550/arXiv.2203.04291
Pearce, M. T. (2018). Statistical learning and probabilistic prediction in music cognition: Mechanisms of stylistic enculturation. Annals of the New York Academy of Sciences, 1423(1), 378–395. https://doi.org/10.1111/nyas.13654
Plucker, J. A., & Makel, M. C. (2010). Assessment of creativity. In J. C. Kaufman & R. J. Sternberg (Eds.), The Cambridge handbook of creativity (pp. 48–73). Cambridge University Press. https://doi.org/10.1017/9781316979839.005
Rahimi, S., Almond, R. G., & Shute, V. J. (2023). Getting the first and second decimals right: Psychometrics of stealth assessment. In M. P. McCreery & S. K. Krach (Eds.), Games as stealth assessments (pp. 125–153). IGI Global. https://doi.org/10.4018/979-8-3693-0568-3.ch006
Rahimi, S. (2023). Going beyond the brick: Assessing and supporting creativity using AI-powered digital games. Creativity Research Journal, 37(2), 275–283. https://doi.org/10.1080/10400419.2023.2241779
Rahimi, S., Smith, J. B., Truesdell, E. J. K., Vinay, A., Boyer, K. E., Magerko, B., Freeman, J., & Mcklin, T. (2024). An automated, unobtrusive, formative assessment of creativity in a computer science and music remixing learning environment [Advance online publication]. Psychology of Aesthetics, Creativity, and the Arts. https://doi.org/10.1037/aca0000683
Reif, Y., & Schwartz, R. (2024). Beyond performance: Quantifying and mitigating label bias in LLMs. In K. Duh, H. Gomez, & S. Bethard (Eds.), Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 16–21 June 2024, Mexico City, Mexico (pp. 6784–6798, Vol. 1, Long Papers). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.naacl-long.378
Repenning, A., Zurm¨uhle, J., Lamprou, A., & Hug, D. (2020). Computational music thinking patterns: Connecting music education with computer science education through the design of interactive notations. In H. C. Lane, S. Zvacek, & J. Uhomoibhi (Eds.), Proceedings of the 12th International Conference on Computer Supported Education (CSEDU 2020), 2–4 May 2020, online (pp. 641–652, Vol. 1). SciTePress. https://doi.org/10.5220/0009817506410652
Rubenstein, L. D. V., Thomas, J., Finch, W. H., & Ridgley, L. M. (2022). Exploring creativity’s complex relationship with learning in early elementary students. Thinking Skills and Creativity, 44, 101030. https://doi.org/10.1016/J.TSC.2022.101030
Runco, M. A. (2011). Divergent thinking. In M. A. Runco & S. R. Pritzker (Eds.), Encyclopedia of creativity (2nd ed., pp. 400–403, Vol. 1). Academic Press.
Runco, M. A. (2014). Creativity: Theories and themes: Research, development, and practice. Academic Press. https://doi.org/10.1016/C2012-0-06920-7
Runco, M. A., & Acar, S. (2012). Divergent thinking as an indicator of creative potential. Creativity Research Journal, 24(1), 66–75. https://doi.org/10.1080/10400419.2012.652929
Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161. https://doi.org/10.1037/h0077714
Samma, T., Honda, K., & Fujii, S. (2025). Sight-over-sound effect depends on interaction between evaluators’ musical experience and auditory-visual integration: An examination using Japanese brass band competition recordings. PLoS ONE, 20(4), e0321442. https://doi.org/10.1371/journal.pone.0321442
Sharmin, S. (2021). Creativity in CS1: A literature review. ACM Transactions on Computing Education (TOCE), 22(1), 1–26. https://doi.org/10.1145/3459995
Shute, V. J., Leighton, J. P., Jang, E. E., & Chu, M.-W. (2016). Advances in the science of assessment. Educational Assessment, 21(1), 34–59. https://doi.org/10.1080/10627197.2015.1127752
Shute, V. J., Rahimi, S., Smith, G., Ke, F., Almond, R., Dai, C.-P., Kuba, R., Liu, Z., Yang, X., & Sun, C.-L. (2020). Maximizing learning without sacrificing the fun: Stealth assessment, adaptivity, and learning supports in educational games. Journal of Computer Assisted Learning, 37(1), 127–141. https://doi.org/10.1111/jcal.12473
Siva, S., Im, T., McKlin, T., Freeman, J., & Magerko, B. (2018). Using music to engage students in an introductory undergraduate programming course for non-majors. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE 2018), 21–24 February 2018, Baltimore, Maryland, USA (pp. 975–980). ACM. https://doi.org/10.1145/3159450.3159468
Song, Y., Xing, W., Barron, A., Oh, H., Li, C., & Minces, V. (2023). M-flow: A flow-based music creation platform improves underrepresented children’s attitudes toward computer programming. In Proceedings of the 22nd Annual ACM Interaction Design and Children Conference (IDC 2023), 19–23 June 2023, Chicago, Illinois, USA (pp. 233–238). ACM. https://doi.org/10.1145/3585088.3589383
Sternberg, R. J. (1999). Handbook of creativity. Cambridge University Press. https://doi.org/10.1017/CBO9780511807916
Szydlo, T., Brzoza-Woch, R., Sendorek, J., Windak, M., & Gniady, C. (2017). Flow-based programming for IoT leveraging fog computing. In S. M. Reddy, W. Cellary, & M. Fugini (Eds.), 2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE 2017), 21–23 June 2017, Poznan, Poland (pp. 74–79). IEEE. https://doi.org/10.1109/WETICE.2017.17
Thul, E. (2008). Measuring the complexity of musical rhythm [Master’s thesis, McGill University]. https://www-cgrl.cs.mcgill.ca/∼godfried/teaching/mir-reading-assignments/Eric-Thul-Thesis.pdf
Tsay, C.-J. (2013). Sight over sound in the judgment of music performance. Proceedings of the National Academy of Sciences of the United States of America, 110(36), 14580–14585. https://doi.org/10.1073/pnas.1221454110
Turkman, B. (2016). Subjective and objective measurement in creativity: Comparison studies [Doctoral dissertation, University of Georgia]. https://openscholar.uga.edu/record/14629?v=pdf
Venckutė, M., Berg Mulvik, I., Lucas, B., Bacigalupo, M., Cachia, R., & Kampylis, P. (2020). Creativity, a transversal skill for lifelong learning—An overview of existing concepts and practices—Final report (tech. rep.). Publications Office of the European Union. https://publications.jrc.ec.europa.eu/repository/handle/JRC122016https://publications.jrc.ec.europa.eu/repository/handle/JRC122016
Wagner, J., Triantafyllopoulos, A., Wierstorf, H., Schmitt, M., Burkhardt, F., Eyben, F., & Schuller, B. W. (2023). Dawn of the transformer era in speech emotion recognition: Closing the valence gap. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(9), 10745–10759. https://doi.org/10.1109/tpami.2023.3263585
Williamon, A., Thompson, S., Lisboa, T., & Wiffen, C. (2006). Creativity, originality, and value in music performance. In I. Deliège & G. A. Wiggins (Eds.), Musical creativity (pp. 177–196). Psychology Press. https://doi.org/10.4324/9780203088111-22
Yoshida, L. (2024). The impact of example selection in few-shot prompting on automated essay scoring using GPT models. In A. Olney, I. Chounta, Z. Liu, O. Santos, & I. Bittencourt (Eds.), Artificial intelligence in education. Posters and late breaking results, workshops and tutorials, industry and innovation tracks, practitioners, doctoral consortium and blue sky. AIED 2024. Communications in computer and information science (pp. 61–73, Vol. 2150). Springer. https://doi.org/10.1007/978-3-031-64315-6 5
Zhang, L., & Nouri, J. (2019). A systematic review of learning computational thinking through Scratch in K–9. Computers & Education, 141, 103607. https://doi.org/10.1016/j.compedu.2019.103607
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Learning Analytics

This work is licensed under a Creative Commons Attribution 4.0 International License.