AI-Augmented Advising: A Comparative Study of GPT-4 and Advisor-based Major Recommendations

Kasra Lekan; Zachary A. Pardos

doi:10.18608/jla.2025.8593

Authors

Kasra Lekan University of California, Berkeley
Zachary A. Pardos University of California, Berkeley https://orcid.org/0000-0002-6016-7051

DOI:

https://doi.org/10.18608/jla.2025.8593

Keywords:

advising, major selection, GPT, LLM, AI-human collaboration, higher education, generative AI, experimental study, research paper

Abstract

Choosing an undergraduate major is an important decision that impacts academic and career outcomes. In this work, we investigate augmenting personalized human advising for major selection using a large language model (LLM), GPT-4. Through a three-phase survey, we compare GPT suggestions and responses for undeclared first- and second-year students (n = 33) to expert responses from university advisors (n = 25). Undeclared students were first surveyed on their interests and goals. These responses were then given to both campus advisors and GPT to produce a major recommendation for each student. In the case of GPT, information about the majors offered on campus was added to the prompt. Overall, advisors rated the recommendations of GPT to be highly helpful (4.0 out of 5 on its explanation for the recommendation and 3.8 on its answers to individual student questions) and agreed with its recommendations 33% of the time. Additionally, we observe more agreement with AI’s major recommendations when advisors see the AI recommendations before making their own. However, this result was not statistically significant. We categorize qualitative feedback from advisors with an affinity diagram and outline five design implications for future AI-assisted academic advising systems. The results provide a first signal as to the viability of LLMs for personalized major recommendation and shed light on the promise and limitations of AI for advising support.

References

Adnan, M., Habib, A., Ashraf, J., Mussadiq, S., Raza, A. A., Abid, M., . . . Khan, S. U. (2021). Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models. IEEE Access, 9, 7519–7539. doi: 10.1109/ACCESS.2021.3049446

Agarwal, N., Moehring, A., Rajpurkar, P., & Salz, T. (2023). Combining human expertise with artificial intelligence: Experimental evidence from radiology [Working Paper, National Bureau of Economic Research]. https://doi.org/10.3386/w31422

Alwarthan, S., Aslam, N., & Khan, I. U. (2022). An explainable model for identifying at-risk student at higher education. IEEE Access, 10, 107649–107668. https://doi.org/10.1109/ACCESS.2022.3211070

Arnold, K. E., & Pistilli, M. D. (2012). Course signals at Purdue: Using learning analytics to increase student success. In Proceedings of the Second International Conference on Learning Analytics and Knowledge (LAK 2012), 29 April–2 May 2012, Vancouver, British Columbia, Canada (pp. 267–270). ACM. https://doi.org/10.1145/2330601.2330666

Ashktorab, Z., Desmond, M., Andres, J., Muller, M., Joshi, N. N., Brachman, M., Sharma, A., Brimijoin, K., Pan, Q., Wolf, C. T., Duesterwald, E., Dugan, C., Geyer, W., & Reimer, D. (2021). AI-assisted human labeling: Batching for efficiency without overreliance. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW1), 89:1–89:27. https://doi.org/10.1145/3449163

Bashkirova, A., & Krpan, D. (2024). Confirmation bias in AI-assisted decision-making: AI triage recommendations congruent with expert judgments increase psychologist trust and recommendation acceptance. Computers in Human Behavior: Artificial Humans, 2(1), 100066. https://doi.org/10.1016/j.chbah.2024.100066

Baucks, F., Schmucker, R., Borchers, C., Pardos, Z. A., & Wiskott, L. (2024). Gaining insights into group-level course difficulty via differential course functioning. In Proceedings of the Eleventh ACM Conference on Learning @ Scale (L@S 2024), 18–20 July 2024, Atlanta, Georgia, USA (pp. 165–176). ACM. https://doi.org/10.1145/3657604.3662028

Bauer, E., Greisel, M., Kuznetsov, I., Berndt, M., Kollar, I., Dresel, M., Fischer, M. R., & Fischer, F. (2023). Using natural language processing to support peer-feedback in the age of artificial intelligence: A cross-disciplinary framework and a research agenda. British Journal of Educational Technology, 54(5), 1222–1245. https://doi.org/10.1111/bjet.13336

Bleemer, Z., & Mehta, A. (2022). Will studying economics make you rich? A regression discontinuity analysis of the returns to college major. American Economic Journal: Applied Economics, 14(2), 1–22. https://doi.org/10.1257/app.20200447

Bolukbasi, T., Chang, K. -W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In D. D. Lee, U. von Luxburg, R. Garnett, M. Sugiyama, & I. Guyon (Eds.), Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS 2016), 5–10 December 2016, Barcelona, Spain. ACM. https://dl.acm.org/doi/10.5555/3157382.3157584

Borchers, C., & Pardos, Z. A. (2023). Insights into undergraduate pathways using course load analytics. In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 219–229). ACM. https://doi.org/10.1145/3576050.3576081

Botelho, A., Baral, S., Erickson, J. A., Benachamardi, P., & Heffernan, N. T. (2023). Leveraging natural language processing to support automated assessment and feedback for student open responses in mathematics. Journal of Computer Assisted Learning, 39(3), 823–840. https://doi.org/10.1111/jcal.12793

Brooks, P., & Hestnes, B. (2010). User measures of quality of experience: Why being objective and quantitative is important. IEEE Network, 24(2), 8–13. https://doi.org/10.1109/MNET.2010.5430138

Broos, T., Verbert, K., Langie, G., Van Soom, C., & De Laet, T. (2018, March). Multi-institutional positioning test feedback dashboard for aspiring students: Lessons learnt from a case study in flanders. In Proceedings of the Eighth International Conference on Learning Analytics and Knowledge (LAK 2018), 7–9 March 2017, Sydney, Australia (pp. 51–55). ACM. https://doi.org/10.1145/3170358.3170419

Brynjolfsson, E., Raymond, L., & Li, D. (2023). Generative AI at work [Working Paper 31161, National Bureau of Economic Research]. https://www.nber.org/papers/w31161

Capel, T., & Brereton, M. (2023). What is human-centered about human-centered AI? A map of the research landscape. In A. Schmidt, K. Vaananen, T. Goyal, P. O. Kristensson, A. Peters, S. Mueller, J. R. Williamson, & M. L. Wilson (Eds.), Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 23–28 April 2023, Hamburg, Germany (pp. 1–23). ACM. https://doi.org/10.1145/3544548.3580959

Carlstrom, A. H., & Miller, M. A. (2013). 2011 NACADA national survey of academic advising. https://nacada.ksu.edu/Resources/Clearinghouse/View-Articles/2011-NACADA-National-Survey.aspx

Chang, C. -Y., Hwang, G.- J., & Gau, M.- L. (2022). Promoting students’ learning achievement and self-efficacy: A mobile chatbot approach for nursing training. British Journal of Educational Technology, 53(1), 171–188. https://doi.org/10.1111/bjet.13158

Chi, M., Jordan, P., Vanlehn, K., & Litman, D. (2009). To elicit or to tell: Does it matter? In Proceedings of the 14th International Conference on Artificial Intelligence in Education (AIED 2009), 6–19 July 2009, Brighton, UK (pp. 197–204). IOS Press. https://doi.org/10.3233/978-1-60750-028-5-197

Chiang, C.-W., Lu, Z., Li, Z., & Yin, M. (2023). Are two heads better than one in AI-assisted decision making? Comparing the behavior and performance of groups and individuals in human-AI collaborative recidivism risk assessment. In A. Schmidt, K. Vaananen, T. Goyal, P. O. Kristensson, A. Peters, S. Mueller, J. R. Williamson, & M. L. Wilson (Eds.), Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 23–28 April 2023, Hamburg, Germany (pp. 1–18). ACM. https://doi.org/10.1145/3544548.3581015

Chong, L., Zhang, G., Goucher-Lambert, K., Kotovsky, K., & Cagan, J. (2022, February). Human confidence in artificial intelligence and in themselves: The evolution and impact of confidence on adoption of AI advice. Computers in Human Behavior, 127, 107018. Retrieved 2023-10-31, from https://doi:10.1016/j.chb.2021.107018

Desmond, M., Muller, M., Ashktorab, Z., Dugan, C., Duesterwald, E., Brimijoin, K., Finegan-Dollak, C., Brachman, M., Sharma, A., Joshi, N. N., & Pan, Q. (2021). Increasing the speed and accuracy of data labeling through an AI assisted interface. In Proceedings of the 26th International Conference on Intelligent User Interfaces (IUI 2021), 14–17 April 2021, College Station, Texas, USA (pp. 392–401). ACM. https://doi.org/10.1145/3397481.3450698

Esteban, A., Zafra, A., & Romero, C. (2020). Helping university students to choose elective courses by using a hybrid multi-criteria recommendation system with genetic optimization. Knowledge-Based Systems, 194, 105385. https://doi.org/10.1016/j.knosys.2019.105385

Gillath, O., Ai, T., Branicky, M. S., Keshmiri, S., Davison, R. B., & Spaulding, R. (2021, February). Attachment and trust in artificial intelligence. Computers in Human Behavior, 115, 106607. Retrieved 2023-10-31, from https://doi:10.1016/j.chb.2020.106607

Gudibande, A., Wallace, E., Snell, C., Geng, X., Liu, H., Abbeel, P., . . . Song, D. (2023, May). The false promise of imitating proprietary LLMs. arXiv preprint arXiv:2305.15717. https://doi.org/10.48550/arXiv.2305.15717

Jayaprakash, S. M., Moody, E. W., Laur´ıa, E. J., Regan, J. R., & Baron, J. D. (2014, May). Early Alert of Academically At-Risk Students: An Open Source Analytics Initiative. Journal of Learning Analytics, 1(1), 6–47. https://doi.org/10.18608/jla.2014.11.3

Jiang, W., Pardos, Z. A., & Wei, Q. (2019). Goal-based course recommendation. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 36–45). ACM. https://doi.org/10.1145/3303772.3303814

Khosravi, H., Buckingham Shum, S., Chen, G., Conati, C., Tsai, Y.-S., Kay, J., Knight, S., Martinez-Maldonado, R., Sadiq, S., & Gasevic, D. (2022). Explainable artificial intelligence in education. Computers and Education: Artificial Intelligence, 3, 100074. https://doi.org/10.1016/j.caeai.2022.100074

Kucirkova, N., Gerard, L., & Linn, M. C. (2021). Designing personalised instruction: A research and design framework. British Journal of Educational Technology, 52(5), 1839–1861. https://doi.org/10.1111/bjet.13119

Lang, D., Wang, A., Dalal, N., Paepcke, A., & Stevens, M. L. (2022, January). Forecasting Undergraduate Majors: A Natural Language Approach. AERA Open, 8, 233285842211265. https://doi.org/10.1177/23328584221126516

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9), 1–35. https://doi.org/10.1145/3560815

Lucero, A. (2015). Using affinity diagrams to evaluate interactive prototypes. In J. Abascal, S. Barbosa, M. Fetter, T. Gross, P. Palanque, & M. Winckler (Eds.), Human-computer interaction–INTERACT 2015. Lecture notes in computer science (pp. 231–248, Vol. 9297). Springer International Publishing. https://doi.org/10.1007/978-3-319-22668-2_19

Maphosa, M., Doorsamy, W., & Paul, B. (2024). Improving academic advising in engineering education with machine learning using a real-world dataset. Algorithms, 17(2), 85. https://doi.org/10.3390/a17020085

Markel, J. M., Opferman, S. G., Landay, J. A., & Piech, C. (2023). GPTeach: Interactive TA training with GPT-based students. In Proceedings of the Tenth ACM Conference on Learning @ Scale (L@S 2023), 20–22 July 2023, Copenhagen, Denmark (pp. 226–236). ACM. https://doi.org/10.1145/3573051.3593393

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2022). A survey on bias and fairness in machine learning. ACM Computing Surveys, 54(6). https://doi.org/10.1145/3457607

Mendez, G., Galarraga, L., & Chiluiza, K. (2021). Showing academic performance predictions during term planning: Effects on students’ decisions, behaviors, and preferences. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 8–13 May 2021, Yokohama, Japan. ACM. https://doi.org/10.1145/3411764.3445718

Mendez, G. G., Galarraga, L., Chiluiza, K., & Mendoza, P. (2023). Impressions and strategies of academic advisors when using a grade prediction tool during term planning. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 23–28 April 2023, Hamburg, Germany (pp. 1–18). ACM. https://doi.org/10.1145/3544548.3581575

Millecamp, M., Gutierrez, F., Charleer, S., Verbert, K., & De Laet, T. (2018). A qualitative evaluation of a learning dashboard to support advisor-student dialogues. In Proceedings of the Eighth International Conference on Learning Analytics and Knowledge (LAK 2018), 7–9 March 2018, Sydney, Australia (pp. 56–60). ACM. https://doi.org/10.1145/3170358.3170417

Moakler, M. W., & Kim, M. M. (2014). College major choice in STEM: Revisiting confidence and demographic factors. The Career Development Quarterly, 62(2), 128–142. https://doi.org/10.1002/j.2161-0045.2014.00075.x

Ocumpaugh, J., Baker, R. S., San Pedro, M. O. C. Z., Hawn, M. A., Heffernan, C., Heffernan, N., & Slater, S. A. (2017). Guidance counselor reports of the ASSISTments college prediction model (ACPM). In Proceedings of the Seventh International Conference on Learning Analytics and Knowledge (LAK 2017), 13–17 March 2017, Vancouver, British Columbia, Canada (pp. 479–488). ACM. https://doi.org/10.1145/3027385.3027435

OpenAI Platform. (2023). Retrieved September 21, 2023, from https://platform.openai.com

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155. https://doi.org/10.48550/arXiv.2203.02155

Pardos, Z. A., & Bhandari, S. (2023). Learning gain differences between ChatGPT and human tutor generated algebra hints. arXiv preprint arXiv:2302.06871. https://doi.org/10.48550/arXiv.2302.06871

Pardos, Z. A., & Jiang, W. (2020). Designing for serendipity in a university course recommendation system. In Proceedings of the 10th International Conference on Learning Analytics and Knowledge (LAK 2020), 23–27 March 2020, Frankfurt, Germany (pp. 350–359). ACM. https://doi.org/10.1145/3375462.3375524

Reverberi, C., Rigon, T., Solari, A., Hassan, C., Cherubini, P., & Cherubini, A. (2022). Experimental evidence of effective human–AI collaboration in medical decision-making. Scientific Reports, 12(1), 14952. https://doi.org/10.1038/s41598-022-18751-2

Reynolds, L., & McDonell, K. (2021). Prompt programming for large language models: Beyond the few-shot paradigm. In Y. Kitamura, A. Quigley, K. Isbister, & T. Igarashi (Eds.), Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, 8–13 May 2021, Yokohama, Japan (pp. 1–7). ACM. https://doi.org/10.1145/3411763.3451760

SentenceTransformers. (2024). Pretrained models—Sentence-Transformers documentation. Retrieved February 13, 2024, from https://www.sbert.net/docs/pretrained_models.html

Shaik, T., Tao, X., Dann, C., Xie, H., Li, Y., & Galligan, L. (2023). Sentiment analysis and opinion mining on educational data: A survey. Natural Language Processing Journal, 2, 100003. https://doi.org/10.1016/j.nlp.2022.100003

Shao, E., Guo, S., & Pardos, Z. A. (2021). Degree planning with PLAN-BERT: Multi-semester recommendation using future courses of interest. Proceedings of the AAAI Conference on Artificial Intelligence, 35(17), 14920–14929. https://doi.org/10.1609/aaai.v35i17.17751

Shin, T., Razeghi, Y., Logan IV, R. L., Wallace, E., & Singh, S. (2020). AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980. https://doi.org/10.48550/arXiv.2010.15980

Song, K., Tan, X., Qin, T., Lu, J., & Liu, T.- Y. (2020). MPNet: Masked and permuted pre-training for language understanding. arXiv preprint arXiv:2004.09297. https://doi.org/10.48550/arXiv.2004.09297

Stein, S. A., M. Weiss, G., Chen, Y., & Leeds, D. D. (2020). A college major recommendation system. In Proceedings of the 14th ACM Conference on Recommender Systems (RecSys 2020), 22–26 September 2020, online (pp. 640–644). ACM. https://doi.org/10.1145/3383313.3418488

Suhre, C. J. M., Jansen, E. P. W. A., & Harskamp, E. G. (2007). Impact of degree program satisfaction on the persistence of college students. Higher Education, 54(2), 207–226. https://doi.org/10.1007/s10734-005-2376-5

Thomas, S. L., & Zhang, L. (2005). Post-baccalaureate wage growth within four years of graduation: The effects of college quality and college major. Research in Higher Education, 46(4), 437–459. https://doi.org/10.1007/s11162-005-2969-y

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. -A., Lacroix, T., Rozi `ere, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., & Lample, G. (2023). LLaMA: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. https://doi.org/10.48550/arXiv.2302.13971

Tschandl, P., Rinner, C., Apalla, Z., Argenziano, G., Codella, N., Halpern, A., Janda, M., Lallas, A., Longo, C., Malvehy, J., Paoli, J., Puig, S., Rosendahl, C., Soyer, H. P., Zalaudek, I., & Kittler, H. (2020). Human–computer collaboration for skin cancer recognition. Nature Medicine, 26(8), 1229–1234. https://doi.org/10.1038/s41591-020-0942-0

Wang, X. (2013). Modeling entrance into STEM fields of study among students beginning at community colleges and four year institutions. Research in Higher Education, 54(6), 664–692. https://doi.org/10.1007/s11162-013-9291-x

Weber, T., Hußmann, H., Han, Z., Matthes, S., & Liu, Y. (2020). Draw with me: Human-in-the-loop for image restoration. In Proceedings of the 25th International Conference on Intelligent User Interfaces (IUI 2020),17–20 March 2020, Cagliari, Italy (pp. 243–253). ACM. https://doi.org/10.1145/3377325.3377509

Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P.-S., Cheng, M., Glaese, M., Balle, B., Kasirzadeh, A., Kenton, Z., Brown, S., Hawkins, W., Stepleton, T., Biles, C., Birhane, A., Haas, J., Rimell, L., Hendricks, L. A., . . . Gabriel, I. (2021). Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359. https://doi.org/10.48550/arXiv.2112.04359

Weisz, J. D., Muller, M., Ross, S. I., Martinez, F., Houde, S., Agarwal, M., Talamadupula, K., & Richards, J. T. (2022). Better together? An evaluation of AI-supported code translation. In Proceedings of the 27th International Conference on Intelligent User Interfaces (IUI 2022), 22–25 March 2022, Helsinki, Finland (pp. 369–391). ACM. https://doi.org/10.1145/3490099.3511157

Wessel, J. L., Ryan, A. M., & Oswald, F. L. (2008). The relationship between objective and perceived fit with academic major, adaptability, and major-related outcomes. Journal of Vocational Behavior, 72(3), 363–376. https://doi.org/10.1016/j.jvb.2007.11.003

White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., & Schmidt, D. C. (2023). A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv preprint arXiv:2302.11382. https://doi.org/10.48550/arXiv.2302.11382

Williamson, K., & Kizilcec, R. (2022). A review of learning analytics dashboard research in higher education: Implications for justice, equity, diversity, and inclusion. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 260–270). ACM. https://doi.org/10.1145/3506860.3506900

Wolniak, G. C., & Pascarella, E. T. (2005). The effects of college major and job field congruence on job satisfaction. Journal of Vocational Behavior, 67(2), 233–251. https://doi.org/10.1016/j.jvb.2004.08.010

Xu, F. F., Vasilescu, B., & Neubig, G. (2022). In-IDE code generation from natural language: Promise and challenges. ACM Transactions on Software Engineering and Methodology, 31(2), 1–47. https://doi.org/10.1145/3487569

Xu, L., Pardos, Z. A., & Pai, A. (2023). Convincing the expert: Reducing algorithm aversion in administrative higher education decision-making. In Proceedings of the Tenth ACM Conference on Learning @ Scale (L@S 2023), 20–22 July 2023, Copenhagen, Denmark (pp. 215–225). ACM. https://doi.org/10.1145/3573051.3593378

Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2023). Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910. https://doi.org/10.48550/arXiv.2211.01910

AI-Augmented Advising

A Comparative Study of GPT-4 and Advisor-based Major Recommendations

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)