Analysis of Student Behavior Using the R Package crsra

John Muschelli
Jeffrey Leek


Due to the fundamental differences between traditional education and Massive Open Online Courses (MOOCs) and the ever-increasing popularity of MOOCs more research is needed to understand current and future trends in education. Although research in the field has rapidly grown in recent years, one of the main challenges facing researchers remains to be the complexity and messiness of the data. Therefore, it is imperative to provide tools that pave the way for more research on the new subject of MOOCs. This paper introduces a package called crsra based on the statistical software R to help tidy and perform preliminary analysis on massive loads of data provided by Coursera. The advantages of the package are as follows: a) faster loading and organizing data for analysis, b) an efficient method for combining data from multiple courses and even across institutions, and c) provision of a set of functions for analyzing student behaviors.

Full Text:



Bozkurt,A., AkgunOzbek,E., & Zawacki-Richter,O. (2017). Trends and patterns in massive open online courses: Review and content analysis of research on MOOCs (2008–2015). The International Review of Research in Open and Distributed Learning,18(5),118–147.

Deng, R., Benckendorff, P., & Gannaway, D. (2019). Progress and new directions for teaching and learning in MOOCs. Computers&Education,129,48–60.

Gasˇevic ́, D., Kovanovic ́, V., Joksimovic ́, S., & Siemens, G. (2014). Where is research on massive open online courses headed? A data analysis of the MOOC research initiative. The International Review of Research in Open and Distributed Learning, 15(5),134–176.

Ho, A., Reich, J., Nesterko, S., Seaton, D., Mullaney, T., Waldo, J., & Chuang, I. (2014). HarvardX and MITx: The First Year of Open Online Courses, Fall 2012 – Summer 2013. SSRN Working Papers.

Jordan, K. (2015). Massive open online course completion rates revisited: Assessment, length and attrition. The International Review of Research in Open and Distributed Learning,16(3), 341–358.

Kizilcec, R. F., Pe ́rez-Sanagust ́ın, M., & Maldonado, J. J. (2017). Self-regulated learning strategies predict learner behavior and goal attainment in Massive Open Online Courses. Computers & Education, 104, 18–33.

Lopez, G., Seaton, D. T., Ang, A., Tingley, D., & Chuang, I. (2017). Google BigQuery for education: Framework for parsing and analyzing edX MOOC data. In Proceedings of the 4th ACM Conference on Learning @ Scale (L@S 2017), 20–21 April 2017,Cambridge,MA,USA(pp.181–184).NewYork:ACM.

Lu, T., Bradlow, E., & Hutchinson, W. (2017). Binge Consumption of Online Content. Working Paper. Retrieved from

Maiz Olazabalaga, I., Castan ̃o Garrido, C., & Garay Ruiz, U. (2016). Research on MOOCs: Trends and methodologies. Porta Linguarum(I), 87–98.

Pardos, Z. A., & Kao, K. (2015). moocRP: An open-source analytics platform. In Proceedings of the 2nd ACM Conference on Learning @ Scale (L@S 2015), 14–18 March 2015, Vancouver, BC, Canada (pp. 103–110). New York: ACM.

Perna, L. W., Ruby, A., Boruch, R. F., Wang, N., Scull, J., Ahmad, S., & Evans, C. (2014). Moving through MOOCs: Understanding the progression of users in massive open online courses. Educational Researcher, 43(9), 421–432.

Pursel, B. K., Zhang, L., Jablokow, K. W., Choi, G., & Velegol, D. (2016). Understanding MOOC students: Mo- tivations and behaviours indicative of MOOC completion. Journal of Computer Assisted Learning, 32(3), 202–217.

Reich, J. (2015). Rebooting MOOC research. Science, 347(6217), 34–35.

Wickham, H., Franc ̧ois, R., Henry, L., & Mu ̈ller, K. (2018). dplyr: A grammar of data manipulation, R package version 0.7.6 [Computer software manual]. Retrieved from



Share this article: