Higher education institutions need to be responsible for understanding the characteristics and qualities of learners who decide to take courses with them; online vs. on-campus and what it takes to keep them learning at an institution. Taking heed and modifying structures, communications, and services will help learners and institutions in this ever-increasing online degree market where organizations compete globally for learners. Today, acquiring learners through marketing and recruitment is a large portion of the higher education budget and online learners are retained at rates 10-20% less than face-to-face offerings (Hubert, 2006), making it paramount to the success of our distance and online institutions to figure out how to keep these learners. Knowing who they are and what is important to them, as well as the factors for retention will help us with benchmarks and to devise plans to see these learners through to graduation.

Examining the research and literature available on online learners and retention (key terms such as “online learner population,” “online learner retention,” and “distance learner retention”), and our own statistical analysis of Colorado State University Online learner retention will help us identify the characteristics of a retained population in order to support and advise learners within credit hours and services to support their learning and to help us to know when certain learner populations might need extra support to be retained.  


Semester-to-semester retention is a key metric for college administrators to predict student success because students who “stop out” are less likely to graduate (DesJardins, 2006). At schools with higher graduation rates, more of the students who stop out ultimately return (EAB, n.d.), but the delay pushes back the potential earning gains normally seen from completing. In the wake of COVID-19, between July 2020 and July 2021, 1.4 million people stopped-out of higher education programs without earning a credential, bringing the total population of Americans with incomplete degrees and certificates up to 40 million (Some College, No Credential, 2022).

We know that “nontraditional”/online learners take more breaks from their education before they finally attain their degree, which has a lot to do with who these learners are and all the competing priorities in their life. Colorado State University recently commissioned a report from Hanover Research to help us ascertain information about student caregivers and the impact that has on our learner population. With this report, we see that 20-30% of our learners have dependents, with 29% of our graduate learner population and 20% of our undergraduate population falling into this category (Rodgers, 2024). Persistence of this learner population is less likely, and they are more likely to take fewer courses, have financial and mental health challenges, and be of first-generation and non-white statuses (Rodgers, 2024).  One of the first things online learners choose when deciding where they are going to take courses is their instructional modality (Stokes, 2023).  

Online learning prior to 2020 was a growing field, now in 2024 it is known if not fully understood by most households in the United States. “In 2021, about 60% of all postsecondary degree seekers in the U.S. took at least some online classes. Around 30% studied exclusively online,” (Hamilton, 2023). Within Colorado State University Online alone, we have seen our student credit hours increase by 12-17% per term (year over year) for the last two years. Our subset of learners has changed in compilation in recent years with the increases as well; our current population is younger and more diverse, locationally, and ethnically, than it was prior to the pandemic.

According to the 2023 Wiley Voice of the Online Learner report, online learners are price-conscious and are earning their degree to help them achieve career goals and/or personal growth (Stokes). We know online learners have more constraints on their time and locations for study success than their peers (Mowreader, 2024), which could be due to the fact that there is a higher population of females in online learning (National Center for Education Statistics, 2023) and that on average females take on a higher burden of family and household duties and support compared to male counter-parts (Jolly,et.al., 2014). Obligations to family is a primary and reoccurring reason for why online learners drop an online course (Evans, 2009). Looking at the obligations of a traditional-age and face-to-face modality learner has vs. the obligations of a learner who has a family and works full-time, one can assume that it only makes sense that online learners are not retained as fully as their face-to-face counterparts. However, home factors are not for us to control, nor for us to dictate. The factors we can help with are within our institution, how we communicate to the learners, and how we support the whole learner. In their 2021 journal article, Seery, Barreda and Hein, address this as “rethinking the retention process” (p.82), wherein they mention that there are different learner characteristics for distance learners and differing demands that need to be considered for retention incentives and alternatives need to be considered.  

Literature Review 

Research into challenges impacting online learner retention separates factors into three primary categories: internal challenges, such as time management and motivation; external challenges, such as lack of employer support, financial problems, and limited environments to study; and program-related challenges, such as low interaction with educators and peers, overly demanding programs, and lack of institutional support. (Kara et al., 2019).

Available research to date focuses on qualitative aspects of successful online learners; their characteristics and institutional factors which contribute to the success of these learners (summary outlined below), however the quantitative research into learner success is either lacking, outdated, or the incidence of this research is not cited as often as the qualitative research. Our own research in this paper is focused more on quantitative research and analysis. Within the online modality, we have an even bigger responsibility to retain our learners and support them due to the nature of reaching a traditionally marginalized and unreached population of learners (Prinsloo, 2022).  

Online Learner characteristics 
  • Often determine modality of learning first (Stokes, 2023) 
  • Price-conscious (Stokes, 2023) 
  • Values collaboration and interaction (Dabbagh, 2007) 
  • Intrinsically motivated learners (Dabbagh, 2007) 
  • Learners possessing a high loci of control (Dabbagh, 2007) 
  • Less location bound and of more diverse backgrounds (Dabbagh, 2007) 
Institutional characteristics/strategies for success  
  • Mandatory orientation programs (Bawa, 2016) 
  • Collaborative learning (Bawa, 2016)  
  • Social engagement (Serry, Barreda, & Hein, 2021) 
  • Student engagement and student sense of belonging (Muljana and Luo, 2019) 
  • Learning facilitation which focuses on instructor interaction, logical course structures and organization of content (Muljana and Luo, 2019) 
  • Course development strategies which support differing learning tactics and connect curriculum to past experiences (Serry, Barreda, & Hein, 2021) 
  • Student services which support the whole learner (Muljana and Luo, 2019) 
Quantitative Research Methodology

The key goals of this study are:

(1) to propose a model that can accurately predict online learners' decision to stay or leave an academic institution

(2) to investigate critical online learners' features that impact their decision to stay or drop out, and

(3) to examine the nature of the relationship between learners who stay and who drop out.

This study contributes to the literature on learners' retention behavior in a couple of different stages. First, we use an interpretable machine learning method to find out online learners' critical features and identify the relationship between the response variable and predictors. The proposed data-driven non-parametric method does not enforce prior assumptions and estimates all predictors to isolate key features. Second, we consider a big institutional dataset to examine online learners' retention indicators.

Variable selection

We collect available online learners’ data from a 4-year flagship institution (Colorado State University- Fort Collins) in Colorado. The cross-sectional dataset includes online learner records of 3,300+ students covering the academic year 2020-2022. The dataset includes graduate and undergraduate degree-seeking students enrolled online at the academic institution. The primary/dependent variable includes an online learner's decision to stay or leave from one fall to the next. A comprehensive list of variables impacting student retention behavior at an academic institution is mentioned in the literature (Parvez & Brown, 2019). Similarly, this study's Explanatory or predictor variables include student level, application type, attempted credit hours, earned credit hours, grade point average (GPA), and student demographics (e.g., age, race, first-generation, residency status, and gender). The above-mentioned independent variables are trained and tested to predict their impact on learners' retention behavior by using both statistical and machine learning models. All model variables are presented and reported in Table 1.

Table 1. Dataset Specifications (selected variables)

Sr NoVariablesDescriptionsUnits
1retainedThe indicator for persistence is EITHER still enrolled at the institution or has graduated.

Y = 1; N=0

2FirstGenIndicator for whether a student is a First-Generation student during the first fallY = 1; N=0
3Gpa_calSemester grade point average during the first fallNumber
4AgeAge as of census during the first fallYears
5Attempted_creditsTotal attempted credits during the first fallNumber
6Completed_creditsTotal completed credits during the first fallNumber
7CO_residentIndicator for Colorado residentY = 1; N=0
8Student_levelStudents enrolled in undergrad programs or notY=1; N=0
9Application_typeStudents enrolled as new or transferY=1; N=0
10Gender_femaleAn indicator of a student is female during the first fall

0 = Male

1= Female

11Race_whiteAn indicator of a student is a WhiteY = 1; N=0

Results and Discussion:

This study employed binomial logit and ML models to identify the factors affecting learners' behavior. The application of Binomial logistic regression along with data mining algorithms to predict student retention behavior is evident in the literature as well (Parvez et al. 2020; Parvez et al. 2023). As per descriptive statistics, the total number of online female learners is higher (56.2 percent) than male learners (43.8 percent) at this institution. White is this institution's dominant race (68.53 percent) compared to other ethnic groups (31.47 percent). Also, 15.81 percent of learners declared themselves first-generation here. Further, the majority of online learners have been retained (86% for undergrad and 83% for graduate) by the institution. Further, 1 out of 3 online learners are state residents. A total of 77.69% of online learners are new students compared to 22.31% are transfer students. Finally, most online learners (75.48%) are enrolled in graduate programs compared to 34.52% who are enrolled in undergraduate programs at this institution.

Table 2. Descriptive Statistics for online learners (numeric variable only)

VariablesMean (UG)*S.D.**Mean (GR)*S.D.**

*UG refers to undergraduate and GR refers to graduate online learners;

**S.D. refers to standard deviation

Online learners' mean GPA is higher (3.38) for graduate students than for undergraduate students (2.68). Also, an undergraduate online learner’s mean college entrance age is 28.67 with a standard deviation (s.d.) of 9.20 compared to graduate learners (32.74 with an s.d. of 9.30).  Further, undergraduate online learners registered (attempted) 9.53 credits (on average) in their first semester and ended up completing 7.45 credits. Contrary, graduate online learners registered (attempted) 5.87 credits (on average) in their first semester and ended up completing 5.41 credits (table 2).

Table 3. Effects of predictor variables on online learners' behavior 

Logistic Regression Model Output - Response variable (dependent variable): learners’ retained (yes=1) 

Explanatory variablesMarginal Effects
FirstGen0.003 (0.020)
GPA_Cal0.095*** (0.008)
Age-0.002*** (0.000)
Attempted_Credits-0.029*** (0.005)
Completed_Credits0.027*** (0.005)
Gender_female-0.034** (0.015)
CO_resident0.044** (0.015)
Race_White-0.026 (0.020)
Race_international0.244*** (0.062)
Race_hispanic_latino-0.025 (0.022)
No.of observations3,308

Note: Reported values are the estimated marginal effects and, in parentheses, standard errors.

*** significant at 1%, ** significant at 5%.

We estimate p-values for each explanatory variable and as per marginal effects, most explanatory variables are statistically significant. So, there’s evidence that each of these has an independent effect on the probability of a learner being retained (rather than just a difference observed due to chance). Key regression results (marginal effect) indicate that “learner GPA” in their first semester has a positive and statistically significant impact on retention behavior. Also, learners who are state residents are more likely to be retained by the institution compared to non-resident learners. Online learners who registered a higher number of credits are less likely to retain. However, the total number of credits completed by online learners in their first semester is positively related to retention prediction. Another key finding indicates that international online learners are highly likely to be retained by the academic institution. Other race variables (e.g. white and Hispanic or latinx) are statistically insignificant. Female online learners are negatively related to retention behavior, however, not expected (table 3).

Implications of this research

Retaining online learners is a concern for a growing number of institutions and learners who embark upon such a journey. Our findings suggest that in order to keep learners at our institution, it is important for undergraduate learners to attempt around 8 credits per term and for graduate learners to attempt around 6 credits per term. This does not mean that CSU Online will be limiting the credits a learner can take or adding to the barriers and red tape which already exist for our learners, but rather means that in the advising and orientation process for our learners we will be explaining how attempting certain credit amounts can support the learners’ success.

Our learners are retained at 86% for UG and 83% for graduate learners. We know that online learners are more often female than male, and with this research we see that our female learners aren’t retained at as high of a rate as their male counterpart in the same program of learning. This means that we need to support our female learners differently to retain them.

We see that learner GPA has a positive impact on learner retention, which seems inconsequential to some, however this could be due to the fact that if learners feel they are more successful they will want to pursue, rather than just assuming that if a learner has a good GPA it means that they will pursue because they have the aptitude.

One area we didn’t consider is that Colorado residents are positively impacting our retention. It is possible that this is due to the timing and availability of extra resources being closer to campus, however we will look further into what aspects of being in Colorado impact persistence.

Limitations and suggestions for future research

Our research reviewed literature available in the wider online learning space, but our data for this study was limited to Colorado State University (CSU) and our Online learners only. We are limited to the centrally supported offerings of CSU which is a land-grant R1 institution. If an institution offers programs differently or is a fully supported online program apart from a physical university, these findings may be very different.

We would like to continue our research in the following ways:

  • More statistical analysis on which of the GPAs lead to the highest retention rate.
  • Completed credits in UG and Grad leading to the highest retention rates.
  • Different program retention rates.
  • What adds to our Colorado residents being retained more successfully than outside of Colorado limits?
References (APA):
  1. Bawa, P. (2016). Retention in online courses. SAGE Open, 6(1), 215824401562177. https://doi.org/10.1177/2158244015621777
  2. Castro, M. D. B., & Tumibay, G. M. (2021). A literature review: Efficacy of online learning courses for higher education institution using meta- analysis. Education and Information Technologies, 26, 1367– 1385. https://doi. org/10.1007/s10639-019-10027-z
  3. Dabbagh, N. (2007). The Online Learner: Characteristics and pedagogical implications. Contemporary Issues in Technology and teacher Education, 7(3), 217-226
  4. Dhawan, S. (2020). Online learning: A panacea in the time of COVID- 19 crisis. Journal of Educational
  5. Technology Systems, 49(1), 5–22. https://doi.org/10.1177/0047239520934018
  6. DesJardins, S., Ahlburg, D., & McCall, B. (2006). The effects of interrupted enrollment on graduation from college: Racial, income, and ability differences. Economics of Education Review, 25(6), 574–590. https://doi.org/10.1016/j.econedurev.2005.06.002
  7. Dumford, Amber D., and Angie L. Miller. “Online learning in higher education: Exploring advantages and disadvantages for engagement.” Journal of Computing in Higher Education, vol. 30, no. 3, 3 Apr. 2018, pp. 452–465, https://doi.org/10.1007/s12528-018-9179-z.
  8. EAB. (n.d.).What happens to stopouts?, Improve graduation rates, Eab.com. Retrieved March 8, 2024, https://eab.com/resources/infographic/what-happens-to-stopouts-improve-graduation-rates/
  9. Evans, T.N. (2009). An investigative study of factors that influence the retention rates in online programs at selected state, state-affiliated, and private universities (Doctoral dissertation). Retrieved from ProQuest Dissertations and Thesis database. (ProQuest document ID: 1937608371).
  10. Fung, C. Y., Perry, E. J., Su, S.I., and Garcia, M.B. (2022). Development of a socioeconomic inclusive assessment framework for online learning in higher education, (23-46). Doi:10.4018/978-1-6684-4364-4.ch002
  11. Greenhow, C., Graham, C.R., & Koehler, M.J., (2022). Foundations of online learning: Challenges and opportunities, Educational Psychologist, 57(3), (131-147). DOI: 10.1080/00461520.2022.2090364
  12. Hamilton, I. (2023, May 24). By the numbers: The rise of online learning in the U.S. Forbes. https://www.forbes.com/advisor/education/online-colleges/online-learning-stats/#:~:text=Online%20colleges%20and%20universities%20enroll,Around%2030%25%20studied%20exclusively%20online.
  13. Herbert, M. (2006). Staying the course: A study in online student satisfaction and retention. Online Journal of Distance Learning Administration, 9(4). Retrieved from http://www.westga.edu/~distance/ojdla/winter94/herbert94.htm.
  14. Hussain, S., Khan, M.Q. Student-Performulator: Predicting Students’ Academic Performance at Secondary and Intermediate Level Using Machine Learning. Ann. Data.Sci. (2021). https://doi.org/10.1007/s40745-021-00341-0
  15. Jolly, S., Griffith, K. A., DeCastro, R., Stewart, A., Ubel, P., & Jagsi, R. (2014). Gender differences in time spent on parenting and domestic responsibilities by high-achieving young physician-researchers. Annals of internal medicine, 160(5), 344–353. https://doi.org/10.7326/M13-0974
  16. Kara, M., Erdoğdu, F., Kokoç, M., & Cagiltay, K. (2019). Challenges Faced by Adult Learners in Online Distance Education: A Literature Review. Open Praxis, 11(1), 5. https://doi.org/10.5944/openpraxis.11.1.929
  17. Lee, Y., & Choi, J. (2010). A review of online course dropout research: implications for practice and future research. Educational Technology Research and Development, 59(5), 593–618. https://doi.org/10.1007/s11423-010-9177-y
  18. Martin, F., Chen, Y., Moore, R. L., & Westine, C.D. (2020). Systematic review of adaptive learning research designs, context, strategies, and technologies from 2009 to 2018. Educational Technology Research and Development, 68, 1903-1929. https://doi.org/10.1007/s11423-020-09793-2.
  19. Mowreader, A. (2024, January 12). Online learners less likely to complete compared to peers. Inside Higher Ed | Higher Education News, Events and Jobs. https://www.insidehighered.com/news/student-success/academic-life/2024/01/12/online-learners-less-likely-complete-compared-peers
  20. Muljana, P.S. & Luo, T. (2019). Factors contributing to student retention in online learning and recommended strategies for improvement: A systematic literature review. Journal of Information Technology Education: Research, 18,19-57. Doi:10.28945/4182
  21. National Center for Education Statistics. (2023). Postbaccalaureate Enrollment. Condition of Education. U.S. Department of Education, Institute of Education Sciences. Retrieved July 20, 2023, from https://nces.ed.gov/programs/coe/indicator/chb.
  22. National Student Clearinghouse Research Center, (2022). Some College, No Credential. (2022, May 10). https://nscresearchcenter.org/some-college-no-credential/
  23. Ouatik, Farouk, et al. “Predicting Student Success Using Big Data and Machine Learning Algorithms.” International Journal of Emerging Technologies in Learning (iJET), online-journals.org/index.php/i-jet/article/view/30259. Accessed 27 Feb. 2024.
  24. Parvez, R., Tarantino, A., & Meerza, S. I. A. (2023). Understanding the Prediction of Student Retention Behavior during COVID-19 Using Effective Data Mining Techniques. In Research Square - Preprint.
  25. https://doi.org/10.21203/rs.3.rs-2948727/v1
  26. Parvez, R., Meerza, S. I. A., & Khan Chowdhury, N.H. (2020). Economics of student retention behavior in higher education. In Proceedings of the Annual Meeting—Agricultural and Applied Economics Association (AAEA), Kansas City, Missouri. https://doi.org/10.22004/ag.econ.304405
  27. Parvez, R., & Brown, K. (2019). An Empirical Assessment of Student Retention in Community Colleges. In Proceedings of the Annual Meeting—Association of Institutional Research (AIR) Forum, Denver, CO.
  28. Paulino, R. A., & University, K. (2022, September 15). Analyzing distance learning and flexible offerings during the covid-19 ERA and the possible impact on future practices. Home. https://www.naspa.org/blog/analyzing-distance-learning-and-flexible-offerings-during-the-covid-19-era-and-the-possible-impact-on-future-practices
  29. Prinsloo, P. (2022). Improving student retention and success: Realizing the (im)possible.  West African Journal of Open & Flexible Learning, 1(11), 127-134.
  30. Rodgers, M. (2024, February). Student Caregivers. Hanover Research. https://hanoverresearch.com
  31. Salim Muljana, P., & Luo, T. (2019). Factors contributing to student retention in online learning and recommended strategies for improvement: A systematic literature review. Journal of Information Technology Education: Research, 18, 019–057. https://doi.org/10.28945/4182
  32. Seaman, J. E., Allen, I. E., & Seaman, J. (2018). Grade increase: Tracking distance education in the United States. Babson Survey Research Group.
  33. Seery, K., Barreda, A., & Hein, S. (2021). Retention strategies for online students: A systematic literature review. Journal of Global Education and Research, 5(1), 72-84. https://www.doi.org/10.5038/2577-509x.5.1.1105
  34. Stokes, K. (2023, June 20). Voice of the online learner 2023. Wiley. https://universityservices.wiley.com/voice-of-the-online-learner-2023/
  35. Turnbull, D., Chugh, R., & Luck, J. (2021). Transitioning to E-learning during the COVID-19 pandemic: How have higher education institutions responded to the challenge? Education and Information Technologies, 26, 6401–6419. https://doi.org/10.1007/s1063 9-021-10633 -w
  36. Xing Xu, Jianzhong Wang, Hao Peng, Ruilin Wu, »Prediction of academic performance associated with internet usage behaviors using machine learning algorithms», Computers in Human Behavior, 98 (2019) 166–173. https://doi.org/10.1016/j.chb.2019.04.015
  37. Zhang, Ling, et al. “Academia’s responses to crisis: A bibliometric analysis of literature on online learning in Higher Education during COVID‐19.” British Journal of Educational Technology, vol. 53, no. 3, 12 Feb. 2022, pp. 620–646, https://doi.org/10.1111/bjet.13191.

*This paper was one of three selected as a "Best Paper" among DLA 2024 proceedings, Jekyll Island, Georgia, July 28-31, 2024.