Students' retention on online learning: Establishing a predictive model at a private university in Vietnam

Low levels of student retention have become one of the most significant issues that online learning has brought about. Through the literature review, most studies have pointed out some factors contributing to student retention in online learning environments; however, few have focused on establishing a model that minimizes student dropout rates. Hence, this paper aims to formulate a predictive model to tackle this issue. Through the quantitative survey design and the PSL-SEM approach in data analysis, the research involved 100 students. After analyzing the data, it is suggested that some factors and their relationship with student retention. These were Academic locus of control, Flow experience, Satisfaction, and Learning strategies. Also, this study indicated that to improve the students’ retention in online learning, Student satisfaction should be paid more attention rather than the others. accumulated various educational has equipped ample confidence and skills in language teaching. research Communities, and Teaching - Learning Practices. Nguyen, and presented at some conferences in the field of language learning and teaching. His recent papers were “Constructive Alignment in Teaching English at Tertiary Level: An Insight into an AUN-Designed Course at Van Lang University” (published in the proceeding of OPENTESOL2020 -hold by HCMC Open University) and “The Implementation of E-Learning into Language Learning: A Case of English Majors at Van Lang University” (published in the proceeding of AsiaCALL2021). He has participated in teaching both English and Non-English majors. His research interests are Curriculum Development and Language Assessment.


Introduction
Online learning, without a doubt, has been considered a global phenomenon since a long time ago. With the support of rapid technological advances, online learning has overcome its obstacles and has been accepted worldwide (Sorensen & Donovan, 2017). Many articles have recently praised online learning as the key to the new era due to its benefits. Notably, online learning could bridge the gap among areas within a country and beyond, provide a flexible learning environment (J. Watson & Johnson, 2011); develop critical and technological skills (Ngo, 2021;Wardani, Martono, Pratomo, Rusydi, & Kusuma, 2018); enhance the traditional classroom (Fadde & Vu, 2014;Tran & Nguyen, 2022). In the case of the institution employing online learning, Tareen and Haand (2020) list online learning as a new educational market with high profits and stable growth in the 21 st century. However, some requirements and features of online learning, such as suitable infrastructure, different patterns of interactions, and learner autonomy, could lead to severe problems for educational organizations, including school dropouts (Hamid, Sentryo, & Hasan, 2020). Therefore, it is suggested that student retention for every country to maintain a qualified workforce. Hanson (2021) presents that the college dropout rate is 40%, of which 30% are freshmen before they finish their first year. In some Asian countries, such as Japan, it is reported that from April to December 2020, the number of student dropouts is 1,300 (Kakuchi, 2021). In Vietnam, VNS (2017) proposes that nearly 20 percent of students do not complete their last year in college.
Within the context of online learning and COVID-19, the situation seems to be more critical. Online learning offers more opportunities for students to participate in the learning environment via various tools, especially in the current context. However, many complain that online interactions are incompatible with all students. As a result, college dropouts become a challenge for online learning compared to traditional face-to-face classrooms (Mubarak, Cao, & Zhang, 2020;Pham & Van Nghiem, 2022;Simpson, 2018). Furthermore, with the sudden shift from offline to online mode, many higher education institutions are losing school profits due to this issue (Kakuchi, 2021). If the situation could not be improved, the school investment in online learning would be discouraged. Therefore, it is necessary to investigate students' dropout intentions.

Factors impacting school dropouts in online learning at the tertiary level
Many scholars have indicated the factors that affect the school dropouts in online learning in different higher education contexts. For example, Rovai (2003) proposes that school dropouts in college students are a complicated phenomenon that needs careful attention. The author concludes with a model consisting of four main factors: student styles before school entrance, student competence before school entrance, and external and internal influences after entering school. Additionally, H. Choi (2016) presents some factors in college dropouts, i.e., the learner (age, gender, social status, self-motivation), external factors (social encouragement, family finance, and support, personal problems), internal factors (academic performance, technological and motivational issues), and outcomes (GPA). It is apparent that these factors vary in each study; however, it could be concluded that they are mainly related to the teaching and learning environment, the learners, and the school assistance. In an attempt to give an overall look, Y. Lee and Choi (2013) draw out a five-latten-variable model, including (1) internal academic locus of control (ALOC) (the students' control of their learning), (2) student satisfaction (students' satisfaction towards their learning and related issues), (3) student flow experience (students' deep engagement in their learning activities), (4) use of learning strategies (the students' strategies employed in their learning process), and (5) student retention (the persistent of completing the online course). Additionally, Chongbang N. (2021) explains how parents' socioeconomic condition also affects the affordability, availability, and accessibility of virtual learning.
The locus of control was firstly proposed by Rotter (1954) as the personal control of a particular result. There are two aspects of perception: internal (if that person believes that their action causes the effect) and external (if that person assumes that the result is made by chance or other people's behavior. Concerning the educational context, Findley and Cooper (1983) confirm that students with the internal academic locus of control have better academic performance than external ones. This is because, with the high internal academic locus of power, the students are more satisfied with their results and try to avoid failure.
Student satisfaction: The focus on student satisfaction arises from bringing the client-customer relationship to education (Mark, 2013). Elliott (2002) and Wu, Tennyson, and Hsia (2010) define student satisfaction as the result of the educational experience a particular institution gives to students. Student flow experience: Csikszentmihalyi and Csikzentmihaly (1990) conclude that flow is a state of being involving the deep engagement in an activity. In terms of education, student flow experience refers to the intensive focus on a learning process that causes a student to be successful.
Use of learning strategies: Learning strategies refer to perceiving, storing, and using the given information (Alexander, Graham, & Harris, 1998). McKeachie (1987) states that learning strategies could be classified according to student cognition levels, such as cognitive strategies and metacognitive strategies.
Student retention: Villano, Harrison, Lynch, and Chen (2018) propose student retention as the decision of remaining at school. In other words, it indicates the continuity of student education at an educational institution.
These latter variables are concluded by reviewing a variety of studies related to factors impacting student retention as well as their interrelationship (Joo, Joung, & Sim, 2011;Keller & Blomann, 2008;E. Lee, 2001;Levy, 2007; Morris, Wu, & Finnegan, 2005;Ro & Guo, 1988;Zimmerman, 1990). Also, via reviewing these studies, the relationship among these variables was established. Firstly, Rotter (1966) and Gianakos (2002) point out the relationship between internal locus of control and job satisfaction. In terms of educational context, Morris et al. (2005) confirm the influence of internal academic locus of control on student persistence in online learning courses. These authors also indicate that this variable effectively predicts students' course completion.
What is more, Levy (2007) adds the relationship between student retention and student satisfaction. Remarkably, he states in his research that the fewer students are satisfied with their online courses, the more they are likely to quit. Finally, in terms of locus of control, the study of Y. Lee and Choi (2013) presents that student satisfaction has a mediating effect on the relation between the student's internal ALOC and retention.
Additionally, Joo et al. (2011) andKeller andBlomann (2008) identify the relationship between internal ALOC and flow experience. Specifically, providing that a person has more ALOC, they are likely to attain a high level of flow experience. Moreover, Shin (2006) and Joo et al. (2011) propose that flow experience positively affects student satisfaction. Mainly, in Joo et al. (2011), the mediating effect of flow experience in the relationship between ALOC and student retention exists.
E. Lee (2001) finds out that learning strategies positively impact students' flow experience. The study concluded that learning strategies also affect student satisfaction, which is affected by the mediator "flow experience." In other words, learning strategies enhance student satisfaction via deep involvement in the learning process (flow experience). In addition, the learning strategies mainly deal with internal factors such as learning control (Gall, Gall, Jacobsen, & Bullock, 1990;Pintrich, 1988). As a result, student learning strategies are affected by internal ALOC.

Research Question
By looking closely at the current situation and the literature gap, this study aims to identify the predictive factors affecting students' retention in online learning within the context of a private university in Vietnam. Therefore, the following research question was formulated: What factors affect students' retention in online learning at a private university in Vietnam?
Hypotheses and conceptual framework of the study From the above review and the study of Y. Lee and Choi (2013), this study aimed to re-examine the effects of internal academic locus of control (ALOC), student satisfaction, student flow experience, and use of learning strategies on student retention. Moreover, as mentioned in the literature review, the mediating roles of Student satisfaction and Student flow experience were considered. Hence, five hypotheses were established: H1: Internal ALOC, student flow experience, and student satisfaction have a positive effect on retention.
H2: Internal ALOC and student flow experience positively affect student satisfaction. Also, the conceptual framework or the structural model of the study was drawn thank to these hypotheses:

Setting and sampling method
The research was conducted at Van Lang University (VLU), particularly in the Faculty of Foreign Languages. The context of VLU and this faculty were familiar with the researcher; therefore, it was easier to reach the participants. In brief, due to COVID-19, all the VLU students had to change the form of learning from offline to online. Also, to recruit the participants, the convenience sampling method was used in the study.

Participants
The participants were mostly students at Van Lang University, especially at the Faculty of Foreign Language, where the researcher was working. After the distribution of the questionnaires via the convenience sampling method, 162 responses from the participants were recorded. A detailed description of the participants was included in the next section.

Research design
The quantitative survey was the design of this research. According to R. Watson (2015), quantitative research aims at exploring the phenomena using statistical approaches. Additionally, Straits (2005) and Creswell (2014) suggest that a survey study investigates human attitudes and opinions of participants through the responses to a set of questions. As a result, this design was suitable for identifying the factors that impact student retention in online learning.

Research Instrument
The main instrument of this study was a close questionnaire. In this research, the questionnaire was adopted from the study of Y. Lee and Choi (2013). Some modifications were made to suit the current context, such as the institution's name and courses. Briefly, the questionnaire consisted of 23 items which covered six constructs as presented in the following table: Table 1.

No.
Constructs Number of items Construct 1 was designed into two multiple-choice questions. For the rest, all the items were responded to through a 5-point Likert scale, from 1 (strongly disagree) to 5 (strongly agree).

Data collection and analysis
Firstly, the questionnaire was distributed to all the participants via M.S. Teams, LMS of the university, and other social networks regarding the data collection process. Then, responses were collected for analysis. Finally, the PLS-SEM approach processed all the data via Smart PLS software. The PLS-SEM approach is a suitable method to identify the relationship among different latten variables and validate measurement and structural models (Hair, Ringle, & Sarstedt, 2013;Sarstedt, Ringle, & Hair, 2017). According to these authors, there are three main stages in the data analysis, including (1) coding the data, (2) assessing the measurement model, and (3) assessing the structural model. In this research, these steps were conducted with the addition of hypotheses testing in the last stage.
Notably, in light of coding data, all the indicators of each latten variable were coded in the following table: 156 Next, the measurement model was assessed in terms of its reliability and validity. Hair et al. (2013) suggest that in this stage, the following statistical indexes were employed: Outer Factor Loading, Construct Reliability (C.R.), Convergent Validity (AVE), and Discriminant Validity. Finally, in assessing the structural model, the bootstrapping techniques with 5,000 resamplings were used to test all the statistical hypotheses through a t-test value with a significance of 0.05.

Validity and Reliability
The validity and reliability of the study were ensured by employing piloting and statistical techniques. Firstly, the measurement model and the questionnaire were validated from the study by Y. Lee and Choi (2013). Additionally, before distributing to the participant, 50 responses were recorded as part of the piloting process. Then, some minor adjustments were made to create the final version of all items in the questionnaires. The statistical indexes to assess the validity and reliability were also used and will be presented in the following section.

Results/Findings and discussion
Descriptive statistic

Assessing the measurement model
In the stage of assessing the measurement model, firstly, the factor loadings were examined in order to eliminate the unsatisfied indicators. The factor loading of each indicator should be 0.7 and above. Therefore, in the current model, the indicators "SAT6", "RE1", "RE4", and "FE2" were eliminated.
After the removal of unsatisfied indicators, Hair et al. (2013) and Hair, Risher, Sarstedt, and Ringle (2019) suggest that to validate the measurement model, the following statistical indices should be assessed: • Composite Reliability (C.R.) • Average Variance Extracted (AVE) • Discriminant Validity • Cross loadings of each latten variable • The HTMT matrix The results of all these indices were presented in the following sections with all the essential explanations. According to Hair et al. (2019), in a measurement model, C.R. must be above 0.78, and AVE should be 0.5 and above. In table 4, all the figures of C.R. and AVE of the current model met the requirements. Regarding cross-loadings, it is required that the square root of a construct should be higher than its correlation with any other constructs. According to table 5, all the figures reach the standard. The HTMT matrix was examined in order to ensure that the set of indicators of one construct was separated from each other. Henseler, Ringle, and Sarstedt (2015) propose that all the figures of the HTMT matrix should be smaller than 0.78. From table 6, all the figures of the model met the requirement.
Based on the results of statistical indices, it was apparent that the measurement model of the research was valid.
Concerning Collinearity, VIF values were examined. In the current model, the maximum VIF value was 2.847, which was smaller than the threshold of 3.3 (Roberts & Thatcher, 2009). As a result, there is no risk of Collinearity. Then, when assessing the coefficient of determination (R2 and R2 Adjusted), the figure was above 0.25 and smaller than 0.5.  (2015), the in-sample predictive power of the current model was primarily moderate, except for the variable "Learning Strategies." In terms of f² effects size of path coefficients, the results were presented in the following table: According to Cohen (2013), the size of the f² effect should be above 0.02 to indicate the significant impact of an input variable on the output one. Therefore, from table 7, it was concluded that nearly all variables had the power to explain the other variables, such as "Student Flow Experience" -"Student Satisfaction," "Internal ALOC" -"Learning Strategies," etc. However, the variable "Student Flow Experience" was not considered as effectively explain the variable "Student Retention." To identify the model's predictive power within the samples in the research, Predictive relevance Q2 was examined (Dolce, Vinzi, & Lauro, 2017) via the Blindfolding process.  Tenenhaus, Vinzi, Chatelin, and Lauro (2005) propose that if the Q2 values of all latten variables are above 0, the structural model reaches the global quality. Also, Hair et al. (2019) recommend that if the Q2 value is from 0 to below 0.25, the predictive power is low, and if this value is from 0.25 to 0.5, the power of prediction is moderate. In the current model, all the Q2 values were above 0, and most of them were smaller than 0.25, except the Q2 value of the variable "Student Satisfaction." Consequently, the structural model has a global quality, and the within-sample predictive power was low.

Hypothesis tests
The hypothesis test result was conducted by running Bootstrapping technique with 5,000 resampling with a significance of 0.05. In brief, there were four hypotheses in the research, including: H1: Internal ALOC, student flow experience, and student satisfaction positively affect retention.
H2: Internal ALOC and student flow experience positively affect student satisfaction.
H3: Internal ALOC and learning strategies positively affect student flow experience.
H4: Student satisfaction mediates the positive effects of Internal ALOC on student retention.
H5: Student flow experience mediates the positive effects of learning strategies on student satisfaction. Table 10.

Hypotheses Relationship between variables T-Values P-Values Result H1
Internal ALOC -> Student Retention As (Hair et al., 2019); Kock (2016) suggests, a supported hypothesis must satisfy two conditions: (1) the t-value is higher than 1.96, and (2) the p-value is smaller than 0.05. In table 10, it was apparent that all the hypotheses satisfied the requirements. Hence, all the proposed hypotheses were supported. However, there were some comments on each hypothesis in light of the combination of f² effects size. According to Cohen (2013), Hair et al. (2019), and H. Nguyen and Vu (2020) the f2 should be above 0.02 for low effects on output variables and above 0.15 for moderate effects. In table 11, in combination with the path coefficients, it was readily that the relationship of "Student Flow Experience -> Student Retention" (in H1) was the least powerful compared to the others.
What is more, regarding hypotheses H4 and H5, the mediation effects were examined. Notably, there were two mediating variables, including the variable "Student Satisfaction" in H4 and "Student Flow Experience" in H5.  According to Hair et al. (2019), if a x b is not 0 and the indirect effect is significant, the moderating effect is partial mediation. In the current study, this requirement was satisfied. In other words, the variable "Student Flow Experience" partially affected the relationship between learning strategies and student satisfaction.

Discussion
The current research investigated the predictive factors affecting student retention in an online learning environment at a private university in Vietnam. Notably, the structural model was adapted from the study of Y. Lee and Choi (2013), which indicated that there six variables and their relationships were proposed into five different statistical hypotheses: H1: Internal ALOC, student flow experience, and student satisfaction have a positive effect on retention.
H2: Internal ALOC and student flow experience positively affect student satisfaction. 165 H3: Internal ALOC and learning strategies positively affect student flow experience.
H4: Student satisfaction mediates the positive effects of Internal ALOC on student retention.
H5: Student flow experience mediates the positive effects of learning strategies on student satisfaction.
After examining the responses from 162 participants, this research generally confirmed the original study of Y. Lee and Choi (2013), which examined the exact relationship among five variables, including Internal ALOC, student flow experience, learning strategies, student satisfaction, and student retention. To be more specific, all the hypotheses tested were supported through statistical assessment and analysis.
What is more, hypothesis H1 was confirmed in the study, which presented the positive effect of Internal ALOC and student flow experience on student retention (as found in the studies of Morris et al. (2005), Shin (2006), and Joo et al. (2011)). Specifically, all the indicators related to the student scores showed strong path loading factors, as Morris et al. (2005) found that student academic performance plays a significant role in their retention. In addition, Shin (2006), and Joo et al. (2011) indicate that student flow experience and Internal ALOC significantly impact student retention.
Regarding hypothesis H2, the study's finding was similar to Gianakos (2002), and Shin (2006), i.e., student satisfaction was impacted by Internal ALOC and student flow experience. Notably, Gianakos (2002) revealed that Internal ALOC, concerning positive thinking in overcoming difficulties (the indicator ALOC3), fostered student satisfaction. What is more, Shin (2006) concludes that student flow experience predictively manipulated student satisfaction. Providing that the more students concentrated on the lesson, the more satisfied they got.
For the last direct relationship among three variables, "Internal ALOC," "learning strategies," and "student flow experience", the study pointed out that Internal ALOC and learning strategies have a positive impact on student flow experience. Joo et al. (2011) andKeller andBlomann (2008) state that the direct impact of Internal ALOC on student flow experience was of significance. They conclude that ambitious students in achieve high academic performance engaged themselves more in their learning activities. Additionally, Joo et al. (2011) andE. Lee (2001) consider different ways of study positively affect student flow experiences. Also, the study confirmed the study of Y. Lee and Choi (2013) that there were the moderating effects of Student satisfaction and Student flow experience on the relationship of Internal ALOC on student retention and Learning Strategies and Student Satisfaction, respectively.
Besides these similarities to the previous studies, this research also proposed some significant findings which manifested the in-depth exploration of the relationship of all variables in the model. Firstly, unlike the study of Y. Lee and Choi (2013), this research confirmed the impact of student flow experience on student retention. Additionally, the strength of the relationship between variables was examined via f² effects size. Notably, within all relationships stated in the hypotheses, the effect of Student Flow Experience on Student Retention was the least powerful in the research context.
Moreover, in the original article of Y. Lee and Choi (2013), the moderating role of the variables "Student Satisfaction" and "Student Flow Experience" were stated. However, the specific types of moderation were not mentioned. This study bridged this gap by indicating clearly that the moderation of Student Satisfaction on the relationship between Internal ALOC and student retention was complimentary. Specifically, Student Satisfaction strengthened this relationship. Finally, the effect of the moderating variable of student satisfaction on the relationship between learning strategies and student satisfaction was partial mediation. In other words, student satisfaction partly manipulated the effect of learning strategies on student satisfaction.
Regarding the power of prediction, based on the result of the Predictive relevance Q2, Student Satisfaction had the strongest power of prediction. From these results and findings, it was easier for the stakeholders to pay more attention to the specific variables in the structural model. For example, from the current data, the path coefficient of Student Satisfaction to Student Retention was more substantial than that of Internal ALOC and Student Flow Experience. In addition, Student Satisfaction had the most vital predictive power within the current context. Therefore, more consideration for student satisfaction should be taken in order to increase student retention.
To sum up, the findings of the data confirmed the factors affecting students' retention in online learning in the current context, including Internal ALOC, student flow experience, learning strategies, and student satisfaction. Also, among these factors, student satisfaction has the strongest power to predict whether the students continue their online learning or not.

Conclusion
This study was conducted to explore the factors affecting student retention in online learning. To achieve this purpose, the model of Y. Lee and Choi (2013) was employed, and five hypotheses indicated the relationship between Internal ALOC, Student Flow Experience, Student Satisfaction, Learning Strategies, and Student Retention was formed. Notably, the first three hypotheses related to the direct effects of variables, and the others concerned the moderation effects of Student Satisfaction and Student Flow Experience. Through the quantitative survey research design, 162 participants were involved, and the data were analyzed using the PLS-SEM approach with SmartPLS software. After completing data analysis, the research showed that all the hypotheses were supported, which made a solid confirmation of the previous studies. In terms of a new contribution to the literature, firstly, the study pointed out the effects of student flow experience on student retention, which was not mentioned in the study of Y. Lee and Choi (2013). Additionally, the types of moderation were mentioned, i.e., (1) Student Satisfaction strengthened the relationship between Internal ALOC and Student Retention, and (2) Student Satisfaction totally affected the effect of Learning Strategies on Student Satisfaction. Lastly, the study suggested that Student Satisfaction had the strongest power in predicting student retention. In other words, the stakeholder should pay more attention to this variable to keep students completing their studies.
Besides these significant findings, the research is limited in some aspects. First of all, due to the complicated of COVID-19 in the country, the researcher could not reach a larger number of participants. What is more, the sampling method for hypothesis testing would be better if the researcher employed the random sampling method; nevertheless, as stated above, it is hard for the researchers to do this because of the COVID-19 context. Hence, it is suggested that future research could involve more participants and a random sampling method in the related studies. Secondly, the result of the study might not be generalized to other contexts. Notably, the study established and validated the model for predicting student retention; however, the predictive power tested was in-sample only. Consequently, to use this model for predicting student retention, the same data analysis process should be replicated. Finally, in this study, the number of variables included only five primary variables. As a result, to establish a more powerful model, other researchers need to pay more attention to other variables.