Implementation of Learning Analytics in MOOC by Using Artificial Unintelligence

: Massive Open Online Course (MOOC), a web-based e-learning tool, is growing to be used by current educational institutions. To prevent high non-passing rate, instructor needs to know which learner has the potential to pass the course or not. Learner who will fail the course also need advices immediately from instructor or system to overcome it. Learning Analytics (LA) is needed to collect and analyze learners’ activity logs on MOOC and predict their passing potential. The prototype application is developed by using Rational Unified Process (RUP) software development method. Implementation of LA in MOOC is feasible and suggested to analyze learners’ success factors by consuming learners’ activity logs and visualizing it in scatter diagram and node-link diagram. Instructor can provide advices to learners based on success factors generated by LA.


Introduction
Massive Open Online Course (MOOC) is a web-based LMS that usually provides online and free (open) courses and can accommodate a significant (massive) number of learners. So, it allows learners to learn anywhere, anytime and on any device. MOOC is believed to increase learners' engagement and learning outcomes and it starts growing to be used by current educational institutons (Yulianto et al., 2016a;2017).
MOOC contains learning contents such as slide, video, audio, textbook, picture and also provide assessment to learner (Layona et al., 2017). At the end of learning, learners will have exam to evaluate their learning outcomes. The evaluation result will be the reason to determine whether the learner will pass or fail the course.
Educational institutions expect every learner to pass all taken courses. To prevent high non-passing rate, instructor (or institution) needs to know which learner has the potential to pass the course or not. Learner who will fail the course needs advices immediately to overcome it (Yulianto et al., 2016b).
MOOCs or other e-learning tools is suggested to implement Learning Analytics (LA) to collect and analyze learners' activity logs (records) and predict their passing potential. LA is the collection, measurement, analysis, reporting and prediction of data about learners and their contexts. LA is purposed to understand and optimize learning and the environments in which it occurs (Siemens and Gasevic, 2012). LA involves predictive modeling and other advanced analytic techniques to generate learner's learning process and increase support needed to learner and instructor (Ruiz et al., 2014).
Chatti implements LA by using B technique to detect hidden pattern by using education data set (Chatti et al., 2012). There are 4 techniques such as statistics, information visualization, data mining and social network analysis.
LA is also implemented in some MOOCs such as SmartKlass, Blackboard Analytics and Open Academic Analytics Initiative (OAAI). Each method has different characteristics on cost, platform, or algorithm. SmartKlass provides machine learning algorithm. Blackboard Analytics gives web-based paid LA, can be used for general LMS and uses "black box learning" algorithm. Meanwhile, OAAI provides free LA but not for public, runs on Windows only, uses several machine learning algorithms such as decision tree, support vector machine and Bayesian Network.
Many data sets (factors) are used in LA to predict learning process. Erik conducted research of LA for online course with UNED COMA (Santos et al., 2014). He stated that activities in forum is related with student passing rate. Activity in forum also shows individual ability according to Milligan (2015).
Warburton found that absence and duration to do assignment relate directly to learner's final score. Meanwhile, seat position in class or sitting with specific group also affects positively to learner's final score, where time difference between laboratory and theory does not affect (Akhtar et al., 2017). Strang found that age, gender and culture did not affect directly to learner's final score, but login times, reading and quiz activity did (Strang, 2017).
Based on previous researches conducted ( Fig. 1), we can figure that grade (or final score) is commonly used by many systems as passing requirement (Layer 1). Some factors that influence grade directly are assignment, exam, or project score (Layer 2). Indirectly, factors that influence Layer 2 are activities recorded on systems technically (Layer 3) such as login time and count, post, submitted assignment, learning duration, etc. Last, factors that influence Layer 3 are hard to be recorded (quantified) such as age, gender, city, personality, learning style, previous skill, etc (Layer 4).
This study record and analyze data of Layer 3 (learning activities or logs) of learners whom pass and fail the course. After that, it gives recommendation to instructors and learners so they can take actions immediately to prevent failure (Yulianto et al., 2013).
Many (machine/deep learning) algorithms or statistics methods are applied in LA, such as decision tree, support vector machine, Bayesian Network, back propagation, simple linear regression, paired t-test, etc. All are known as part of artificial intelligence methods. The term 'intelligence' means one's capacity for logic, understanding, learning and problem solving. It can be stated as the ability to perceive information and to transform it as knowledge. Intelligence is commonly studied in humans, but currently has also been observed in inanimate such as machines or computers, known as 'artificial intelligence' (Albin, 2015).
Contradiction of artificial intelligence theory began to emerge. The results the machine analyzes, generates and predicts are the only data entirely rulebound that machine itself doesn't understand (Higgins, 1987;1988). So, term 'intelligence' is not appropriately used. Paradoxically, if making humans more intelligent is not an easy job, how to do to machine? We should never assume that computers always get things right (Broussard, 2018). So, the term 'unintelligence' is proposed as contradiction to 'intelligence' for machine. This study uses some artificial unintelligence methods to analyze, generate and predict learner's learning process.
Objectives of this study are using LA to consume recorded learners' activity logs on MOOC, analyze and generate learner's learning process. To visualize it, we implement a prototype application with scatter diagram to show students passing rate (Hai-Jew, 2015; Heymann and Le Grand, 2016) and node-link diagram by using simple linear regression and paired t-test statistics method to show learner learning success factors (Ward et al., 2010;Illinois, 2016). At the end, LA is used to predict whether learners will pass or fail the course by showing the score predication in table, by using some artificial unintelligence methods such as back propagation, support vector machine, multiple linear regression and decision tree.

Method
Literature study is used to get common algorithms or methods used in previous studies for implementing LA. These algorithms will be used in prototype application to be implemented in MOOC. Prototype application development method in this study uses Rational Unified Process (RUP). This study will not discuss deeper about RUP since it's not the objective and it can be easily explored online by readers. Fig. 1 Prototype application uses statistics methods such as simple linear regression and paired t-test to consume and analyze recorded logs from learners' activities on MOOC and generate success factors. To predict learners' learning result (score), it uses artificial unintelligence algorithms such as back propagation and support vector machine and statistics methods such as multiple linear regression and decision tree. The prototype application is developed by using Java and all the methods and algorithms are implemented by using JSAT library which is a library made by Edward Raff (2017a;2017b).

Proposed System
Prototype application is developed into two parts: front-end (web) and back-end (desktop). Front-end application is dedicated to guests, learners and instructors, while back-end is dedicated to administrator. Front-end application provides scatter diagram and nodelink diagram (Fig. 2) and works well on various browsers such as Chrome, Firefox, Safari and IE and on various OS such as Windows, Linux and MacOS. It also developed in separate 3 web programming languages, which is PHP, JSP and C# ASP.NET. So, institutions can select appropriate one to be implemented in their existing MOOC system. Scatter diagram is used to show students passing rate and node-link diagram is used to show learner learning success factors.
Back-end application provides setting to be configured by administrator. Configuration includes database (Fig. 3) to be analyzed (server location, tables and output), processing schedule (how many processes per day and on what day or date) and algorithm selection (for analyzing and predicting). It provides scatter and node-link diagram (Fig. 4) and prediction results (Fig. 5). Back-end application is developed on desktop-based and by using Java. It works well on Windows, Linux and MacOS.

Evaluation
Application was tested for prediction to know the accuracy of 4 methods. Evaluation is conducted by analyzing about 800 former learners' learning process logs (Layer 3). After that, equation models of each method are generated and tested to 20 new learners whom use internal institution MOOC system. Application will compare learners' actual and predicted score, then calculate the difference mean. The higher the difference has, the lower the accuracy is (Table 2).      Accuracy result shows that prediction by using BPNN has difference mean of 2.88, which is smaller than other algorithms. But, it does not mean that BPNN gives better accuracy result for prediction. It can depend on many factors and institution is suggested to try other algorithms.
Evaluation is also conducted to test back-end speed performance for analyzing and visualizing report in scatter and node-link diagram. There are 500 data used in this evaluation and 16 experiments for each. It needs about 21.6 milliseconds to analyze and visualize scatter diagram and about 91.4 milliseconds for node-link diagram.

Conclusion
LA is feasible and suggested to be implemented in MOOC to collect and analyze learners' success factors by consuming learners' activity logs and visualizing it in scatter diagram and node-link diagram. By this prototype application, LA is implemented and success factors is modelled by using simple linear regression and paired ttest statistics method. To predict learners' score is by using back propagation, support vector machine, multiple linear regression and decision tree. Instructor and institutions are expected to prevent immediately if a learner is suspected to fail the course.
For further researches, it's suggested to add more methods or algorithms to analyze learners' success factors and predict algorithm for student success, so that admin has more options and the result can be compared with another analysis and prediction method. Based on success factors, MOOC can provide recommender system and adaptive system. Recommender system is needed to give recommendation to learner what should do and do not to pass the course. Learner is freed to follow or not the recommendation. Adaptive system is needed to adjust the system based on learner profile .