A Classification and Prediction Model for Student’s Performance in University Level

: Educational Data Mining is a new discipline, focusing on studying the methods and creating models to utilize educational data, using those methods to better understand students and their performance. We implemented two different techniques on our dataset; classification used to build a prediction model and association rules were used to find interesting hidden information in the student’s records. This study will help the student’s to determine their direction and improve when necessary to cope up with their studies. It also provide a great tool to predict and evaluate those students who need attention and correction actions and find out any deviation before it happen and become a decrease in performance and reduce failure rate.


Introduction
Knowledge Discovery in Database (KDD) is defined as the "extraction of implicit, unknown and potentially useful information from data''. The word implicit means that we are looking for information that is contained in the database and unknown stands for a result or information we did not expect to have before. KDD consist of many steps and one of them is data mining. The knowledge discovery process takes the raw results from data mining and transforms them into useful and understandable information which can be used in different implementations and decisions making processes. Knowledge discovery is a multisteps process, these processes include data integration, preparation and transformation, data mining as well as evaluation of the results of the data mining process, those processes can be iterative and every time the results would be enhanced or a new information could be discovered, Geist (2002).
Data mining, the process of extracting any hidden predictive information from large databases, is a dominating new technology with promising potential to help companies focus on the most important information in their data warehouses and decision making by decreasing time and providing new frontiers and aspect never thought of before. Data mining tools (association rules, clustering) predict future trends and behaviors, allowing businesses to make practical, knowledge-driven decisions. "Higher Educational Institution (HEI) is greatly concern on the student's enrollment data to understand the influence on student's decision to attend their institution and on their study's information to check their performance" Abu Haris et al. (2016).
Data mining commonly involves four classes of task: • Classification -Arranging the data into predefined groups or classes. For example a university classifies its students into undergraduate and postgraduate. Common algorithms include nearest neighbor, Naive Bayes classifier and neural network • Clustering -another form of classification where the arranging of items is not predefined, so the algorithm will try to group similar items together according to a center or cluster point • Regression -Attempts to model the data into a function that represent the whole sample with least error. A common method is to use Genetic Programming • Association rule learning -Searches for similarities among large database. For example a supermarket might collect data related to customers and using association rule learning, it can find out what products are most likely to be bought together, this kind of information can be used for marketing purposes and to improve the sales process, Agrawal et al. (1993) Educational Data Mining is a new discipline, focusing on studying the methods and creating models to utilize the educational data, using those methods to better understand students and their performance.
Educational data mining is an emerging stream where students, academics and research analysts can use. It provides students with tools and measures to check their performance, it also provides academic with indicators and prediction for student's performance. Researchers find it very interesting basis to build applications and implementations. Provided that educational data is not used recently and it's a very promising field.
"The main Goals of educational data mining are (EDM, 2017): • Predicting students' future learning behavior by creating student models that incorporate such detailed information as students' knowledge, motivation, metacognition and attitudes • Discovering or improving domain models that characterize the content to be learned and optimal instructional sequences • Studying the effects of different kinds of pedagogical support that can be provided by learning software • Advancing scientific knowledge about learning and learners through building computational models that incorporate models of the student, the domain and the software's pedagogy"

Related Work
Educational data mining is a new and interesting area of research, recently it gained its popularity due to the vast amount of data available in the educational process(which can be mined) and the increase emphasize on quality of education in university levels. Tools and models are required nowadays to identify average and poor students, correction and detection steps can be done before it's too late. Baradwaj and Saurabh (2011) created an ID3 classifier to classify the student's division on a 50 student's database collected from one university, 7 attributes were used to build a tree that predict and classify student's end semester mark. This study uses the classification ID3 algorithm and entropy of the data in education to help the students and teachers to improve the division of the students. Santillan et al. (2016) built an incremental interaction system to predict student's performance, data was gathered from two different years, three different classification algorithms were used to classify the same data and a comparison was made between the three output. In our study we used classification and association rules mining, it tends to be more accurate and efficient to use more than one data mining aspect in order to analyze the data. Widyahastuti et al. (2017) used a different technique; using linear regression to predict the student's performance by monitoring and using of an online discussion forum, in their proposed prediction model the user is the main input of the dataset, the dataset was divided into three different parts: Online discussion, forums and course assessment. 11 features were used as an input to the data mining tool. A correlation analysis was conducted and the result was discussed and explained. Ahadi et al. (2017) did an interesting study on student's performance during the whole semester (a week by week and assignment by assignment basis) and compared the student's performance towards the end of the semester. Students were clustered based on their weekly performance. The study showed that the student's performance declines when it comes towards the end of the semester and that was due to the nature of studying programming subjects which increase in its difficulty week by week. Daud et al. (2017) prepared a study that tackled the student's performance prediction problem, data has been collected from graduate and undergraduate universities which made up a 3000 record, after preprocessing the number of records was reduced to around 700 records. The study tried to answer the question of will the student complete his study or not. A feature analysis was conducted and feature spaces was selected from relevant attributes. Four categories of attributes were introduce and the influence of those categories on the student's performance was the result of that study. Maja et al. (2015) used association rule mining in education, using the data extracted from a learning and management system (Moodle), taking one subject as a testing and studying case using 77 records. Five attributes were selected for the mining process, the grade of the student was the class label.
After the mining algorithm was implemented on the dataset, a number of rules were generated with a specified support and confidence. The rules generated can help to give an indication of student's performance in general.

Data Preparation and Data Mining
Educational Data Mining is concerned with developing models for exploring and analyzing the vast amount of (unused) data that come from educational institutions and using hose models to better understand students, predict and help them to perform better in their study.
Nowadays, student's performance is measured by internal assessment methods such as midterm exams, quizzes, final exams and assignments, student's results should be above a certain mark to pass the subject.
Academic institutions (Schools and universities) generate huge data on students, courses, faculty, staff that includes managerial systems, organizational personnel, lectures details and so on. This data is the base for any data mining application, the input to any academic institution for improving the quality of education process, Ranjan and Malik (2007).
The data set of 242 students used in this study was obtained from the college of art and science of the applied science university (Bahrain) from the session 2015/2016, the dataset was obtained from the registration department according the student's latest information and recorded into one main table.

Feature Selection
This is a fundamental filtering step to discard any of the irrelevant attributes and reduce the dimensionality of data while improving accuracy.
Using the Gain Ratio Attribute Evaluation, which evaluates the worth of an attribute by measuring the gain ratio with respect to the class. It rank all the attributes according to the importance. We choose 12 attributes to implement in our model as explained in Table 1.
Other attributes such as (parent's background knowledge, student's age, student's distance from college and long Vs short semester attributes among others were discarded as it was in the bottom of the ranked list.

Data Selection
Only those relevant fields were selected which were suitable for the data mining process. While some of the information for the fields were extracted from the database. The dataset was stored in a nominal format to suit the classification and association rules mining process. The Random Tree Classifier The random trees classifier is a powerful technique for classification in general which is resistant to over fitting and can work with segmented fields it perform the Random Trees classification on a field basis, based on the input training feature file.
Random Trees is a collection of individual decision trees where each tree is generated from different samples and subsets of the training data. The idea behind calling these decision trees is that for every field that is classified, a number of decisions are made in rank order of importance.
When you graph these out for a field, it looks like a branch. When you classify the entire dataset, the branches form a tree. This method is called random trees because you are actually classifying the dataset a number of times based on a random sub selection of training fields, thus resulting in many decision trees, Ali et al. (2012).
To make a final decision, each tree has a vote. This process works to mitigate over fitting. Random Trees is a supervised machine-learning classifier based on constructing a multitude of decision trees, choosing random subsets of variables for each tree and using the most frequent tree output as the overall classification.

The Apriori Algorithm
When association rule mining was first introduced by Agrawal et al. (1993) an algorithm called AIS, Agrawal and Ramakrishnan (1994) was given for discovering the large itemsets. However, the AIS algorithm is not efficient, since it generates too many unnecessary candidates.
In the following year, the Apriori algorithm was proposed, which improves the performance from AIS by reducing the number of unnecessary candidates. Also, an OCD algorithm with a similar approach was proposed by Hipp et al. (2000) concurrently.

Results and Discussions
The data set of 242 students used in this study was obtained from the computer science department of the applied science university (Bahrain) from the session 2015/2016.
To better understand the data used in out experiment, we chose the fields according to students and professors surveys, field relevance selections and according to the factors affecting student's performance reviews. Only useful and relevant field which we found out that it affect student's performance were used.
We first implemented the Random Tree Classifier on the dataset and come out with the Model in Table  2, which can be used to predict and evaluate any student against his/her performance during their study in the university.
The Size of the tree was 73 and we choose a Maximum depth of tree: 3, the number of the tree's depth can be modified, it can be reduced to 2 or expand to more than 3.
The model clearly show the importance of the amount of study time the student spend as it is the root node and it was used to classify the model according to its attributes and as a general note it shows that the student's previous GPA is not an indication of what might happen in the following semesters.
The academic advisory visits and the follow up with the supervisor plays a major role in student's performance as it clearly shows that students who have frequent visits to their academic supervisor tend to perform well in their academic results.
The major of the high school plays an interesting role in this model, students majoring in science stream tends to perform well and students majoring in industry stream tends to perform less than expected and need more time to study to follow up with other student's streams.
The complete model can be transformed into an ifthen rule where the current students can be checked for their performance and any new student's information can be used to predict his performance using this model.

Association Rules Mining
The same dataset then have been used in the process of association rules mining, where large item sets are created and intersected in order to generate an interesting rules for student's performance.
The Best rules found according to our minimum support of 100% and a confidence of 95%. We reduced the minimum support so we can generate a vast amount of rules. Then we increase the confidence to only consider those rules with high impact and relevance. A sample of association rules is presented below.

Cross Validation Analysis
10-fold Cross validation is used to validate the robustness of the mining model. 70% of the data were used as a training data set and the remaining 30% were used as a testing data set. The results in Fig. 1 shows an interesting results for the model validation process.

Conclusion
In this study, we presented a model for predicting and evaluating student's performance in education. We implemented two different techniques on our dataset, classification used to build a prediction model and association rules were used to find interesting hidden information in the student's records.
This study will help the student's to determine their direction and improve when necessary to cope up with their studies. It also provide a great tool to predict and evaluate those students who need attention and correction actions and find out any deviation before it happen and become a decrease in performance and reduce failure rate.