Research on Efficiency of Applying Gamified Design into University’s e-Courses: 3D Modeling and Programming

: Research presented in this paper represents a further step towards proving the efficiency of gamification in higher education. Our research was conducted within two higher education institutions and includes full-time and part-time students who enrolled in the courses 3D modeling and Programming . Based on the research results, three hypotheses were tested. These hypotheses give a better insight into some psychological phenomes. The first hypotheses tested the level of knowledge in experimental and control groups for all students who achieved a minimum of 50% score in the pre-test. Our results confirmed the existence of statistically significant difference in the benefit of the experimental group. The other two hypotheses are spreading results even more. We analyzed 50% of the highest ranked and also 50% of the lowest ranked students’ score with the use of t-test. Based on our analysis of the average number of points on the post-test for participants with the lowest ranking we found no statistically significant difference. On the other hand, the same analysis for participants with the highest ranking shows, with statistically significant difference, that the experimental group achieved notably better score.


Introduction
Gamification for learning can be defined as usage of game-based mechanics, aesthetics and game thinking to engage people, motivate action, promote learning and solve problems (Enders, 2013;Kapp, 2012). Mechanics are the functioning components of the game, like a set of rules and feedback loops that make the game fun. Dynamics imply player's interactions with mechanics and aesthetics relate to how the game makes the player feel during those interactions (Enders, 2013;Zichermann and Cunningham, 2011). Gamification is applied in ecourses in order to increase participants' motivation, experience and engagement (Domínguez et al., 2013). It relies both on technology: E-learning systems on Web and mobile platforms; and psychology: Peoples' competition instinct and sense of pride and achievement (Glover, 2013).
The main motivation for this research is to extend the existing research results of the influence of gamification on student knowledge and their motivation with more detailed analysis. We would also like to contribute to a better understanding of gamifications impact on students in higher education in the Computer Graphic (CG) and programming fields. Through our earlier studies (Bernik et al., 2015;, it was confirmed that an experimental group of participants achieved better results in post-test knowledge testes, but now we are interested in investigating how significant is the difference in results. We developed an experimental ecourse in which the results of 50% of the highest ranked students and 50% of the lowest ranked students were analyzed. The results of our research are presented in the following chapters and it shows interesting conclusions where statistical significant differences are presented.

Related Work
Gamification is a term with its origins in the digital media industry. It was widely adopted from 2010, after usage of some parallel terms, like productivity games, surveillance entertainment, funware, playful design, behavioral games, game layer or applied gaming (Deterding et al., 2011). There were many questions about the nature of the concept and its real effectiveness for the teaching process. According to Hamari et al. (2014), gamification can be seen through three main parts: The implemented motivational affordances, the resulting psychological outcomes and by the further behavioral outcomes. From that standpoint, the main benefit of gamification should be the increasing motivation for learning of different contents. On the other hand, games are seen as a collection of multiple necessary conditions and none of these conditions alone is sufficient to constitute a game, but only the combination of them. Such combination of conditions could be arranged on three separate abstraction levels (Huotari and Hamari, 2012). The first level simply states that games are systems, i.e., constituted of several interacting sets of mechanisms and actors and that games always require the active involvement of at least one player. The second abstraction level includes some systemic conditions as rules, conflicting goals and uncertain outcomes. The third abstraction level should include conditions that are unique to games. However, it's hard to find these specific condition for games, so Huotari and Hamari (2012;Hamari et al., 2014) suggest the adoption of term gamefulness.
There different approaches in the implementation of the concept of gamification. For example, author Brenda Enders (2013) proposes the implementation of gaming elements in e-learning like points, achievements, badges, leaderboards, levels and challenge. The author provides some guidelines for the design of gamified elearning systems, but concludes that more research on gamification elements effectiveness in e-learning systems are required. Hof et al. (2017) studied the usage gamification in acquiring competences in communication and collaboration that are necessary for applying agile software methods like Scrum. The results show that students like this approach much more than traditional ex-cathedra learning approaches. It encourages team work, while the overall learning effect was moderately enlarged. Wongso et al. (2014) proposed a conceptual framework design, based on Web 2.0 technology and gamification. They offered a guideline for implementing gamification and Web 2.0 technology in e-learning systems. Their framework includes the phases of analysis, design, development, implementation and evaluation. Garcia et al. (2017) have offered a framework for gamification in software engineering. This framework is composed of the ontology, a methodology for guiding the process and a support gamification engine. In a case study a company used the framework to gamify the areas of project management, requirements, management and testing. Urh et al. (2015) have provided a model for gamification of e-learning systems. They have included previously listed development phases with management of e-learning, important factors for e-learning, game mechanics, game dynamics, gamification elements and their effects on students. Authors have found several important factors like pedagogy, technology, design, administration, people, learning materials and finance. The goals of their model were to maximize student satisfaction, motivation, effectiveness and efficiency. Song et al. (2017) have investigated the impact of gamification on engagement of college students in class. The results indicated that gamified approaches could be effective in engaging students that are bashful or distracted.
On the other hand, Glover has presented some criticism of gamification (Glover, 2013). At first, the educational experience should be rewarding by itself and only after that, gamification can make it more rewarding. Glover found that learners with high intrinsic motivation can be demotivated by some additional motivation. Therefore, gamification elements in e-learning should be carefully designed and optional for users. Also, gamification could discourage some of the less competitive learners. Author has suggested some questions to find out if the gamified approach is appropriate in a given situation, like: Is motivation really a problem, are there behaviors to encourage, can an activity be gamified, would it favor some learners, what rewards would provide the most motivation, are rewards too easy to obtain? Author concludes that gamification depends a lot on quality materials, activities and experiences, but it can provide additional motivation with careful consideration of its implementation.
Schreuders and Butterfield explored ways to increase student involvement in the teaching process and increase motivation and to enhance students' experience in passing through the educational process. Their study lasted for two years and included 32 students. The results of the research showed positive indicators in terms of qualitative and quantitative results in knowledge tests. Authors conclude that their research, regardless of the small number of participants, is in line with other studies that speak of the positive effects of gamification on increasing motivation and the improvement of the students' experience in using the e-learning system (Schreuders and Butterfield, 2016).
Iosup and Epema created two e-courses that were conducted over four semesters with over 450 students. They see gamification as a set of tools that can influence the motivation and behavior of users. The result of their analysis is that more than 75% of students had passed the knowledge check on the first exam period. There was also a positive correlation between students who passed the knowledge test and satisfaction test that is attributed to gamified elements (Iosup and Epema, 2014). De-Marcos et al. (2014) developed a gamified add-on for the BlackBoard LMS system that has enabled tracking of teaching activities to a total of 371 students, as well as mutual co-operation and mutual competition.
The ecourse was open to students for one semester. The authors noted a problem where experimental groups showed very little interest in teaching materials. Approximately 20% of the students actively participated in the research, which makes are wonder if the research could have been better designed. It is suggested that repeating this research with focusing on the social component instead of the achievements, badges and competition could be helpful (De-Marcos et al., 2014).
In the present times, authors like Khandelwal et al. (2017;Kosurkar et al., 2017;Llanos et al., 2016) and others have researched influence of gamification on the programing educational fields through the use of specialized software tools and add-ons. Although those examples are of high value to the research subject, the problem remains and that is that the use of e-courses lies on the back of Moodle platform. We think that the Moodle should get higher approach from the researches over the Globe and that all together create unified and standardized set of gamified elements for next generation of students in every University based lectures.
Generally, we can note that other authors such as Schreuders and Butterfield (2016;Iosup and Epem, 2014;de-Marcos et al., 2014) carried out similar experiments. The duration of these experiments was longer, while the number of used gamified elements was smaller compared to our research. In total, all of these studies lead to similar results.

Research Plan
The research was conducted in two Croatian higher education institutions and it included both full-time and part-time students. The pre-research included students who enrolled in the course 3D modeling on University North and the main research included students who enrolled in the course Programming 2 on Faculty of Organization and Informatics, University of Zagreb. In accordance with the obligations stipulated within each course, students were not overburdened with additional attendance at the faculty, outside their regular classes. Participation in the research was voluntary, but despite that fact, almost all students agreed to participate in the research.
Students were divided into experimental and control groups. In order to keep the interaction between groups to a minimum, the planned timeframe for the research was 20-25 working days. The research goals were presented to the students during this period. Next, in order to determine students' current level of knowledge, a pre-test was conducted. Also, based on the results of the pre-test, we examined the difference in knowledge between all groups of participants.
Students' were rewarded with additional points in the 3D modeling and the Programming 2 courses. Within the 3D modeling course, students had the opportunity to win additional 3 points. Students who participated in the complete research received 3 points, students who participated partially (participated in the pre-test, but not in other activities) received 1 point and students who did not participate at all did not receive any additional points. Within the Programming 2 course, 25% of the best-performing students in the experimental and control group received 4 points. The next 25% of the students of both groups received 3 points. Again, the next 25% of the students of both groups received 2 points and finally, the last group of students, who had the lowest scores, got 1 point.
Students had a minimum of two weeks for the usage of teaching materials from a gamified and an unmodified e-course. In the following week, after using the teaching materials, a post-test was performed. The goal of the post-test was to determine the difference in knowledge, compared to the pre-test results and after using different teaching materials.

Gamified Design Elements
In the unmodified e-course students were able to access digital teaching content that were presented through text, photography and video. Students had the default look and feel of the Moodle system at their disposal (without any embellishment or removal of elements) Students could use forum for communication purposes, but no other gamified element were included in the system. The classical e-course does not have a reward system or the ability to look at other students' points/performance. It does not have the ability to conducts assessments and tests or provide any automated feedback. An integral part of the classical e-course are options like New Announcement, Future Events, Recent The control group had access to e-course Activity as well as Navigation and Basic System Settings.
The control group had access to e-course filled with a couple of gamified elements, such as: Avatars, forum based communication and non-linear access to teaching materials. In the gamified e-course students had the same ability to access digital teaching content through text, photography and video. The main difference is that they also have at their disposal all the gamified elements listed in Table 1. The look of the gamified e-course is shown in the Fig. 1.

Participants and Groups
Participants of the pre-research were second year students of University North who attended the elective course 3D modeling and volunteered to participate. The total number of participants was 55, of which 33% were full-time and 67% part-time students. Participants were divided into four groups with 15 students in each group. 44% of participants were female and 56% male. The average age of participants in this research was 20. A graphical representation of pre-research participants' statuses in displayed in Fig. 2.
For the pre-research purposes the following groups of participants were paired: where, G1 and G4 are experimental groups and G2 and G3 are control groups. Participants of the main research were students of Faculty of Organization and Informatics, University of Zagreb who attended the course Programming 2 undergraduate study of information science in the winter semester of the academic year 2015/2016. The total number of participants who volunteered for the research was 201. Participants were divided into 14 groups of 15 students. 44 students or 21.89% were female and 157 students or 78.11% were male. The average age of participants was 20. A graphical representation of main research participants' statuses in displayed in Fig. 3.   where, G2, G3, G5, G7, G11, G12 and G14 are experimental groups and G1, G4, G6, G8, G9, G10 and G13 are control groups.

Hypotheses and Methods
Aligned with the main goal of this research, to test efficiency of applying gamified design elements in university-level informatics e-courses, we state the following three hypotheses for our research:

H1:
The experimental group of participants who achieved a minimum of 50% score in the pre-test will achieve statistically significant results, compared to the control group of participants who gained a minimum of 50% score in the pre-test, with respect to the achieved scores in the post-test H2: Regardless of the participants' gender, 50% of the highest ranked students in the experimental group will achieve a statistically significant score compared to 50% of the highest ranked students in the control group H3: Regardless of the participants' gender, 50% of the lowest ranked students in the experimental group will achieve a statistically significant score compared to 50% of the lowest ranked students in the control group General scientific methods such as observation, description, comparative methods, synthesis methods, analysis and methods for statistical processing of empirically collected data (t-test) were used in order to test the three hypotheses. It is important to emphasize that the hypotheses H2 and H3 only refer to the results of the main research (conducted within the Programming 2 course). The pre-research (conducted within the 3D modeling course) did not have a sufficient sample of participants to test these hypotheses.

Pre-Research Results
Statistical significance of the pre-research, based on post-test results, is displayed in Table 2. The calculation is based on the comparison of all experimental (G1, G4) groups and all control groups (G2, G3). The average number of points before the experiment was 16.00 for the experimental group and 15.37 the control group. Using the pre-test results, p values were calculated to show that there is no statistically significant difference between the comparison groups. This result is important as it ensues approximately the same initial conditions that need to be achieved before conducting the experiment.
Experimental groups used experimental (gamified) ecourses, while control groups approached classical (unmodified) e-courses. In the first week after the ecourse, a post-test was conducted. The average number of points on the post-test increased by 30.5% for the experimental group and amounted to 20.89. The average number of points on the post-test decreased by 0.52% for the control group and amounted to 15.30. Intermediate t value after the experiment is 3.99 and the calculated p value is 0.0002. Based on that, we conclude (with a 1% possibility of error) that there is a statistically significant difference between experimental and control groups, considering the average post-test results.
For testing the hypothesis H1, additional results analysis was performed only for participants who have met the condition of achieving a minimum of 50% of the total points in the pre-test. Of the total number of participants, 25 met this requirement. Students who did not meet this requirement were removed from the results, after which the score points were re-analyzed. Table 3 shows the analysis of post-test results for closed-type questions, open type questions and total number of points. It is evident that the average scores in all three cases are higher for the experimental group. Standard deviation is higher for the control group only in the case of open type questions. The calculated p value shows that in the case with open type questions there is a marginal statistically significant difference, which can be attributed to a small number of participants in the preresearch. In the other two cases, with closed type questions and total score analysis, the calculated p value shows that the experimental group achieved a better result compared to the control group, with a statistically significant difference. T-test analysis was performed on the overall score. The calculated t value is 3.26, while the p value is 0.003. Based on that, we concluded that there is a statistically significant difference in the benefit of the experimental group. The average number of points in the post-test for the experimental group was 40.61% higher than the average number of points for the control group. Given the overall score on the post-test, where the results are shown only for those participants who achieved at least 50% of that score on the pre-test, we conclude that the hypothesis H1 of this research is confirmed.  Test  group  participants  points  deviation  t value  p value  Pre-test  G1, G4  28  16,00  5,19  0,48  0,6328  G2, G3  27  15,37  4,48  Post-test  G1, G4  28  20,89  5,78  3,99  0,0002  G2, G3  27 15,30 4,50 Table 3: Pre-research comparative analysis of post-test responses for control and experimental groups Closed type questions Opened type questions Overall score Participants group (total number of (total number of (total number of (GE -all experimental Questions = 26) questions = 6) questions = 32) groups, GC -all

Main Research Results
Statistical significance of the main research, based on post-test results, is displayed in Table 4. The calculation is based on the comparison of all experimental (G2, G3, G5, G7, G11, G12, G14) and all control groups (G1, G4, G6, G8, G9, G10, G13). The average number of points before the experiment was 15.54 for the experimental group and 15.23 for the control group. The experimental group achieved better results by 2.03%. Using the pretest results, p values were calculated to show that there is no statistically significant difference between the comparison groups. Experimental groups used experimental (gamified) ecourses, while control groups approached classical (unmodified) e-courses. In the first week after the ecourse, a post-test was conducted. The average number of points after the experiment decreased by 11.87% for the experimental group and amounted to 13.89 points. For the control group it decreased by 26.60% and amounted to 12.03 points. Intermediate value t after the experiment is 2.68 and the calculated p value is 0.007. Based on that, we conclude (with a 1% possibility of error) that there is a statistically significant difference between experimental and control groups, considering the average post-test results.
For testing the hypothesis H1, additional results analysis was performed only for participants who have met the condition of achieving a minimum of 50% of the total points in the pre-test. Of the total number of participants, 118 of them met this requirement. Students who did meet this requirement were removed from the results, after which the score points were re-analyzed. Table 5 shows the analysis of post-test results for closedtype questions, open type questions and total points. The average score is higher, as is the standard deviation, for the experimental group in all three cases.
The calculated p value shows that in the case with closed type questions there is no statistically significant difference, although it is marginal. In the other two cases, with open type questions and total score analysis, the calculated p value shows that the experimental group achieved better results with a statistically significant difference. All experimental groups had higher average score than all control groups, based on the total number of points that the students achieved on the post-test. The points were summed up and t-test was performed. The calculated t value is 2.53 and the p value is 0.0127. Based on that we conclude that there is a statistically significant difference between the experimental and control groups. The average post-test score in the experimental group was 21.67% higher than the average post-test score in the control group. Given the overall score on the post-test, where the results are shown only for those participants who achieved at least 50% of that score on the pre-test, the main research also confirms hypothesis H1.
In order to test hypothesis H2 and H3, the participants in the control and experimental groups were additionally divided to 50% highest ranked and 50% lowest ranked, based on their pre-test results. Table 6 displays the comparison of 50% highest ranked participants in the experimental group with 50% of the highest ranked participants in the control group, as well as 50% of the lowest ranked participants in the experimental group with 50% of the lowest ranked participants in the control group. There is a statistically significant difference between the post-test score of the 50% highest ranked participants. The calculated t value is 2,578, while the p value is 0,011. Experimental group of participants achieved a 21.93% better score then the control group. Based on that results, we conclude that hypothesis H2 of this research is confirmed.  (total number of (total number of (total number of (GE -all experimental Questions = 25) questions = 5) questions = 30) groups, GC -all   In the analysis of 50% of the lowest ranked participants, the calculated t value is 1.226, while the p value is 0.223. The statistically significant difference, in this case, does not exist, despite the higher average scores recorded in the experimental group. Experimental group achieved a 12.96% better score then the control group in the post-test. We conclude that there is no statistically significant difference between the control and experimental groups, in the analysis of 50% of the lowest ranked participants and thus we reject the hypothesis H3 of this research.

Conclusion
In this research we had total of 256 participants who were tested on three hypotheses. H1 and H2 were expected, but H3 was surprise. This finding could have serious impact on future researches if the participants have lower knowledge. Based on the average number of points on the post-test, where we analyzed only the results of participants who achieved at least 50% of the total score on the pre-test, we confirmed the hypothesis H1. Similarly, we analyzed the results of participants with the highest ranking and confirmed the hypothesis H2. In our analysis of the results for participants with the lowest ranking we found no statistically significant difference and thus rejected hypothesis H3. This means that students from control group with lower knowledge didn't had statistical different test results from the experimental group students. Gamification didn't influence those two group which is interesting and definitely something that needs more exploration in the future.
The research extends our earlier results (published at international conferences CECIIS and MIPRO) where the positive impact of a gamified e-courses on the student's knowledge was measured. This research shows that there is an even greater outcome in the results if only the highest rated students from the control group are compared to the highest rated students from experimental group.
There are still some remaining challenges and open questions in this field. One of the biggest challenges is creating a standardized solution that would be universal for all Moodle e-courses in the IT field as well as in any other field. Open questions we will try to answer in our future work include: When an e-course starts to be a gamified one and when it stops; and what are the best elements from computer games to be applied in higher education e-learning.