A Twitter Sentiment Analysis Model for Measuring Security and Educational Challenges: A Case Study in Saudi Arabia

: Ensuring the good psychological health of the community is one of the highest priorities in modern societies. Therefore, having a sense of the community’s rhythm and mood is a very important factor in understanding what challenges it may be facing. Psychological challenges differ from one society to another. Hence every community has its own specific psychological scales. In the context of present-day Saudi Arabia and many other countries, the measuring of educational and security challenges is critical as it can enable decision-makers to avoid anticipated risks. Traditional psychological scales in the form of questionnaires are time consuming to administer and analyze, especially where data need to be collected from a massive sample distributed over a wide geographical area. Such scales are impractical and ineffective in terms of providing critical results especially in today’s rapidly changing environment. Therefore this research proposes an approach to identify and measure the educational and security challenges facing Saudi society through Twitter sentiment analysis. The psychological measurement standards of three key categories of education and security challenges were identified and broken down into selected keywords that best identified these challenges. Then Arabic tweets that contained those keywords were analyzed in order to develop a model that could classify new tweets into one of the three types of challenges. The proposed model was better able to predict tweets belonging to the cross-cultural and ethics of dialogue and rules of difference challenges than those for the dominant negative social values challenge. The results of this research show that the sentiment analysis of tweets could provide a faster and cheaper alternative to the use of traditional psychological scales.


Introduction
Psychological rigidity has become one of the hallmarks of the health community (Rappaport and Seidman, 2000). Problems and issues in social and psychological domains are defined as challenges. Classical social and psychological researchers deal with studying a problem in specific ambit by applying a psychological test that is suitable for that particular problem. According to Urbina and Anastasi (1997), a psychological test is "an objective and standardized measure of behavior of an individual's performance of tasks that have usually been prescribed beforehand". The most common type of psychological test is a paper-and-pencil test consisting of a series of questions that need to be answered by a respondent for sake of the measurement. The answers to these questions produce a test score. Psychometrics has been recognized as the approved term for the science behind psychological testing (Mellenbergh, 1989). In this study, we introduce Twitter as psychological test medium. Specifically, we use Twitter to measure the extent of some security and educational challenges in Kingdom of Saudi Arabia.
A number of security and educational challenges are threatening educational security. Each nation has its own educational ideas that stem from its values and its stylistic and philosophical constants, which shape its identity and character. Educational security protects nations through the maintenance of those educational values (Ali, 2013). The security and educational challenges can be categorized into three main types: (1) Cross-cultural, (2) ethics of dialogue and rules of difference and (3) dominant negative social values. The cross-cultural challenge has been defined as following a civilizational style to apply some beliefs and believing that those beliefs represent the absolute truth (Arif, 1995). The ethics of dialogue and rules of difference challenge has been defined as a sort of communication between individuals that esteems those variations with others which allows for true listening in a safe environment that offers possibilities for the transformation of self-awareness in each individual" (Ashki, 2006). Dominant social values have been defined as those values that the majority of people in a society support at a particular time (Prilleltensky, 1989). If these values are negative, they are called dominant negative social values. An example of a dominant negative social value in Saudi Arabia at the moment could be a value such as "men and women are considered not to be equal." The objective of sentiment analysis is to determine the attitude of a speaker, writer, or other subject with respect to some topic (Cambria and Hussain, 2012). Recently, social media has grown exponentially and now contains a huge amount of human activity, making it a fertile source for sentiment analysis (Pfeffer, 2014). Twitter was created to be a fast communication medium for people from all walks of life and is considered one of the most important social networking sites on the World Wide Web with a global impact. The use of Twitter sentiment analysis to understand and predict emotions, opinions, or moods has been proven theoretically and demonstrated practically (Maynard and Bontcheva, 2016). Sánchez et al. (2017) highlighted the value of Twitter as a new data source for the social sciences. Twitter represents an abundant source for many human studies due to its quantitative and qualitative content, not least because more than 140 million active users publish over 400 million tweets every day (Li et al., 2012). Moreover, in the Middle East, Twitter is becoming a prominent player in socio-political events such as the Arab Spring (MacEachren et al., 2011). Vosoughi et al. (2015) discussed the importance of Twitter in the analysis and study of human behavior and refer to the fact that less than 10% of Twitter accounts are private.
The importance of measuring educational and security challenges together with the availability of a huge amount of data on human activities online, encouraged us to measure educational and security challenges by using social media and more precisely by using Twitter.
Twitter is currently the most popular social networking program in Saudi Arabia. In 2015, according to the Alyaum website (2015), 500,000 tweets were made per day in Saudi Arabia and no doubt this figure has increased since then. In this study, we introduce a novel approach to measure psychological hardiness by using tweets that have been initiated in Saudi Arabia. The research question we seek to address is: What is the relation between educational and security challenges and timing in Saudi Arabia? It should be noted that in this study we target only Arabic tweets. The methodology section provides details about the data collection and analysis procedures.
The traditional method of measuring educational and security challenges is to conduct a survey by using a questionnaire that is filled in directly by targeted persons at a specific time or at a time of their own choosing. Hence, the answers are dependent on the respondents' mood at the time they complete the questionnaire. According to Choi (2011), educational and security challenges are influenced by variable factors, making them vulnerable to change. For instance, in a psychological test to measure educational and security challenges that was developed by Mikhamar (1996), the first question is "Whatever the obstacles, I can achieve my goals" and the respondent is asked to choose one answer from "Always," "Sometimes" or "Never." A respondent could choose "Always" but at another time he/she might choose "Sometimes." Consequently, measuring psychological hardiness at one point in time, regardless of the sample type or size, may not provide a clear picture of the phenomenon under study or accurate results.

Related Works
In the current decade, Twitter sentiment analysis has attracted many researchers and practitioners due to its promising results. Usually, sentiment analysis is applied in the business domain to understand customer behavior or feedback (Neuendorf, 2017). Numerous research and practical papers have proposed solutions based sentiment analysis for different business domains and recently sentiment analysis has been used to provide solutions in the social sciences as well. In this study, we use sentiment analysis as a research tool for the field of psychology. Therefore, we focus our discussion of the related works on a few examples that have proposed sentiment analysis as a solution for problems in a number of different fields in order to show that sentiment analysis is suitable, applicable and successful in a number of different fields.
For instance, Bollen et al. (2011) used the Profile of Mood States (POMS) to model public mood and emotion by using Twitter sentiment analysis and they concluded that social, economic and political events do have a direct impact on public mood. As an example of using Twitter sentiment analysis in political science, Al-Khalifa (2012) used graph analysis to understand the underlying social structure of Saudi political activities on Twitter. Hend used NodeXL to visualize the shape of political networks for different hash tags and to ascertain the power of the relationships within these networks. In a similar vein, Ullmann (2017) used tweets to understand the effect of Twitter during the Egyptian revolution. In a different strand of research, Alabbas et al. (2017) used Twitter sentiment analysis to detect high-risk floods. On the other hand and on a more personal level, Desai et al. (2012) used Twitter sentiment analysis to check the usefulness of tweets that were issued during Kidney Week and their results showed that Twitter is an effective medium through which to improve medical awareness. Schwartz (2016) analyzed tweets to predict individual well-being by modifying the Satisfaction With Life (SWL) Positive Emotions, Engagement, Relationships, Meaning and Accomplishment (PERMA) psychological scales to make them suitable for Twitter sentiment analysis.
Many standard models and tools have been generated for the purpose performing Twitter sentiment analysis on English tweets, but unfortunately Arabic sentiment analysis lacks of mature models and tools. However, some notable efforts have been made in this direction. For instance, Alabbas et al. (2017), mentioned above, used classifier techniques to analyze short informal (colloquial) Arabic text. As examples of these researches are Aldayel andAzmi (2016), El Ballouli et al. (2017). To address the lack of data sources for this area of research, Refaee and Rieser (2014) created an Arabic Twitter corpus for subjectivity and sentiment analysis that has been published by the European Language Resources Association (ELRA). The publication of this corpus by the ELRA indicates that the use of Arabic tweets in sentiment analysis is an attractive area of research. An enormous amount of research has also be undertaken on Twitter as data source for the sake of improving the automated understanding of the Arabic language (see the review by Boudad et al. (2017)).
However, to the best of our knowledge, our research is the first to use sentiment analysis of Arabic tweets to measure psychological behavior.

Development of a Sentiment Analysis Model for Measuring Security and Educational Challenges
The measurement of security and educational challenges through the analysis of tweets is both a theoretical and a practical challenge. From the theoretical perspective, many psychological tests can be used to measure security and educational challenges. However, for the purpose of this study we needed a psychological test that had been developed for the Middle East context so that we could modify it to make it suitable for Arabic tweet analysis. On the other hand, the work presented us with a practical challenge due to the lack of Arabic sentiment analysis tools. Currently, researchers and practitioners in the Arabic sentiment analysis domain are using a standard corpus that has been developed for commercial goals and which is thus not suitable for our purposes. To overcome this limitation, we developed a novel technique for Arabic sentiment analysis. Hence, our developed model makes two novel contributions: Offering a way to measure security and educational challenges by tweets and providing a new technique for Arabic sentiment analysis. In the following, we provide details of the developed model.
First, some suitable psychological tests were selected for each security and educational challenge. Due to the importance of measuring these challenges numerous research studies have used different psychological tests, such as Kobasa (1979) and Maddi and Kobasa (1984). This importance drew the attention of researchers from academic and community institutions to study the psychological rigidity of different sectors of society. It is a fact of life that psychological factors vary from place to place and that social or religious factors affect psychological factors as well. Hence, researchers have developed scales for psychological hardiness to match their area of interest. Therefore, we felt that it would be appropriate to use a general psychological scale to measure security and educational challenges that was developed to match the nature of Saudi Arabian society. One of the most well-known psychological hardiness scales for Arab countries is that proposed in Mikhamar (1996), which has been used in many PhD theses and research papers. The other source of psychological scales for measuring security and educational challenges in the Middle East is Arif (1995). Therefore, in this research, we used the psychological scales in Mikhamar (1996) and in Arif (1995). These psychological scales are in the form of a traditional questionnaire that consists of direct questions targeted at a specific population and are evaluated by statistical methods.
We modified these psychological scales by combining related questions from both to configure specific categories. For instance, in the works of Mikhamar (1996) and in Arif (1995) we found that a total of five questions are used to measure "negative thinking," so we replaced these five questions with the category "negative thinking". Then we identified a number of keywords to represent "negative thinking". In brief, in our modification of the psychological scales, we (1) chose a suitable psychological scale for each challenge; (2) grouped the related questions under a specific category; (3) Unification of Polarity. All categories should be appeared as positive or negative feelings; and (4) Representing each category by keywords. Note that this four-step process was evaluated and approved by a group of experts. Table 1 shows a sample of the keywords and their associated challenge. The categories and keywords have been translated into English for the sake of clarity (the original Arabic words are in parentheses).
In order to develop our proposed model, we followed five steps: Step 1 By using assistance of experts we identified a group of keywords that represent the security and educational challenges that the proposed model is expected to discover.
Step 2 We used RapidMiner Studio using the search Twitter operator (set for the Twitter streaming API) to search Twitter for Arabic tweets made within boundaries of Saudi Arabia. The resulting tweets were generated as records and each record consisted of {name, to-user, source (device), text, geo-location latitude, geo-location longitude, retweet count}. We removed retweets (-rt) and links (-http) to reduce noise and duplications. The output of this step was a dataset in which each tweet was represented as a record. In the traditional questionnaire process, this dataset equates to the data obtained from filled-in questionnaires.
Step 3 We organized and cleaned the dataset by (1) tokenizing the dataset, i.e., extracting only the words from tweet texts; (2) removing Arabic stop words, such as " " meaning "from"; and (3) stemming the remaining words to their original roots according to Arabic grammar. The output of this step was a dataset of clean tweet records containing only stem words on which the proposed classification model would be tested Step 4 We developed the proposed classification model based on deep learning (Bengio et al., 2013) as a multilayer feed-forward artificial neural network that was trained with stochastic gradient descent using a backpropagation algorithm. The aim of this step was to build a model that could predict whether the new tweets were positive, negative or neutral educational and security challenges. The result was cumulative and based on the majority. As our aim was to identify security and educational challenges for the whole community, the output result reflected the community situation. A negative result showed the existence of a risk to security and educational values and vice versa for the positive result. The output of this step was a novel classification model.
Step 5 In this step, we validated the developed model by employing 10 k-fold cross-validation (Kohavi, 1995) to test the prediction accuracy of the developed model on the dataset. Cross-validation split the data into k subprocesses. Each time one subprocess was used as a validation set the remaining k-1 subprocesses were used as the training set. The results obtained from the k experiments were averaged to get a single validation value. The output of this step was a validated classification model.    Table 2 summarizes the process followed in developing our proposed sentiment analysis model and Figure 1 illustrates the life cycle of the proposed model.

Results
The results were obtained by developing the classification model through applying a deep learning operator using Rapid miner. In line with Al-Rubaiee et al.  Table 3 shows the precision and Table 4 the recall results for Challenge 1, the cross-cultural challenge with an accuracy of 64.06+/-5.56%. From the tables, the highest predicted precision (73.48%) and recall (75.91%) were for negative tweets.        Table 5 shows the precision and Table 6 the recall results for Challenge 2 (ethics of dialogue and rules of difference). From the tables, the highest predicted precision (67.52%) and recall (75.60%) calculated are for the positive tweets. According to equation (3), the accuracy is 61.99+/-6.23%. Table 7 gives the precision and Table 8 the recall for Challenge 2 (dominant negative social values). According to equation (3), the accuracy is 61.46+/-6.67%. From the tables, the highest predicted precision (65.78%) is for the negative tweets while the highest recall (78.64%) is for the positive tweets.
The precision results indicate that the developed model can best predict the tweets that reflect Challenge 2, while the recall results indicate that it can better retrieve the positive tweets associated with Challenge 3. As the attitude of the public changes with frequent changes in the mood and in response to events, this implies that the keywords for the challenges especially Challenge 3 (dominant negative social values) might be need to be reviewed in order to develop an adaptive method that can look for keywords that change based on modifications in the mood of the public at large.
The results of this research can be used in the following: The assembly of a mechanism that acts as an early warning system about the public mood; the identification of possible educational and security threats and the protection of the community as part of a strategy of prevention rather than cure.

Conclusion
The impact of Twitter as the most famous microblogging site in Saudi Arabia is a phenomenon that has attracted researchers interested in studying societal behavior. In this research, we introduced a novel way to measure educational and security challenges by applying sentiment analysis to Arabic tweets. The methodology consisted of the following steps: (1) Identifying the keywords needed to discover security and educational challenges in Twitter content; (2) using Twitter API streaming to construct a dataset of Twitter users for a specific period of time; (3) using data mining algorithms to develop a classification model based on the identified keywords to classify tweets; and (4) evaluating the performance of the developed model. This aim of this research was to develop a classification model that could determine whether the current tweet stream contained tweets representing three major security and educational challenges, namely, the (1) cross-cultural, (2) ethics of dialogue and rules of difference and (3) dominant negative social values challenges. The results showed that the model was able to predict Challenge 1 and Challenge 2 tweets more effectively than Challenge 3 tweets.
In addition, this research provided a faster and cheaper alternative method for two traditional psychological scales. As future work, detailed scientific studies should be undertaken to compare our developed model with traditional statistical methods.