Trends in Preventive Behaviors Across Countries: An Exploratory Study of English Language Tweets

: People take preventive behaviors to keep themselves from harm and protect themselves from physical harm, psychological threats, or health issues. Previous studies have suggested that preventive behavior trends are closely connected to cultures and values. However, the specific preventive behavior categories, their specific elements, and the preventive behavior differences between countries have never been fully examined. Therefore, using a topic modeling approach to analyze tweets from across the world, this study categorized the preventive behaviors and identified the specific trends in 68 countries. Six main topic foci were found: Life, disease, saving others, physical, psychological, and crime; the extent of these preventive behaviors in each country was analyzed to determine how they were organized and valued in each country. The results of this study could be useful for international firm management, marketing, and government policies and be the starting point for further international preventive behavior studies.


Introduction
Preventive behavior is important for maintaining health, coping in disaster situations proactively, and remaining secure in possibly dangerous situations (Agliardi et al., 2016). This has been particularly evident during the COVID-19 pandemic, with many healthcare measures being introduced to avoid infection (Gefen and Ousey, 2020;Goodwin, et al., 2020). This outbreak has also enhanced concerns about mental health (Niederkrotenthaler et al., 2020). Reger et al. (2020) found that the significant stress from the COVID-19 pandemic in the U.S. resulted in increased suicide rates. Although preventive behaviors are common, the degree of care varies widely as generally, preventive behavior depends on a person's risk acceptance. Rieger et al. (2015) examined global risk preferences in a study on 6,912 university students in 53 countries and found that economic conditions and culture resulted in country differences. Rieger et al. (2015) also found that risk aversion was related to the "uncertainty avoidance" construct suggested by Hofstede (1980), which views preventive behaviors as being associated with the degree of anxiety (Hofstede et al., 2005). Hofstede's (1980) global survey found that there were certain cultural trends of uncertainty avoidance. In high-uncertainty-avoidance countries, people were found to be more sensitive to ambiguity and anxiety (Hofstede et al., 2005) and, therefore, tended to avoid uncertain or unknown situations to prevent uncertainty (Hofstede et al., 2005).
Preventive behavior targets have also been found to be different. For example, in countries or regions that have low economic development, populations tend to suffer from financial anxiety (Vieider et al., 2018;L'Haridon and Vieider, 2019), and in conflict-affected regions, populations are more anxious about protecting their and their families lives. Therefore, the types of preventive behaviors vary depending on the situation. Overall, however, there has been little interdisciplinary cross-country research to determine the commonalities and differences in preventive behavior trends.
Therefore, this study sought to fill this gap and broaden the understanding of preventive behaviors, for which a multiple theoretical exploratory interdisciplinary framework was used to comprehensively clarify preventive behavior characteristics associated with health care, psychological concerns, consumption fears, and prosocial issues in each country. The results from this study could inform relevant research on preventive behavior in a range of disciplines, such as cultural studies, business management, health care, and mental health and could also assist international organization management, marketing, and medical practitioners in better understanding cultural patterns associated with preventive behaviors.
As there are certain elements affecting the construction of preventive behaviors, this exploratory study also sought to extract certain patterns. In most previous cross-country risk-related cultural studies, multiple country surveys have generally been used; however, it is difficult to measure a subject's unconscious behaviors and collect broad country data from surveys alone. Therefore, this study decided to analyze the behavioral data embedded in tweets, which are microblogs associated with Twitter, a social networking service that is widely used across the world. From January to December 2019, 1,763,821 tweets that mentioned "prevention" were collected from 166 countries and then classified using text analysis into topic model categories. Six preventive behavior topic types were found and specific trends in each country were analyzed.

Motivations for Preventive Behavior
Preventive behavior is motivated by a personal need to reduce danger, anxiety, ambiguity, and fear (Janis, 1967;Hofstede, 1980;Rogers, 1983;Maddux and Rogers, 1983). Rogers (1975) claimed that preventive behavior has been illustrated by Protection Motivation Theory (PMT), which identifies three main motivation factors for eliminating and reducing negative threats such as war or crime: (1) Possible harmfulness of an event, (2) likelihood of the event and (3) effectiveness of the coping behavior (Rogers, 1975).
Preventive behavior motivation, however, is not limited to only eliminating or reducing negative threats. Maddux and Rogers (1983) revised the PMT and added a fourth psychological factor, self-efficacy expectancy, which was related to a person's ability to cope with an event (Bandura, 1977); therefore, expectancy could also be a preventive behavior motivation.
The PMT reliability has been confirmed in many studies. For example, Floyd et al. (2000) conducted a PMT meta-analysis on 65 studies and found that these four preventive behavior motivations were common; that is, most studies considered both the physical and psychological preventive behaviors. The preventive behavior motivations for health and threats have been discussed for many decades. As early as 1981, Beck and Frankel (1981) suggested that self-efficacy expectancy could be a predictor of preventive health care against potential health threats. PMT has been seen to be effective for many years and this theory is still relevant today, as evidenced in several recent studies (Wang et al., 2019;Bashirian et al., 2020). Therefore, PMT is also relevant to health prevention behavior motivation.
Health protection behaviors have also been described as part of the health belief model (Rosenstock, 1974). Janz and Becker (1984) identified three health belief model streams: Preventive health behaviors before an illness or injury, actions taken after a diagnosis, and reasons for visiting a clinic. In a more recent study, Jansen et al. (2021) connected these streams to people's preventive behaviors. Therefore, based on these previous studies, both the PMT elements and the health threats were considered in this study.

Consumer Activity
Consumption is a critical human activity. Some preventive behaviors related to consumption are reviewed in this section. Consumption refers to product purchase, possession, usage, and disposal activities (Solomon, 2013); that is, post-purchase activities are also included in consumption. Consumption involves exchanging money for products or services; therefore, the anxiety and risks of cost are connected to consumption (Ortega-Egea and García-de-Frutos, 2021). When people purchase, maintain, and sometimes dispose of products, they may experience financial, quality, and security anxiety and risk (Kamalul Ariffin et al., 2018). As a purchase situation involves a balance between cost and quality (Sweeney et al., 1999), if it is unbalanced, people may experience financial, quality, and security anxiety and risk (Peter and Tarpey, 1975). Maintaining products can ensure continued quality; however, the frequency of replacement increases disposal costs.
To avoid this kind of anxiety and risk, people tend to collect information (Ahtola, 1984) to compare offerings and prices and to maintain their possessions, which can reduce their anxiety and risk perceptions. Given this context, preventing behavior about consumption was also considered in this study.

Prosocial Behavior
PMT, the health belief model, and consumption are all concerned with inner-oriented preventive behavior. To protect their physical safety or psychological state, people try to avoid threatening health and consumption events by maintaining their health and collecting information. There are, however, other-oriented behaviors as people sometimes behave for themselves and other people or the environment (Batson and Powell, 2003). Sustainable behavior is an example, which refers to activities that are aimed at protecting and preserving society (Balderjahn et al. 2013). People also address social problems, such as human rights and environmental issues, to attain a sustainable society.  Batson and Powell (2003) In organizational studies, the behavior motivated toward others is called prosocial behavior. Batson and Powell (2003) defined prosocial as an activity focused on benefiting others than oneself or making a sacrifice to help others (Bolino and Grant, 2016). From an analysis of 221 salespeople, George (1991) concluded that prosocial behavior enhanced community mood; in a study on 82 university employees and 162 students, O' Reilly and Chatman (1986) found that prosocial behavior was generated when people identified their value within their community and in four experiments on undergraduate and graduate students, Grant and Gino (2010) found that prosocial behavior generated gratitude. Therefore, it was surmised from these studies that preventive behaviors encompass both inner and other-oriented behaviors.
In summary, as presented in Table 1, preventive behaviors include PMT, health beliefs, consumption, and prosocial behavior. This article extracted these preventive behavior types from related tweets and then extracted keywords to further analyze preventive behaviors in each country.

Patterns of Preventive Behaviors
This section investigates numbers and types of preventive behaviors by analyzing text data classified using a topic model. Topic models have been used in social science fields, such as economics, business studies, and political studies (Reisenbichler and Reutterer, 2019;Sterling et al., 2019;Mustak et al., 2021).
This study uses a topic model to investigate data from social networking services, especially Twitter to estimate preventive behavior patterns. There are several motivations for using social networking services. One of the main motivations is for people to express themselves (Flecha-Ortíz et al., 2021;Valkenburg et al., 2016) by generating personal posts and by sharing other posts, both of which are the main functions of social networking services. (Coles and Saleem, 2021;Valkenburg et al., 2016).
Some people also use social networking services to feel a sense of belongingness (SaVolainen et al., 2020;Kuss and Griffiths, 2017). Many formal and informal groups are established within social networking services so that people sharing common ideas can communicate, enhancing their sense of belongingness (Dobbins et al., 2021;Valkenburg et al., 2016). Social networking services are also sources of information (Gibson et al., 2021;Gil de Zúñiga et al., 2014) and it includes word-of-mouth information, new information, and information about other people's reactions and experiences. All of this can influence beliefs and decisionmaking (Rajamma et al., 2020;Gil de Zúñiga et al., 2013). Some studies have shown that people learn about their own identities, relationships with others, and the actions they should take by using social networking services (Abbas et al., 2019;Valkenburg et al., 2016). Therefore, social networking services can be a valuable medium for exploring people's behaviors.
Presently, there are many social networking services, such as Instagram, Facebook, Snapchat, Linkedin, and Twitter. Among these, Twitter is a widely used global social networking service for sending microblogs of less than 140 characters, which are called tweets. Twitter is unique as it allows users to express their feelings, emotions, and thoughts through text rather than image information. These tweets have been collected in recent studies to assess particular behaviors (Le, et al., 2019;Doogan, et al., 2020;Shahi et al., 2021). For example, Otsuki et al. (2018) forecast disasters in Japan by analyzing people's movements in their tweets; therefore, it is possible to explore and analyze people's preventive behaviors using this kind of behavioral data.

Materials and Methods
A topic modeling classification approach was taken in this study to classify the target documents using specific keywords. This method has been used in a variety of studies to classify mail, newspaper articles, journal articles, customer reviews, and microblogs (Blei, 2012). For example, Bohr (2020) used a topic model to extract 28 topics from 78,000 U.S. climate change newspaper articles over two decades and Hu et al.
(2019) used a topic model to classify complaints of customers of New York City hotels from 27,864 online reviews, from which 10 main topics were found. Similarly, to the method employed in this study, Mutanga and Abayomi (2020) investigated COVID-19 topics in South Africa using a topic model.
Topic models using the Latent Dirichlet Allocation (LDA) method, search for the co-occurrence of words, and identify the keyword clusters, which are then used to calculate the topic probabilities in each document and describe the topic contents. Therefore, LDA was applied in this study for the tweet analysis (Blei et al., 2003).

Data Collection and Cleaning
Approval was given by Twitter Inc. to access the database using an API (application programming interface) key to collect tweets from January to December 2019 that mentioned preventive behaviors. Although preventive behaviors have been the main feature of the COVID-19 pandemic, the purpose of this study was to clarify the preventive behaviors in normal times; therefore, only pre-COVID tweet data were collected. This study collected English language tweets for several reasons. First, besides its many native speakers, the English language is used by the largest number of nonnative speakers in the world (Eberhard, et al., 2021) and is the most widely used global language (Eberhard, et al., 2021). Further, because interpreting the results from multiple language text analyses can be complex, this study targeted only English language tweets. To avoid overloading the database, following the suggestion of Mahmud et al. (2020), only tweets at the end of each month were randomly sampled; overall, 1,763,821 tweets were collected.
The Twitter user communication functions include retweets and favorites. Users can retweet or share tweets posted by others and favorite posts they like or agree with. Therefore, retweeted and favorited tweets were included in the analysis because they reflected the interest in the preventive behaviors. Because the raw data could not be directly analyzed, as suggested by Sommer et al. (2012), the data were subjected to a two-step pre-process cleaning procedure before the analysis.
First, unaffiliated characters, such as @, URL, emojis, numbers, and punctuation and words listed as stop words in the text mining package (tm) in R, such as "I," "she," "it," and "the," were removed from the raw data. Then, stemming was used to clean the documents by reducing the words to their root form (Paice, 1996).

Number of Topics
The four indicators (Fig. 1) proposed by Arun et al. (2010), Cao et al. (2009), Deveaud et al. (2014, and Griffiths and Steyvers (2004) were used to estimate the latent number of topics; the indicators by Arun et al. (2010) and Cao, et al. (2009) were used for the minimization and the two indicators by Deveaud et al. (2014); and Griffiths and Steyvers (2004) were used for the maximization. As illustrated in Fig. 1, indicators by the Arun et al. (2010) and Cao et al. (2009) are minimized at six topics and the indicators by Deveaud et al. (2014) and Griffiths and Steyvers (2004) are maximized at six topics. Therefore, the topic modeling sample analysis was based on these six topics. Using these settings, the R (version 3.6.1) topic model was implemented using the topic model and the LDA packages. Table 2 presents the results of the topic modeling and the top 10 words, which were the characteristic terms used to identify and distinguish each topic: PMT, health beliefs, consumption, and prosocial.

Results
Topic 1 was focused on the preventive factors to avoid a decrease in life quality and make people's lives more comfortable. For example, there are tweets about products that made people's lives better and talking to someone about information sharing about this category. This topic was not limited to consumption contexts and dealt with the broader meanings of the information needed to make lives better; therefore, topic 1 was classified as "life." Topic 2 was focused on the tweet keywords for behaviors that prevented diseases, such as disease features, effective drugs, and new treatments. As this topic was closely related to the health threat prevention behaviors recognized in PMT and health belief constructions, topic 2 was classified as "disease." Topic 3 was focused on prosocial behaviors, with many tweets on suicide prevention, such as saving other people; therefore, topic 3 was classified as "saving others." Topic 4 was focused on physical violence and included tweets about gun violence, violence toward women, and school violence. This topic, therefore, encompassed the physical preventive behaviors identified in PMT; therefore, the topic was classified as "physical." Topic 5 was focused on psychological health, with many tweets mentioning mental health. Although "health" was included in the top keywords, this topic was different from topic 2 and was in line with the preventive behaviors to lessen psychological threats in the PMT; therefore, topic 5 was classified as "psychological". Topic 6 was focused on preventive behaviors to minimize crime, such as decreasing crime fatalities and avoiding HIV infections and sexual offenses. Although the crime topic also included preventive physical and psychological behaviors, it was considered different from topics 4 and 5; therefore, topic 6 was classified as "crime." Some of the keywords in each topic, such as "someone," "remind," "look," "new," and "hard," appeared to be general words, but they had specific topic meanings in each context; for example, "Say something nice about someone …," "we'd like to remind everyone there is hope and help …," "My new blog looks at … to prevent 150,000 heart attacks," and "I've been fighting so hard in suicide prevention …." After the six topics were extracted from the tweet data using the topic modeling, the trends in each country were analyzed and compared.

Post-Hoc Analysis of the Topic Trends
Using the keyword frequencies, the trends for each of the six preventive behavior topics were analyzed for each country. The top 10 keywords in each topic were used as the Bag Of Words (BOW) and the keyword frequencies per tweet in each country were calculated. To estimate the topic trends in each country, a four-step data cleaning process was conducted.
First, location data from all tweets were extracted and 166 countries were identified. Second, the favorited tweets were extracted as these reflected the degree of user interest in the topic. Mahmud et al. (2020) suggested that random sampling was very useful when conducting big data analyses, as it can speed up the big data processing time and enhance scalability. They also indicated that the estimated outcomes were the same for all random samples. Given these advantages, a random sampling method was employed. Then, 4,000 samples were randomly collected each month in 2019. Because only 2,079 samples were showing geographical data in January, all January samples were collected. Therefore, overall, 46,079 tweets were included in the post-hoc analysis. Finally, the sample sizes in each country were checked and countries that had sample sizes less than 10 were eliminated from the study as these would possibly have indicated unstable trends. After the data cleaning, data from 68 countries were analyzed. Fig. 2 shows the six preventive behavior trends in the 68 countries for the six identified topics: Life, disease, saving others, physical, psychological, and crime. The retweets and favorite tweets in each country are also shown in Fig. 2

Discussion
This study used topic modeling to examine trends in preventive behavior tweets, from which six main topics were identified; life, disease, saving others, physical, psychological, and crime. Then, topic trends in 68 countries were analyzed in a post-hoc analysis. The topic trends identified were basically in agreement with previous studies (Batson and Powell, 2003;Peter and Tarpey, 1975;Rogers, 1975;Rosenstock, 1974).
The life topic, which was focused on preventive behaviors to make people's lives better and reduce information risk, was considered to be aligned with the consumption context in this study; however, it was not limited to consumption contexts as the preventive behaviors against disease also encompassed the health belief context in the PMT. The PMT covers many preventive behaviors, including those associated with physical and psychological threats. This study confirmed the presence of physical, psychological, and crime preventive behavior topics and it seems that physical and psychological topics were associated with crime prevention behaviors. The prosocial construct was included to assist in explaining PMT, health beliefs, and consumption preventive behaviors, with the 'saving others' topic being associated with one of the prosocial preventive behaviors. Overall, six main preventive behaviors were identified.
The PMT, health beliefs, consumption, and prosocial foci assisted in clarifying the preventive behaviors. However, some preventive behaviors could not be fully explained using a single concept. For example, the PMT and health belief preventive behaviors were associated with disease, physical and psychological topics, though these perspectives have a difficulty explaining topics about improving life and saving others. Similarly, the consumption and prosocial perspectives have difficulty explaining topics about the disease, physical and psychological. Therefore, there needs to be an interdisciplinary perspective when one assesses the six types of preventive behaviors.
The post-hoc analysis analyzed and compared the trends for these six preventive behaviors in 68 countries. It was found that the physical preventive behavior trend in African countries was stronger than the others, but the regional characteristics in the other preventive behavior trends were not clear.
To enhance the interpretation of the results, the relationship with other indicators, people's degree of happiness, the total number of suicides and deaths, the number of crimes, and GDP based on the (PPP) Purchasing Power Parity exchange rate were also examined. It is because people take preventive behaviors to eliminate anxiety and stress and maintain a happier state (Ortega-Egea and García-de-Frutos, 2021). In stressed societies, the number of suicides, deaths, and crimes increase (Wells et al., 2017;Kivimäki et al., 2018;Reger et al., 2020), and the preventive behavior trends also vary depending on economic development (Vieider, et al., 2018;L'Haridon and Vieider, 2019), the relevance between the results and these indicators was also assessed.
The degree of happiness was determined from the 2020 world happiness report published by the Sustainable Development Solutions Network; the total number of suicides and deaths data from 1983 to 2016 was extracted from global health estimates in the World Health Organization mortality database; the number of crimes was taken from the latest United Nations Office on Drug and Crime: UNODC (2022) for the number of people in prisons per a 100,000 population; and the latest GDP was based on the 2011-2019 PPP data from the World Bank world development indicators database. The Pearson's correlations between these data and the six preventive behaviors were calculated, from which three significant correlations were found.
A positive correlation between saving others and people's degree of happiness was confirmed (r = 0.281, p<0.05), and a positive correlation between saving others and the GDP based on PPP (logged) were also confirmed (r = 0.296, p<0.05). These results suggested that saving others' preventive behaviors were likely to occur in high-GDP and high-happiness countries. Economically developed countries often help developing countries through "Official Development Assistance" programs (OECD, 2017); this suggests that saving others may be enhanced as countries become more economically developed. The correlation also suggested that people may be more inclined to save others when their happiness is higher.
A negative correlation was found between psychological preventive behaviors and people held in prisons per 100,000 population (r = −0.372, p<0.05). However, correlations between psychical preventive behavior and people held in prisons per a 100,000 population (r = −0.118, p = n.s.) and crime (r = .163, p = n.s.) were not significant. These results suggested that there was a hierarchical relationship among psychological, physical, and crime preventive behaviors. This means that psychological preventive behaviors are possibly more important when people's anxiety about physical threats and crime decreases. Therefore, the hierarchical relationships between the topics suggested by these results require more precise verification. A negative correlation between disease preventive behaviors and the total number of deaths (logged) was found (r = −0.306, p<0.05), suggesting that disease preventive behaviors are focused on decreasing the total number of deaths and saving lives. This relationship has already been examined, with the previous results validating this supposition (Gefen and Ousey, 2020;Goodwin et al., 2020).

Conclusion
Some cultural studies have found country or regional differences (Hofstede et al., 2005;Schwartz, 2006;Inglehart and Baker, 2000). However, as far as we know, there have been no interdisciplinary verification studies focused on preventive behaviors. Although the uncertainty avoidance index developed by Hofstede (1980) is similar to the preventive behaviors examined in this study, this study provided more detail. The data analysis in this study identified six main preventive behaviors and found that these behaviors were related to economic factors, such as GDP and people's values, such as their perceptions of happiness. As suggested by Hofstede et al. (2005), the preventive behaviors in certain cultures are based on people's specific values. Therefore, this study enhances the understanding of cultural differences. Our findings suggest that it is difficult to explain people's preventive behaviors using only a theoretical framework for PMT, health beliefs, consumption, and prosociality. As these frameworks are complementary, an interdisciplinary perspective is needed when examining preventive behaviors.
Since 2020, there has been a significant global research focus on preventive behaviors because of the COVID-19 pandemic (Gefen and Ousey, 2020;Niederkrotenthaler et al., 2020) as these preventive behaviors have been protecting people and society. However, country differences were found for the six identified preventive behaviors in this study. The health preventive behaviors were found to be significantly relevant to the total number of deaths in each country and this index may be related to the speed of the virus spread. As the data in this study were focused on the period from January to December 2019, the virus effect had not yet become serious. Therefore, this study researched the effect of preventive behaviors in normal times rather than examining the influence of COVID-19 in each country. The tendency in normal times is considered to be rooted in people's habits and culture. This study suggests that there are characteristics of preventive behaviors in each county. Therefore, considering these characteristics can help motivate people's preventive behavior in the future, in all countries.
This study also has broad implications for management, marketing, and politics. Understanding the preventive behaviors of employees, consumers and citizens are critical to company management, marketing, and government policies, and as employees and consumers often act beyond their country borders, comprehending their cultural and value differences is important (Ghemawat, 2007;Douglas and Craig, 2011). The analysis of the six preventive behaviors and the country differences revealed the specific cultural foci in each country, which could be useful for forecasting employees', consumers' and, citizens' needs. Also, their motivations can be inferred by considering our results. This article can be a milestone in the study of cross-cultural preventive behaviors.
However, there were some limitations in this study. First, only English tweets were analyzed, primarily because English is the most used language on Twitter (Eberhard et al., 2021). Also, English is regarded as the language with the largest number of nonnative speakers in the world and it has been adopted as a second or third language in most countries analyzed in this study (World Economic Forum, 2019). In addition, some people are actively learning English to communicate with the world (BC, 2013). Therefore, the results of this study are based on an English language sample only. As other languages such as Spanish and Chinese are used in tweets and similar microblogs such as Weibo in China, the analysis of these other languages and microblogs in future research would more accurately identify the cultural preventive behavior differences.
Second, this study measured six types of preventive behavior in the period of 2019; therefore, research at intervals of several years is needed to understand the dynamic trends and the influences of events such as pandemics and war. In particular, the COVID-19 pandemic is likely to have a significant impact on people's attitudes toward preventive behaviors.