A Proposed Framework for Reducing Electricity Consumption in Smart Homes using Big Data Analytics

: Smart homes with smart technologies can provide better insights into saving energy and improving the quality of our life. All connected appliances are Internet of Things (IoT) devices that support applications inside a smart home which produce an amount of data that shows what households are doing during their daily life. IoT and Big Data Analytics (BDA) became the most popular technologies in our smart life that are rapidly affecting all areas of technologies and businesses to increase the benefits for organizations and individuals. This research paper contains a review study for recent papers with different techniques that discusses BDA challenges and benefits of a smart home and its relationship with IoT. Moreover, the research paper also contains a proposed approach with its technique (clustering algorithm) for analysing data to solve the main research problem of the large power crisis in Egypt due to the high electricity consumption. Thus, using associate rule to recommend actions based on these data to reduce electricity consumption in different houses in Egypt based on each inhabitant’s interest.


Introduction
defines a smart home as a "dwelling incorporating a communications network that connects the key electrical appliances and services and allows them to be remotely controlled, monitored or accessed". While Harper (2003) acknowledged that "smart house technologies that most people are pleased with are connected with saving energy or money". While mining data is one of the successful fields which has been currently one of the main vital solutions for many problems in the business field 2018), the current research focuseson smart homes for which we need to extract information from raw data collected through wireless devices embedded in IoT applications in order to contribute and improve energy efficiency management, such as smart meter, which is one of the most cost-effective techniques that depends on an intelligent calculating device capable of reporting information about the amount of power consumed. It is connected to some sensor networks and collects information as measuring devices. Such objective is beneficial for many organizations (Khedr and Kok, 2006;Hegazy et al., 2015).
By monitoring the user's consumption habits, adjusting the consumption of final users is possible to smooth the demand curve so the consumption peaks can be reduced. This could be done by generating consumption predictions using different machine learning approaches. This objective has been applied in different fields such as in banking fields (Khedr et al., 2016;Khedr et al., 2014) and education (Khedr, Idrees, 2017A;2017B), as well as smart homes.
In this paper, we present some recent techniques used in smart homes to collect data about consumers' consumption and their behavior. The paper is organized as follows: Section 2 is a background. Section 3 shows the literature review in the recent papers. Section 4 gives an overview of the approaches being used in recent papers. Section 5 presents research problem in Egypt. Section 6 presents the proposed approach and the data analysis. Section 7 presents the conclusion and the future work.
538 Background Smart home system helps to reduce the energy consumed in the building as people consume a very large part of the electricity (Martirano, 2011). In fact, sensors are one of the IoT technology indicators of having smart homes and appliances including smart TVs, lighting control, home security system, temperature monitoring and fire detection. Householders can continuously monitor and control all home appliances, even outdoor and make any decision under any circumstances with the help of sensors of the appliances that send surveillance data to a central controller (Talari et al., 2017) as shown in Fig. 1.
Devices and sensors enable information to be shared across in an appropriate manner and communicate with a smart environment, this is offered by the Internet of Things which is the next innovative technology that has been adopted recently by different wireless technologies to benefit from the opportunities offered by the Internet technology (Marjani et al., 2017). Home appliances such as air conditioners, TVs, water heaters and all the surrounding electronic equipment are connected to the Internet of Things that can be controlled remotely to facilitate daily life operations (Marjani et al., 2017).
Sensors, applications and devices continuously generate huge amounts of structured, unstructured or semi-structured data that is strongly increasing known as the Big Data. Big Data is often described using five Vs: Volume, Velocity, Variety, Veracity and Value (Hadi et al., 2015). Storing, processing and analyzing the growing amount of data or big data is inadequate using traditional database systems. Big data is somewhat new in IT and business but has been used in previous literature (Marjani et al., 2017). BDA is the ideal first step towards a smarter city. It assures flexible and real-time data processing followed by intelligent decision procedures (Crammer et al., 2013). A large volume of data can be collected from various sources as shown in Fig. 2.
BDA and IoT systems complement each other when combined, one could predict problems and fix them early and the other could react to problems. Supportive techniques are required for such a decision which many research has developed (El Seddawy et al., 2013). Data analytics could draw the information that is delivered by internet of everything to build insights that are required (Shukla, 2017), (Bhati, 2017) as shown in Fig. 3.
Different smart devices like mobile phones, smart airconditioners, etc. put smart homes into practice and apply the technology of Artificial Intelligence (AI). There are three generations of smart home technologies: First generation: Proxy server home automation approach and wireless technology (Bluetooth and Zigbee). Second generation: AI controls electrical devices. Third generation: Robot bubby "who" can interact with human beings and can stroll around Home. (Li, 2013;Moltagh et al., 2015).

Research Problem
Recent researchers found that the power crisis is a prevailing problem world-wide. Thus, this is our main research problem specifically in Egypt that is currently observing a large power crisis due to the high electricity consumption that is increasing faster than the capacity expansion as shown in Fig. 4.
A research study was conducted at smart solution company in Maadi, Cairo, Egypt, where all engineers and employees have agreed that residents have low awareness in saving energy and electricity bills, where all residents consume high percentages of electricity and pay high bills. Moreover, they argued that households demand smart devices or appliances not for the sake of saving energy but for some other reasons, such as home decoration, the need to control and monitor all devices easily and more advanced systems. In addition, households never ask for a full smart home, they ask only for some smart devices.
Egypt has many successful business fields (Nazier et al., 2013;, one of these fields is residential compounds such as Mountain View that doesn't sell fully smart homes, they only build the compound with an infrastructure that supports any smart automation systems, but it is useless as the awareness of having a smart home to save electricity and energy doesn't exist.

Air-Conditioners and Heaters
One of the most consuming electricity devices are air-conditioners and heaters. Companies has a program application that is installed in portable devices that enables clients to remotely control and monitor airconditioners and heaters; it's a simple application that works by copying some numbers and data from the device's remote.

Lights
Companies installed motion sensors to automatically switch on the lights whenever the person enters a specific room or kitchen and switch it off whenever he/she comes out. The same problem exists in corridors. Companies need to install motion sensors instead of using multi-way switching. Other rooms such as bedrooms, living rooms and dining rooms don't need motion sensors, companies just install dim switches to control lights in each room which also helps in saving energy. Figure 5 shows the light switching system of an automated place captured from a smart solution company site in Egypt.

Smart Meter
All smart solution companies agreed that only few factories use smart meters that are never installed in homes regardless their huge benefits. Smart meters in homes would provide a report for every appliance in every room with the amount of electricity consumption across time and highlight which time is the highest consuming electricity. This will help in saving energy and reducing the electricity bills as people can control their devices for the rest of the month from the first week report.

Literature Review
This section explores literature, concepts, datasets, findings, proposed work and results from the field of smart home automation systems that were discussed and their existing approaches (mentioned in Table 1). The goal is to extract proven ideas for such framework to analyze it. In addition, techniques of BDA and IoT mentioned in Table 1 are discussed in details in Table 2  in with data algorithms learning know how to separate the recovery from the natural language data 25 Households combined in the for mechanisms improving the recognition's processing. It also uses uniform city of Svebolle, accuracy. It also generates natural interfaces to exchange data seamlessly Kalundborg, Denmark language reports of the user's behavior between its items. It relies on a shared and 8 months of data from the recognized appliances. database which enables easy upgrades of the framework components. Collected.
Big Data allows large volumes of varied They proved that big data technology 9-(Ebeid et al., Machine Learning Smart Home data to be managed and offers support for ML allows information to be analyzed in 2016).
(ML) Approaches Energy (SHE) algorithm, Data mining visual tools, near to more detail than with traditional project, (SHE real-time monitoring and other information technology and the application of it Consortium analysis and processing possibilities that fit to the energy sector is an 2012).
perfectly with the requirements. innovative idea. *Centre for Advanced Studies in Adaptive Systems (CASAS) CASAS was established in the School of Electrical Engineering and Computer Science at Washington State University (WSU Pullman campus). CASAS tests technologies using smart homes to meet research needs. In WSU CASAS there are many tools can be used (Cook et al., 2017). *Smart Grid, Smart City (SGSC) SGSC project is an Australia's first commercial-scale Smart grid. The project is funded by Ausgrid in partnership with the Australian Government, Energy Australia and their consortium partners (Moltagh et al., 2015). The SGSC project tests smart grid in a real-world context by gathering robust information about costs and benefits. All electricity providers, governments, technology suppliers and consumers across Australia could use the outcomes of the project for future decisions (Sami and Honarvar, 2016), (Ayaz et al., 2017). Activity recognition uses sensory observations Activity recognition can detect concurrent Activity recognition requires a detailed Recognition that aims to identify and recognize the agents, activities and interleaved activities easily. analysis and understanding of the Algorithm actions and goals from a series of observations It can observe and analyze human activities domain in which activities occur and environmental conditions (Mayr, 2016). and interpret ongoing events successfully (Mayr, 2016). (Mayr, 2016). 2. Smart Windowing A technique that can be used for online sensor It helps to avoid situations in which the Expensive to manufacture and streaming that consists of two phases: (1) an sensor data of an activity is spread over expensive to install. online phase that uses the same windowing several different chunks or the window technique to recognize activities and (2) an size, i.e. the set of simple events, is not offline phase for analyzing and windowing sufficient to predict a specific activity. the streaming data (Conrad, 2016).

Spatiotemporal
A set of different statistical spatiotemporal Allow to study the persistence Assessing both the temporal feature Analysis features is to recognize activities in real of patterns over time and and spatial dimensions of data time (Mayr, 2016).
illuminate any unusual patterns. adds complexity to data analysis. 4. Clustering Method Clustering where objects in the same group Automatic recovery from failure Complexity and failure to recover that are more like each other are grouped and easy to implement. from database corruption. together (Moltagh et al., 2015).

Prefix Span
Prefix Projected Sequential Pattern Growth: PrefixSpan mines the complete set of The database will keep shrinking. It discovers the frequent single items, then patterns but greatly reduces the efforts of wrappings this information into a frequent-candidate subsequence generation. pattern tree, or FP-tree. Pattern growth is a Moreover, prefix-projection substantially method of Pattern growth is a method of reduces the size of projected databases and frequent-pattern mining that does not require leads to efficient processing. candidate generation (Han and Lim, 2010).

Big Data Analytics and Internet of Things Techniques Proposed Approach and Data Analysis
In order to address the issue of electricity consumption due to high usage and consumption of the appliances mentioned in the research problem; we used clustering algorithm to analyze the data to find out their consumption pattern and then used association rule to help in recommending actions based on inhabitant's interest as shown in Fig. 6. None of the recent papers mentioned in the literature review used both algorithms together.
Clustering algorithm is to group households based on their satisfaction level which is described by the household's characteristics and activities behavior by applying k-means on raw data. In addition, association rule (apriori algorithm) helps to find relationships between data items within large datasets in various types of databases.

Research Methodology
Dataset were collected from visiting five different Egyptian houses for several months and measuring the electricity consumption of their devices to know the highest electricity consuming device; we collected millions of data records every 6 sec. Every record was in UNIX time that is transferred into date and time using RStudio program and every 30 min of data are merged to find out the usage of each device.

Smart home Smart meter Associtation Rule
Big data

User Behavior Report
Big data analytics   In each CSV file is a non-negative integer which records power demand of the downstream electrical load in Watt which is also merged every 30 min. All graphs for this analysis were produced using Power-BI program. We collected and analyzed the data to check the consumption time of each appliance in non-smart house as shown in Fig. 7 as an example of an electric heater in house 3.

Relevant Types of Behavior Patterns
To know the types of behavior patterns needed to suggest actions, we need to define the characteristics to identify the relevant patterns in the result set of the frequent (and/or periodic) pattern mining algorithm. The data analysis of a recommender system can be based on a variety of methods as shown in Fig. 8.
To be able to suggest actions by using relevant patterns, it must be composed of two main components as shown in Fig. 9. First, a relevant pattern must contain at least one action to lower energy usage in smart homes (action). Additionally, the pattern must consist of normal events served as a condition to suggest the action at the right time because a one-event condition is not enough to suggest an action. Pattern is created by at least three events, normally one action and two events, but in exceptional cases, three actions are possible as a relevant pattern consists of normal events (condition) and an action, it can be interpreted as the association rule Eq. (1) and Fig. 10. An association rule is an implication of the form Equation 1: The rule states that when X occurs, Y occurs with a certain probability (Deenadayalan et al., 2014).
Association Rule doesn't consider the order of the items which defines time. Also, patterns can be explained as association rules. Patterns where the actions at the beginning or the center are defined as relevant patterns as shown in Fig. 11:

Patterns with Multiple Actions
As noted, patterns can also contain two or more actions. To apply the association rule approach for such patterns, one association rule is created for each action in a pattern. As actions are just a special form of events, the other actions must be treated as normal events and serve as a condition for the rule: As shown in the example of Fig. 12, a four-event pattern containing 3 actions will result in 3 rules. For each action (1, 2 and 4) shown in the pattern, a rule is formed and the other actions areas normal events. To recommend an action, the condition must appear in the behavior data without an action.

Architecture
Recommender systems are used as information filtering systems to predict user preferences. In addition, it needs to suggest meaningful actions or items based on the interest of the user of the system.  (Zehnder et al., 2015) The method chosen for this project is association rule. The design of a recommendation engine depends on the characteristics of the data. Data analysis methods and different techniques that can be classified into one of the three groups: Content-based systems, collaborative filtering systems and association rulesbased systems (Ye and Huang, 2011). These methods can be combined in some exceptional cases. The architecture of the recommender system developed in this project can be divided into three main parts: • The storage of the association rules • The event stream of the current behavior data inside the smart home • The matching algorithm for both previous points Prioritization Designing of the recommender system allows matching more than one rule at the same time. For this reason, rules should be weighted to decide which rule is most suitable. Furthermore, excluding rules under a certain threshold could be done by prioritization criteria. Some indicators can be considered when calculating the weight of a rule: 1) Support (count) of the pattern relative to all events 2) The number of events in a pattern 3) The position of the action in the pattern 4) Date when the rule was mined 5) The confidence of the pattern As confidence must be calculated after mining by the recommendation system, the support (count) is calculated by the pattern-mining algorithm Window Sliding with De-Duplication (WSDD). Fig. 13: Formula to calculate the confidence of a rule (Zehnder et al., 2015) Fig. 14: If the action is not the first or the last event, the support count is independent (Zehnder et al., 2015) Fig. 15: Adapted formula for confidence (Zehnder et al., 2015) Confidence Confidence in association rule mining shown in Eq.
(2) is defined as follows: The rule holds in T with confidence conf if conf % of transactions that contain X also contain Y (Deenadayalan et al., 2014): The confidence of a rule shows how often the action Y appears in pattern that contains event X (condition). If a rule has a confidence of 100% and without mining an action, no occurrence of a pattern. The confidence is calculated using pattern support (count) (Deenadayalan et al., 2014) as shown in Fig. 13. The support count of a pattern including an action is divided through the support count of the pattern without action. This leads to the formula.
When mining overlapping patterns, the support of a longer pattern containing the same events as a shorter pattern must be equal or higher than the longer, the count of the shorter pattern will increase because of the occurrence of the longer pattern. Therefore, the result from the confidence calculated as shown in Fig. 13 will be between zero and one. However, the displacement of the action creates a new pattern with a support count that is completely independent from the original one in case that the action is not at the beginning or at the end of the pattern.
The result of the confidence would not be between zero and one as shown in Fig. 14, therefore it won't be comparable to other patterns. Fig. 13 can be modified to be as shown in Fig. 15: Functional prototype of the system must be developed to be used in the field test, it is not difficult to suggest or recommend an action. However, the action suggested must not decrease the comfort levels of inhabitants of the tested smart homes, we could allow the inhabitant to vote for each recommendation to measure if a suggestion decreases their comfort level.

Results
Recommended actions lowered energy usage in smart homes as the energy consumption of some devices was lower than before the execution of actions. This is shown in both figures: Fig. 16 during weekday and Fig. 17 during weekend. This was achieved either by reducing or turning off devices to lower their consumption. Turning off was the most obvious and effective approach to lower energy usage in a smart home provided that the actions were of the inhabitant's interest. Therefore, an action is categorized into 2 main attributes: • Lower energy usage • Does not decrease the comfort level

Conclusion and Future Work
The proposed framework was applied on different houses and appliances that were measured and analyzed using k-means. Every house had its own usage which differed from other houses; some houses found that light was the highest consuming device, while others found the air-conditioner. Based on these findings, the recommender chose the most suitable action to lower the usage without decreasing the comfort level of inhabitants using apriori algorithm. After applying the recommended actions, most of the devices consumed less electricity.
Analysis to optimize decision making can be optimized by data analytics that uses data collection. To identify patterns, we need to review large amount of unorganized data that helps decision makers and enables a better understanding of behaviors. The idea of data analytics aims at handling huge volumes of data to identify trends, patterns and collect irreplaceable findings by applying BDA.
In future work, we need more information such as time between two actions, time of day, weekdays, or season when the pattern occurs; all of which will improve the accuracy of the suggestions of the recommender system by the mining algorithm. Better conclusions can be done with the help of such attributes to react on different behavior patterns in various times than just the date when the rule was mined (e.g., a pattern, which happened regularly 1 year ago on the same weekday as today could lead to better results as a pattern mined in recent times).
In addition, it would be better to recommend actions based on every inhabitant in the house and taking in consideration age, time, culture and gender as it may differ in recommending actions and results.