Feasibility of Hybrid Neuro-Fuzzy (ANFIS) Machine Learning Model with Classical Multi-Linear Regression (MLR) for the Simulation of Solar Radiation: A Case Study Abuja, Nigeria

: The extremely variable nature of solar radiation makes it difficult for solar power plants to keep up with predicted power output and demand curves. As a result, solar radiation simulation is crucial to the efficient design, administration, and operation of any solar power plant. With only partially satisfactory results, empirical models have been routinely employed in Nigeria to predict solar radiation from easily measurable environmental characteristics like temperature, humidity and cloud cover. Only a few machine learning models have been used to predict sun radiation in Nigeria, despite the global trend toward machine learning. With almost no published work utilizing Abuja as a case study, machine learning algorithms for simulating sun radiation in Nigeria have not been sufficiently studied. By contrasting the performance of the conventional Multi-Linear Regression (MLR) model with the cutting-edge machine learning model, ANFIS, this study seeks to close this gap and establish which model is more suited and accurate for forecasting solar radiation in Abuja, Nigeria. Data for daily measured climatic variables, such as maximum and minimum temperatures, relative humidity, precipitation, maximum and minimum wind speeds, sunshine hours, and solar radiation were retrieved for this study over ten years from the National Space Research and Development Agency, (NASDRA) Abuja. R, R 2 , RMSE, and MSE were used to simulate and assess the performance of various model combinations throughout both the training and testing stages. When compared to the best MLR model simulation, ANFIS model 8 was shown to generate accurate results.


Introduction
In Nigeria, just 40% of the population has access to the national grid, where the majority of the power is produced from fossil fuels like coal, gas, and oil (Aliyu et al., 2015), which are bad for both the environment and people.Due to the substantial distance between rural areas and the closest utility grid connection point, the rural residents of the nation make up the remaining portion of the population without access to electricity (Shaaban and Petinrin, 2014).The Federal Government of Nigeria (FGN) is supplying electricity to these rural communities using renewable energy sources like solar, wind, small hydro, etc. through the Rural Electrification Strategy to make up for the shortfall in power generation, boost solar radiation hitting the ground is crucial for the full design, operation, management, and financial sustainability of solar power plant projects (Habte et al., 2021), (Ağbulut et al., 2021).Several radiometers, such as the Pyranometer, Albedometer, Pyrheliometer, etc., are used to measure solar radiation information.This equipment must be constantly calibrated by experts and have substantial acquisition and maintenance costs (Feng et al., 2020).This restricts the measurement of SR, particularly in underdeveloped nations like Nigeria.Because of this, it is crucial to be able to simulate or predict solar radiation using easily measurable environmental variables (such as temperature, humidity, cloud cover, wind speed, etc.).To integrate solar resources into electrical networks and enable significant penetration, it is crucial to comprehend their unpredictability.Temporal and spatial scales have an impact on variability and these scales are essential for creating effective solutions for reducing variability (Perez, 2018).
Empirical models are recognized and acknowledged as useful in predicting solar radiation because they are based on mathematical calculations (Ağbulut et al., 2021).The sunshine-based models were calibrated to be more accurate and (Akpabio et al., 2004) and (Falayi and Rabiu, 2007) utilized them to estimate the monthly mean global solar radiation reaching the horizontal surface in various regions of Nigeria.Myers (2017) suggested a model that predicts solar radiation using maximum and minimum temperatures.Based on the data obtained from detecting cloud cover, (Kasten and Czeplak, 1980) created equations for the calculation of solar energy reaching the horizontal surface.The intricate and non-linear relationships between the dependent and independent variables have not been adequately captured by these empirical models, even though they have been widely used to forecast solar radiation (Falayi and Rabiu, 2007), particularly in moist regions where heavy clouds predominate during rainy seasons.Empirical models yielded findings that were only partially correct [15, with even worse projections for a small number of data samples] (Muhammad et al., 2018).Artificial Intelligence (AI) has become more widely used in nearly all engineering fields as a result of recent technological developments (Huang et al., 2020) (Najashi et al., 2014).A subset of AI called Machine Learning (ML) has been used to anticipate solar radiation data and previous research has demonstrated that ML models have outperformed empirical models in terms of accuracy (Quej et al., 2017) (Liu et al., 2020).To anticipate worldwide solar radiation, (Tymvios et al., 2005) compared ANN-type models with Angstrom's empirical models.The results showed that ANN models produced better forecasts than Angstrom-type models.(Hassan et al., 2016) explored how well three machine learning algorithms, ANFIS, SVM, and MLP, predicted the amount of solar energy that would hit a horizontal surface.The MLP model produced the greatest results in this study, followed by the ANFIS and SVM models.(Govindasamy and Chetty, 2021) investigated the effectiveness of using ANN, GRNN, Support Vector Regression (SVR), and Random Forest (FR) for solar radiation forecasting across South Africa.The authors did this by adding PM10 air pollutant concentration to readily measurable meteorological parameters; ANN models produced the best results with high correlation coefficients and minimal forecast errors.
To predict the monthly mean horizontal global solar radiation in Jos, Iseyin, and Maiduguri, (Olatomiwa et al., 2015) created a hybrid model using SVM and the Firefly Algorithm (FFA).The accuracy of this unique model was compared to several standard metrics in this study and the findings demonstrated that it produced more accurate forecasts than ANN and GA models.Another study by (Kuhe et al., 2021) forecasted the sun radiation in Makurdi using RBNN, GRNN, and Feed-Forward back-propagation Neural Network (FFNN); by applying ANN's ensemble, the findings produced forecasts with increased accuracy.Khahro et al. (2015) identified the ideal tilt angle for a prospective location in Pakistan and proposed nine new empirical models to estimate diffuse solar radiation on a slanted surface.After the study, they suggested modifying the ideal tilt angle for the study location every six months.Deng et al. (2010) The average daily global solar radiation in China was calculated using the Least Squares-Support Vector Machine (LS-SVM) techniques.Three sets of data were created from the obtained information: One for testing and two for validation.The LS-SVM model's parameters were adjusted using grid search, an efficient optimization tool.With an R 2 of 0.98, the model delivered excellent results.Hossain et al. (2013) conducted a study to demonstrate that by using the chosen feature subsets and optimized parameters on them, machine learning model accuracy may be greatly increased.To support this strategy, they used Least Median Square (LMS), MLP, and SVM.

Materials and Methods
In this study, daily measured maximum and minimum temperatures (Tmax and Tmin respectively), Relative Humidity (RH), Precipitation (Pc), surface Pressure (Ps), maximum and minimum wind speeds (WSmax and WSmin respectively), Sunshine Hours (SH) and solar radiation on a horizontal surface (Rs) were collected and the data were pre-processed for a period of ten years, from 1 st January 2010 to 30 th April 2021.For performance evaluation, the daily data obtained for this study were divided into training and testing phases, with 25% used for testing and 75% used for training.

Adaptive Neuro-Fuzzy Interference System (ANFIS)
ANFIS is a soft computing strategy that combines fuzzy logic and ANN soft computing techniques.Kemal and Alhasa (2016) (Cheng et al., 2005) (Sharma et al., 2017).Fuzzy reasoning can change the qualitative aspects of human knowledge and offer fresh perspectives on the process of exact quantitative analysis.Although it can convert human thought into a rule-based Fuzzy Inference System (FIS), it lacks a stable technique for doing so, and changing the Membership Functions (MFs) takes a lot of effort.It has a higher capacity to adjust to its environment over the course of learning than ANN.As a result, ANN may be used to alter the MFs automatically and lower the rate of errors while determining fuzzy logic rules Kemal and Alhasa (2016).

ANFIS Architecture
This neuro-fuzzy network, which has five layers, maps an input space to an output space utilizing fuzzy reasoning and neural network learning techniques.The ANFIS architecture is shown in Fig. 3.
A first-order Sugeno fuzzy has the following rules: ( ) ( ) where, A1, B1, A2, B2 are membership function parameters for x and y inputs and p1, q1, r1, p2, q2, r2, are the outlet function parameters.The structure and formulation of ANFIS follow a five-layer neural network arrangement.Layer 1: In this layer, every node  is an adaptive node having a node function seen in Eq. 3: where, 1 i Q is the membership grade for input x or y.The membership function chosen was Gaussian because it has the lowest prediction error.
Layer 2: In this layer, every rule between inputs is connected by a T-norm operator that performs as an 'AND' operator: Layer 3: In this layer, every neuron is labeled Norm and the output is called 'Normalized firing strength': Layer 4: In this layer, every node  is an adaptive node having a node function as in Eq. 6: ( ) where, p1, q1, and r1 are irregular parameters referred to as 'consequent parameters'.Layer 5: In this layer, the overall output is computed as the summation of all incoming signals:

Multi-Linear Regression (MLR)
A well-known technique for statistically simulating the linear relationship between one or more independent variables and the dependent variable is Multi-Linear Regression (MLR).The dependent variable y and the n regressor variables may generally be connected.The model is known as an MLR model and is characterized by n regressor variables.Ladlani et al. (2014) provide the equation: where, 0 is a cut-off and 1 … n are the regression coefficients.To obtain the values of the intercept and the regression coefficient in Eq. 8, the least squares method is frequently used (Kemal and Alhasa, 2016).
The performance evaluation metrics used to assess the model's performance include the coefficients of determination (R2), correlation coefficient (R), Mean Square Error (MSE), and Root Mean Square Error (RMSE) (Abba et al., 2021a, b).The conditions are:

MSE
x y n

Results and Discussion
Using a correlation matrix and conventional sensitivity analysis, the most prevalent and appropriate input combinations with the targeted variables were examined.Table 1 shows the linear relationship between the variables, which is utilized as a fundamental barometer for the correlation of variable sets.
According to Fig. 4, the linear correlations are quite strong when stationary and relevant variables have a probability less than 0.05 (P0.05).Inverse relationships between two variables are also demonstrated by the negative correlation values.As a result, the correlation value's weakness suggests that conventional approaches are inadequate for simulating such intricate relationships and that stronger tools must be developed immediately.

Model Combination
The model combinations were created based on the levels of interaction between each variable and solar radiation, Rs, as shown in Table 1 and Fig. 4. Tmax and SH have the best and worst relationships, respectively, with a value of 0.5374 for Tmax and 0.0411 for SH.For use in both ANFIS and MLR models, the resulting models are M1, M2, M3, M4, M5, M6, M7, and M8, as indicated in Table 2.The modeling utilized an input/output combination of Rs and the atmospheric factors.The Neuro-Fuzzy Designer tool of MATLAB was utilized to forecast sun radiation with ANFIS.A Sugeno-type fuzzy inference system was produced by tuning the input and output parameters of the Membership Function (MF).A triangular MF type was chosen for the input parameter and a constant MF type was chosen for the output parameter.
The FIS was trained over 50 iterations with an error tolerance of 0.005.(epochs).
To appropriately assess the effectiveness of ANFIS in forecasting solar radiation, the anticipated solar radiation values generated by the ANFIS model were divided into training (75%) and testing (25%) data.Table 3 presents the outcomes of the performance criteria.With values of R2 = 0.4345, R = 0.6592, MSE = 0.0128 and RMSE = 0.1133, ANFIS-M8 generated the best results of the ANFIS models, whereas MLR-M8 produced the best results of the MLR models with values of R2 = 0.3633, R = 0.6027, MSE = 0.0145 and RMSE = 0.1202.The best ANFIS model, therefore, performed better than the best MLR model.However, due to the poor performance criterion found in Table 3, neither ANFIS nor MLR can handle the prediction of solar radiation efficiently for daily recorded values.This is demonstrated by radar plots in Fig. 5 and  6, respectively, which exhibit the R2 and R values for the training and testing models of the ANFIS (M1-M8) and MLR (M1-M8), respectively.

Conclusion
In this study, the forecasting of solar radiation in Abuja, Nigeria, is done using both the conventional Multi-Linear Regression (MLR) model and the cuttingedge machine learning model, ANFIS.Model 8 of the ANFIS produced the best results when input variables were combined in different ways, with values of R2 = 0.4345, R = 0.6592, MSE = 0.0128 and RMSE = 0.1133, while Model 8 of the MLR produced the best results when input variables were combined in different ways, with values of R2 = 0.3633, R = 0.6027, MSE = 0.0145 and RMSE = 0.1202.A significant degree of agreement between the variables that can be seen on the projected graphs is inferred from the simulation of the observed solar radiation data and the forecasted values.the best ANFIS model fared better than the best MLR model.In this study, the forecasting of solar radiation in Abuja, Nigeria, is done using both the conventional Multi-Linear Regression (MLR) model and the cutting-edge machine learning model, ANFIS.Model 8 of the ANFIS produced the best results when input variables were combined in different ways, with values of R2 = 0.4345, R = 0.6592, MSE = 0.0128 and RMSE = 0.1133, while Model 8 of the MLR produced the best results when input variables were combined in different ways, with values of R2 = 0.3633, R = 0.6027, MSE = 0.0145 and RMSE = 0.1202.A significant degree of agreement between the variables that can be seen on the projected graphs is inferred from the simulation of the observed solar radiation data and the forecasted values.the best ANFIS model fared better than the best MLR model.

Table 1 :
Sensitivity analysis between the experimental variables

Table 3 :
Prediction results of ANFIS and MLR based on the evaluation criteria