Using Hyper Clustering Algorithms in Mobile Network Planning

: Problem statement: As a large amount of data stored in spatial databases, people may like to find groups of data which share similar features. Thus cluster analysis becomes an important area of research in data mining. Applications of clustering analysis have been utilized in many fields, such as when we search to construct a cluster served by base station in mobile network. Deciding upon the optimum placement for the base stations to achieve best services while reducing the cost is a complex task requiring vast computational resource. Approach: This study addresses antenna placement problem or the cell planning problem, involves locating and configuring infrastructure for mobile networks by modified the original density-based Spatial Clustering of Applications with Noise algorithm. The Cluster Partitioning around Medoids original algorithm has been modified and a new algorithm has been proposed by the authors in a recent work. In this study, the density-based Spatial Clustering of Applications with Noise original algorithm has been modified and combined with old algorithm to produce the hybrid algorithm Clustering Density Base and Clustering with Weighted Node-Partitioning around Medoids algorithm to solve the problems in Mobile Network Planning. Results: Implementation of this algorithm to a real case study is presented. Results demonstrate that the proposed algorithm has minimum run time minimum cost and high grade of service. Conclusion: The proposed hyper algorithm has the advantage of quick divide the area into clusters where the density base algorithm has a limit iteration and the advantage of accuracy (no sampling method is used) and highly grade of service due to the moving of the location of the base stations (medoid) toward the heavy loaded (weighted) nodes.


INTRODUCTION
Clustering is one of the most important research areas in the field of data mining (Velmurugan and Santhanam, 2011;Suguna and Thanushkodi, 2011); it specializes in techniques for grouping similar objects into a cluster in which objects inside a cluster exhibit certain degree of similarities. It separates dissimilar objects into different clusters. In geometry aspect, clustering is a process to identify dense regions, which are separated by the sparse regions, in the data space to be clusters from the whole data space. Applications of clustering analysis have been utilized in many fields, especially in spatial data analyzing. As large amounts of data are obtained from satellite increasingly, finding clusters in spatial data becomes an active research area.
In mobile network, each cellular service area is divided into regions called cells. Each cell contains an antenna and is controlled by a solar or AC power network station, called the Base Station (BS). Each base station, in turn, is controlled by a switching office, called a Mobile Switching Center (MSC). The MSC coordinates communication between all the base stations and telephone central office. Cell planning is challenging due to inherent complexity, which stems from requirements concerning radio modeling and optimization. Manual human design alone is of limited use in creating highly optimized networks. Due to the complexity of this process, Artificial Intelligence (AI) (Raivio et al., 2001;Ibrahim, 2011), clustering techniques (Ibrahim, 2005;2011;Ibrahim et al., 2009;Fattouh et al., 2003), Ant-Colony-Based algorithm (Raivio et al., 2001;Ibrahim, 2011), clustering techniques (Ibrahim, 2005;2011;Ibrahim et al., 2005;2009;Fattouh et al., 2003) have been successfully deployed in wire network planning. Tabu Search TS (St-Hilaire et al., 2006), Genetic Algorithm (GA) (Guan et al., 2006) and clustering techniques (Harby and Ibrahim, 2008;Ibrahim and Al Harbi, 2008a;2008b) have been successfully deployed in mobile network planning.
This study introduces the density base clustering in the first stage of solving the Mobile Networking Planning problem. In the second stage we classify each cluster into two categories. If the cluster is nonhomogeneous density we adjust capacity and coverage constraint by using the CWN-PAM (Harby and Ibrahim, 2008) algorithm. If the cluster is homogeneous density we use the density-based Spatial Clustering in the adjustment. Combined these two clustering algorithms generate the hypired algorithm CDB-CWNPAM Clustering Density Base and Clustering with Weighted Node-Partitioning Around Medoids algorithm which have the advantage of quick divide the area into clusters where the density base algorithm have limit iteration and the advantage of high accuracy and hightly grade of service due to the moving of the location of the base stations (medoid) toward the heavy loaded (weighted) nodes due to apply of CWN-PAM algorithm.
Main phases used in radio network planning: The radio network planning process can be divided into different phases. At the beginning is the Preplanning phase. In this phase, the basic general properties of the future network are investigated, for example, what kind of mobile services will be offered by the network, what kind of requirements the different services impose on the network, the basic network configuration parameters and so on. The second phase is the main phase. A site survey is done about the to-be-covered area and the possible sites to set up the base stations are investigated. All the data related to the geographical properties and the estimated traffic volumes at different points of the area will be incorporated into a digital map, which consists of different pixels, each of which records all the information about this point. Based on the propagation model, the link budget is calculated, which will help to define the cell range and coverage threshold. There are some important parameters which greatly influence the link budget, for example, the sensitivity and antenna gain of the mobile equipment and the base station, the cable loss, the fade margin. Based on the digital map and the link budget, computer simulations will evaluate the different possibilities to build up the radio network part by using some optimization algorithms. The goal is to achieve as much coverage as possible with the optimal capacity, while reducing the costs also as much as possible. The coverage and the capacity planning are of essential importance in the whole radio network planning. The coverage planning determines the service range and the capacity planning determines the number of to-be-used base stations and their respective capacities.
In the third phase, constant adjustment will be made to improve the network planning. Through driving tests the simulated results will be examined and refined until the best compromise between all of the facts is achieved. Then the final radio plan is ready to be deployed in the area to be covered and served.
The two important mobile technologies are: GSM Global System for Mobile Communications and UMTS Universal Mobile Telecommunications System. This study use GSM technology.
GSM referred to as 2G. It operates in the frequency 900-Mhz and a variation of it operates in the 1800-Mhz. GSM planning divided into two phases, coverage planning phase, capacity planning phase. The coverage planning and capacity planning are independent. The frequency is a one of the important issue resource in GSM systems.

Coverage planning in GSM:
The coverage planning depends on the received signal strength. Base stations are placed to ensure that the signal strength is sufficiently high in all areas of the region to be served. In this stage the link budget and Okumura-Hata function are calculated, which will help to define the cell range.
When considering the coverage of a cell, the maximum radius of the cell must be determined. Coverage is determined with respect to the maximum path loss that can be applied to the signal. The maximum path loss is calculated for the reverse link since the transmission power of subscriber antenna is much less than that of the base station. Link budget is designed to calculate the maximum path loss. It is defined in (Allen et al., 2004) as: the accounting of all of the gains and losses from the radio transmitter (source of the radio signal), through cables, connectors and free air to the receiver. A simple link budget equation looks like this:

Allowed propagation loss = Transmitted EIRP + Receiver Gains − Total margin (Losses)
The elements of a link budget: The elements can be broken down into three main parts: • 1) Transmitting side with effective transmit power. Cable Loss: Losses in the radio signal will take place in the cables that connect the transmitter and the receiver to the antennas. The losses depend on the type of cable and frequency of operation.
Body Loss: Allow at least 0.25 dB (loss) for each connector in cabling.
Antenna Gain: Is defined as the ratio of the radiation intensity of an antenna in a given direction, to the intensity of the same antenna as it radiates in all directions (isotropically).
Receiving side with effective receiving sensibility: It can be calculated by the following equation: Effective Receiving Sensibility = Receiver Sensibility-(Cable RX loss+ Body RX loss)+Antenna RX gain where, Cable loss, body loss and antenna gain like transmitter side above.
Receiver sensibility: Is a parameter that deserves special attention as it indicates the minimum value of power that is needed to successfully decode/extract "logical bits" and achieve a certain bit rate.
Propagation part with propagation losses: The propagation losses are related to all attenuation of the signal that takes place when the signal has left the transmitting antenna until it reaches the receiving antenna. One of the main causes for the power of a radio signal to be lost in the air is fading. Shadow fading is a phenomenon that occurs when a mobile moves behind an obstruction and experiences a significant reduction in signal power:

Total Margins = Fading Margin + Interference Margin + Penetration Margin + Other Margins
Once the maximum allowed propagation loss in cell is known, the maximum cell range and coverage area can be evaluated by applying a model like Okumura-Hata model for propagation loss (Allen et al., 2004). Propagation model is the algorithm that the predicate tool uses to calculate signal strength. Each model is developed to predicate propagation in particular environments such as overlay, open area, suburban area, urban area, high dense urban and low dense urban. Okumura-Hata model is widely used for coverage calculation in macrocell network planning (taking from lesson in RF-Basic Concept: Technical Parameter and Link Budget, CISCOM Cellular Integrated Services Company).
The Okumura-Hata model is valid for the following conditions: The path loss is expressed as the sum A + B log10(d) + C, where the constant coefficients A, B and C are dependent upon the propagation terrain and d is the distance between the transmitter and receiver.
The parameters A and B are set by the user according to Table 1 (taking from Alcatel GSM Network). These values have been determined by fitting the model with measurements. These are points that are at the interior of a cluster (in the interior of density-based cluster). A border point has fewer than MinPts within Eps, but it is in the neighborhood of a core point. A noise point is any point that is not a core point or a border point. Figure 1 shows the original DBSCAN clustering algorithm.

CWN-PAM clustering algorithm:
The CWN-PAM algorithm is based mainly on the idea of the Modified Partitioning Around Medoids (M-PAM) (Ibrahim and Al Harbi, 2008a). We modified the cost function of the M-PAM. Modified Cost Function to Handle Node Load. In database contains n points {n 1 , n 2 … n n }, Where n h is the medoid (the real data point that satisfies minimum cost) of cluster C h , n i denote to non-medoid points and k is the number of cluster. The Direct Euclidean distance from a point n h to n i is the dis( n h , n i ). The cost function in M-PAM (Eq. 1) is modified: The TC is modified to WTC where: L hi is the subscriber load cost of this distance. According to Eq. 2, medoids, location of the base stations, move toward the heavy loaded (weighted) nodes.
For each cluster, we applied coverage and capacity plan and calculate the number of needed base station BS.

Fig. 2: CWN-PAM algorithm
If any cluster needs more than one base station, we increase number of clusters on just the cluster that had a problem on its mobile constraints. Figure 2 describes algorithm CWN-PAM.

CDB-CWNPAM Clustering Density Base and Clustering With Weighted Node-Partitioning Around
Medoids Algorithm is hybrid algorithm. We modified DBSAN algorithm to divide the area under consideration to introduce into it the mobile network and for each area we applied CWN-PAM clustering algorithm.
Problem statement: In a certain area, contains number of subscribers, we need to determine the number of base stations required and define their boundaries in such a way that satisfying good quality of services with minimum cost. The problem statement: • Input: a set N data points {n 1 , n 2 … n n } in 2-D map, subscribers loads and communication constraints, such as the maximum length of cables and set of streets • Objective: Partitioning the city into k clusters {C 1 ,C 2 , .., C k } that satisfying clustering constraints, such that the cost function is minimized with high grade of services • Output: k clusters, Base Station locations, boundaries of each cluster The proposed algorithm contains three phases. Figure 3 describes algorithm CDB-CWNPAM. The following sub-sections describe these three phases.

Fig. 3: CDB-CWNPAM algorithm
Pre-planning: This phase is divided into two steps.
Step 1, convert map from raster form to digital form.
Step 2, determine the initial number of clusters.
Step 1: Map and its Data Entries. The maps used for planning are scanned images obtained by the user. They need some preprocessing operations before using them as digital maps, we draw the streets and intersection nodes on the raster maps, the beginning and ending of each street are transformed into data nodes, defined by their coordinates. The streets themselves are transformed into links between data nodes. The subscriber's loads are considered to be the weights of each node. For each intersection node and street the user can right click to input the characteristics of intersection node (number, name, capacity) or street (street number, street name, street load,..). We save all above data in database.
Step 2: Determine Initial Number of Clusters. In cell planning, we need to divide planned area to number of cells, each cell served by BS which guarantees the quality of service for all subscribers. In this study we used GSM technology of radio network planning to calculate number of cells needed by coverage planning and calculate number of cells needed by capacity planning for planned area: Number of cells needed by coverage planning = Total area / area of the cell Number of cells needed by capacity planning = Total number of subscribers / Total subscribers per cell Initial number of clusters K = the maximum of the two values.
Main-planning stage: In this phase, the goal is to split the entire database into clusters. Two parameters must be determined before we start applying the DBSCAN. These parameters are MinPts and Eps.
MinPts to be equal to the number of subscribers per cell, which is calculated using capacity planning and Eps equal to radius of cell which calculated by coverage planning.
Partition database: After MinPts and Eps are determined, we used DBSCAN algorithm to classify each node into one of the following types: • Core point if it has more than the specified number of points (MinPts) within Eps. Each core point will be the location of base station and the circumference of this circle with radius Eps around the node will be the boundary of the served area for each cluster • Noise point in real planning all subscribers must be served so noise point is served using the nearest mobile tower • Border point that belongs to ascertain cluster A point K is belonging to the core point C if the distance between the node K and C is minimum distance between K and all core.
Adjustment stage: For each cluster, we apply coverage and capacity plans and calculate the number of needed base stations BSs. If any cluster needs more than one base station, we add more clusters.
In this stage we classify each cluster into two categories. If the cluster is non-homogeneous density we adjust capacity and coverage constraint by using the CWN-PAM (Harby and Ibrahim, 2008) algorithm. If the cluster is homogeneous density we use the densitybased Spatial Clustering in the adjustment. Combined these two clustering algorithms generate the hyper algorithm CDB-CWNPAM Clustering Density Base and Clustering with Weighted Node-Partitioning Around Medoids algorithm which have the advantage of quick divide the area into clusters where the density base algorithm have limit iteration and the advantage of high accuracy (no sampling method is used) and highly grade of service due to the moving of the location of the base stations (medoid) toward the heavy loaded (weighted) nodes when apply CWN-PAM algorithm.

RESULTS
The CDB-CWNPAM is applied on an area of a map. Figure 4 shows the area after applying density base algorithm. The area is divided in the first stage into 7 clusters. Figure 5 shows the area after applying the Adjustment stage; three clusters haven't change their locations of base station due to the uniform distribution of subscribers. Fourth and fifth clusters are divided each into two clusters where capacity or coverage constraints are not satisfied. Sixth and seventh clusters are reformed, due to non homogeneous distribution of subscribers, using CWN-PAM algorithm, therefore there location of base stations are moved towards the heavy loads.   Table 2 describes the comparison between the proposed method and other methods used in mobile network planning. The first two methods that are used here: Tabu search and Genetic Algorithms.
Tabu Search and Genetic algorithm need a huge numbers of estimated input parameters (Initial probability, Mutation probability, Crossover probability, Number of iteration, Selection pressure and Frequency factor ) which can be affected to the results.
M-PAM (Modified-Partitioning Around Medoids) is based mainly on the idea of PAM (Partitioning Around Medoids) algorithm. We imbedded the capacity and coverage algorithm to initiate the variable k.
Algorithm CWN-PAM (Clustering with weighted Node-Partitioning Around Medoids) is based mainly on the idea of the M-PAM algorithm, It handles the constraints of network planning by modifying the PAM distance function. PAM is the most accurate algorithm in the partitioning based algorithm because it's flexibility to check all nodes in each cell to determine the best location for base station. Due to the modification in cost function in CWN-PAM, the location of the base stations (medoid), move toward the heavy loaded (weighted) nodes and increase there grade of service. The hyper algorithm CDB-CWNPAM Clustering Density Base and Clustering with Weighted Node-Partitioning Around Medoids algorithm which have the advantage of quick divide the area into clusters where the density base algorithm has a limit iteration and the advantage of high accuracy (no sampling method is used) and highly grade of service due to the moving of the location of the base stations (medoid) toward the heavy loaded (weighted) nodes when the density of subscribers are not uniform distribution.

CONCLUSION
Clustering analysis is one of the major tasks in various research areas. The clustering aims to identify and extract significant groups in underlying data. Based on certain clustering criteria the data are grouped so that the data points in a cluster are more similar to each other than points in different clusters. This study presents hyper algorithm CDB-CWNPAM Clustering Density Base and Clustering with Weighted Node-Partitioning Around Medoids algorithm which have the advantage of quick divide the area into clusters where the density base algorithm has a limit iteration and the advantage of accuracy (no sampling method is used) and highly grade of service due to the moving of the location of the base stations (medoid) toward the heavy loaded (weighted) nodes when used the second algorithm CWN-PAM. Comparisons with other clustering methods are presented showing the advantages of the CDB-CWNPAM algorithm introduced in this study.
It is expected that by applying this system to a number of areas belonging to different countries with different sizes, one can verify its capabilities more universally. The next generation mobile communication system is desired to transmit multimedia information at multi-rate. Therefore, we can implement the UMTS technology instead of GSM technology by modify only the coverage and capacity planning algorithms.