A DYNAMIC TEMPORAL NEURO FUZZY INFERENCE SYSTEM FOR MINING MEDICAL DATABASES

The analysis and representation of temporal data are becoming increasingly important in many areas of research and application. The existing Fuzzy Cognitive Maps (FCMs) are efficient modeling method for knowledge representation and fuzzy reasoning in time series analysis. In the past, it was used to represent a complex causal system as a collection of concepts and causal relationships among concepts. However, most of the FCMs available now are constructed manually and are constrained with human experts’ intervention for assessing its reliability. This study proposes a new temporal mining system to discover temporal dependencies between the concepts of a complex causal system by building a Fuzzy Temporal Cognitive Map (FTCM) by extending the FCM. For this purpose, a four-layer fuzzy temporal neural network is proposed and implemented by the automatic creation of the conventional FTCMs from the given data. This FTCM is generated from the medical temporal database records of diabetic patients where the medical diagnosis is performed by converting the fuzzy cognetive maps into a fuzzy temporal rule based inference system using Allen’s temporal relationships and fuzzy temporal rules.


INTRODUCTION
Temporal data mining is the process of extracting temporal patterns from large collection of data. It applies methods such as clustering, neural networks, genetic algorithms, decision trees, to mine data with the intention of uncovering hidden temporal patterns. Time series data is a sequence of data measured successively at uniform time intervals. The prediction of future values in a time series system is based upon past/ present information and therefore it is very useful in medical applications. Time series analysis is more applicable in temporal clinical databases to predict a patient's health and to plan medical therapy.
Medical data is temporal in nature and therefore conventional data mining techniques are not suitable to make effective decisions in medical applications. Moreover, the medical knowledge can be represented using different methods such as medical ontology, cognitive maps. However medical ontologies fail to map the symbolic knowledge with numerical knowledge in order to perform inference. Fuzzy Cognitive Map (FCM) has several properties such as flexibility, abstraction, (Stylios and Groumpos, 2004) differentiability as well as fuzzy reasoning and is capable of mapping the symbolic knowledge to numerical knowledge.

Fuzzy Cognitive Map
Fuzzy cognitive maps proposed by Kosko (1986), are signed fuzzy digraphs for representing causal reasoning. Fuzzy logic is capable of performing reasoning under uncertainty which is not possible with first order logic. Fuzziness describes ambiguity of an event, whereas randomness describes uncertainty of occurrence of an event. Fuzzy sets use linguistic variables to represent imprecise concepts rather than quantitative variables. The modelling of complex systems requires new methods that can utilize the existing knowledge and human experience. A fuzzy neural network or a neuro fuzzy system is a learning machine that finds the parameters of a fuzzy system Science Publications JCS (i.e., fuzzy sets, fuzzy rules) by exploiting approximation techniques from neural networks.
Fuzzy Cognitive Maps (FCMs) constitute a novel, yet attractive approach that encompasses advantageous modeling features. FCMs are the extension to the generic model of cognitive maps (Kosko, 1986). Moreover, FCM can be used to describe the behavior of a collection of concepts. FCM introduced fuzzy values for quantifying the concepts of a complex system in which the degree of uncertainty can be addressed.

Objective
The main objective of this work is to propose a dynamic neuro fuzzy inference system that builds a model named, Fuzzy Temporal Cognitive Map (FTCM) based on four layer fuzzy neural network (Song et al., 2010), with an identification of membership function in a prespecified time interval and computation of relationship among the concepts within that interval. Further, the causalities among the concepts are represented as a temporal relationship pattern with an integration of Allen's interval algebra (Allen, 1983). The FTCM is then used for fuzzy temporal reasoning by converting it into a rule based fuzzy inference system.

Related Work
The Fuzzy Cognitive Map (FCM) concept introduced by Kosko (1986) has evolved from the concept called Cognitive Map (CM) (Axelord, 1976). fuzzy cognitive maps are used in a much wider range of applications, such as decision making (Papageorgiou et al., 2003), system control (Kottas and Boutalis, 2006), medicine (Innocent and John, 2004). Wide range of applications of FCMs are found in Decision Support Systems. Several works has been proposed recently for the extension of FCMs. FCM can model dynamical complex systems that change with time following nonlinear laws. There are many works in the literature that deal with FCM. For example, (Papageorgiou et al., 2003) proposed an integrated two level hierarchical system based FCM for decision making in radiation therapy. Stylios and Groumpos (2004) used FCMs to model complex systems. A mathematical description of FCM was presented by them and they examined a new methodology based on fuzzy logic techniques for developing the FCM. In their approach, experts describe the relationship between concepts and determined the influence from one concept to another. (Kottas et al., 2004) proposed an efficient cause-effect method for reaching equilibrium points, by getting the "dominant" influences between nodes. The equilibrium states obtained by this approach reflect more realistically the behavior of the system but they are not as "secure" as the traditional approach.
One of the paradigms used to automate development of FCMs started from the Hebbian law. The first attempt to learn FCMs using this approach was proposed by Dickerson and Kosko (1994) and was referred to as Differential Hebbian Learning (DHL). This method was further extended into Nonlinear Hebbian Learning (NHL) (Innocent and John, 2004). Wojciech et al. (2008) proposed a new approach named data-driven NHL (Nonlinear Hebbian Learning) that extends NHL by using historical data of the input concepts. The improved quality in learning process depends on the historical data. Enhancement in the learning process is suggested in (Song et al., 2010) by incorporating the inference mechanism of FCM with automatic identification of membership functions and quantification of causalities. (Miao et al., 2002;Leong and Miao, 2005) proposed fuzzy cognitive agents which deploy the fuzzy cognitive maps as decision making engines based on user preferences and domain knowledge. However, the decisions made using fuzzy cognitive maps are not sufficient to make final decisions in many complex applications including medical data sets where the decision must be made based on past and present data in which the construction applied must be able to predict the future. Therefore, it is necessary to include the temporal phenomena in decision making. The goal is to build a reliable knowledge representation model for inferencing and prediction using medical databases, compatible with FCM knowledge representation. This study proposes the novel Fuzzy Temporal Cognitive Map (FTCM), which defines a complete discrete temporal extension and fuzzy inference mechanism of FCM. In FTCM, the temporal dependencies of concepts during a particular time interval is measured. It presents a hybrid modeling of two types of causality: equality causality and difference causality.

Problem Statement
Let T = {t 0 ,t 1 ,….t n }be an ordered set of time series values and let C = { c 1 ,c 2 ,…c n }be the set of concept labels. The strength of activation of each concept at time t i , is the value associated with the medical observation of a disease or a symptom. The temporal relationship among the concept activation values are used to assess the disease symptom of a patient. The activation values and the causal relationships of those activation values of concepts at time t i can be used for predicting the state of a particular concept at time t i+1 . As far as the FTCM cause-effect relationships or the network links are concerned, high (absolute) values of causality signify strong cause-effect relationships between the concepts. Using this training, rules are generated and stored in knowledge base. The connectivity of the FTCM can be represented by an adjacency weight matrix W as given below. Since no cause can cause itself, all w ii = 0, where: Genarally the FCMs are created manually and the reliability of it depends upon the experts' knowledge. Hence to overcome this problem, we construct a four layered fuzzy neural network as defined in (Song et al., 2010) with automatic identification of membership function.

System Architecture
The basic organization of tthe proposed neuro-fuzzy inference system is shown in Fig. 1. The diabetic patients' records are taken from the medical database. Data preprocessing is performed to correct the missing or incorrect data and also for attribute reduction. A generalized FTCM is constructed by automatically selecting the membership function and quantification of the cause-effect relationship. Deciding the cause-effect function to specify the relationships between causal effects in a temporal domain is the key factor of this system. Supervised learning is performed in FTCM to infer useful patterns by integerating Allen's temporal relations. The remaining part of this section, describes each operation in detail.

Medical Database 2.1.1. Input Datasets
Input to the system is referred from a diabetic patient data set (http://archive.ics.uci.edu/ ml/ datasets/ diabetes). The data folder contains 70 text files. Each file represents one individual patient history. A knowledge base is built using frames as knowledge representation scheme, to develop a therapy for monitoring and controlling blood glucose level by proper insulin dosage. Since the observations are varying with time, it should be represented by date, time, concept label and value. The dataset consists of insulin deficient, high blood glucose details obtained through continuous assessment. To plan the theraphy of insulin, the concentration of blood glucose should be continuously monitored.

Data Pre-Processing
Data Preprocessing is carried out with the records of temporal clinical databases for efficient modelling. In preprocessing, records having null values are eliminated. When the size of the data set is very large, it is classified using C4.5 classifier and only the relevant data records are obtained. Each dimension of the dataset is defined with a concept label. Due to many reasons the blood glucose level of patients may increase or decrease which may result with greater variations. Hence the temporal information such as before and after food in every time is measured in a day. This observations are made with pre specified time for food to the patients. In pre processing thousands of records from the medical dataset are integrated into one relational database table and the records fail to match with the format, can be deleted from the record set. If any values of the data type or concept falls out of the range of the average value, those records were also deleted. Records of incorrect and missing data can also be deleted.

Construction of FTCM
To Design FTCM, following are the major concerns: • Input to the map Science Publications

JCS
• Involved concepts from the domain • Causal effect relationships between concepts A discrete temporalized domain is used for for defining the FTCM. Creation of FTCM is achieved by constructing Fuzzy Neural Network with fuzzy temporal decisions. Basically, FCMs are constructed manually based upon the domain experts' knowledge. The proposed approach makes use of a four layered fuzzy neural network for the generation of FTCM based on the structure given in (Song et al., 2010). A hybrid type causality, i.e., a combination of equality causality and difference causality is defined to present FTCM.
The 4 layers are implemented using the following steps: Step 1: Layer 1 represents identified concepts (input variables) of the investigated system. Concepts are of observational and interventional. There are 3 observational concepts of type insulin dose with code 33, 34, 35 and 17 interventional concepts such as blood glucose measurement with code 48, 57-73 in data folder. Time is divided into 4 logical time slot. For instance, concept 33 is interpreted as four concepts, i.e., 331, 332, 333 and 334, in which the added third digit of the label corresponds to the number of the assumed period of the day. Each concept is then represented as a node in this layer and each node can directly send their input values to the second layer.
For each input concept identified in layer1, represent it using 3 linguistic terms, i.e. .,small(S) medium(M) and large(L) using a membership function and temporal constraints. Find the maximum and minimum value for each concept in a file, set 3 ranges for small, medium and large. The output of second layer is: e-a/b , where, a = (original value of the concept in input file-mean of values of the concept) and b = (range of max or min) 2 . Each linguistic term (x il ) represents a fuzzy subset in the universe of the layer 1 input variables. The fuzzy set with linguistic variables have been related with input variables and are modelled using a symmetric Gaussianmembership function with center (C) and spread(σ), which are again checked by a set of temporal constraints. The application of symmetric Gaussian-membership function ensures differentiability, which is a necessary property for the back propagation algorithm employed in the learning process.
Moreover, the membership function is depicted using a graphical representation to show of the magnitude of participation of each input. It associates a weight with each of the inputs that are processed based on weight matrix formed from the weights. This study considers a functional overlap between inputs in order to determine a suitable output response.
Step 3: Layer 3 is composed of calculating causalities among the concepts in FTCM and performing the defuzzification process.
The nodes in this layer represent linguistic values of output variables. A hybrid causalities such as the combination of equality causality and differenc causality is measured In FTCM, since both input and output variable represent the same concept, the same set of linguistic values are used for the output variables. The linguistic terms which represent the output variable (y il ) are also described by symmetric Gaussian membership function. Layer 3 nodes are connected to layer 2 nodes with the help of fuzzy weights, which are calculated using mutual subsethood, ie., 1-ε (x il , y il ), Mutual subsethood (Song et al., 2010) measures the similarity between the input and output variables, which describes the causal-effect relationship from the input linguistic term to the output linguistic term. Also, it ensures that no concept will cause itself. The process of defuzzification is also implemented in this layer, based on the concept of centroid based approach.
Step 4: Layer4 represents the non fuzzy output concept.
The fourth layer consists of the non fuzzy output variables. It is calculated by integrating the crisp weight from the output linguistic term to the output variable. The relationship between concepts are learned with respect to the selection of the membership function during stage by stage development. The selection of a membership function and proper construction of FTCM is a crucial problem since it converts various measurements to a common reasoning paradigm. Therefore this FTCM has been constructed with a set of concepts with time interval, set of membership functions, set of causeeffect temporal relationship and a weight matrix.

Inferencing and Prediction
The dynamically generated FTCM is analyzed using Allen's temporal relations to infer useful patterns in the medical diabetic data set. First phase of algorithm that refers to generalization of temporal events into one event with a temporal interval has the following steps: • Sort a dataset according to Patient identifiers PID and timestamps of transactions • Based on Patient identifiers, calculate frequent sign types • Delete non-frequent sign types from transactions.
• Calculate a set of sign sequences per each Patient for only frequent sign types • Generalize each sign sequence into a generalized sign with a time interval and get a database of generalized signs GD

JCS
Second phase is discovering temporal relational rule from the database of generalized signs by the following steps: • All candidate temporal interval relations per each patient's ID are found • A set of frequent temporal interval relations is extracted The algorithm for prediction using FTCM is detailed as below: 1. Construct Generalized FTCM 2. Retrieve Generalized Pattern List GP{x 1 , x 2 , …x n } with a general set of frequent temporal interval relation 3. Repeat Generate FTCM for each patient record for each concept do find activation value Av i of all pairs of concepts ( c i , c j ) at time t i calculate future activation value favi at time ti+1 fAv i (t i+1 ) = T(Σ i,j w ij Av i (t i )), where T = 1/(1+e -cx ) if fAv i (t i+1 ) ≠ fAv i (t) then adjust the weight form the pattern list PL{ } with the candidate patterns CP // set TH l and TH u for av i and w j , 1≤i ≤ n and ∀x∈j and j≠i if (ATH l ≤ Av i ≤ ATH u ) && (WTH l ≤wij ≤WTH u ), add(c) →P{ } where c = { i, t, Avi} end for Until no patient record exist 4. While ( PL{ }≠ ф ) Compare the candidate pattern with general pattern For each y in CP and for each x in GPL if equals(x.y) Λ (overlap(x,y) V contains(x,y)) then Generate Rule: Retrive the activation value for i th concept Av i (t i ) Severity of concept i to concept j with respect to W ij End While.
A Generalized FTCM is constructed from the preprocessed input dataset. By considering frequently occurring cause-effect sign sequences into a generalized pattern list with a time interval, a generalized pattern list is retrieved from that FTCM. For the new testing sample record, FTCM is constructed and for each concept Ci of FTCM, the activation value AVi is obtained between all pairs of concepts (Ci,Cj) at a particular timestamp t i . During the next period, the effect of Ci on Cj is measured as the product of weighted sum of the activation values of cause-effect on C i to C j and the threshold TH. If the difference between these two observations is greater than zero, then causal relationship of concept C i to C j is observed. This observation can be done iteratively by adjusting the weights in FTCM. The cause-effect values represent the candidate patterns in an ordered set of time intervals. The temporal relationship among the concepts constitutes the symptoms of a disease pattern. If the patterns observed differ in their usual values then biological disorder will be the result. In prediction the candidate patterns found are compared with a general pattern list. By using temporal interval relational algebra, the severity of the disease is predicted.

RESULTS AND DISCUSSION
To facilitate analysis of experiments results, the experiments have been generated using different number of linguistic labels. The sample diabetic patient records are taken from the benchmark (http://archive.ics.uci.edu/ ml/datasets/diabetes) medical database. The training datasets are pre-processed and stored in the form of relational database records. During the construction of FTCM, the selection of the temporal membership function is done automatically based upon the values of the concepts. Symmetric Gaussian membership function with temporal constraints is used. Also, the weights associated with the concepts are generated automatically which reduces human intervention very much. Layer 1 is considered to be a set if input values which are identified concepts of the investigated system. Time is divided into 4 logical time slots where 24 hrs is divided into 6 h duration to get 4 intervals. Each time slot has the start and end time period. The end of the first slot is the beginning of the second time slot. A one to one mapping of the observations of the system to concept labels will be performed. Each concept is defined with a concept label of three digit where the third digit is the number of the slot of the day. For e.g., if the input file data is: 03-03-2012 08:00 58, then concept is 581, where 1 is the first slot of time. Layer 2 represents fuzzification of input variables. The membership function is a graphical representation of the magnitude of participation of each input. It associates a weight with each of the inputs that are processed, define functional overlap between inputs and ultimately determines an output response. The rules use the input membership values as weighting factors to determine their influence on the fuzzy output sets of the final output conclusion. Once the functions are inferred, scaled and combined, they are defuzzified into a crisp output which drives the system. In this study, the initial weight matrix W ij , represents impact of concept i to concept j where i≠j, so the main diagonal of the matrix is 0. The weights should be between -1 and +1. The weight should be adjusted based on the concept values, time intervals and the temporal relationships between the concepts. Layer4 represents the output concept. The result is an FTCM, which can be visualised by the prefuse tool. The knowledge representation through FTCM is used well as the domain expert in predicting the severity of the diabetic disease to make decisions. The graph in Fig. 2 shows that the correlation between the values produced through FTCM with domain expert. From Fig. 2 it has been observed that the prediction accuracy made by our inference system provides more accurate results when the number of patient records exceeds more than 500.

JCS
A generalized FTCM is constructed with the diabetic dataset considered from (http:// archive.ics.uci.edu/ ml/datasets/diabetes). The results of the learning process that are as close as possible to the expected result with the settings that, the concepts of an investigated system are the observations made periodically and the time interval is 4 slots per day. The state of the cause-effect relation of a concept is obtained for each time slot and if the value is not measured at one time period then it may be referred from the previous value. Previously, the experiment is being done using different delays and decays in concepts and there are drastic changes in the temporal values of weights observed. But in the proposed approach, due to the use of a fuzzy temporal neural network the weights between the concepts remain constant and it is close to the expected values. We have arrived at the same results in less number of iterations as shown in Fig. 3.
Inferencing and forecasting for the incoming patient record is performed based on Allen's interval algebra and rules. The frequently occurring patterns or the generalized pattern lists are obtained with temporal intervals. The dynamically generated FTCM is analysed using Allen's algorithm to infer useful patterns in the diabetic patient record. Similarity of the general pattern with the candidate pattern is observed. If there is a similarity based upon the temporal relations then the activation value of the concept is observed for the biological disorder such as blood glucose level and insulin level. The weights learned predicts the severity of the disease with that of the impact of concept i on concept j. The flow of such cause-effect relationship values can be the disease symptoms of a patient at a particular time. These values are ruled by certain temporal relation to make it to their normal values and the decision on the medical therapy is made. approach with the simple neural network is compared and proved that the proposed system yields a better accuracy than existing approach, which is shown in Fig. 4. The prediction accuracy of this approach with the simple neural network is compared and proved that the proposed system yields a better accuracy than existing approach, which is shown in Fig. 4.
This FTCM approach is viewed as an efficient knowledge representation scheme for temporal medical databases with prediction accuracy of 91% for large datasets. Figure 5 shows the performance comparison of the proposed FTCM with BPN (Back Propagation) Neural Network.
The cause effect relationship of the concepts are used to retrieve the temporal relational rules by integrating Allen's temporal logic. The rules inferred are used to infer useful information based upon the observations made on the blood glucose level. Figure 6 shows that the number of rules inferred through FTCM are efficiently used for a decision making system in order to perform diagnosis and prediction.

CONCLUSION
In this study, a knowledge representation technique called fuzzy temporal cognitive map and a fuzzy temporal inference system for mining temporal medical datasets has been proposed. Using this FTCM, quantification of causalities and identification of causeeffect relationships have been automatically performed which reduces the number of iterations to arrive the accurate result. In this manner, FTCM models of the investigated systems have been automatically constructed from the data and thereby reduces the excessive dependence on expert knowledge. Also the proposed approach provides better prediction accuracy than FCM models, where the number of patient records exceeds more than five hundred. The selection of membership function for the process of fuzzification in FTCM yields a better relation than the domain expert.
The proposed work can be extended by using other approaches like genetic algorithms to assign the membership functions and feature selection. This study can be extended by building a fuzzy rule based classifier for enhancing the prediction accuracy.