Deduction of Oral Cancer Using Fuzzy Linear Regression

Problem statement: To examine the risk factors of oral cancer at the e arlier stage. Smoking, chewing, and drinking are the major risk f actors which cause oral cancer considered as input variables. Approach: A case – control study was conducted at JKK Nataraj Dental College and Hospital, during the period from September 2007 to N vember 2009, in Namakkal District, Tamilnadu, India. Data collected were analyzed usin g Fuzzy Linear Regression. For this JAVA program was developed. Results: Using this fuzzy linear regression model Smoking, Drinking, and Chewing were identified as potent risk factors of o ral cancer. Conclusion: Smoking, drinking, and chewing, are the most dangerous risk factors that w ill cause oral cancer. This study will help to improve the clinical practice, guidance for analyzi ng the risk factors of oral cancer.


INTRODUCTION
Oral cancer is one of the most common life threatening cancers all over the world, in particular Asian countries and tobacco is considered to be the most potent risk factor for oral cancer . Oral cancer is a significant cause of morbidity and mortality and consistently ranks as one of the top ten cancers worldwide, with broad differences in geographic distribution (Delavarian, 2010). Worldwide, oral cancer is reported to have the lowest survival rate. This has led to an increased concern over the role of cancer screening program (Mohd Dom Rosma et al., 2010). According to World Health Organization, of the diagnosed oral cancer worldwide around 40% occur in India, Pakistan, Bangladesh and Sri Lanka. India has one of the highest rates in the world; accounting for one-third of the total cancers and unfortunately this figure continues to rise. 80-90% of oral cancers are caused by tobacco use. Many epidemiological studies conducted over the last three decades in America, Europe and Asia have provided strong evidence of an association between alcohol and tobacco use and an increased risk of oral cancer. Low socio-economic status is as well significantly associated with increased oral cancer risk in high and lower incomecountries, across the world .
Tobacco is one of the most preventable causes of oral cancer. Although there is evidence that smoking (cigarette, cigar and pipe) is associated with oral cancer, the smokeless tobacco (often called chewing tobacco or spit tobacco) seems to be strongly associated with oral cancer. These findings are based primarily on the epidemiological association of tobacco use with increased incidence of oral cancer .
The co-factors are age, gender, ethnicity, smoking, lack of fruits and vegetables in diet, alcohol drinking, chewing, chronic irritation to the lining of the mouth, alcohol-containing mouth wash and Human Papilloma Virus (HPV) infection. Oral cancers are two to four times as common in men as in women. This may be because men are more likely to use tobacco and alcohol (Arulchinnappan et al., 2011).
The most common treatment is radiation therapy and chemotherapy. The new treatment methods in the field of cancer are introduced every day, but the decision making is the complex practice that should be done with extreme care and conscious. Sometimes making decisions with initiative thinking may lead to wrong diagnosis and treatment. Methodological decision making is unfailing and will be the base for the decisions. Instead of identifying the stages and development of cancer, it is important to identify the risk possibility of oral cancer for prevention.
A risk factor is anything that may increase a personal chance of developing a disease. Some people with one or more risk factors never develop the disease, while others develop disease and have no known risk factors. But, knowing your risk factors to any disease can help to guide you into the appropriate actions, including changing behaviors and being clinically monitored for the disease.
The purpose of this study is to present the use of fuzzy regression models in the prediction of oral cancer susceptibility as a function of demographic function (age, gender and ethnicity), the risk habits (cigarette smoking, alcohol drinking, tobacco chewing, sunlight, mouthwash, chronic irritations and diet).
Fuzzy regression models have been the main predictive modeling for a long time. Predictive regression models characterize the relationship between inputs and outputs using linear equations for a linear function. In order to extend the capability of regression models, non-linear transformations are often applied to the model inputs or output.

Fuzzy regression prediction model:
A fuzzy set is mathematical theory for describing the interested variables from uncertain factors or variables like seasonal inflows. The relationship between input and output variables is defined from fuzzy rule, according to human processes in thinking and decision. In addition, fuzzy rules are relatively easy to explain and understand. Recently, the fuzzy model was accepted to describe the relationship of the uncertain variables (Hassan et al., 2010). Often, the calibration processes of the fuzzy model were performed by manual adjusting (trial and error) the membership functions and rule bases (Teerawat et al., 2011). A Fuzzy regression model is a non parametric model that can be used to explain the variation of a dependent variable Y in terms of the variations of the independent variable X as Y = f(x) where f(x) is a linear function. Fuzzy regression provides a means for handling regression problem lacking the significant amount of data and with vague relationships between the explanatory and response variables. It was first introduced by Tanaka in 1982. A fuzzy Linear Regression model expresses the fuzzy regression coefficients as fuzzy number in the interval form. The estimated dependent variable Y is also a fuzzy number since the regression coefficients are fuzzy numbers.
There are two approaches to fuzzy regression. The first approach also known as the possibilistic regression is based on minimizing fuzziness as an optimal criterion. The second which is based on the least squares of errors as a fitting criterion. In fuzzy regression, deviation between observed values and estimated values are assumed to be due to system fuzziness or fuzziness of regression coefficients (Mohd Dom Rosma et al., 2010).
In this study the fuzzy regression model used is based on Tanaka's possibility regression describes the response variable Y as: where Y is the fuzzy output, X=[X 1, X 2,….., X K ] T is the real valued input vector of independent variables and each regression co efficient A J , j=0,1……..k was assumed to be asymmetric triangular fuzzy number with centre α j and half width c j ,c j >0.
The Fuzzy Regression equation is considered as: Fuzzifing the dataset: First, in order to represent a continuous fuzzy set, we need to express it as a function and then map the elements of the set to their degree of membership. Triangular membership functions are used to represent fuzzy sets because of its simplicity, easy comprehension and computational efficiency. Membership functions are usually predefined by experienced experts. They also can be derived through automatic adjustments. We use three linguistic terms (low, medium and high). The three fuzzy membership values are produced.

Study of population:
The majority of the subjects were textile mill and saego factory workers employed in and around Namakkal. In addition workers from allied textile industries, transport workers and electrical power plant employees were examined.
The original study population was comprised of 221 industrial workers.
All participants were 35 years of age or older. All examinations were conducted with the help of specially trained dentists.
Identifying information, such as age, gender (demographic profile) of patients and information on cigarette smoking, alcohol drinking, tobacco chewing (oral cancer risk habits) was recorded and used as input variables.
The output refers to the health condition of the patients as either healthy or unhealthy.
The demographic and disease variables of patients that were reported to be associated risk factors to oral cancer were used as the predictor variables in developing the fuzzy linear regression models.
The data set was either 'cancer' (1) or 'healthy' (0). Variable descriptions and the membership function of lower, upper bound values are shown in Table 1.
After calculating the Lower and Upper bound of fuzzy co-efficient, the result for the crisp system of linear equation is: 2 X 66, y 6.56, y 8.41, Xy 50.79, Xy 39.23, X 4356

Fuzzy linear regression algorithm:
Stage 1: Set up stage: Step 1: To find membership function.

Find fuzzy regression problem:
Step 4: Find lower and upper bound values.

DISCUSSION
Oral cancer is the sixth most common cancer for both sexes in the general population and the third most common cancer in developing nations. Oral cancer is well known to occur in the age group of 35 and above. Oral cancer constitutes the most life threatening of all dental condition. Unfortunately, most malignant oral tumors are not detected until they are in advanced stages. Oral cancer is a major health problem in tobacco users all over the world. The early stage development of oral cancer is a matter of great concern in this study. The study attempts to evaluate how the fuzzy method can improve the early detection of risk factors which all can cause oral cancer.

CONCLUSION
The oral cancer risk factors are found to be more prevalent. The purpose of this study is to evaluate the ability of a fuzzy regression model to predict the likelihood of an individual in developing oral cancer based on knowledge of their risk habits and demographic profiles. The data connected with oral cancer is fuzzified and the result obtained using fuzzy linear regression and JAVA program. The result shows that smoke, drink and chewing, are the most dangerous risk factors that will cause oral cancer. Through this innovative Fuzzy Linear Regression Algorithm, it is possible to identify the risk factors of oral cancer. The findings of this study suggest that fuzzy regression models provide good alternative to human expert prediction in predicting oral cancer susceptibility.