A BI-RADS Based Expert Systems for the Diagnoses of Breast Diseases

: We proposed an expert system based on the interpretation of mammographic and ultrasound images that may be used by expert and non-expert doctors in the interpretation and classifying of patient cases. The expert system software consists of a mammographic(MAMMEX) and breast ultrasound(SOUNDEX) medical expert systems which may be used to deduce cases according to the Breast Imaging Recording and Data System (BI-RADS) based upon patients’ history, physical and clinical assessment as well as mammograms and breast ultrasound images. The systems were tested on a total of 179 retrospective cases from the Radiology Department, Hospital Universiti Sains Malaysia (HUSM), Kubang Kerian, Kelantan. The accuracy, sensitivity and specificity of MAMMEX were 97%, 96% and 92% and values of 99%, 98% and 100% were found for SOUNDEX respectively.


INTRODUCTION
Breast cancer is the most common form of malignant disease amongst women. However, mortality rates fell noticeably especially in the United Kingdom partly because of the widespread practice of breast screening [1] . For this, the usage of ultrasound in conjunction with physical and mammography examination has been propagated in order to obtain a thorough assessment in breast screening.
Breast screening is not without its problems. The implementation of mass screening would result in increased caseloads for radiologists which would in turn give rise to chances of improper diagnosis. Diagnosticians with the training and experience to interpret mammographic images and breast ultrasounds are scarce. This is further aggravated by the requirement of having two radiologists reading a case in certain practices. Mammography reading is a very hard skill to teach, requires years of experience and frequent scrutinizing [2] . Radiologists training for mammography traditionally involve viewing large numbers of films and they need to maintain a high throughput (approximately 7000 cases per year) in order to perform well in reading and interpreting the mammograms [2] . Sensitivity and specificity are very crucial in clinical practice as only 15-30% of patient referred for biopsy are found to have a malignancy. Unnecessary biopsies increase health care costs and may cause patient anxiety and morbidity. It is therefore important to improve the accuracy of interpreting mammographic lesions [3] , thereby improving the positive predictive values of detection modalities.
Routine and repetitive use of computer-based systems developed for experiments would bring several benefits. Radiologists could be trained to evaluate the perceptual features appropriately [4] . The existence of an expert system would make diagnostic expertise more widely and readily available in the clinical community. The availability of this system would facilitate computer aided study and learning. This system would also prove to be useful in the training of radiologists in the early part of their career. The archiving of knowledge gathered in this area with patient cases would also promote the interpretation of images in a more consistent and standard manner and may be referred to from time to time.
The lack of standardized description and categorization of breast assessment of patients [5,6,7] led to the realization of an urgent need for the development of a certain standard form of guideline or system. In view of this, the American College of Radiology called upon a task force on breast cancer in the late 1980s and appointed a committee to develop guidelines for standardized reporting which was initially used for mammographic findings. This was published as the "Breast Imaging Reporting and Data System", referred to as BI-RADS [8,9] which was intended to standardize the terminology in reporting starting with the mammogram report. BI-RADS was not in routine use until the year 1997 [10] . The standardized assessment categories used to describe findings on mammography initially and then extended into the other modalities. On the basis of the level of suspicion, detected lesions or abnormalities can be placed into one of the BI-RADS assessment categories.
Even though BI-RADS was introduced to help standardize feature analysis and final management of breast modality findings, there still exists variations in their interpretations. Continued efforts to educate radiologists to promote maximum consistency still need to be done [11] .
The earliest study encountered was by Cook & Fox [12] , where mammographic image analysis was investigated using a decision table to represent all the parameters and possibilities in 41 rules that were created, all centred upon masses and lesions. The other related works were mostly based on Artificial Neural Networks(ANN) for decision making in the diagnoses of breast cancer and some of the work are related to breast biopsy decisions [13,14,15,16,17] . A study was also carried out by Floyd et al. [18] using case-based reasoning but none quite fits exactly in what this study is intended to achieve.
The Proposed Method: It usually takes five to ten person years to build even a moderate expert system and the most crucial stage in the technology of expert systems is the process of knowledge acquisition [19] as it involves efforts dedicated to the identification of the facts that comprise the knowledge base. It is often very difficult for clinicians and health care providers to sketch systematically on paper, their knowledge base and/or algorithms that they use in diagnosis and/or treatments.
Certain guidelines may be adopted that proved to ease the whole process. The three steps in the Knowledge Acquisition Process are shown in Figure 1, processes involving acquiring explanations from the experts, actually capturing the knowledge and organizing the knowledge. The capturing stage is the process of documenting the objects, relations and actions that make up the knowledge. On the other hand, the organization stage is the process of ordering the knowledge in such a form that it would be ready to be mapped into the knowledge base being developed.

Explanation
Capture Organization Knowledge Explanation includes interviews, the interview environment, the do's and the dont's that need to be abided by the interviewer as well as obtaining knowledge existing in written forms and capture.
In addition to this, knowledge acquisition through a human expert is a delicate task that needs to be well thought out carefully and deliberately conducted. Also, it most typically lacks an organizational format to guide the activity and it has to undergo the processes as in the following Figure 2   As the development of the knowledge base is the most important task that the knowledge engineer performs, a stringent and diligent process needed to be employed to produce a systematic, thoughtful procedure in the knowledge-base construction. In the development of the expert system in this study, the process of knowledge acquisition and knowledge representation proceeded virtually hand in hand as this was absolutely vital for the integrity of the end result.
The complete knowledge base or expert system was then gradually developed in an incremental manner. The system development was based on production rules and therefore, decision tables were considered. The formation of the rules were based on the different modalities with their associated features.
The knowledge base or expert system developed for this work is divided into two parts, namely MAMMEX for the Mammogram Expert System and SOUNDEX for the Ultrasound Expert System. It is envisaged that these two experts systems will produce results which are at par and consistent with those generated by the primary domain expert. In other words, each time MAMMEX and SOUNDEX were consulted, they were expected to provide the same advice as an expert, which in this case, is the category of BI-RADS that the radiologist usually classifies at the end of the assessment for each case. This established a benchmark against which MAMMEX and SOUNDEX will be tested.
Amongst the several criteria that were considered in the initial stage of development of MAMMEX and SOUNDEX included choosing the most suitable language, the working environment i.e. the software should be able to be stored and run on a portable computer in a Windows environment, be simple enough so much so that an end-user would be able to learn to run them, should have the ability to exhibit high performance in terms of speed and reliability in order to be a useful tool, able to propose correct and consistent solutions in a reasonable amount of time and the screen definition language is powerful enough to customize the way the questions are asked and have the ability to allow interface calls to other external routines or programs. After several deliberations and careful considerations including the usage of established knowledge based shells, it was decided that the Builder C++ language environment be chosen as the medium for the implementation.

Developing a Prototype:
The process of interpreting mammograms is in fact a multifaceted medical decision-making task [20] as there is a constellation of characteristics, a plethora of features to be considered before a certain conclusion or decision can be made. As such, the diagnoses of breast diseases requires a more thorough investigation of all possibilities and procedures. Moreover, there is a poorly structured collection of many isolated facts and it is unclear what kinds of distinctions between the facts are the important ones. It was necessary to solve the possibilities by including heuristic or appropriate methods which did not require perfect data and the solutions derived by the system may be proposed with varying degrees of certainties. Also, it was important to obtain explanations that infer how the expert system arrived at the answer and justifications for the knowledge itself. Therefore the use of rules or assertions was preferred to represent the knowledge.
The creation of the rule base proceeded from discussions with practicing clinicians and radiologists and through the extraction of rules from journals and texts on established practices for patients who present themselves in for assessment and complaints.
When the system was run, questions pertaining to the patient history, clinical and physical assessment, mammographic features and eventually, ultrasound features will be displayed on the screen in a windows environment. This was how data was obtained or needed as input to arrive at a decision.
The type of question asked was multiple choice. A question will display a statement ending in a verb, followed by a numbered list of possible choices to complete the sentence. The user will be requested to enter the number of the correct choice for the situation by a click on the mouse. Questions will continue to be displayed one after another depending on the path designated by the choice of answers. To illustrate this point, for example, if the user finds that there is no presence of mass on mammogram assessment, then the subsequent questions pertaining to mammogram mass will be skipped and questions pertaining to the next matter which involves another feature, calcifications (for example) will then have to be dealt with.
The user will be required to answer all the questions that were displayed by the system. At the end, the system will provide the user with a conclusion listing the categories of the BI-RADS for the particular modality. The highest numeric value associated with the particular category will be taken as the answer returned by the system.
The consultation is essentially a search through a tree of goals. The top goal at the root of the tree is the action part of the goal rule, i.e. the suspicious level returned or the diagnosis of the disease. Subgoals further down the tree include determining the other features involved and seeing if these are significant. Many of these subgoals have sub-subgoals of their own, such as mammographic features for example, the presence of mass and its details.
The special kind of structure called the tree is very useful for representing the way in which goals can be expanded into subgoals by a program. The basic idea is that the root node of the tree represents the main goals, the terminal nodes represent primitive actions that can be carried out and the non-terminal nodes represent subgoals that are susceptible to further analysis.
The search strategy implied and the manner in which the rules were executed may be described in more technical terms as a Forward Chaining Search and Inference Technique with pruning, a natural way to design expert systems for analysis and interpretation. That is, the process begins with a certain data concerning the category that is most likely; for example, its mass features (if any), calcification features (if any) and so on. These data, along with the constraints, serve to highlight the potential alternatives and to decimate the unlikely ones. This is consistent with the way a domain expert reacts when confronted with patients' cases when they arrive at the hospital for check-up. The expert first needs to gather some information and then tries to infer from it whatever can be inferred [19] . Thus the search ultimately arrive at a listing from which a final selection is made based on the highest score. The pruning process results in a reduction in search requirements. The actual premises for each of the different modalities are listed in Tables  1, 2, 3 and 4.  Table 1 lists the premises used in the classifier system for patient history, Table 2 shows the premises used in the physical and clinical assessment, Table 3 is concerned with the premises used in the mammographic assessment and lastly Table 4, the premises for the ultrasound assessment. The premises for each of the sections underwent numerous changes. Detailed discussions were held from time to time which entails numerous and endless trips to HUSM which were deemed necessary to ensure that the work progresses in an acceptable frame of reference. Finally, the above premises were obtained with the consensus of the radiologists.
An attempt had also been made to incorporate patients' images to be included in the expert systems developed to allow for image manipulations and processing. Table: Facts and information had to be gathered in order to facilitate enough knowledge to be incorporated in the expert system. Based upon the various premises listed previously, work then began in developing the framework of the expert system. For each of the modality and its various features, information digging and fact-finding had to be endeavoured. To determine reasonable numerical values associated with each and every factor making up the sections in MAMMEX and SOUNDEX, numerous papers were mined. In other words, each and every piece of information relating to the main modalities had to be investigated and gathered. From here, a more reliable numerical value would be found and thus be used in the eventual knowledge based system to be developed.  The margin of the mass is (well defined, sharp halo, microlobulated, macrolobulated, ill-defined, irregular, obscured, uncertain) The shape of the mass is (round, oval, irregular, stellate, uncertain) The size is (less than 1.0cm, equal o 1.0 cm, greater than 1.0 cm, uncertain) The density of the mass -(fat density, low density, isodense, high, has central lucency) The mass are (multicentered, multifocused, multicentred/multifocused, uncertain) If the mass is multicentred or multifocused, then they are also bilateral(yes, no) There is no architectural distortion(yes, no, uncertain) There is skin thickening(yes, no, uncertain) There is nipple retraction/abnormality(yes, no, uncertain) Calcifications are present(yes, no, uncertain) The calcification is (micro, macro, mixed(macro,macro), uncertain) The morphology of calcifications are (lucent-centered, parallel tracks/linear tubular, coarse/popcorn like, large rod-like, round, eggshell/rim, milk of calcium, suture calcifications, dystrophic, punctate, amorphous/indistinct, granular sand-like, pleomorphic/heterogeneous/granular, fine linear/ fine linear branching/casting) The calcification distribution is (grouped/cluster, linear, segmental, regional, diffused/scattered) The number of calcifications per cubic cm is (1, less than 5, greater than 5, uncertain) There is presence of node in axilla(yes, no, uncertain) There are multiple nodes(yes, no, uncertain) The shape of node in axilla is (round, ovoid/ellipsoid, bean-shaped, slightly lobulated, spiculated, uncertain) The margins of node is well-circumscribed(yes, no, uncertain) Nodes are bilateral(yes, no, uncertain) Size of node is(less than 2.0 cm, more than 2.0 cm, is uncertain) Node has central lucency(yes, no, uncertain) Table 4:

The Quest For Information -Gathering Of Facts, Figures And Building The Decision
The premises used in the classifier system pertaining to the ultrasound features The mass is detected on the ultrasound image (yes, no, uncertain) Location of breast mass is (on the upper outer quadrant, upper inner quadrant, lower outer quadrant, lower inner quadrant, retroareolar, inner middle, outer middle, upper middle, lower middle, uncertain) The shape of the mass is (round, ovoid/ellipsoid, irregular, lobular, spiculated, uncertain) The orientation of axis of the mass is (taller than wide, wider than tall, is almost equal, uncertain) Overall mass margin is (smooth/well-circumscribed, gentle lobulations, radial/ductal extension, branch pattern, angular margin, uncertain) The number of lobulations are (less than 3, greater than 3, uncertain) Echo pattern of mass is (anechoic, hypoechoic, hyperechoic, isoechoic, mixed, uncertain) Posterior to the mass, (there is acoustic enhancement, normal/no enhancement/shadowing, complete shadowing, uncertain) The lesion has calcifications(yes, no, uncertain)

Collating Past Works And The Use Of Decision
Tables: To illustrate the whole process, it was certainly helpful that a table be developed prior to actually embarking on the fact finding. This formed the decision tables to accommodate the various previous works that were based on each and every characteristic of the overall processes involved in breast assessment. For example, consider the case for the mammographic feature assessment. The presence of mass entails the search for facts related to the mass margin for example. Then, these premises will be listed in the left hand side of the table, making up the rows. The top headers of the table will be in the form of columns whereby resources and previous studies found to support evidences on the characteristics mentioned, will be recorded.
For each of the study that was gathered, The Positive Predictive Values (PPVs) for the associated benign and malignant features mentioned in the resources for the associated premises were entered accordingly in the appropriate cells of the decision table. Some papers focused on benign features only; some deal with mainly malignant features while some studies which scrutinize on a much broader basis and encompassed the benign as well as the malignant features.
After 'exhaustive' searching from previous studies, the mean of the positive predictive values were then calculated for the benign and malignant values. As such, values pertaining to the benign and malignant cases emerge. These were taken as to represent the possible range of certainty values for the benign and malignant values each differing for the different features for the different modalities. An example of this is depicted transparently in Table 5 which shows a subsection of the eventual whole decision table.
The entire decision table represents a method for visualizing the large number of possible situations in a single table. Rules can then be created directly from the decision table. From the decision tables whereby the knowledge base or expert system may begin to be constructed and developed, certainty values may be formulated and built based upon the fact findings from the collection of referred papers on the various parameters. Work then proceeded in the direction of determining the framework of the knowledge-based system i.e. constructing the backbone of the entire system in the set of decision tables.   their corresponding ultrasound images. Prior to these years, the database of patient images and cases in DICOM format were non-existent.    The simple two-by-two table is one of the most intuitive methods for the analysis of diagnostic examinations [21] . Despite this, the method is capable of displaying strength and power in illuminating understanding the performance and analysis of diagnostic examinations. The basic idea of a diagnostic test interpretation is to calculate the probability a patient has a disease under consideration given a certain test result. For this, a Two-by-Two table is used as a mneumonic device [22] . The table is labelled with the test results on the left side and the disease status on top as shown in Figure 6. Table 7 shows a fictitious data from an experiment to evaluate the accuracy of a certain test T for a certain set of patients with clinical suspicions. The data are numbers of women with malignant or benign breast tumours. Referring to the table, the sensitivity and specificity of the two-by-two table may be illustrated. By using the numbers in the 'Malignant' column, the sensitivity of a fictitious test T in the sample of women is approximately 88% (65 out of 74) and specificity of test T is calculated as 75% (126 negative results out of 169 women with benign lesions). Table: The categorization for cases in certain studies is not as straightforward as in Table 7. This is because measures of accuracy: sensitivity, specificity require a positivity threshold for classifying the test results as either positive or negative [23,24] . In mammography and breast ultrasound, the BI-RADS scoring system was used to classify the modalities as normal, benign, probably benign, suspicious and malignant findings (i.e. BI-RADS B1, B2, B3, B4 and B5 respectively). Therefore, it was suggested that certain modifications be made based on the categories [25] , called the Modified Two-by-Two Table Analysis

The Modified Two-By-Two
The modified calculation of the accuracy, sensitivity and specificity in the Two-by-Two Table as in the above equations were implemented on the results obtained from the execution of MAMMEX and SOUNDEX.
A summary of the accuracy, sensitivity, specificity, True Positive, True Negative, False Positive and False Negatives values for MAMMEX and SOUNDEX are as shown in Table 8. CONCLUSION A BI-RADS based mammographic and ultrasound expert system for breast diseases has been successfully developed in this study (MAMMEX and SOUNDEX). The Modified Two by Two Table results indicate that the expert systems developed have high performance and reliability with accuracies of 97% and 99%, sensitivities of 96% and 98% and specificities of 92% and 100% for MAMMEX and SOUNDEX respectively.