Restricted Boltzmann Machines for Fundus Image Reconstruction and Classification of Hypertension Retinopathy

Corresponding Author: Bambang Krismono Triwijoyo Department of Computer Science, University of Bumigora, Mataram, Indonesia Email: bkrismono@universitasbumigora.ac.id Abstract: Conventionally classification of hypertensive retinopathy through analysis of fundus images by experts, but this method the results are highly dependent on the accuracy of observations and expert experience. In this study, we propose a fundus image reconstruction and Hypertensive retinopathy classification model using Restricted Boltzmann Machines (RBM), as well as the Messidor database that has been labeled as a dataset. The experimental results show that the performance of the model produces an accuracy level of 99.05% where the model can generalize image input into one of the nine classes of the severity of hypertension retinopathy.


Introduction
Medical image classification is a challenging research topic, one of which is the retinal image classification which is an important factor in the screening process for eye diseases, including Hypertension Retinopathy [HR] with physical signs of changes in the retinal microvascular as a response to high blood pressure in patients (Wong and Mitchell, 2004). The physical symptoms of retinopathy are narrowing of the retinal vessels, retinal bleeding and cotton white spots.
The conventional method used by ophthalmologists is to evaluate the fundus or retinal images of the eye, to determine the evolutionary phase of hypertensive retinopathy, but this method has a weakness of the traditional method has a weakness in the accuracy and consistency of observations, because it only relies on the eye doctor's vision, especially in the case of early stages of symptoms of hypertensive retinopathy will be difficult to do manually identification (Khitran et al., 2014). Based on these reasons, early diagnosis of hypertensive retinopathy through automatic analysis of retinal images is needed as an aid to the ophthalmologist in screening with accurate results for the prevention and treatment of hypertensive retinopathy.
This study aims to develop a classification model of hypertension retinopathy through in-depth learning methods, using Restricted Boltzmann Machines and analyze the performance of hypertension retinopathy classification models with retinal image input data from the MESSIDOR database.
Restricted Boltzmann Machine (RBM) is a rule of learning using the Boltzmann Machine method (Hinton, 2012). RBM is a probabilistic generative model that can automatically extract data input features using an unsupervised learning algorithm (Hinton, 2002;Smolensky, 1986). RBM uses a recurrent network architecture. Technically, RBM is a stochastic neural network (a neural network which means it has neuron units in the form of binary activations that depend on interconnected neurons, whereas stochastic means activation which has probabilistic elements) which consists of two binary units namely visible layer is stated to be observed and the hidden layer is feature detectors and unit bias. Furthermore, each visible unit is connected to all hidden units represented by an array of weights, so that each hidden unit is also connected to all visible units and bias units are connected to all visible units represents the number of hidden neurons. RBM is controlled by a series of weights and biases in all layers.
In general, the purpose of the RBM algorithm is to rebuild the input as accurately as possible. Then the input is changed based on weight and bias and then used to convert the input into an output. In the next stage, the output will be input in the next iteration. At this stage, the input layer tries to change the activation as an input reconstruction and then uses this input to compare with the original input (Ranzato et al., 2010).
In the case of computer vision, each visible unit corresponds to a pixel value from the image while the hidden units represent independent specific features of the image. The weights connecting the visible and the hidden units are usually trained using contrastive divergence learning which is an approximation of maximum likelihood learning (Xia et al., 2016). Methods using RBMs have become more popular in recent years and they are successfully applied to image recognition (Yamashita et al., 2014).

Related Work
Previous research has used the learning algorithm in RBM as a feature extraction method, proposed by (Hinton and Salakhutdinov, 2006). RBM produces a high ability for feature extraction and representation; Empirical research has proven that using features extracted from the RBM algorithm instead of raw data results in significant improvements in different machine learning applications, such as the classification of color images (Larochelle and Bengio, 2008), speech and object recognition (Li et al., 2015). The learning algorithm in RBM is designed to extract discriminatory features from large and complex data sets by introducing hidden units in an unsupervised way.
Previous studies relating to the classification of hypertensive retinopathy used features of AVR with datasets DRIVE and VICAVR (Khitran et al., 2014). They used a hybrid classifier which is a combination of Naive Bayes and SVM with accuracy for the DRIVE dataset is 98% and for VICAVR dataset is 96.5%. The Preprocessing steps are still needed to detect AVR properly and eliminate noise (Abbasi and Akram, 2014), used features of the ratio of Arterial and Venous diameter (AVR). They used 100 images of hypertensive retinopathy patients and used four methods, Artificial Neural Networks (ANN), Naive Bayes, Decision Tree (DT), Support Support Vector Machine (SVM) with an accuracy of 76, 75, 68 and 81%, respectively. Agurto et al. (2014) used AVR features and the Tortuosity Index, local dataset and they used Partial Least Squares (PLS) methods with 80% accuracy. This method needs additional features of AV nicking, vascular branching angles and embolic plaque for vascular changes. Cavallari et al. (2015) used the AVR feature and Tortuosity Index, 16 Images of the retina from the local data set. They used the average fractal dimension (mean-D) method with Accuracy results is 68.8%.
The classification of hypertensive retinopathy using deep learning was conducted by (Triwijoyo et al., 2017). The model and dataset used are Convolutional Neural Network (CNN) and DRIVE dataset, with an accuracy of 98.6%.
While (Akbar et al., 2018) proposed detection of hypertensive retinopathy using edge detection of arterial and venous vessels on retinal images from three datasets of INSPIRE-AVR, VICAVR and AVRDB, with 95, 96.8 and 98.8%, respectively. The detection of hypertensive retinopathy using the Neural Network has also been proposed by (Syahputra et al., 2018;Arsalan et al., 2019). Syahputra

Materials and Methods
In this section, we will discuss dataset inputs, data balancing, architecture and learning algorithms from classification models using RBM.

Dataset
We used database Methods to Evaluate Segmentation and Indexing Techniques in the Field of Retinal Ophthalmology (MESSIDOR) as a dataset (Messidor, 2010). Messidor is a research program funded by the French Ministry of Research and Defense within a 2004 TECHNO-VISION program. This database can be used, free of charge, only for research and educational purposes. Messidor database consists of 1200 eye fundus color digital images of the posterior pole, which were acquired by three ophthalmologic departments, using a color video 3CCD camera on a Topcon TRC NW6 non-mydriatic retina graph with a 45 degrees field of view. Figure 1 shows an example of fundus images from the Messidor database.
The images saved in uncompressed TIFF format were captured using 8 bits per color plane at 1440960, 22401488, or 23041536 pixels resolution. Figure 2 shows the diagram of the proposed method, in general, there are six steps. Starting with four steps of preprocessing the input image from the Messidor database, which consists of the cropping and resizing process, segmentation, measuring ARVs and labeling to determine the class of hypertensive retinopathy to produce a new dataset of hypertensive retinopathy consisting of nine classes. Next is the training process of the RBM model and the last is the testing of the training result model using test data to produce a classification of hypertensive retinopathy.

Method
Preprocessing includes cropping the original image to remove the left and right parts of the background image, focus more on the retina image and reduce complexity. The cropping process changes the original image size from 1440960 to 900900 pixels, from 22401488 to 13801380 pixels and from 23041536 to 14521452 pixels. After the cropping process, then the three sizes of cropped images are resizing to one dimension of 256256 pixels to be used as input to the classification model using the Restricted Boltzmann machines.
The sample data from the dataset divided into training datasets and validation datasets. Each category of the class is taken 60% as data training and 40% used as data validation. We used a cross-validation training method, with leave-one-out. This method was adopted from (Cawley and Talbot, 2003). The leave-one-out cross-validation resulted in seven times faster training time as well as a relatively lower error rate than the k-fold cross-validation. We calculated the ratio between Arterial and Venous width (AVR) of 89 retinal image samples by adopting (Hubbard et al., 1999;Bhuiyan et al., 2013) methods, the next is segmenting retinal blood vessels, measuring AVR and labeling retinal images in nine classes based on AVR for training model by modifying the category of HR by (Abbasi and Akram, 2014). Table 1 shows the proposed new categorization of hypertensive retinopathy based on AVR.    Table 2 shows the results of the data labeling process, where the number of retinal images per class is not balanced so that the duplication and augmentation methods are used to add data for classes with less than 133 for class 0 to 5 labels and less than 134 for class 6 to 8 labels. Whereas for classes whose data exceeds 133 for class 0 to 6 labels and more than 134 for class 6 to 8 labels, the amount of data is reduced so that eventually balanced data is obtained and ready for the model training process.
In this study, we used RBM for the classification of hypertension retinopathy based on retinal images. Figure  3 shows an illustration of the architecture of the RBM model for image classification: where v is the visible layer, h is the hidden layer, D is the number of visible units and P is the number of hidden units, as well as training datasets in vectors N: The RBM model input in the form of a retinal color image, each value of the intensity of the image pixel is read and converted into a value between 0 to 1, then becomes the input for visible nodes, so the number of visible nodes corresponds to the number of pixels of the input image. Then the first iteration process is adjusting the connection weights between each visible node and each hidden node until we get the output of hidden nodes which then updates the value of the visible node. The process is repeated for the next iteration and until the last epoch.
The training model consists of setting model parameters and model architecture experiment scenarios. While testing is the stage of testing the model that has been carried out in the training phase. In this testing phase, the data test set from the MESSIDOR database was used, where the 30 samples of the data test set were not used in the model training process. The dataset consists of nine classes of hypertensive retinopathy that have been categorized and labeled. The number of epochs is 20, the batch size is 30 and the number of sample images for testing is 30 randomly selected.
The algorithm of Restricted Boltzmann Machines is as follows (Salakhutdinov and Hinton, 2009 EndFor Update Weight: Decrement Learning rate α_t EndFor

Results
The experiments were carried out using specifications hardware and software environment specifications on laptops with Intel Core i7-7500U processor specifications, 12 GB RAM, GPU: NVIDIA GeForce GTX 960, Windows 10 operating system. Python 3.6 Programming Language with a Jupyter notebook. Table 3 shows the results of the training process experiments of four types of RBM models. The number of visible nodes in each RBM model is according to the input image size 28283, 64643, 1281283 and 2562563. The four RBM models use the same number of hidden layer nodes as 1500 units and a learning rate of 0.05.

Experimental Results Using a Different Image Size
The training performance of the four RBM models is very good, with an accuracy level of both training and validation above 98%. The difference in accuracy from the four RBM models is not too significant or smaller than 0.19%. From these empirical facts, it can be concluded that the size of the input image does not significantly affect the accuracy of the RBM model training results.
As for the training time, there is a significant correlation between the size of the input image and the training time, where the greater the size of the input image, the greater the number of visible nodes of the RBM model, so that it has implications for the longer training time. The accuracy of the results of the testing model shows that the more the number of visible nodes, the less the accuracy of the testing model. Figure 4 shows a graph of the training results of the four RBM models with varying input image sizes. the blue line is the training error level from epoch 0 to epoch 19 or as many as 20 epochs. While the green line is the validation error level from epoch 0 to epoch 19. From the four graphs, it appears that at the beginning of epoch 0 to 3 the validation error rate is relatively lower than the error training level, this shows that there is overfitting, but after the third epoch, shows that the error rate training and validation have the same trend until the 20th epoch.
The smaller the size of the input image or the smaller the number of visible nodes, the faster the rate of error reduction and error validation in the RBM model. Finally, the convergence of the error training level and the validation error level of the four types of RBM models occur after the fifth epoch. Based on the three facts above, then for the trial scenario, the next RBM model will use the second model, namely the RBM model with an input size of 64643 pixels, each pixel of the input image will be read by one visible node, so the total number of visible nodes is 12288 nodes.

Experimental Results Using a Different Number of the Hidden Nodes
In this experiment, a comparative analysis of four RBM models with the different numbers of hidden nodes was performed, each of which is 500, 1000, 1500 and 2000 nodes. The RBM model is trained up to 20 epochs using the Messidor data set with a learning rate of 0.05. The number of training data sets is 1200 retinal images with dimensions of 64643 pixels, 40% of the data or 480 images are used for validation and a sample of 30 images is used for testing the RBM model. Table 4 shows the fact that first the more the number of hidden nodes the less the accuracy of training and the validation accuracy of the RBM model. Second, the more hidden nodes, the longer the training process.   Third, the more number of hidden nodes, the less testing accuracy of the RBM model, the difference is very small or not too significant. Figure 5 shows a graph of the training results of the four RBM models with varying the number of hidden nodes. the blue line is the training error level from epoch 0 to epoch. While the green line is the validation error level from epoch 0 to epoch 19. From the four graphs, it appears that the performance of the RBM model is almost the same as the results of the model trials using variations in the number of visible nodes, where at the beginning of epoch 0 to 5 the validation error rate is relatively lower than the training error level, this shows that there is overfitting, but after the fifth epoch, showing that error and validation training had the same trend until the 19th epoch The fewer the number of hidden nodes, the faster the rate of error reduction and error validation in the RBM model. Finally, the convergence of the training error training rate and the validation error rate of the four types of RBM models occurred after the tenth epoch.

Experimental Results Using a Different Learning Rate
This section describes the results of RBM model experiments that have model specifications with the number of visible nodes 12288 and the number of hidden nodes 1500. Then try using three kinds of learning rate values, each of them is 0.5, 0.05, 0.005 and 0.0005. The RBM model is trained up to 20 epochs using the Messidor data set. The number of training data sets is 1200 retinal color images with dimensions of 6464 pixels, 40% of the data or 480 images are used for validation and a sample of 30 images is used for testing the RBM model.    Table 5 shows that first, up to a learning rate of 0.005 the level of training accuracy, validation accuracy and the testing accuracy is relatively stable above 98%, but at a learning rate of 0.0005, the level of training accuracy, validation and testing drastically drops to the range of 39%. This empirically proves that the learning rate of the ideal RBM model is greater than 0.005. While the training time for the four types of selective RBM models is around 6 min, except for the RBM model with a learning rate value of 0.05, where the training time is 8.44 min, despite having the highest level of training accuracy, validation and testing among the four types of RBM models that tested. From the three experimental scenarios as well as the empirical data of the experimental results, the RBM model with the number of hidden nodes 1500 and learning rate 0.05 is the best performance RBM model. The analysis and discussion of the experimental results of the Retinopathy Hypertension Classification Model using RBM were concluded, first, the model could reconstruct the input image into one of the image classes with a relatively small error rate. Second, the RBM training time is relatively faster than the other model.
The model that we propose from the results of this study is still very open for further development. The usefulness of this result study is: First, a new dataset for classification of hypertensive retinopathy into nine classes, which can be used as a standard dataset for other researchers to test their proposed model. Second, the RBM classification model can be implemented for the classification of retinal images that experience noise, because the RBM model is capable of reconstructing images. Third, the model we propose can be implemented for the classification of other medical images such as images of the prostate, lungs and others. Fourth, the model that we propose can be developed as a tool for ophthalmologists in assisting the diagnosis and early detection of hypertensive retinopathy, based on the patient's retinal image.

Discussion
The contributions of this study are: First, a new dataset of hypertensive retinopathy which consists of nine classes according to the degree of severity based on AVR as an indicator of class categorization and labeling, from retinal images taken from the Messidor database. Second, the experimental results of the Retinopathy Hypertension Classification Model using RBM prove that the model can reconstruct the input image into one of the image classes with a relatively small error rate.
Comparison of the results of the classification of hypertensive retinopathy between the previous research method and the method we proposed, as presented in Table 1, the average previous research used the AVR feature extraction process through segmentation where the method depends on the feature extraction algorithm, while the method we use is the deep learning approach. with the RBM method, where input in the form of retinal images and feature extraction is carried out by the model in the image classification process and our method is proven to produce better accuracy.
The limitations of this study are: First, the output is in the form of reconstructed images, not class labels so that it is still necessary to add layers for classification such as Softmax or Support Vector Machines (SVM) so that the output is in the form of classes. label as in the case of the classification of hypertensive retinopathy. Second, the accuracy of the model is strongly influenced by the amount of labeled training data, where the greater the amount of labeled training data, the higher the model accuracy results. In our study we only used a sample of 89 labeled training data, which were then developed using the duplication and augmentation technique to become 1200 images, taking into account the balancing of each training data class, as presented in Table 2.
Our future work is to develop a hypertensive retinopathy classification model architecture by combining the RBM model with CNN and other machine learning, in addition to increasing the number of datasets for training and involving experts in image labeling, so that it is expected to improve model performance. The next stage is implementing the model by building an interface on a mobile application to support telemedicine.

Conclusion
This research is to develop a classification model of hypertension retinopathy using RBM, the experimental results show that the model's performance is very good at reconstructing images with an accuracy rate of 99.05%, meaning that the model has a good ability to generalize image input into one of nine output classes. But the output model is still an image so it needs to be combined with layers like SoftMax, to get the class label output. Our next research plan is to develop a Classification Model for Hypertension Retinopathy by combining Restricted Boltzmann machines with Convolutional Neural Networks to get better classification results in the form of class labels.