Signal Domain in Respiratory Sound Analysis: Methods, Application and Future Development

Corresponding Author Achmad Rizal Department of Electrical Engineering and Information Technology, Universitas Gadjah Mada, Yogyakarta, Indonesia Email: rizal.s3te14t@ugm.ac.id Abstract: The development of digital signal processing technology encourages researchers to develop better methods for automatic lungs sound recognition system than the existing ones. Lung sounds were originally assessed manually according to doctor's expertise. Signal processing techniques are intended to reduce subjectivity factor. Signal processing techniques for lung sound recognition are developed by researchers based on their point of view to the lung sounds. Several researchers developed signal processing methods in a time domain. Meanwhile, other researchers developed signal processing techniques in a frequency domain or combined some signal domains. This paper describes the sensor used, the dataset used and the characteristics of extraction techniques as well as the classifier in the system developed by the previous researchers. In the final section, we describe some possible development of the future potential application of lung sound analysis.


Introduction
Auscultation has become a standard procedure to determine the health condition of the respiratory organs. Although auscultation has several disadvantages (Melbye, 2001), auscultation is still used because of the advantages that accompany it (Pasterkamp et al., 1997). With advances in digital signal processing, lung sounds can be recorded, processed and analyzed so that lung sounds can be classified automatically. This system is called computerized respiratory analysis (CORSA) (Sovijärvi et al., 2000). Research on lung sound analysis never completed due to lung disease patients is increasing from year to year (Buist et al., 2007). Many researchers have developed various digital signal processing techniques for lung sound analysis. Additionally many papers have been written to review various signal processing techniques with a variety of viewpoints.
Review on lung sound digital signal processing with a very broad scope is presented by (Earis and Cheetam, 2000). On this paper, all stages of lung sounds analysis are discussed and calculated how many research is done in the certain case of lung sounds. Reichert et al. (2008) presented almost similar paper with recent data. In their study, they discussed marker of each adventitious lung sounds and methods used by previous researchers. Palaniappan et al. (2013a) discussed research of lung sounds based on the analysis techniques comprising visual analysis, statistical analysis and analysis using machine learning. Palaniappan et al. (2013) investigated the performance of a variety of machine learning techniques in lung sound analysis. This study demonstrated that the hybrid machine learning increases the performance of the classification of lung sounds. Another study with a particular case can be read in the paper by (Shaharum et al., 2012). Their study discussed various techniques for lung sound detection in patients with asthma wheeze. Research in the signal domain of lung sound signal processing method has not been done before. By looking at domain signal from the signal processing is done can be seen what is considered to have valuable information. Each researcher has his considerations in choosing signals domain for lung sound feature extraction.
This current paper discusses the lung sound classification method based on the signal domain. From the signal domain, signal processing techniques may be divided into the time domain, frequency domain and time-frequency domain. Wavelet domain is classified in a class by itself because principally the wavelet domain is different from the time-frequency domain. Many researchers use the method with various signal domains such as time domain and frequency domain. It can be interpreted that sometimes one individual domain alone is not enough used in the lung signal processing. By looking at previous work, it is expected to make it easier to develop signal processing method for abnormalities in the lungs and respiratory organs detection using lung sound.

Respiratory Sound Overview
Lung Sound Classification Respiratory sounds are produced by the turbulence of air flow in a respiratory track. On inspiration, air moves into a narrower airway to the alveoli as the end track. When the air hits the wall of the respiratory tract, it forms a turbulent and produces sound. At the time of expiration, the air flows in an opposite direction towards a wider respiratory tract. In this phase, it occurs less turbulent, so that the normal expiration forms a smaller sound than inspiratory phase (Pasterkamp et al., 1997).
The first attempt to quantitatively analyze the lung sounds was made by McKusick (McKusick et al., 1955;Bohadana et al., 2014). Even an attempt to analyze the pulmonary sound was done a long time ago. However, research to develop lung sound signal method continues until today. Traditionally, lung sounds were analyzed based on the intensity, pitch, location and the ratio of inspiration and expiration. Table 1 shows the classification of lung sounds and types of lung sounds.

Normal Lung Sound
Normal lung sounds are produced by healthy lung in a certain location. The normal lung sounds are divided into four types and they are named based on the locations.

Tracheal Sound
Heard in the tracheal area, upper respiratory airway. Practically it is rarely used in routine auscultation. The tracheal sound has a high pitch and same length between inspiratory and expiratory phase (Bohadana et al., 2014).

Bronchial Sound
Heard in bronchus or lung branch. The bronchial sound has high pitch and pause between inspiratory and expiratory phase. Expiratory phase has a longer duration than the inspiratory phase. If the bronchial sound is heard anywhere in lung surface, it indicates lung disorder (Pasterkamp et al., 1997).

Bronchovesicular Sound
Bronchovesicular sound has a medium level of intensity and pitch. Bronchovesicular sound has same length inspiratory and expiratory phase. This sound is heard over the upper chest wall. If this sound is audible everywhere, it usually indicates consolidation area (Loudon and Murphy, 1984).

Vesicular Sound
Vesicular breath sounds are the most common normal lung sound in almost all the lung surface. Its voice is a soft and low pitch. The inspiratory sound is longer than the expiratory sound (Bohadana et al., 2014). The vesicular sound could be heard rougher and partially audible longer if there is a rapid and profound ventilation (for example after exercise) or in children who have thinner chest wall.

Abnormal Lung Sound
Abnormal sound is divided into two conditions. The first condition is when the bronchial sound heard in an improper location. If this happens, then it indicates a consolidation of the lungs. In this case, there is usually fluid in the lungs. The second condition is when lung sounds have low intensity or even disappears. This indicates that the respiratory tract covered by a liquid or a foreign object.

Adventitious Lung Sound
Additional lung sound consists of two kinds, Continuous Adventitious Sound (CAS) and Discontinued Adventitious Sound (DAS). Each adventitious lung sound is divided into two types. The following is a detailed explanation of each adventitious lung sounds.

Wheeze
CAS is often called the wheezes is continuous, high pitch, rather sighing sound that it is usually heard at expiration and sometimes on inspiration. It occurs when the flow of air through the narrowed airways due to secretions, foreign body or injury that prevents air flow (Abbasi et al., 2013). Wheeze can occur in the inspiration phase, expiratory phase or both. Some references split wheeze into two categories, wheeze and ronchi based on pitch. High-pitched wheeze is called with stridor while a low pitch is called to ronchi. Wheeze usually occupies a frequency of 400-600 Hz with more than 100 ms duration. Abnormalities associated with wheeze, e.g., asthma, Congestive Heart Failure (CHF), chronic bronchitis and pulmonary edema .

Crackle
Crackles are discontinuous, nonmusical, short duration, explosive and more often heard on inspiration. These sounds are classified as fine crackle and coarse crackle. Fine crackle has a high pitch, high intensity and a very short duration. Fine Crackle occurs as a result of the narrower airway that is suddenly open after closed on previous respiratory cycles. Coarse crackle has a lower intensity than fine crackle intensity. The pitch of coarse crackle is lower and the duration is not too short than fine crackle pitch and duration. It usually occurs at the beginning of inspiration and sometimes when inspiration. Coarse crackle occurs when there is fluid in the respiratory tract (Bohadana et al., 2014). Health problems associated with crackles, among others are ARDS, asthma, bronchiectasis, chronic bronchitis, consolidation, early CHF, interstitial lung disease, pulmonary edema.

Lung Sound Recording and Lung Sound Database
Based on studies on lung sounds, in general, there are two sources of data used. The first is the data recorded directly from patients in the hospital and the second is a data record from the database. In a paper reported by (Reyes et al., 2008), lung sounds are taken from patients with interstitial pneumonia using electrets microphone and pneumotachometer. Yamashita et al. (2014) record data of lung sound from patients with pulmonary emphysema and normal patients using a piezoelectric microphone. Data lung sounds recordings are usually taken from patients with certain lung disease cases and normal subjects as a control.
Some lung sound database is available either on CDs or files that can be accessed from the internet. One database lung sounds that are often used to study lung sounds are Rale database used in (Palaniappan and Sundaraj, 2013) and (Mayorga et al., 2012). Another available database is a database of Marburg Respiratory Sound (MARS)  or data on the Internet is used in (Jain and Vepa, 2008).

Sensor Types and Sensor Placement
The most common devices used for the acquisition of the lung sounds are electronic stethoscope (Hashemi et al., 2011;Maciuk et al., 2012;Emmanouilidou et al., 2012;Lin et al., 2006;Ayari et al., 2012;İçer and Gengeç, 2014). The stethoscope is a primary device for auscultation, using an electronic stethoscope it is possible to record and analyze lung sounds. Some researchers use a microphone with a slight modification to put it in the chest (Taplidou and Hadjileontiadis, 2007;Jin et al., 2008;Reichert et al., 2008;Alsmadi and Kahya, 2002). One of the most often used microphones is ECM from Sony. Several researchers utilized piezoelectric contact microphone or condenser microphone in the acquisition of the lung sounds (Lozano et al., 2013;Xu et al., 1998). The stethoscope is often combined with the pneumotachometer to determine the air flow, inspiration or expiration (Taplidou and Hadjileontiadis, 2007;Ponte et al., 2013;Aydore et al., 2009). Other additional devices used are PVT spirometer, accelerometer and a flowmeter (Homs-Corbera et al., 2000;Gnitecki and Moussavi, 2005;Kahya et al., 2006).
Besides the types of the device, the number of devices used to record lung sounds also varies. The simplest is to use one electronic stethoscope (Emmanouilidou et al., 2012;Ayari et al., 2012) until 5×5 matrix microphone mounted on the chest (Reyes et al., 2008) or use the chest belt containing seven electronic stethoscopes (Becker et al., 2013).

Analogue Prefiltering
Analog prefiltering is intended to reduce unnecessary frequency components such as DC components or highfrequency components, or to serve as an anti-aliasing filter. BPF is constructed from 7.5 Hz HPF and LPF 2.5 kHz is used by (Mayorga et al., 2012). BPF with different bandwidth used by (Alsmadi and Kahya, 2002) is BPF 90-1200 Hz (Alsmadi and Kahya, 2002). Another technique used is LPF 1 kHz (Hadjileontiadis, 2009) or HPF 75 Hz to reduce heart sound (Charleston-Villalobos et al., 2007). Selection of pass frequency depends on lung sound to be analyzed. Hadjileontiadis use of LPF 1 kHz due to lung sounds to be processed is the crackle that has frequency <1000 Hz (Hadjileontiadis, 2009). While in a paper by Mayorga et al. (2012) asthma, Crackle, wheeze, stridor and normal are analyzed. Some data used in research by have frequency > 1000 kHz.

Sampling Frequency
In digital signal processing, sampling frequency plays a significant role. The frequency of sampling will determine the bandwidth to be processed and may limit the noise that will fit into a signal (Lu et al., 2013). The design of the filter depends on the selected sampling frequency. The standard sampling frequency is 44100 kHz (Fs) for music. Even the frequency is too high for lung sound (<2500 Hz), some researchers using Fs for the acquisition of the lung sounds (Jin et al., 2008;Emmanouilidou et al., 2012). Other researcher uses Fs/2, Fs/4 or Fs/8 (Taplidou and Hadjileontiadis, 2007;Kandaswamy et al., 2004;Lin et al., 2006;Uysal et al., 2014).

Denoising Methods
One of the problems in the lung auscultation using stethoscope is noise. One of the most significant noises that cannot be eliminated directly is heart sound. Heart sounds arise as a result of the process of opening and closing of the heart valves in the pumping of blood by the heart. The emergence of the heart sounds in the lung sound recording cannot be avoided because, during the recording process of lung sounds, the heart keeps beating. Heart sound occupies a frequency range of 20-150 Hz which means overlap with the low-frequency component of the sound of the lung (Hadjileontiadis and Panas, 1997). Heart sounds and lung sounds have a different pattern so that the emergence of heart sounds changing in each phase of respiration (Al-Naggar, 2013). The simplest technique to eliminate heart sound is used with cut-off frequency 70-100 Hz HPF or BPF 100-2000 Hz Homs-Corbera et al., 2000). More complex techniques to eliminate heart sound on the lung sound could use an adaptive filter, high order statistics, independence component analysis, the method of fractal and others (Gnitecki and Moussavi, 2003;Ahlstrom et al., 2005;Hadjileontiadis and Panas, 1997;Chien et al., 2006).
Another noise that often arises is the sound of swallowing as the body's mechanisms to prepare for consumption and to avoid aspiration (un-breathing condition) (Lazareck and Moussavi, 2002). Patient's swallowing sound appears on the lung sound recording when the patient feels nervous, or lung sound recording process is too long. These sounds can be removed by signal processing such as using root mean square calculations, average power and fractal (Aboofazeli and Moussavi, 2004;2005). Another type of noise that can interfere with lung sound recordings, for example, the movement of a stethoscope, a voice conversation between the patient and the physician or crying sound of baby's patient (Emmanouilidou and Elhilali, 2013).

Respiratory Sound Signal Processing
For the ease of the comparison of pulmonary speech recognition, the methods that have been done by previous researchers are divided by the signal processing domains. Also, each study describes the sensors used, the data set used, the method used, the extracted features and classification techniques. Some studies do not include the performance of the system that are made because they only measure or test the characteristics of lung sounds.

Time Domain Signal Processing
In time-domain signal processing research, the most widely used sensor is electrets microphone, followed by electronic stethoscope for data acquisition in real terms. Also, some studies use the database on the internet as data input. Autoregressive modeling (AR modeling) (Alsmadi and Kahya, 2002;Kahya et al., 1999) and Empirical Mode Decomposition (EMD) (Charleston-Villalobos et al., 2007;Lozano et al., 2013) are widely used among others. A more detailed and specific method is used by (Ayari et al., 2012) where Crackle is recognized using the crackle parameters consisting of Initial Deflection Width (IDW), Largest Deflection Width (LDW). In the classification stage, the method that has been commonly used are Back-Propagation Neural Network (BP NN), K-mean clustering and others, several studies using empirical methods to show the difference between the two types of lung sounds (Lozano et al., 2013;Castañeda-Villa et al., 2013). The differences between the data classes are shown only through the graph or plot, to see signal processing results visually. List of lung sound study uses time domain signal processing is presented in Table 2.

Frequency Domain Signal Processing
Lung sound signal processing in the frequency domain is the most rarely used by researchers. Lung sounds have non-stationary nature so that frequency analysis cannot show that lung sound frequency components change at any time (Mondal et al., 2014). Some methods of signal processing based on the frequency domain are proposed by some researchers. Mayorga et al. (2012) using quantile vector to produce the features of lung sounds. Quantile vector is calculated from the FFT signals along 400 ms. Then calculated the frequency with octile coefficient 0125, 0250,..., 0875. Distribution of vector quantile calculation results on all these frames is used to form a codebook using Gaussian Mixture Models (GMM). Another method based on Fourier transform is used by (Xu et al., 1998) also (Wang et al., 2012). Both groups of researchers used cepstral analysis to analyze lung sounds. Analysis of the frequency spectrum to use as Welch spectra, spectra DT or PSD calculation using the method of autoregressive modeling (AR-modeling) (Jané et al., 2004;Oud et al., 2000). The results show that the frequency analysis produces features that can distinguish normal and abnormal lung sounds with high accuracy. Table 3 shows a resume of research on lung sound using frequency domain signal processing.

Time-Frequency Domain Signal Processing
Considering the non-stationary nature of lung sound and then time-frequency domain (TF domain) analysis become a more appropriate choice for the analysis of lung sounds. One of the most widely used methods is the Short-Time Fourier Transform (STFT). STFT is Fourier transform that is performed on one segment of data and formulated as in Equation 1: With is w(t-τ) window function and e -j2πmfτ is complex sinusoid form that will change signal into frequency domain. From STFT result, signal features will be extracted such as peak frequency (Rizal and Suryani, 2008), local maxima, peak coexistence, discontinuity (Taplidou and Hadjileontiadis, 2007), mean, amplitude deviation, local maximum, discontinuity criteria (Taplidou et al., 2003), mean and median frequency, spectral crest factor, entropy, relative power factor, high order frequency moment (Morillo et al., 2013) and so on. Another approach used is to change the STFT as an image and then to perform processing such as image processing Rizal et al., 2009). The advantages of STFT are computationally simple and easy in observing the frequency of the signal in each time. The drawbacks of this method are relatively low resolution and the uncertainty of the time when the frequency occurs because the frequencies are calculated at specified intervals.
Other TF domain method used is Wigner-Ville Distribution (WVD). WVD regarded as a special case of Cohen's class distribution. WVD mathematically formulated as follows (Maciuk et al., 2012): Variable τ indicates time-lag in the autocorrelation, while * shows complex-conjugate of signal x. WVD used by (Maciuk et al., 2012;Ponte et al., 2013) to show the differences between normal lung sounds and pathological lung sounds. Even WVD has a high TF resolution, but it requires a massive computation and the emergence of cross-product that is frequency shadow that appears even though nothing in the original signal (Boashash, 2003).
Another method often used is the Hilbert-Huang Transform (HHT), which consists of Empirical Mode Decomposition (EMD) and Huang Spectra for calculating the Instantaneous Frequency (IF) of the lung sounds. Several studies only use EMD, which is a time domain (Charleston-Villalobos et al., 2007) and some to calculate the IF of lung sounds (Lozano et al., 2013). Table 4 show previous lung sound analysis study using TF domain.

Wavelet Domain Signal Processing
The wavelet transform is a signal processing techniques that provides ease of setting the resolution of the signal so also called by Multiresolution Analysis (MRA) (Semmlow and Griffel, 2014). Lung sound research uses wavelet that often becomes the reference is research done by (Kandaswamy et al., 2004). Wavelet decomposes lung sounds up to level 7 using some mother wavelet. Sub-band D1, D2 and A7 are not used because their values are close to zero. The mean, the average power, the standard deviation and the mean ratio of absolute values of adjacent subbands are taken as signal features. ANN is used as a classifier. Hashemi et al. (2011) added skewness and kurtosis calculations on each sub-band of Kandaswamy's method. Abbasi et al. (2013) only change ANN with SVM to test Kandaswamy's method. SVM has better performance compared with ANN on daubechies8 wavelet decomposition. Different wavelet decomposition strategies are shown in other studies (Rizal et al., 2006a;2006b). In the lung sounds, wavelet packet decomposition is done to level 5 and is taken in a certain sub-band with different bandwidths at frequencies between 0-1000, 1000-2000, 2000-3000 and 3000-4000 Hz. The energy of each selected sub-band is used as features and produce more than 85% of accuracy. Some research on lung sound classification using wavelet method can be seen in Table 5.

Multi Domain Signal Processing
Some researchers combine the method of signal processing from the two or more signal processing domain. For example, time domain method is combined with frequency domain methods such as AR modeling in the time domain and the quantile frequency in the frequency domain (Yilmas and Kahya, 2006). AR modeling and Discrete Wavelet Transform (DWT) is used by . The combination of T-F domain method and wavelet using STFT and WPD for feature extraction is presented in (Emrani and Krim, 2013). Meanwhile, Welch's method for the Power Spectral Density (PSD) calculation and STFT is used to characterize lung sounds on research conducted by (Mazic et al., 2003). Therefore, this combined method has advantages in providing a complete characteristic of the lung sounds. The drawback of this combined method is the requirement of a longer computation time. For real-time detection purposes, combined method is not appropriate because of computational time. Some multidomain signal processing for lung sound analysis is shown in Table 6.

Clinical Application
One of the main objectives of research on lung sounds is to build a system that can detect lung abnormalities based on lung sounds. In fact, the results obtained from studies conducted recently only distinguish lung sounds, for example normal, crackle and wheezing. Some researchers tried to distinguish the disease from lung sounds, but they usually limited to a few cases, for instance in pulmonary tuberculosis (Becker et al., 2013) or in asthma and pneumonia (Bouzakine et al., 2005). Therefore, it is still needed quite a long way to come to the direct detection of diseases from lung sound only.
Even so, with the advancement of electronic technology several commercial electronic stethoscopes have been developed to facilitate physicians in analyzing lung sounds. These features are available, e.g., volume adjustment, reduction of heart sounds, recording and transmitting wirelessly to a computer. With the signal is displayed on the computer screen, the doctor will get more information from observed lung sounds.
The electronic stethoscope is also possible to build a telemonitoring system to monitor lung sound of patients remotely. For transmission media can use either wired or wireless, to a short distance (between the rooms in the hospital) and long distance (between home/small clinics to large hospitals). Lung sound analysis can be done automatically by a computer or manually by a physician.

Education Application
Benefits can be obtained directly from CORSA is for education purposes. Various kinds of recorded lung sound avaliable as learning materials for medical students. If in the past to listen to a particular type of lung sounds must be listened directly from the patient's lungs, now lung sound recording can be heard anytime and anywhere. Some sources in both commercial and free on the internet can be accessed easily with various cases (Ward, 2005).

Conclusion
Intuitively, lung sound processing method is good enough if the method used is quite straightforward, less computing time but can distinguish more lung sounds classes. In general, we could not conclude what method or on what domain is the best signal processing technique for lung sounds. The final goal of the lung sound signal processing is getting the highest accuracy. However, in reality all the methods used are not directly comparable due to several reasons such as different lung sound databases (lung sound types, the number of data, location of recording, sensors used, sampling frequency) and the various classification methods.
In many cases, lung disease cannot be detected using only lung sounds. Examination of other modalities such as X-ray, laboratory tests and others may be required in establishing the diagnosis. In the future development, we need a method that can combine data from lung sounds and data of other modalities such as examination results of X-ray. Lung sounds are retained as the main data diagnosis because of practicality.
Even there is still a large enough gap for a clinical application, CORSA still can be used for telemonitoring of lung disease. For telemonitoring, the system ability to determine the types of sounds that occurs is considered sufficient to monitor the health condition of the patient. The usefulness of this system can be used to reduce the gap between the availability of lung disease specialists at a remote area. Significant support for the development telemonitoring system is the availability of electronic stethoscope that makes it easy to the data acquisition, transmission and record of lung sounds. Another thing that could be developed is the use of mobile devices for lung health monitoring. Additional applications on mobile devices today have been able to add a function to record lung sounds, send and to analyze it.
The availability of internet technology supports medical education in particular that is related to auscultation capabilities. The availability of lung sound database on the Internet makes it easy for students to listen to and to analyze lung sounds. Next, augmented reality-based interactive applications might appear for auscultation learning.