GC-MS Analysis and PLS-DA Validation of the Trimethyl Silyl-Derivatization Techniques

Problem statement: Comparison and validation analysis of the conventi onal and poweradjusted Microwave-Assisted (MA) techniques of TMS derivatization using Gas ChromatographyMass Spectrometry (GC-MS) and Multivariate Analysi s (MVA) of Partial Least Squares Analysis (PLS-DA). Further improvement of the conventional t echnique using vigorous shaking was tested and analyzed. Approach: Cross-validation and response permutation test of PLS-DA and S-plot of Orthogonal Partial Least Squares Discriminant Analy sis (OPLS-DA) was applied to the extracellular data of Lactococcus lactis, which was analyzed by GC-MS. The analyzed samples w re firstly derivatized using Methoximination (MeOX) and N-meth yl-N-trimethylsilyl fluoroacetamide (MSTFA) followed by conventional and power-adjusted Microwa ve-Assisted (MA) heating treatment. Results: The supervised PLS-DA applied to extracellular data failed to show the same clustering results between conventional and power-adjusted Microwave-A ssisted (MA) techniques. It was suggested that the type of heating used in the derivatization tech niques had affected the detection of groups of metabolites. Furthermore using the UV-scaling metho d, S-plot and Variable Importance for Projection (VIP), about 40 metabolites that were responsible t owards the clustering and separation showed in PLS-DA score plots were successfully indicated. Con ventional technique with vigorous shaking showed clear clustering according to groups compare d to MA technique. Conclusion: Type of heating applied to the TMS derivatization showed effects to wards the detection of metabolites where conventional technique indicated strong clustering compared to MA technique.


INTRODUCTION
Microbial metabolomics focusing on the measurement of low molecular mass metabolites is routinely used to screen metabolites that were indicative of specific regulator and mechanism. The rapidly expanding field typically employs chromatographic techniques coupled to Mass Spectrometry (MS) to separate and identify metabolites of interest (Kamenik et al., 2010;Varghese et al., 2010).
Profiling of extracellular metabolites from fermentation broth of microbial represents several important aspects and advantages to determine the overall chemical properties of the produced metabolites (Kamenik et al., 2010;Kiefer et al., 2008. Since intracellular comprises of dynamic, low concentration of metabolites, analysis of extracellular gives a better interpretation of produced metabolites, both qualitative and quantitative. Generally, the composition of metabolites in microbial can vary considerably, depending on the organism of interest and the way microbial is stressed. Metabolites derived from extracellular are chemically and diverse, namely they consists of amino and nonamino acids including short branched fatty acids and carboxylic acids, sugar and sugar phosphates and steroids (Villas-Boas et al., 2006).
With the complexity of metabolites, comprehensive methods to analyze and measure all metabolites using single instrument is typically impossible. Therefore a variety of measurement tools and extraction methods exist to acquire these diverse metabolites (Kouremenos et al., 2010).
Since the main metabolites in extracellular share the same chemical characteristic such as same functional groups, including polar groups and active hydrogen (OH, NH, COOH, SH), quantification using a single analytical machine is indeed possible. Furthermore, with current separation technologies, development of multiple techniques and methods that focused on the single analytical instrument to quantify and profile all metabolites is actually possible.
Gas Chromatography Mass Spectrometry (GC-MS) is generally used for the volatile compound analysis (Kamenik et al., 2010;Varghese et al., 2010). It is an excellent high-throughput analytical tool that offers reproducibility, high resolution and relatively cheap compared to other analysis tools (Park et al., 2010). GC-MS coupled to quadruple in particular, provides high linearity and improve the dynamic range (Kiefer et al., 2008) As many of the interest metabolites in extracellular have high polarity but low volatility, conversion into more volatile and stable derivatives by substitution of the polar groups using silylation (Pietrogrande and Bacco, 2011) allows the metabolites analysis using GC-MS.
Silylation reported since 1965, is a classical derivatization that uses trimethylsilyl (CH 3 ) 3 Si-TMS to improve the volatility and stability of non-volatile compounds. The derivatization agent is commonly employed in GC-MS due to its fast, chemically stable and yielding good reaction (Kushnir and Komaromy-Hiller, 2000). The TMS derivatization usually comprises of a long heating treatment at 37-40°C with two steps; methoximation using methoxyamine hydrochloride dissolved in pyridine and silylation using N-Methyl-N-Trimethylsilyltrifluoroacetamide (MSTFA) or N, O Bistrimethylsilyltrifluoroacetamide (BSTFA). BSTFA with 1% Trimethylchlorosilane (TMCS) is more favored as the chemical mixture gives less interference (matrix effect) (Deng et al., 2005). MSTFA usually generate by-products or artifacts which cause interference to the GC-MS analysis (Little, 1999). Usage of MSTFA with 1% TMCS, described by (Kushnir and Komaromy-Hiller, 2000) however showed stable fragmentation of ion with no interference observed. Several improvements have been described to shorten the duration of heat treatment and increased the efficiency of the chemical derivatization reaction (Liebeke et al., 2010;Bowden et al., 2009). The most recent was Microwave-Assisted (MA) (Liebeke et al., 2010) or Microwave-Accelerated Derivatization (MAD) (Kouremenos et al., 2010;Bowden et al., 2009;Deng et al., 2005) that uses domestic microwave oven instead of the typical heating block which successfully accelerated the derivatization step by decreasing the overall time of heating treatment from 90-3 min. Currently the usage of microwave irradiation as a heating treatment for methoximation and silylation processes prior to GC-MS analysis has been acknowledged and widely used in almost all biological samples including environmental analysis, herbicides and industrial related processes (Chu et al., 2001;Ruiz-Matute et al., 2011;Ranz et al., 2008;Sandra et al., 2010;Beale et al., 2010).
Generally, the facultative anaerobic and mesophilic L. lactis produces a mixture of metabolites including amino acids, organic acid, fatty acid, alcohols, aldehydes, ester and ketones (Ayad et al., 1999;Garde et al., 2007;2002;Szymanska et al., 2012). As the phenotype of the bacterium allows growth at temperatures of 25-40°C and can survive with or without oxygen, the proposed conditions were used and tested. Although there are few groups of metabolites that react less to the derivatization reagent (esters), these groups of metabolite are easily detected by GC-MS with TMS derivatization.
The aim of this study is to compare and validate the conventional and the MA technique of TMS derivatization using Multivariate Analysis (MVA) of PLS-DA followed by identification of influence metabolites using an S -plot of the OPLS-DA supported by regression coefficient plot and VIP values. The supervised PLS-DA is useful to visualize high dimensional data, perform discriminant analysis and identify relevant potential metabolites involved in metabolic changes. In additional, validation using cross-validation and response permutation test in PLS-DA allows determination of differences in high dimensional data, whereas the S-plot is generally used to further explain the results from the PLS-DA model. To our knowledge, a comprehensive discussion on the comparison of TMS derivatization techniques using MVA has not been well presented or described.
As the optimization of MA technique by (Villas-Boas et al., 2006) and (Bowden et al., 2009) showed MSTFA to functioning optimally with microwave irradiation, we focused on the usage of MSTFA to analyze metabolites extracted from L. lactis extracellular grown in condition at 30-37°C and agitation condition. Furthermore, we decided to use incubation temperature of 40°C instead of 37°C with an incubation period of 90 min as the procedures are commonly used in MSTFA derivatization. In additional, this study describes usage of vigorous shaking as an improvement to the conventional technique which significantly improves the technique.

Reagents and materials: The derivatization reagents N-Methyl-N-Trimethylsilylfluoracetamide
(MSTFA) and methoxyamine hydrochloride were obtained from Sigma (St. Louis, MO, USA). Pyridine reagent grade was purchased from Merck Chemicals (Germany). Ultrapure water generated from the Mili-Q system (Milipore, Milford, MA, USA) was used in the experiments. The derivatization agents were stored at 4°C. Internal standard of D 4 -Alanine was obtained from Sigma (St. Louis, MO, USA).

Extraction of extracellular samples:
Approximately 15 mL of fermented culture medium was collected during the exponential growth phase (5-6h) with final OD 600nm of 1.0. Briefly sample was filtered using cellulose acetate membrane filter (0.2-µm pore size), followed by separation into 1 mL of aliquots (n = 9). About 10 ml of ultrapure water was added followed by 0.2 µL of internal standard (10 ml of 2, 3, 3, 3, D 4 -Alanine) spiked in each sample. Each replicate was vigorously mixed for exactly 1min and stored at -20°C before freeze-drying prior to silylation derivatization.

TMS derivatization:
The derivatization technique used in the study was based on the protocol proposed by (Roessner et al., 2001) and Villas-Boas et al. (2006) with modification. The first derivatization technique, based on (Roessner et al., 2001) was performed by adding 80 µL of methoxyamine hydrochloride dissolved in pyridine (2g 100 mL −1 ) to the dried sample and was subjected to heat treatment with agitation up to 500 RPM for 90 min at 40°C. Silylation using 80 µL of MSTFA was then added and subjected under the same heating treatment and agitation as previously mentioned.
The second derivatization technique using microwave irradiation described by Villas-Boas et al. (2006) was carried out by adding 80 µ L of methoxyamine hydrochloride dissolved in pyridine (2g 100 mL −1 ) to the dried sample followed by incubation in a domestic microwave oven for 2.8 min with 50% of exit power. 80 µ L of MSTFA was then added to sample followed by 3min incubation in the domestic microwave oven under the same treatment previously mentioned.
The final incubated mixture was then transferred to a GC-MS vial and put at ambient temperature before analyzing using GC-MS. Prior to silylation derivatization, a control sample consisting of methoxyamine hydrochloride in pyridine and MSTFA prepared identically to each technique was analyzed using GC-MS to check for any contamination in derivatization reagents. Fresh methoximation reagent was prepared prior to TMS derivatization.

GC-MS analysis:
GC-MS analysis was performed using Perkin Elmer Turbo Mass Clarus 600 system equipped with a quadrupole mass spectrometer with Electron Ionization (EI) mode operated at 70eV. The column used for the analysis was an Elite-5MS capillary column coated with 5% diphenyl crosslinked and 95% dimethylpolysiloxane (30 m × 0.25 mm i.d × 0.25 µm film thickness, Perkin Elmer, USA). The MS was operated in scan mode (start after 8.0 min, mass range 40-600 amu at 0.5 sec scan −1 ). All injections were performed in the split ratio of 1: 50 with 1µL volume. The GC parameter based on our in-house method was optimized for MSTFA derivatization set from 70-300°C with helium gas flowed constantly at 1.1 mL min −1 . The GC column was equilibrated for 6min prior to each analysis.
Data analysis and validation: Data analysis of both derivatization techniques was performed separately using Turbomass 4.1.1 software (PerkinElmer Inc. USA) by extracting the height of GC peaks of the TMS derivatives. Signal to ratio was set to 3, followed by peak smoothing, before being aligned, deconvoluted and extracted. Identification of GC peaks was based on NIST mass spectral database library (NIST, 2008) and available reference standards that were prepared and analyzed identically to samples. Quantification of internal standard was done manually for each sample. Roughly a data matrix consisted of peak intensities and Retention Time (RT) was generated. The data matrix was then normalized to the total sum of Total Ion Chromatography (TIC) and internal standard followed by log transformation followed by statistic validation using one-way Analysis of Variance (ANOVA) and compared using Fisher's Least Significant Difference (LSD) method with significant levels of p<0.05, p<0.01 and p<0.001. Clean, validated data matrix was further analyzed using Principal Component Analysis (PCA) for initial observation, PLS-DA and S-plot of the OPLS-DA of SIMCA-P + version 12.0 (Umetrics AB, Ume, Sweden) using UV and Pareto scaling.

Validation parameter of the GC-MS method:
Linearity and sensitivity of the method were investigated using available reference standards including amino acids, organic acids and sugars, with concentrations ranging from 5 µ g mL −1 -200 µ g mL −1 . These standards were previously prepared from 200 µ g mL −1 stock solutions by dilution with dH 2 O. The Limits of Detection (LOD) and Limit of Quantification (LOQ) were measured as the lowest concentration of every standard with signal to ratio of 3. Each point on the calibration curve, expressed as height peak was obtained from a minimum of five replicates of measurements. The relationship between the height peak and standard concentration was determined by linear regression with R 2 >0.99.

RESULTS
Initial PCA to determine primary observation of the three fermentation conditions derivatized using different techniques of TMS derivatization failed to reveal the same results of clustering either using UV or Pareto-scaled. Data generated using conventional with vigorous shaking method illustrated in Fig. 1a showed reasonable clustering that associated with temperature changes where condition at 30°C and agitated at 150 rpm was clustered almost together while MA method illustrated in Fig. 1b gave rather an opposite clustering with replicates of agitated condition cloistered away from the 30 and 37°C.
Due to inconsistency and poor separation obtained by PCA, supervised PLS-DA was carried out by fitting each replicates tested for its fermentation conditions. An excellent separation according to fitted fermentation conditions were achieved in PLS-DA score plot (PC1 versus PC2) of conventional technique (Fig. 2a). PLS-DA score plot of MA technique however exhibited poor clustering with condition at 30°C clustered with agitated condition (Fig. 2b).
Visualization of metabolites using an S-plot of the OPLS-DA was carried out to identify potential metabolites that are responsible for the discrimination showed in PLS-DA score plot indicated metabolites that are distant away from the origin and close to the vertical axis of S-plot showed in Fig. 3a and b are potentially responsible for the exhibited separation. Identification using available standards and NIST library showed metabolites belonging to butanoate metabolites are responsible for the separation in the conventional technique while MA technique suggested metabolites associated with homo and heterofermentation (format, lactate, ethanol) and several short branched chain fatty acids influences the clustering and separation showed in the PLS-DA score plot.
Determination of these potentially influenced metabolites toward the separation in PLS-DA models was further analyzed using regression coefficient plot with 95% jackknifed confident intervals where metabolite with Variable Importance for Projection (VIP) values exceeding 1.0 were selected as metabolite cut off. It was observed that these selected metabolites detected by both techniques (41 metabolites for conventional technique and 40 Metabolites for MA) have positive and negative values ( Fig. 4a and b) in regression coefficient plot and actually effected the separation significantly. In brief, negative values indicate relatively high concentration of the metabolites while negative values represent low concentrations of metabolites in the samples.

DISCUSSION
PLS-DA was carried out to improve the poor clustering of conventional and MA techniques obtained from PCA models. The PLS-DA analysis using S-plot and PLS regression coefficients offers visualization of metabolic changes stressed by different treatments or conditions (Szymanska et al., 2012). Collectively, the normalized data matrix for both techniques contained 27 observations (n = 9) with a total of 130 (conventional) and 187 (MA) metabolites were validated and identified.   Maximum separation of both data was carried out using UV and Pareto-scaling. Among the scaling methods, UV-scaled with correlations showed clean separation of replicates thus was selected for PLS-DA analysis. The obtained PLS-DA models were then further analyzed using analysis of variance of sevenfold Cross-Validation predictive residual (CV-ANOVA) and response permutation with 20 random reclassifications. Cross-Validation (CV) was used to determine the sufficient number of Principal Components (PCs) represented by the total amount of explained the Xvariance (R 2 X), Y-variance (R 2 Y) and cross-validated predictive ability (Q 2 Y). To narrow down the total tested metabolites, potential metabolites that give the major influenced toward separation were selected based on the VIP values and further determined using regression coefficient plot and S-plot of the OPLS-DA. Regression coefficients reflect toward the metabolic changes of metabolites while the S -plot shows potentially significant metabolites, based on contributions and reliability to the separation observed by the PLS-DA models (Wiklund et al., 2008) PLS-DA score plot of conventional technique with vigorous shaking with a combination of the first two PCs showed strong separation with R 2 Y and Q 2 (cum) values exceeding 50% (Table 1). PSL-DA score plot of MA technique however required 8 PCs to explain the total predictive of 51%. This may relate to the two strong outliers situated outside of the ellipses of Hotelling's T2 (Fig. 2b) which impact the score plots.
In generally values of R 2 and Q 2 over 50% are considered good for metabolic experiments.
Analysis on the quality of separation showed by both PLS-DA models using CV-ANOVA indicated that the only conventional technique was statistically significance (p<0.05). This suggested that the discrimination analysis of conventional technique with vigorous shaking was more consistent compared to MA technique. The response permutation tests ( Fig.  5a and b) however indicated a satisfactory model was achieved by the two techniques where blue Q2-values to the left were lower than the original points to the right. Physical comparison of the response permutation test however clearly suggested that conventional technique was better than MA.
As mentioned previously, tested metabolites were narrowed down to a total of 41 metabolites from the conventional technique and 40 of metabolites from MA. The cut down was based on the values of VIP>1 and p<0.05 selected as potential metabolites. It was suggested that selected markers with VIP>1 were associated significantly with the separation shown in both S-plot models as the functions calculated from the weight sum of squares of the PLS weight indicates the importance of the selected variable to the whole model. To correlate with the suggestion, we determined the selected metabolites using PLS-DA column loading plot. It was revealed that the markers have higher or lower w*c or p (corr) weight values with lower values were the more relevant markers to explain the discrimination according to fermentation conditions ( Fig. 6a and b). Furthermore this explained why UV-scaled or correlation method was preferred for data analysis for the particular data compared to Pareto-scaled or covariance method.

CONCLUSION
Considerable variation of detected metabolites derived from TMS derivatization techniques showed that type of heating used in methoximation and silylation processes significantly affected the way PLS-DA models were visualized and interpreted. In this study we demonstrated the importance of heating treatment in TMS derivatization using conventional and MA techniques using validation of PLS-DA. It was observed in the PLS-DA score plot, that the conventional technique with vigorous shaking exhibited strong separation among the three fermentation conditions with total predictive values (Q 2 ) exceeding 50%. MA technique however required 8PCs to explain the same values of Q 2 thus reflected towards the poor separation of replicates in PLS-DA score plot. In general, regulating and fixing the heating treatment in the TMS derivatization process is important to determine the quality of derivatization thus maximizing the detection of metabolites. Since microwave irradiation is governed by the exit power of irradiation, fixing the temperature to specific values is reasonable. Moreover, conventional method with vigorous shaking did contribute toward the reproducibility of the PLS-DA model and should be considered. Further analysis to determine the suitable shaking speed is needed to improve the quality of conventional technique especially using MSTFA as TMS derivatization agent.