FLYER: A Simple Yet Robust Model for Estimating Yield Loss from Rice False Smut Disease (Ustilaginoidea virens)

Corresponding Author: Moin U. Salam Department of Agriculture and Food Western Australia, 3 Baron-Hay Court, South Perth, WA 6151, Australia Tel: +61 8 9368 3162, Fax: +618 9368 3082 Email: moinsalam1@gmail.com Abstract: Rice False Smut (RFSm) is presently an internationally important fungal disease of rice. While the Yield Loss (YL) from this disease is reported in many countries, there exists no tool to instantly estimate the YL by visual field inspection. This study developed a simple model, FLYER, for this purpose. The model is run by two inputs: (i) fraction of productive but diseased tillers in a field and (ii) averaged number of smut balls present in the diseased panicles. FLYER was developed using data from Bangladesh, India and Japan. The driving algorithm of the model, the yield reduction in a diseased panicle as a function of number of smut balls present in the panicle, was validated with additional data from Bangladesh and Japan. When tested with independent data from fields infected naturally by RFSm, FLYER closely estimated the Yield Loss (YL, %) against observed datasets from Bangladesh (Root Mean Squared Deviation (RMSD) = 1.15% YL), Egypt (RMSD = 1.65% YL) and India (RMSD = 1.68% YL). This model could contribute to rapid assessment of regional and variety-specific yield loss and strategic management of the disease on a field-by-field basis.


Introduction
Rice False Smut (RFSm) is a fungal disease (anamorph: Ustilaginoidea virens (Cooke) Takah.; teleomorph Villosiclava virens (Nakata) E. Tanaka and C. Tanaka) of rice (Oryza sativa L.), which has worldwide importance (Tanaka et al., 2008). It affects separate panicles (floral organs) in rice crops. Symptoms are only visible after flowering, where the fungus infects individual spikelet and replaces the seed with a large, velvety orange to green balls (smut ball) (Ou, 1972). The smut balls, also known as pseudosclerotia, comprise of mycelial tissue and spore-masses and incorporate remnants of anthers and portions of paleae and lemmas (Ikegami, 1961).
Historically the disease has been treated as minor (Webster and Gunnell, 1992) as not causing significant Yield Loss (YL) on a regular basis across wider geographical regions of the world. However, it has now become an emerging disease and reported to be a concern in a number of rice growing countries including Africa (Ou, 1972), Bangladesh (Dhaka Tribune, 2013), China (Guo et al., 2012;Li et al., 2013), Egypt (Atia, 2004), Fiji (Ou, 1972), India (Devi and Singh, 2007;Arumugam and Tangamuthu, 2010;Singh et al., 2012), Italy (Ou, 1972), Japan (Ashizawa et al., 2010), Papua-New Guinea (Ou, 1972), Korea (Kim and Park, 2007), South America (Ou, 1972) and the United States (Brooks et al., 2009). A disease becomes an economically important especially when the YL becomes evident. With its present geographical status, knowing region-and field-specific YL from RFSm has become an integral decision issue to farmers, extension agents, researcher and policy-makers.
Yield loss from RFSm has been reported in many countries in varying figures ranging from 1 (Atia, 2004) to over 75% (Upadhyay and Singh, 2013), while not estimated yet in some countries (such as in Bangladesh). Measuring YL in fields related to RFSm is tricky, as the disease often is not homogenously distributed (Singh et al., 2014) and the success of employment of mass artificial inoculation has yet to be reported. Thus, rigorous sampling requires to be done panicle-by-panicle basis. Rice plants grow in clusters (called hills) by producing tillers asynchronously which generates varying sized panicles. If sampling is not considered with similar-sized-paired-panicles, the calculated YL could be misleading. Development of a simple model that could quickly and reliably estimate the YL would be a useful replacement of time consuming and cumbersome technique of field measurement of YL from RFSm. To best of our knowledge, there is no such model reported in literature.
Success of development of such a model needs to, (i) capture the underlying factors of yield formation in rice panicles and (ii) relate the disease to those factors influencing the panicle weight. Literatures unanimously agree that chaffiness (i.e., unfilled and/or partially filled spikelets) and weight of individual spikelets in the panicle are the two factors governing yield loss due to RFSm (Sinha et al., 2003;Atia, 2004;Upadhyay and Singh, 2013). However, there exist information gap on relative quantitative contribution of these two factors to YL and how the disease quantitatively influences those two factors.
This study was an attempt of fulfilling the research gaps with specific aims of (i) developing a simple generic model to estimate yield loss from rice false smut disease and (ii) testing the model in diverse environments across national boundaries to enhance its universal applicability.

Fields for Data Collection
The primary data for measuring attributes of yield loss in relation to Rice False Smut (RFSm) disease, the model development and its validation were collected from the experimental farm of the Bangladesh Rice Research Institute (BRRI), Gazipur, Bangladesh, located at 23°59 / N latitude, 90°24 / E longitude. This farm has built up as an intensive rice-ecosystem in the last 40 years by growing three rice crops annually in 88 fields spread over 35 hectare area. This site is about 35 m above the mean sea level and has a subtropical climate, which is strongly influenced by the south-western monsoon. The average annual rainfall is 2000 mm with more than 80% of it occurring during mid-June to end-September. Mean temperature is the lowest (15°C) in January and the highest (30°C) in May. The soil of the experimental farm is "Chhiata" clay loam, a member of the fine, hyperthermic Vertic Endoaquept (Saleque et al., 2004). The initial soil chemical properties at 0-15 cm soil depth broadly as: pH 6.1, organic matter 2.02%, total Nitrogen (N) content 0.07%, available phosphorus 10.14 mg kg −1 (0.5 M NaHCO 3 extracted), exchangeable potassium 0.17 meq/100 g soil (neutral 1.0 N NH 4 OAc extracted), available sulphur 20 mg kg −1 (Ca(H 2 PO 4 ) 2 extracted) and available zinc 2.8 mg kg −1 (0.01N HCl extracted) (Khatun et al., 2015). Monsoonal rice (locally known as "transplanted Aman" or "T. Aman") grown in the farm as hand-transplanted during July-August 2014 using about 30 day-old seedlings. Two or three seedlings were transplanted per hill maintaining a hill-to-hill distance of 20 cm and line-to-line distance of 20 cm. Field size varied between 5×4 m to 400×250 m. Rice variety "BRRI dhan49" was used in all the fields from where data were collected. The crops were fertilised with recommended doses of Nitrogen (N) (200 kg ha −1 as urea), Phosphorus (P) (63 kg ha −1 as triple super phosphate), Potassium (K) (84 kg ha −1 as muriate of potash) and Sulphur (S) (56 kg ha −1 as gypsum) (BRRI, 2013). Nitrogen was top dressed in three equal splits: 20, 35 and 50 Days After Transplanting (DAT), whereas P, K and S were applied once, during final land preparation. The crops received moisture predominantly through monsoonal rains, but supplemented by irrigation water to maintain a water level of 2 to 3 cm. Management of the crops included manual weed control twice, at 30 and 45 DAT. No chemicals, insecticides or fungicides, were used for pest and disease control.

Measurement of Attributes of Yield Loss in Relation to Rice False Smut Disease
Three attributes of yield loss in relation to Rice False Smut (RFSm) disease were measured: Filled spikelets per panicle, chaffiness and weight of a single filled spikelet. "Chaffiness" was defined as unfilled or partially filled spikelets. Five hundred and thirty six panicles, 268 each of healthy and diseased, were collected from five fields from the experimental farm of BRRI, Gazipur, Bangladesh (see above section "Fields for data collection") during October and November towards the end of ripening stage of the crops. "Healthy" referred to as absence of any smut ball within a panicle, whereas "diseased" referred to as presence of one or more smut balls in a panicle. Samples were collected in pairs, healthy and diseased, across the whole range, small to large, of panicle size and disease status. The disease status denoted here as number of smut balls per panicle and qualitatively the more were the smut balls, the severe was the status of the disease. Paired samples were drawn within a hill or, when not available, in the closed vicinity of the diseased hill. Spikelets from the sampled panicles were separated and filled and unfilled and/or partially filled spikelets were counted manually on a panicle-by-panicle basis. The number of smut balls on individual panicles was also counted. Chaffiness was expressed as percentage of unfilled and/or partially filled spikelets to total spikelets, by count, per panicle. Filled spikelets were oven-dried at 48°C for 72 h and weighted for each panicle at three decimal digits as gram (g). The weight of a single filled spikelet, expressed as mg, was calculated from the panicle weight by dividing the corresponding number of filled spikelets. All data were summarised under five panicle size categories (Table 1):

The Model and Model Development
The model FaLse smut induced Yield loss Estimator in Rice (FLYER) estimates the yield loss in rice due to Rice False Smut (RFSm) disease in a field scale. Here, the field scale is independent of the size of the field. The model uses two inputs: (i) fraction of productive but diseased tillers infected by the disease (as evident from presence of smut balls in the panicles) and (ii) average number of smut balls present in the diseased panicles. The calculation follows as of Equation 1: where, RFSmYL is yield loss in a field infected by RFSm (expressed as percentage), RFSmT is the diseased productive tillers expressed as fraction of total productive tillers in the field and RFSmPYR is the yield reduction in a diseased panicle as a function of number of smut balls present in the panicle (expressed as percentage). As noted earlier, RFSmT is an input of the model. The RFSmPYR was calculated by the following equation of "exponential rise to maximum value" (Miura, 2005): where, YRmax, a parameter, is the amplitude of yield reduction in a diseased panicle, YRhp, a parameter, is the offset RFSmPYR from 0 and YRrate, a parameter, is the rate of constant in relation to the number of smut balls present in the diseased panicle (bip); the bip is the second input of the model. Using the estimated values of the parameters, model was run with inputs RFSmT (in the range of 0 to 1, i.e., 0 to 100% diseased productive tillers in a field) and RFSmPYR (in the range of 0 to 160 smut balls per diseased panicle) to generate a yield loss chart.

Parameter Estimation
The model assumes, if smut ball forms in all the spikelets of a panicle, there will be no yield gain from that panicle; hence the value of the parameter YRmax can be set as 100, considering a yield reduction of 100%. FLYER further assumes that presence of zero balls in a healthy panicle will translate into no yield penalty in the panicle; hence the value of the parameter YRhp can be set as 0, considering a yield reduction of 0%. The value of the parameter YRrate can be derived by solving Equation 2 with measured data on yield reduction by number of smut balls per diseased panicle. We used "solver" function of Microsoft Excel 2007 to derive this parameter value. The "solver" function of Microsoft Excel is designed to define an optimal value for a formula that includes one or more parameters. This tool was previously used to estimate values of parameters for plant disease models (Salam et al., 2007). In our case, the "solver" function was used to minimise the mean squared deviation between observed and simulated values.

Data for Parameter Estimation
Seventy two data-points, 50 primary and 22 secondary, were used to estimate the value of the parameter YRrate. Nine hundred twenty eight paired panicles (healthy versus diseased) bulked into 50 samples were randomly collected from six fields from the experimental farm of BRRI, Gazipur, Bangladesh (described in section "Fields for data collection") in five categories of panicle size described in Table 1. However, all the samples did not belong to all the five categories of panicle size due to unavailability of severely diseased panicles across the categories. Sampling was done at crop maturity during October to November 2014. Spikelets from the sampled panicles were separated and filled and unfilled and/or partially filled spikelets were discarded from all of the 50 samples. The number of smut balls on panicles were counted and averaged by dividing with the number of panicles in the corresponding samples. Table 2. Data from Bangladesh, India and Japan on yield reduction, expressed as percentage, in a panicle of rice (figures in Column 2-4) from rice false smut disease as a function of number of smut balls per diseased panicle (Column 1). Bangladeshi data were generated from field measurements and Indian and Japanese data were sourced from literature and processed (details in "Materials and methods section"). These data were used for development of the "RFSmPYR" component of FLYER model (Equation 1 and Fig The filled spikelets were oven-dried at 48°C for 72 h and weighted for individual samples. The yield reduction in each sample (YR, expressed as %) was calculated as follows and presented in Table 2: YR = 100 -(((Weight of filled spikelets in healthy panicles-Weight of filled spikelets in diseased panicles)/ (Weight of filled spikelets in healthy panicles)) × 100) (3) The 22 secondary data-points were derived from literature. Table 2 lists 17 data-points from Japan, where samples were collected from 10 fields in Gifu prefecture (detail in Ikegami, 1959). This literature presented the weight of the panicles (accounting for only filled spikelets) by smut ball number and we calculated the yield reduction using the Equation 3. Table 2 also lists five data-points from India, where 10 bulk paired samples (healthy versus diseased) collected from the Agricultural Research Station, Mugad in the State of Karnataka (Hegde and Anahosur, 2000). This literature presented yield reduction in diseased panicles by small range of ball number per panicle; we averaged each range of ball numbers and related to corresponding yield reduction (Table 2). Table 3. Data from Bangladesh and Japan on yield reduction, expressed as percentage, in a panicle of rice (figures in Column 2-3) from rice false smut disease as a function of number of smut balls per diseased panicle (Column 1). Bangladeshi data were generated from field measurements and Japanese data were sourced from literature and processed (details in "Materials and methods section"). These data were used for validation of the "RFSmPYR" component of FLYER model (Fig. 3)

Model Validation
The model was validated in two steps: (i) testing the RFSmPYR component (the yield reduction in a diseased panicle as a function of number of smut balls present) of the model and (ii) testing the FLYER model in field scale. Two datasets, one from Bangladesh (seven datapoints) and the other from Japan (eight data-points) were used for validating the RFSmPYR component of the model. On the other hand, three datasets, from Bangladesh (four data-points), Egypt (seven data-points) and India (five data-points) were employed for validating the model in the field.

Data for Model Validation
For Bangladesh datasets on yield reduction as a function of number of smut balls per diseased panicle was measured in seven RFSm infected fields in the research station of the head quarter of BRRI, Gazipur, Bangladesh (section "Fields for data collection"). Bulk paired samples (healthy versus diseased) of 71, 125, 97, 268, 105, 177 and 230 panicles from Field 1, 2, 3, 4, 5, 6 and 7, respectively, were collected randomly under five categories of panicle size, described in Table 1. It may be noted that these samples were separate to that were used for model development. Sampling was done at crop maturity during October to November 2014. Spikelets from the sampled panicles were separated and filled and unfilled and/or partially filled spikelets were discarded for each of the seven samples. The number of RFSm balls in the panicles were counted and averaged by dividing with the number of panicles in the corresponding samples. The filled spikelets were ovendried at 48°C for 72 h and weighted for individual samples. The yield reduction in each sample (expressed as %) was calculated following Equation 3 and presented in Table 3. The Japanese dataset, presented in Table 3, were sourced through literature (Ikegami, 1959). It is the same literature that we derived Japanese dataset (17 data-points) for model development; however, this dataset for model validation belonged to different set of experiment conducted in different year. Japanese literature presented the weight of the panicles (accounting for filled spikelets only) by smut ball number and we calculated the yield loss using Equation 3 and presented in Table 3.
For validating the FLYER model in field scale, we measured yield loss in four fields in the research station of the head quarter of BRRI, Gazipur, Bangladesh (section "Fields for data collection"). The yield loss was measured from bulk paired samples (healthy versus diseased) of 125, 230, 177 and 268 panicles from Field A, B, C and D, respectively, that were collected randomly under five categories of panicle size, described in Table 1. These samples were separate to that were used model development and used for validating the RFSmPYR component of the model. Sampling was done at crop maturity during October to November 2014. Spikelets from the sampled panicles were separated and filled and unfilled and/or partially filled spikelets were discarded for each of the seven samples. The number of smut balls in the panicles were counted and averaged by dividing with the number of panicles in the corresponding samples. The filled spikelets were ovendried at 48°C for 72 h and weighted for individual samples. The yield reduction in each sample (expressed as %) was calculated following Equation 3 and presented in Table 4. The Egyptian dataset was reported in Atia (2004) and Indian in Sinha et al. (2003). Table 4. List of model inputs and observed yield loss (%) either measured (Bangladesh, four data-points) or sourced from literature (Egypt, seven data-points; and India, five data-points) (details in "Materials and methods section"). The FLYER model was run with the two inputs (Columns 2 and 3) and validated against the observed yield loss data (Column 4) as shown in Fig  The Indian dataset presented with the amount of smut balls as percentage in grains (filled spikelets) and we considered this as the number of smut balls per panicle to use it as one of the inputs for model. For validation purposes, FLYER was run with inputs as shown in Table 4.

Statistical Analysis
Data on attributes of yield loss in relation to the disease were analysed as "mean" with corresponding confidence interval at 95% level of statistical significance. In addition, the weight of single panicles was regressed over the number of the filled spikelets per panicle in 268 healthy and 268 diseased panicles separately. The intercepts and slopes of the regression lines were statistically compared using paired t-test. Performance of the model, for validating the RFSmPYR component of FLYER, was analysed statistically using a correlation-regression approach (predicted value versus observed value) (Kobayashi and Salam, 2000). For this approach, two regression statistics were used: (i) the coefficient of determination (R 2 ) for the 1:1 (y = x) line and (ii) the slope (m) of the regression line which was forced through the origin (Asseng et al., 2000). The standard error of the slope, the level of significance (P) to test whether the slope was different from 1 and the number of points (n) included in the regression analysis were also used.
Performance of the FLYER in field scale were analysed statistically using three approaches: (i) correlation-regression approach (as described above), (ii) paired mean testing approach (predicted value versus observed value) (Mead et al., 2002) and a deviation approach (predicted value minus observed value) (Kobayashi and Salam, 2000). For paired mean testing approach, the Standard Error of the Difference (SED) between two means was calculated as: where, SDP and nPare the standard deviation and number of data-points in model's prediction and SDO and nO are the standard deviation and number of datapoints in observation. The Least Significance Difference (LSD) was calculated using the SED and t-value at 5% level of significance and the means of model's prediction and observation were compared. This comparison was performed across the datasets of three counties and between the datasets within a country. For the deviation approach, two deviation statistics were used. The first deviation statistic was the Root Mean Squared Deviation (RMSD), which is the average product of deviations for each "data-point pair" in two datasets (Kobayashi and Salam, 2000). The second one was the Mean Squared Deviation (MSD). MSD has three components; Squared Bias (SB), squared difference between predicted and observed standard deviations (SDSD) and lack of positive correlation weighted by the standard deviations of predicted and observed values (LCS). MSD measures the total deviation between predicted and observed values. The lower the value of MSD, the closer the predicted value is to the observed value. SB indicates the agreement between the predicted and observed means, whereas SDSD and LCS together show how closely the model predicts variability around the mean. The two sources of this variability are the magnitude of fluctuations among the n observations and pattern of the fluctuations across n observations; SDSD and LCS quantify ability of the model to describe the magnitude and pattern of fluctuation, respectively.

Attributes of Yield Loss in Relation to Rice False Smut Disease (RFSm)
The RFSm showed causing yield difference in rice when diseased panicles were compared with healthy ones. Table 5 reveals, having an average of 268 panicles in each group, the weight of a single diseased panicle, with averaged 3.9±0.1 smut balls, was significantly lower (2.5±0.1 g) than healthy panicle (2.9±0.1 g), which translate into a 13.5% reduction in the panicle weight (± is 95% confidence interval). This difference in panicle weight tended to be larger and significant with medium, medium-large and large panicle-size category, but not with small and small-medium category. More smut balls were recorded on larger compared to smaller diseased panicles ( Table 5). The reduction in the weight of diseased panicles across the panicle-size category strongly related to the number of smut balls on the panicles (r = 0.93 or 0.60 across averaged 5 panicle-size category or 268 samples, respectively).
Compared to healthy, on average, diseased panicles recorded with significantly lower number of filled spikelets (151±7 versus 170±7), higher percentage of chaffiness (28±2 versus 16±1), but almost similar sized filled spikelets (weight of single filled spikelet in mg, 16.8±0.2 versus17.3±0.1) ( Table 6). In both healthy and diseased panicles, the number of filled spikelets increased significantly, whereas the weight of single filled spikelet decreased insignificantly with increased panicle-size. On the other hand, with increased paniclesize, chaffiness significantly increased in diseased panicles but remained statistically similar in healthy panicles (Table 6). On average across the panicle-size category, there was high (72.2%) increase in chaffiness but small (3.4%) decrease in the weight of single filled spikelet in diseased panicles in comparison to healthy panicles. Figure 1 indicates that the diseased panicles expressed a similar pattern of variation in chaffiness and number of smut ball sper panicle along the five categories of panicle-size. When measured, the association between chaffiness and the number of smut balls per panicle was found very strong (r = 0.96 or 0.74 across 5 panicle-size category or 268 paired samples, respectively).
A regression of single panicle weight against the number of filled spikelets per panicle accounted for more than 96% of the variance (R 2 = 0.96, intercept = 0.1196, intercept standard error = 0.0388, slope = 0.0166, slope standard error = 0.0002, p<=0.05, n = 268) in healthy panicles. Almost a similar variance (R 2 = 0.94, intercept = 0.0927, intercept standard error = 0.0391, slope = 0.0161, slope standard error = 0.0002, p<=0.05, n = 268) was also accounted for in diseased panicles (Fig. 2). A paired t-test showed neither the intercepts nor the slopes the two regression lines in Fig. 2 were statistically dissimilar, indicating the single panicle weight was similarly responded to the number of filled spikelets per panicle in both healthy and diseased panicles.
The relative contribution of number of filled spikelet and weight of a single filled spikelet to the variation in a diseased panicle weight in relation to averaged number of smut balls per panicle in five panicle-size category and all panicle-size categories is shown in Fig. 3. The figure shows that the additive effect of this two yield loss attributes almost entirely explained the variation in weight of the diseased panicles, relative to healthy panicles, within and between the panicle-size categories. Of this two, the variation in the number of filled spikelet per panicle contributed to over 75% variation in the weight of the diseased panicles.

The Estimate of RFSmPYR Component of the Model
The RFSmPYR component of the model (Equation  2), which is the yield reduction in a diseased panicle as a function of the number of smut balls present in the panicle (expressed as percentage), is presented in Fig. 4. Using the "solver" function of Microsoft Excel, the value of the parameter YR rate was estimated as 0.03 by minimising the Mean Squared Deviation (MSD) between 72 observed data-points and the model's prediction. In the range of 1 to 67 smut balls per panicle with observed yield reduction range of -4.5 to 100%, the minimised MSD was 60.2. Theoretically, a large MSD as high as 5394 is possible with an extreme value of the parameter. The estimated value of YRrate in Equation 2, compared to this extreme, was very low. With no consequence on yield in the absence of the disease, this parameterised component of the model estimates the Yield Reduction (YR) @ 3% per smut ball and reaches exponentially to the maximum of 100% YR when 387 smut balls present on a panicle.

Validation of the RFSmPYR Component of the Model
The RFSmPYR component of the model very accurately predicted the variation in Yield Reduction (YR) with the number of smut balls present in the panicle. This accuracy was observed in both datasets, in Bangladesh (Root Mean Squared Deviation (RMSD) = 2.3% YR) and in Japan (RMSD = 1.2% YR). A regression of predicted against the observed YR in the two countries accounted for more than 96% of the variance (R 2 = 0.96, slope = 0.99, standard error of the slope = 0.05, n = 15) (Fig. 5). Further statistical analysis considering the slope in the 1:1 line showed no significant difference (p<=0.05) between predicted and observed yield reduction. Table 5. Comparison of yield difference in healthy and diseased (by rice false smut) rice under five panicle size categories. Here, "healthy" referred to as absence of any smut ball within a panicle, whereas "diseased" referred to as presence of one or more smut balls in a panicle. Also presented total number of spikelets per panicle accounted for in this study under healthy and diseased panicle samples, together with number of smut balls per diseased panicles. ± is 95% confidence interval Total spikelets per panicle  Table 6. Comparison of attributes driving yield difference in rice due to rice false smut disease. Healthy and diseased panicles are compared side by side under five panicle size categories. "Healthy" referred to as absence of any smut ball within a panicle, whereas "diseased" referred to as presence of one or more smut balls in a diseased panicle. ± is 95% confidence interval Number of filled spikelets per panicle Chaffiness (%) Weight of single filled spikelet (mg)  Table 7. Two statistics, LSD (least significant difference using standard error of difference between two means) and RSMD (root mean squared deviation) comparing yield loss predicted by FLYER model and observed yield loss in Bangladesh (four datapoints), Egypt (seven data-points) and India (five data-points). The "mean of yield loss as %" in Column 2 and 3 refers to as average yield loss in percentage across the data-points within a country or all the countries Prediction (mean of Observation (

The FLYER Model and its Performance in the Field
Output from FLYER model on the estimated yield loss from Rice False Smut (RFSm) disease across the range of diseased productive tillers and average number of smut balls in the diseased panicles is shown in Fig. 6. As an example, the figure indicates that with five smut balls per diseased panicles, the yield loss can be expected in the range of <1 to ~14% depending on the incidence of the disease (as percent diseased productive tillers) in a population. When validated the model's output in field scale, the performance of FLYER appeared to be strong against the observed datasets in Bangladesh, Egypt and India (Table 7 and Fig. 7). Across the three countries with 16 data-points, the average yield loss predicted by the model (4.68%) and observation (5.52%) was close (RMSD = 1.25%) and statistically not different (LSD = 2.40).  (Table 1). The term "all PS" represents the average all panicles (268 each healthy and diseased) across the categories In Bangladesh, Egypt and India, the difference between prediction and observation, measured as RMSD, was 1.25, 1.65 and 1.65%, respectively and the paired mean differences were not statistically significant either (Table 7). A regression of predicted and observed yield loss (%) in all the data-points from the three countries explained 91% variance (R 2 = 0.91) in observation, further proving the model's strength in estimating yield loss in wider environments (Fig. 7). Addition statistical analysis with the slope of the regression in 1:1 line showed no significant difference (p<=0.05) between predicted and observed values (slope = 1.10, standard error of the slope = 0.09, n = 16). A critical analysis of the nature of discrepancy, although appeared to be small in the statistical analyses, between model's prediction and observed datasets is presented in Fig. 8. The figure shows the nature of this discrepancy was not the same in the datasets of the three countries. With Bangladesh, the small failure of the model to perfectly estimate observation is attributed to all the three deviation statistics: SB (i.e., agreement in the predicted and observed means), SDSD (i.e., the magnitude of fluctuation in the observed data-points) and LCS (i.e., the pattern of fluctuation in the observed data-points). On the other hand, SB and SDSD was the major cause of discrepancy in Egypt and Indian dataset, respectively. In these two countries, the model largely predicted the pattern fluctuation or LCS in the observed data-points. The shaded area shows how the rice false smut disease influences the system. The broken arrow represents the influence is either small or not well understood

Discussion
It is not new to report that Rice False Smut (RFSm) disease causes yield loss in the crop. What is new from this study is that we have established a link to this yield loss to the smut balls on the diseased panicles. We then have related this smut ball numbers to quantify the yield reduction in a diseased panicle. We then used this yield reduction function as a multiplier of the disease incidence to develop FLYER, an estimator of yield loss from false smut disease in rice fields.
As contributors to the yield loss in rice from the disease, previous studies identified chaffiness and weight of individual filled spikelets influencing yield reduction in rice panicles (Hu, 1985;Chib et al., 1992). The increase in the chaffiness due to RFSm has been reported to be around 20% (in the range of 9 to 41%) (Baruah et al., 1992;Hegde and Anahosur, 2000;Sinha et al., 2003;Atia, 2004;Srivastava et al., 2004). On exception to that, significantly high, 50 to 75%, chaffiness is recorded by Li et al. (1986). Compared to chaffiness, the weight of individual spikelets in a panicle has been found to be more influencing showing around 30% (in the range of 22 to 37%) decrease in weight Hegde and Anahosur, 2000;Atia, 2004;Srivastava et al., 2004). On the contrary, we present that chaffiness is more influencing than weight reduction in individual spikelets in explaining yield reduction due to RFSm.
Panicle producing tillers are the source of yield in rice crops. Development of tillers in rice is asynchronous with the primary tillers, having bigger panicles, produce more and quality grains (spikelets) compared to the secondary or tertiary tillers that initiate later in the development phase (Vergara et al., 1990). Bigger panicles mean larger number of spikelets; all of those, except a part, transform into filled spikelets (also termed as grains). This transformation is factored by chaffiness. Ideally in the healthy panicles, physiological variations in chaffiness (also termed as sterility) can be expected with panicle size (Mohapatra and Kariali, 2008), whereas the weight of individual spikelets usually relates to the number of filled spikelets per panicle (Oldeman et al., 1987). This has also been reflected in our study as represented in the flow diagram in Fig. 9. With RFSm infected panicles, the effect of chaffiness on the reduction of panicle weight is straightforward as it directly affects the number of filled spikelets. On the other hand, the precise effect of the weight of individual spikelets in a diseased panicle is not clear as it is confounded by the number of filled spikelets in the panicle, which has already dictated by the disease through chaffiness.
Healthy or diseased, this study shows, the bigger was the panicle, the higher was the yield. This yield was attributed almost entirely due to the higher number of the filled spikelets per panicle in spite of relatively lower weight of individual spikelets from the bigger panicles. With RFSm infection, the bigger was the panicle, the higher was the number of smut balls in the panicles. Relative to healthy panicles, this in turn, resulted in higher percentage of chaffiness, hence reducing the number of filled spikelets and finally causing yield reduction, with, however, additional small contribution from reduction in individual spikelet weight. Thus, the number of smut balls per panicle can largely explain the yield loss in rice with RFSm disease, as it relates to panicle size largely influencing chaffiness and slightly affecting single spikelet weight. We have quantified this relationship of smut ball numbers to yield reduction in a panicle and developed an estimator of yield loss from false smut disease in rice fields.
The model, FLYER, is simple because the yield loss is estimated through a single algorithm, basically with a single estimated parameter (relating yield reduction in a panicle to the average number of smut balls present in the panicle). In spite of this simplicity and empiricism, FLYER showed robustness because, (i) the yield reduction algorithm was developed using data from three countries, (ii) the developed yield reduction algorithm was validated in two countries and (iii) the model yield loss was tested in three countries. During the steps of model development and validation, we strictly applied the principle of not using the data from the same experiments or the same fields for both the purposes (Spedding, 1975). We applied paired mean test, correlation-regression approach and deviation-based approach to perform rigorous statistical analysis to prove the "usefulness" of the model (Baker and Curry, 1976). The rice growing scenarios in four countries, Bangladesh, Egypt, India and Japan, from where data were used for model development and/or validation were different with respect to climate, soil type, crop management system and variety. Besides, model testing covered across the ranges of naturally occurring RFSm incidence and severity (denoted as number of smut balls present in diseased panicles) observed in the fields (Singh et al., 2014). Successful testing under those diverse scenarios enhanced credibility of the model to be used across the national boundaries. To best of our knowledge, there is no such model available for estimating yield loss in rice from rice false smut disease.
In the conventional method of RFSm related yield loss estimation in rice, diseased and healthy panicles are sampled, their weights are measured and the proportion of the weights are multiplied by the proportion of diseased panicles in the population (Sinha et al., 2003;Atia, 2004;Upadhyay and Singh, 2013). This method requires lots of sampling and measurements in every field where the yield loss to be determined and it does not account for disease status (number of smut balls) in the panicles. Therefore, on one hand, it is a time consuming process and on the other, the yield loss may not be comparable due to variation in the scale (i.e., disease status). Our mathematical approach not only overcomes both the problems, it also validly estimates the yield loss.
While credited with the performance of FLYER, it is not to discredit that the model was not 100% perfect in predicting all the data-points in all the datasets. In fact, a model is expected not to be perfect as reality is always simplified in a model, partly because our understanding of basic processes is limited and partly because this enables us to handle the model (van Keulen et al., 1975). A model is a working hypothesis (Whisler et al., 1986) and efforts may be taken to further improve the model towards its perfection. Using a special statistical tool, we identified how and where the model deviated from what would have been a perfect-match of observations in all three countries. The model behaved differently in Bangladesh, Egypt and India. For example, FLYER showed a consistent biasness in predicting observation in Egyptian data by underestimating the yield loss, whereas it did not truly predict the pattern of fluctuation in observation of Indian data. With Bangladesh data, the model's imperfect match almost equally attributed to lack of biasness and lack of simulating the magnitude and pattern of fluctuations in observed data-points. The model needs to be tested with more data with wider range of disease incidence and severity to ascertain whether these deviation statistics specific to the counties are consistent. If they are and the deviations become large, the model may need to be re-parameterised for specific countries. While testing the model, it should be kept in mind that the FLYER model estimates yield loss assuming the loss directly related to the RFSm disease, not in association with other diseases in the population. The association of RFSm with other diseases, such as sheath blight or kernel bunt, is not uncommon (Tyagi and Sharma, 1978;Tsuda et al., 2003;Kapse et al., 2012); this should be taken into consideration while collecting data for model testing, or observing a field for estimating yield loss using this model.

Conclusion
By simply inspecting a crop affected by rice false smut disease, the model can be used as an instant estimator of yield loss in a field. Generated hard copy of the output of the model, in a matrix of disease incidence and averaged smut balls per infected panicle, can be used by extension agents to quickly assess a regional yield loss. It can also be used by the farmers for strategic management of the disease. The disease is predominantly soil-borne (TeBeest et al., 2010) and the estimated field-by-field yield loss can help a farmer to identify the fields where management options may be warranted. Where the size of a field is big, yield loss can be estimated in grids; thereby precision management can be employed in specific locations saving the cost of management. Given that number of smut balls on panicles or degree of blanking (chaffiness) is related to the level of resistance the cultivar (Cartwright et al., 2003), the model also has potential to be used by the researchers in assessing variety-specific yield loss.