Proposal of a Genome Editing System for Genetic Resistance to Tomato Spotted Wilt Virus

Corresponding Author: Federico Martinelli Department of Agricultural and Forest Sciences, University of Palermo, Viale Delle Scienze 13, 90128 Palermo, Italy Email: federico.martinelli@unipa.it Abstract: Viruses provoke considerable economical losses in agriculture. New molecular approaches to develop genetic resistance based on translational genomics and precision genetic modifications are highly expected. The type II Clustered, Regularly Interspaced Palindromic Repeats (CRISPR) system including Cas9 nuclease represent a promising and very powerful tool to specifically modulate the expression and activity of genes involved in biotic stress responses. In this study, we describe an approach to develop a platform system based on CRISPR system for genome editing technology in tomato. Tomato is an excellent plant for this approach considering the high-quality genome sequence, the rapid life cycle, the highly efficient in vitro plant culture and transformation protocols, Genome editing can be used to allow resistance to Tomato Spotted Wilt Virus (TSWV) infections by the successful obtainment of two specific objectives: (1) development of a Genome Editing (GE) system using CRISPR-Cas9 system in tomato (Objective 1) and (2) test the system in inducing genetic resistance to TSWV infections. First, it will be necessary to model the molecular dynamics of key host and pathogen proteins predicting how targeted mutations affect these interactions. Then these host players will be targeted by CRISPR-Cas9 technology. The obtained plants can be evaluated for their phenotypic resistance and deeply analyzed using “omic” platforms to gain insight into gene regulatory networks of plant resistance. Outcomes of the proposed project will be essentially three: (1) Identify host target proteins interacting with pathogenic proteins and model their dynamic interactions; (2) develop a platform technology usable to obtain resistant tomatoes to TSWV inducing targeted genetic modifications in the genome; (3) facilitate the adaption of this platform to the improvement of important traits in other specialty crops.


Environmental and Climatic Parameters Important for Tomato
The development and growth of tomato are influenced not only by the genotype and cultivation techniques, but also by climatic and environmental conditions (temperature, humidity, wind speed), which in turn influence the management of irrigation.
In fact, the climatic conditions affect evapotranspiration ET, which is evaluated by the product of the potential Evapotranspiration (ET 0 ) and the crop coefficient Kc, which is higher for tomato than for other crops (Todorovic et al., 2009) and it is particularly sensitive to temperature variations, especially in the period of growth and therefore in the period of higher water stress (Todorovic et al., 2009).
Besides it is also important a correct estimate of ET 0 , which cannot often be assessed according to the method of Penman-Monteith, which is the best model available nowadays (Allen et al., 1998), because it requires the measurement of many climatic variables. Therefore ET 0 is very often evaluated using the Hargreaves-Samani model which requires only the daily air-temperature data, largely measured (Samani, 2000), obtaining results close to Penman-Monteith model (Grillone et al., 2009;. A reliable ET estimation leads to a correct estimate of the water requirements and then to a good irrigation management, which is necessary in absence of sufficient rainfall and runoff. Runoff, often measured by indirect methods, is not always estimated correctly (D'Asaro and Baiamonte et al., 2015), especially in arid and semi-arid climate (D'Asaro and  and is strongly influenced not only by rainfall but also by air temperature , which varies if the farmlands are located in or close to urban centers rather than in the open countryside (Agnese et al., 2008).
In essence it is clear that the main climatic and environmental variable, decisive for tomato is the air temperature, which greatly influences the duration of the crop cycle (Todorovic et al., 2009), the ET0 (Grillone et al., 2009; and Kc (Todorovic et al., 2009) and the runoff coefficient  and therefore the estimation of the water requirements that determine an accurate irrigation scheduling both on a seasonal and daily basis.

Economic Importance of Viral Diseases in Agriculture
Virus diseases represent a risk factor for agricultural activity. The primary sector, compared with secondary and tertiary ones, is exposed to greater risks 2014b;Cecchini at al., 2013;Lanfranchi et al., 2014a;2014b;Lanfranchi and Giannetto, 2014;Monarca at al., 2014;Tudisca et al., 2014a;Santeramo et al., 2012). Generally the risk is an intrinsic part of business Tudisca et al., 2015a). When an entrepreneur makes a decision, he is not sure of its consequences on economic results 2014b;Volpato, 2000;Tudisca et al., 2015b), but risk related to these decisions originates the profit. In fact, if the results related to entrepreneurial choices were assured, the remuneration of employed resources, in absence of market power, should be known and equal to their marginal productivity, without any surplus Cosmina et al., 2012;Schotter, 1995). In agricultural sector, the decision of the farmer is more risky because it depends on biological characters of crop production (Lanfranchi et al., 2014c;Sgroi et al., 2014e;Testa et al., 2014a;2014b;2014c;Tudisca et al., 2014b;2014c;. In fact, in this case cropproduction yields are conditioned by various factors that the farmer can only partially control , typically by means of for variable (Lupo 2014a;2014b) and/or for attribute suitable statistical tools (Inghilleri et al., 2013;Lupo 2014c;2013). The crop-production risk, that is the fact that the cropproduced quantity or quality is lower than expected ones due to weather calamities or pathogens adversity, has a huge impact on farm revenue, determining remarkable uncertainty . Risk reduction represents an essential condition with the aim of creating a competitive advantage for enterprises in the medium/long term Sgroi et al., 2014d;Squatrito et al., 2014;Tudisca et al., 2013c). Risk reduction can be achieved by means of insurance instruments, that reduces uncertainty, or scientific research, that allows to contain risks due to attack of pathogens. The development of resistant genotypes using novel genome editing technologies is highly desirable to significantly reduce entrepreneur risks.

Characteristic of Tomato Spotted Wilt Virus (TSWV)
Tomato Spotted Wilt Virus (TSWV) of the genus Tospovirus in the family Bunyaviridae, is one of the most destructive plant virus affecting many crops such as tomato, pepper, potato, tobacco, peanut, lettuce, bean and ornamental species worldwide (Milne and Francki, 1984). TSWV virions are quasi-spherical made of an outer membrane envelope derived from the host, with two embedded viral-coded glycoproteins (GN and GC). Inside the viral particle, RNA dependent RNA polymerase (RdRp) and nucleoproteines are present. The genome is composed by three negative-sense or ambisense RNA segments: Segment L (~9 kb) encodes a putative RNA-dependent RNA polymerase; segment M (~5 kb) encodes the cell-to-cell movement protein, NSm and the precursor of surface glycoproteins, GN/GC, involved in TSWV transmission by thrips; and segment S (~3 kb) encodes a silencing suppressor, NSs and the nucleocapsid. TSWV management is difficult due to the broad range of hosts and the induced thrip resistance to chemical control (Boiteux and Giordano, 1993). At the moment only two dominant genes Sw-5 and Tsw have been shown to be effective to confer resistance and they were introgressed in tomato (Solanum lycopersicum) and pepper (Capsicum annuum) cultivars (Bioteux et al., 1995;Moury et al., 1998;Stevens et al., 1991). These genes induce broad resistance to a wide spectrum of TSWV. These genes were derived from two wild plants: Capsicum chinense and Solanum peruvianum. Resistance is cause by a gene-for-gene mechanism. Sw-5 is responsible of triggering Hypersensitive Response (HR) in sites of infections impeding systemic infections in the host. Sw-5 targeted the pathogen avirulence protein (NSm) (Peiro et al., 2014). Although five paralogues are present in plant genomes, it was shown that only one is necessary and sufficient to induce resistance against TSWV (Spassova et al., 2001). The Sw-5b protein comprises 1246 amino acids. It is a member of the coiled-coil, nucleotide-binding adaptor shared by APAF-1, certain R gene products, CED-4 (NB-ARC) and leucine-rich repeat group of resistance gene candidates (Meyers et al., 1999).
The emergence of TSWV Resistance-Breaking (RB) isolates reduced the effectiveness of management practices based on Sw-5 (Aramburu and Marti, 2003;Gordillo et al., 2008;Zaccardelli et al., 2008). The lack of a TSWV infectious clone has hampered the study of the molecular mechanisms associated with Sw-5 RB isolates. Comparative analysis of nucleotide and amino acid sequences from RB and Non-Resistance-Breaking (NRB) isolates revealed that the Sw-5 resistance occur when tyrosine or asparagine at positions 118 (Y118) or 120 (N120) are present in the NSm protein (Lopez et al., 2011). This knowledge opens the possibility to predict protein-protein interactions between the mutated resistance-breaking version and a modified host proteins identifying the effects of the substitution on these interactive protein with the NSm. Through the molecular dynamic modeling, it is possible to identify alternative forms of host proteins interacting with these mutated pathogen versions inducing a broad resistance to different virus isolates.

Detection Methods of Plant Diseases
Rapid, early and accurate diagnosis in the cultivation areas is essential to fight threats from dangerous pathogens of crops. Currently, crop diseases are diagnosed by visual scouting of symptoms. Generally, disease confirmation is performed using serological and molecular methods such as: Enzyme-Linked Immune-Sorbent Assay (ELISA), western blots, immuno-strip assays, dot-blot and specific electron microscopy (Van Vuurde et al., 1987). ELISA techniques through both poly-and monoclonal antibodies are useful to identification of many bacteria. This allowed to develop numerous detection kits. Nucleic acid-detection can be divided in DNA-based (FISH and the many PCR variants) or RNA-based (RT-PCR, NASBA and Ampli Det RNA) (Lopez et al., 2009).
Growers and food production industry would greatly benefit from new tools to detect pathogen infections at early stages of infection, which would help them better contain the spread of the crop disease and mitigate the potential for economic devastation. To be successful, rapid diagnosis needs to identifya complex network of relationships between pathogen, vector and host plant. Especially for tree crops, local growers struggle to afford the cost of regular scout teams, vector control measures and tree removal. Even when used, these practices are not enough to prevent the spread of the disease, which is present in the cultivation before can be detected by these technologies. Advanced methods based on volatile analysis of asymptomatic detection of plant disease directly in the field have been developed (Martinelli et al., 2014a) and will complement classical molecular methods (Panno et al., 2014). New sensor devices providing detection results directly in the field have been proposed in different crops (Dandekar et al., 2010;Aksenov et al., 2013;Martinelli et al., 2014b). RNA-seq using next generation sequencing allowed to identify host biomarkers for early disease detections (Martinelli et al., 2012a;. Microarray techniques have been shown to be effective tools in characterizing plant or organ stress status (Rizzini et al., 2010). Metabolomics, largely used for the elucidation of plant physiological processes (Tosetti et al., 2012;Martinelli et al., 2011a;2012b;Ibanez et al., 2014) is another potent tool for the identification of metabolic biomarkers for early detection of diseases. Genes involved in phenylpropanoid pathways or chaperones Natali et al., 2007) may be potential targets to characterize plant stress status.

Precision Genetic Modifications Techniques for Improvement of Disease Resistance
When symptoms of a disease firstly appear in a new area and pathogen detection is confirmed, plant eradication is commonly performed. These methods, however, are often only partly successful because pathogen survive in neighboring areas. Therapeutic treatments usually target pathogen vectors. There is now overwhelming evidence that some of these chemicals present high risks to humans and other living organisms (Forget, 1993). Nobody is completely protected against risks of pesticides, especially in developing countries. The effects of organic metabolites used in crop protection on natural ecosystem community are not known. The high risk groups exposed to pesticides include production workers, formulators, sprayers, mixers, loaders and agricultural farm workers. Pesticides could injure every living organisms. Indeed it is necessary to develop new methodologiescounteracting pathogens without the use chemicalcompounds. Indeed, the need to develop innovative approaches aiming at: (1) Boosting plant immune responses, (2) modulate plant responses to pathogen attacks (small molecule hormones and natural harmless metabolites), (3) create new (nontransgenic) resistant plants using precision genetic modification techniques. All these strategies need to be addressed by the exploitation of the high progresses obtained in bioinformatics, computational methods and functional genomics. Transgenic model and crop plants expressing antimicrobial peptides (e.g., insect cecropins) have been shown to allow resistance against various bacterial and fungal pathogens. An effective protein chimera combining recognition and lysis domains has been used to obtain transgenic grape plantsresistant to Xylella fastidiosa (Dandekar et al., 2012, PNAS 09:3721-3725).
Transgenic Genetic Modification (GM) techniques allowed significant enhancement of qualitative traits (Martinelli et al., 2009a) or the clarification of the role of genes and enzymes (Martinelli et al., 2011b). However, public opinions against GM raised by concerns on gene flow, ecological consequences, toxicity and allergenicity of GM crops. Recent advances in alternative biotechnological techniques are highly desirable. Using Precision Genetic Modification (PGM) techniques, researchers can target specific sequences in complex genomes and precisely introduce modifications including single nucleotide changes. The type II Clustered, Regularly Interspaced Palindromic Repeats (CRISPR) system including Cas9 nuclease is a very recent way for genome editing (Shan et al., 2013). Since no transgene is placed in the genome, this method is widely considered as non-transgenic. In particular bacteria, the CRISPR-Cas9 system is employed as an adaptive immune response against bacterial viruses. This system include the Cas9 nuclease performing double stranded breaks. This enzyme is guided by two noncoding RNAs: A crRNA and a tracrRNA forming a complex with Cas9. These particular RNA hybridize with the target genomic sequence that is cleaved at a site next to the hybridization (Fig. 1). The specificity is due to a 20 nt guide sequence present in the crRNA sequence upstream from a 5'-NGG Protospacer Adjacent Motif (PAM) sequence. Recently it was demonstrated that the crRNA and tracrRNA can be linked to constitute a guide RNA (sgRNA) (Jinek et al., 2012). Computational tools are available to design a sgRNA that will allow to target any gene sequence for its breakage and disruption. The DNA in the target site in the genome is cleaved by the Cas9 endonuclease and the resulting cleaved ends joined by an error prone Non-Homologous End Joining (NHEJ) process producing small deletions and creating a functional knockout. In addition, a repair DNA template spanning the cut site can be included by homologous recombination to obtain custom modifications modifying gene function (Baltes et al., 2014). These genome editing strategies can be efficiently used to induce crop resistance to stresses including biotic attacks.

Objectives and Methodology for Genome Editing in Tomato to Provide TSWV Resistance
A project dealing with the obtainment of genetic resistance to TSWV can be structured in two specific objectives: (1) Develop a genome editing system in tomato, (2) evaluate the transcriptome and phenotype of the obtained plants in relation to resistance to TSWV (Fig. 2).
To obtain TSWV resistance by genome editing it is necessary: (1) Predict the molecular dynamics of interactions between host proteins and NSm, (2) obtain of tomato plants using CRISPR-Cas9 technology targeting key host players interacting with TSWV. For the testing of these GE-obtained plants, two activities may be considered: (1) Analysis of GE-modified using "omic" tools, (2) a phenotypic evaluation of these plants.

Dynamic Modeling of Pathogen and Host Proteins Interactions
Prediction of protein tridimensional structure from its amino acid sequence is one of the fundamental challenges of structural biology. If the target protein has a homologue already solved, the task is relatively easy and high-resolution models can be built. If a structure homologue does not exist, or cannot be identified, models can be constructed from scratch, by a procedure called ab initio modeling (Klepeis et al., 2005). The availability of high-quality tomato genome sequences allow to use these chimeric proteins to interrogate a database theoretically composed by approximately the entire tomato proteome. In the case of TSWV resistance mediated by Sw-5, the tridimensional structure of NSm will be deducted through ab initio modeling software, such as the Rosetta suite. Structures obtained with this methodology can be relaxed in physiological conditions, by explicitly adding water molecules and ions, by Molecular Dynamics (MD) simulations. Host proteins can be identified having a structure that may interact with pathogen proteins such as NSm. It is essential to predict the molecular model of interaction between sw-5 protein and other tomato proteins to elucidate indirect intermediate between sw-5 and pathogen proteins. A functional study of the promoter of these genes will be important using public gene expression databases (i.e., Genevestigator). Endogenous proteins interactive with pathogen proteins can be identified to discover how protein domains interact each other. The purpose is to predict alternative proteins forms of the host interacting with the mutated pathogenic versions providing a broad spectrum resistance.

Development of a Genome Editing Platform in Tomato
The aim of this activity is to construct and test a tomato optimized RNA-guided genome editing system. The RNA guide will incorporate sequences to target a model gene that is useful to clearly identify if the system is working. A possible target gene may be the Phytoene Desaturase (PD) gene. This system may have the following components: (1) A Cas9 endonuclease required for DNA cleavage, (2) a sgRNA containing the 20 base target sequence for phytoene desaturase as well as the region that forms a complex with the Cas9 nuclease. Free online tools can be used to search the coding region to identify an appropriate 20bp target sequence that is downstream from a PAM sequence that will be cleaved by the Cas9 nuclease. Micro-tom tomato is a good model plant to develop the genome editing system in plants. This cultivar was originally developed for home gardening but has several qualities favorable for functional genomics studies. Like Arabidopsis, it grows well in a laboratory setting under artificial light. It has a short life cycle and can grow at densities up to 1357 plants/m 2 . A genome editing system may be composed by two binary vectors: The first will express the NPTII, dsRed and the Cas9 nucleases and a second binary vector can express the sgRNA and NPTII (selectable marker) (Fig.  2). Binary vectors 1 and 2 may introduced into a disarmed Agrobacterium strains to create a functional transformation system, standard lab protocols may be used to accomplish this. A Cas9 coding sequence may be synthesized and codon optimized for tomato based on the protein sequence from S.pyrogenes. A tag to the Nterminus of Cas9 and a nuclear localization sequence at the C-terminal may be added. The Cas9 and dsRed genes can be driven by a CaMV35S promoter as will be the sgRNA transcription. Online tools are used to search the coding region to identify an appropriate 20bp target sequence that is downstream. From tissue culture and shoot regeneration the wild type will have a green shoots but the mutants will display a bleached albino phenotype. Agrobacterium-mediated vectors containing the above constructs will be used to infect tomato tissues. Regenerated plants obtained by in vitro culture (Martinelli and Sebastiani, 2009) may be evaluated for the presence of the targeted mutations. At day 4 and 6 after infection protein and RNA are extracted. Expression of Cas9 nuclease may be determined by ELISA using a FLAG antibody and qRT-PCR is used to estimate the level of expression of the Cas9 mRNA and sgRNA. A replacement strategy can be tested at a specific location by introducing via homologous recombination an altered gene with a modified function. The strategy provide to construct and test an optimized dsRed gene sequence flanked by gene sequence of a tomato interacting protein discovered by previous modeling. This gene that will serve as a template along with the RNA-guided genome editing system on the same plasmid vector. The RNA guide will target tomato gene and the adjoining constructed dsRed containing DNA will serve as a template foreign replacement. In the case of TSWV resistance, the gene sequence can be modified based on the molecular modeling to allow interaction with the protein of non-resistance breaking strains. Regenerated plantscan be phenotyped to displaya phenotype stably expressing dsRed. Mutant plants can be checked for the presence of the targeted mutation by genome editing as previously described. The aim is to create plants that express the modified version of the host interacting proteins providing a broad resistance to TSWV.

RNA-Seq Analysis to Identify Host and TSWV Biomarkers in Specific Infected Tissues
Tomato plants obtained by genome editing and control plants may be inoculated by different nonresistance and resistance TSWV isolates. Mature leaf portions may be derived from greenhouse-grown tomato infected with TSWV and confirmed positive using PCR. An experimental design provides a total of 10 plants sampled in each category, 10 leaves are collected from the upper most branches of plants in each of the infected and uninfected categories. Time points may be 3, 7, 21 days and 2 months after infections. cDNA libraries can be constructed with commercial kits such as the TruSeq RNA Sample Preparation Kit (Illumina Inc. San Diego, USA) and sequenced using a deep next-generation sequencing platform. Assembled reads can be functionally analyzing identifying differentially regulated genes using appropriate bioinformatics software such as Mapman, Pathexpress, Cytoscape, Blast2GO. The validation of RNA-seq data can be performed using qRT-PCR selecting those candidate genes playing a key role in host-virus interactions. A knowledge base of the transcriptional regulation of early events during tomato responses to viral infections may be built. Data will clarify which key molecule plays an important role in transcriptional regulation of these interactions.

Molecular and Phenotypic Evaluation of Genetic Resistance
The same GE-modified plants analyzed may be evaluated at molecular, agronomic and phenotypic level and compared to control untransformed plants. These two categories of plants may be inoculated with TSWV isolates and the evaluation may be performed longitudinally at determined period from infections (3, 6, 21 days, 2 months after infections). RNA-seq, microarrays or even substractive hybridization methods have been shown to be particularly effective to characterize global metabolism changes of plant tissues (Rizzini et al., 2010;Galla et al., 2009;Martinelli et al., 2013a). qRT-PCR analysis is conducted to determine the expression level of defense response genes generally involved in plant-viruses interactions. The strategy is to focus on tomato genes that were differentially regulated by TSWV infections, previously identified by published works. Molecular markers commonly used for phylogenetic studies (Martinelli et al., 2008;Martinelli et al., 2009b;Minnocci et al., 2010) maybe also employed to determine untargeted effects on genomic structure. The analysis of the phenotype will be performed measuring key morphological parameters such as leaf, stem, whole plant dimension, length, size and health. It is also highly desirable to determine if other agronomic and morphological parameters, not related to response to TSWV infections are affected by genome editing. The analysis will be conducted both before and after infections. Titer and molecular analysis of acid nucleic of the pathogen will be conducted in sites of infections. We expect to see improved genetic resistance in GEmodified as shown by phenotypic parameters corroborated by molecular analysis.

Conclusion
CRISPR-cas9 technologies represent the future for genetic improvement using precision genetic modificationl technqiues. In two years hundreds of publications have been produced in just two years. This system has some important practical applications and benefits: • This system has many applications: (1) Inactivate harmful genes such as those that increase the susceptibility to pests and pathogens or (2) improve the function of some genes such as those that improve fruit flavor and quality • CRISPR-Cas9 can be delivered in plants using Agrobacterium-mediated plant transformation system. Since there is no insertion of any foreign genes in the modified scion, this platform should be less regulated compared to transgenics and have more public acceptance • The varietal phenotype will be modified only for the targeted feature, no other characteristic aspects of the modified variety will be changed • The system is very effective to specifically target one genome sequence with a low level of off-site unspecified modifications