# **Frontiers in RNAi**  *(Volume 1)*

# **Edited By**

# **Ralph A. Tripp**

*Department of Infectious Diseases University of Georgia Athens, GA USA* 

# **&**

# **Jon M. Karpilow**

*Biosciences Division Thermo Fisher Scientific Lafayette, CO USA* 

© 2014 by the Editor / Authors. Chapters in this eBook are Open Access and distributed under the Creative Commons Attribution (CC BY 4.0) license, which allows users to download, copy and build upon published chapters, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications. The book taken as a whole is © 2014 Bentham Science Publishers under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **CONTENTS**



## **FOREWORD**

Great technologies follow similar roads to maturity. A breakthrough discovery leads to rapid early adoption which (more often than not) precedes the identification of limitations. Eventually, solutions are identified and the technology, with all its strengths and weaknesses, is put to the test through wide-spread implementation. This process of maturation takes years and in this regard, RNA Interference (RNAi) is no different than yeast two-hybrid, chip-based gene expression profiling, and monoclonal antibodies.

Once the scientific community overcame the sense of excitement associated with the discovery of non-coding RNA-based posttranscriptional gene regulation, early adopters between 2000-2005 immediately gained insights into the technology's limitations. Not all siRNAs silenced with equal efficiency. Some siRNA designs activated the innate immune response. Other siRNAs exhibited off-target effects and could induce false positive phenotypes. To some, these challenges might have appeared overly daunting, but given the potential of RNAi, researchers in both academic and industrial settings pressed to find solutions. Algorithms were designed to address issues associated with functionality and position specific chemical modifications were adopted to minimize offtarget effects and activation of the innate immune system. Self-delivering siRNA were developed to address the need to deliver reagents to cell lines that were refractory to lipid-mediated delivery and viral-based gene silencing was constructed to facilitate experiments in systems that required extended periods of knockdown.

While challenges still persist (*e.g.*, therapeutic delivery), overcoming the first wave of technical hurdles has been paramount to expansion of RNAi into new fields of interest. As evidenced in this ebook, continued mechanistic studies are now combined with a host of new developments where RNAi technology is being merged with miniaturized screening platforms, field-specific database infrastructures, host-pathogen interaction mapping, and new (3D) tissue culture models. At the same time, the technology is reaching into more applied fields. RNAi screening is slowly becoming a module of synthetic biology and cell line engineering, and a key component of therapeutic drug repurposing. This expansion of the technology mirrors the growth and development observed with other platforms (*e.g.*, NG Sequencing, PCR) and is indicative that over the course of little more than a decade, RNAi has grown up!

> *Devin Leake*  Vice President of Research and Development Gen9 Inc. Cambridge, MA USA

## **PREFACE**

RNA interference (RNAi), has developed into an important tool for gene function elucidation by leveraging this endogenous gene-silencing pathway with well-designed triggers for the systematic silencing of particular genes. Incorporation of RNAi into high throughput screening platforms has led to important discoveries in fields such as cancer biology and pathogenesis, successes driven by establishment of best-practices and developments in instrumentation, analysis, and the growing sophistication of cell-based assays. The following chapters represent some of the most important considerations and recent developments in RNAi screening technology. It is certainly our hope that this compendium contains something for everyone, both novice and experienced researchers alike.

The topics in this collection range from recent automation platforms for ultra-high throughput screens (see Chapters 2 and 8) to novel applications of this gene modulation technology to cells grown in 3D culture (see Chapter 9). Chapter 1 provides a review of RNAi as a research tool as well as experimental and bioinformatics approaches to reduce and identify off-target effects in highthroughput screening data. Important considerations regarding deposition of large data sets in public repositories are presented in Chapter 3. Pooled shRNA screening methods are discussed in Chapter 4. Chapters 5 and 6 describe how RNAi technologies are being used to interrogate the host-pathogen interface. RNAi screening in difficult-to-transfect immune cells is addressed in Chapter 7 and applications beyond target discovery are presented in Chapters 10 and 11.

While it is not evident at first glance, a research alliance underlies all the contributions detailed in this book. All authors are affiliated with a lab that participates in the RNAi Global Initiative, a world-wide association of biomedical researchers established to broaden and accelerate the utility and application of RNAi technology. Founded in 2005 with eleven member institutions, the organization focused on developing and disseminating information on the basics of RNAi screening. As the organization grew and technology developed, focus shifted toward more complex, often biological, challenges.

Nine years after its inauguration, the RNAi Global Initiative now includes over 60 academic institutes spread across five continents. Members continue to share their findings in a range of fields (stem cell biology, cancer, and host-pathogen interactions) and as evidenced by this book, collaborate in the development of new bioinfomatic tools, screening methods, and programs in applied biology. The longevity of this alliance is the result of continued contributions on the part of its membership and is a testament to the importance of collaboration in building a cohesive scientific community.

> *Ralph Tripp*  University of Georgia College of Veterinary Medicine Department of Infectious Diseases Athens, GA, USA

> > *&*

*Jon Karpilow*  Thermo Fisher Scientific Lafayette, CO, USA

## **List of Contributors**



*vi* 


## **ACKNOWLEDGEMENTS**

The authors would like to thank the founders and sponsors of the RNAi Global Initiative: Dharmacon Inc. Their dedication to the field of RNAi screening and passionate support of this community not only made this ebook possible, but continues to drive advances to the benefit of all. In particular, we would like to acknowledge Mike Deines, Dr. Queta Smith, as well as Dr. Doris Beylkin for their instrumental roles in conceiving, organizing, and driving RNAi Global since its inception.

Additionally, the contributors would like to thank Dr. Beylkin for her leadership in formulating the content and organization of this book.

**CHAPTER 1** 

# **RNAi and Off-Target Effects**

**Amanda Birmingham1,†, Andreas Kaufmann2,† and Karol Kozak2,3,†,\***

*1 Dharmacon, part of GE Healthcare, 2650 Crescent Dr., suite 100, Lafayette, CO 80026 USA; 2 LMSC, ETH, Schafmattstrasse 18, 8093 Zurich, Switzerland and <sup>3</sup> Medical Faculty, Technical University Dresden, Photenhauerstr 41, 01307 Dresden, Germany* 

**Abstract:** RNA interference (RNAi) is widely used in high-throughput reverse genetics screens for basic research and drug discovery. However, this technique is known to produce "off-target" effects, or phenotypic results caused by an RNAi reagent's knockdown of unintended genes rather than the gene of interest. Off-target effects are regulated through multiple mechanisms within the cell, and their presence can greatly complicate the interpretation of experimental data. We review the biology of offtargeting and discuss both bioinformatics and experimental approaches to reduce RNAi reagents' off-target effects. Since such techniques cannot completely eliminate offtarget effects, we also discuss analysis methods developed to identify off-target effects in RNAi screening data and, in some cases, even leverage them to uncover novel biological functionality.

**Keywords:** bioinformatics, miRNA, mRNA, off-target effects, off-targets, pools, RNAi, screening, seeds, silencing, siRNA.

### **INTRODUCTION**

RNA interference (RNAi) is a widely used tool for functional genetic studies and genetic screens in a number of organisms. In addition, RNAi-based technologies seem to have significant potential for the treatment of human disease [1, 2]. RNAi targets messenger RNA (mRNA) for degradation or translational attenuation and is mediated by small interfering RNA (siRNA) or microRNA (miRNA) bound to the RNA-induced silencing complex (RISC). Initially, it was believed that this gene silencing is highly specific due to required near-perfect complementarity between the siRNA and its target transcript [3, 4]. However, siRNA-mediated gene silencing can lead to widespread, unintended effects that severely complicate the interpretation of experimental results or disease treatment [5-7]. Non-targetspecific side-effects caused by the introduction of siRNAs, such as broad changes

**Ralph A. Tripp & Jon M. Karpilow (Eds) © 2014 The Author(s). Published by Bentham Science Publishers**

**<sup>\*</sup>Corresponding author Karol Kozak:** Medical Faculty, Technical University Dresden, Photenhauerstr 41, 01307 Dresden, Germany; Tel: 0351 8823 248; E-mail: karkoz1@gmx.de

<sup>†</sup> These authors contributed equally to this work.

in gene expression profiles caused by cationic lipids present in transfection reagents [8] and a non-specific interferon response [9, 10], can be successfully addressed by optimizing delivery methods and siRNA design algorithms. However, RISC-dependent dis-regulation of unintended transcripts containing full or partial sequence complementarity [5-7, 11-14] has proven more challenging to control. This dysregulation leads to sequence-dependent false positive results, or so-called off-target effects. The use of bioinformatics tools alone has been unsuccessful in preventing off-target effects, but a combination of bioinformatics and chemical modifications of the siRNA has been found to reduce unintended targeting, and bioinformatics analysis is increasingly able to identify off-target effects in experimental results [15-21].

## **Mechanisms of RNAi**

RNAi leads either to cleavage of the target mRNA or to its translational repression and/or degradation.

### *The Cleavage-Based Pathway*

Cleavage of the target mRNA is facilitated by small double-stranded RNA molecules having complete or near-complete complementarity to their target. Such molecules can be produced when exogenous (500-1000 nucleotide long) double-stranded RNA (dsRNA) is processed in the cytoplasm by Dicer, an RNAse III enzyme, into siRNAs-approximately 21-nucleotide-long double-stranded RNA fragments with a two-nucleotide overhang at both 3′ ends [22, 23]. Alternatively, synthetic siRNAs that mimic Dicer cleavage products can be directly introduced into cells [3], thus bypassing target-nonspecific inhibitory mechanisms, such as the interferon response, elicited by longer dsRNA in mammalian cells [24]. The siRNA duplexes consist of a guide or antisense strand, which is of complementary sequence to the target mRNA, and a passenger or sense strand. Dicer facilitates RISC formation and entry of the siRNA into RISC [25]. Argonaute 2 (Ago2), a functional subunit of RISC, is required to cleave the passenger strand and unwind the guide strand from the siRNA duplex. The cleaved passenger strand is then released and the guide strand is loaded into the now-activated RISC [26-29]. It is not fully understood how the activated RISC locates the target mRNA within the cell, but binding of the target mRNA occurs by complementary base pairing and Dicer cleaves the mRNA at the point opposite bases 10 and 11 of the guide [30]. The guide strand is not affected; thus, RISC can cleave multiple copies of the target mRNA, leading to effective gene silencing.

### *RNAi and Off-Target Effects Frontiers in RNAi, Vol. 1* **5**

Although siRNAs and shRNAs (short hairpin RNAs) produce a similar functional outcome, shRNAs differ significantly in their structure and processing before entering the cleavage-based pathway of RNAi. Experimentally, shRNAs can be produced from plasmids or viral vectors [31] and have a hairpin structure [32]. In the nucleus, shRNAs are expressed as pri-shRNAs (primary shRNAs) and converted into pre-shRNAs (precursor shRNAs) by the RNAase III nuclease Drosha [33-35]. Exportin-5 (Exp5) then binds to and transfers the pre-shRNA to the cytoplasm [36], where Dicer processes it to a functional siRNA [37].

### *The Non-Cleavage-Based Pathway*

The non-cleavage-based RNAi pathway that leads to translational repression and mRNA degradation is mainly facilitated by miRNAs, which comprise a large family of endogenously encoded, non-protein-coding RNAs approximately 21 nucleotides long. In many organisms, miRNAs are vital for post-transcriptional modulation of gene expression. In this non-cleavage RNAi pathway, pri-miRNA molecules of several kilobases in length are transcribed in the nucleus by RNA polymerase II [38]. Drosha, together with DGCR8, which is essential for the recognition of pri-miRNA, processes the pri-miRNA transcripts into approximately 70 nucleotide-long hairpin pre-miRNAs [33, 38] that are actively exported from the nucleus by exportin-5 [36, 39, 40]. In the cytoplasm, the premiRNAs are again processed by Dicer into mature miRNA duplexes [22, 41, 42]. Mature miRNAs function together with argonaute, as catalytic component, and GW182, as scaffold protein, as ribonuleoprotein complexes, miRISCs (miRNAinduced silencing complexes), that regulate protein synthesis by binding to target mRNAs.

In animals, miRNAs generally bind to the 3′-untranslated region (3′UTR) of target mRNAs and attenuate protein synthesis by either inhibiting translation or by inducing mRNA degradation through deadenylation, although it has also been observed that some miRNAs target 5′UTR and coding regions of mRNAs and/or may even enhance translation [43, 44]. Binding of a miRNA to its target mRNA does not require a fully complementary sequence. Instead, association is usually mediated by a six-to-eight base 5′-seed region that provides most of the basepairing specificity (reviewed in [45, 46]). This matching may in some cases accommodate mismatches and bulged nucleotides [47]. Other partial complementarity such as that conferred by "centered seeds" [48] and "3′ compensatory seeds" [45] has been shown to direct targeting in some situations. Because these targeting mechanisms require such limited pairing, a single miRNA

### **6** *Frontiers in RNAi, Vol. 1 Birmingham et al.*

may regulate multiple mRNAs, and conversely a single mRNA may be regulated by multiple miRNAs [49, 50].

### **Mechanisms of RISC-Dependent Off-Targeting**

Gene silencing by RNAi was initially believed to be highly sequence-specific, only tolerating a few mismatches between the guide strand and the target mRNA [4, 51-53]. However, subsequent genome-wide analysis of siRNA efficacy and specificity revealed that siRNAs containing as few as eleven contiguous nucleotides of identity are in some cases sufficient to silence non-targeted genes [5]. Furthermore, since RISC-dependent off-targeting is driven by the strand of the siRNA duplex that has been loaded into RISC, off-target effects can be caused by erroneous loading of either the sense or the antisense strand [5, 14].

The primary cause of off-target effects appears to be siRNAs functioning in translational repression by a mechanism that is similar to that of endogenous miRNAs [11]. When siRNAs silence off-targeted genes through the non-cleavagebased pathway, full complementarity of the siRNA to the off-targeted mRNA is not required. A single complementary binding site for the hexamer or heptamer seed region (positions 2-7 or 2-8) of the antisense strand of the siRNA within the 3′-untranslated region (UTR) of the mRNA can be sufficient for repression [7, 12, 14]. Silencing by this mechanism generally leads to modest changes in expression, with a magnitude of two-fold or less [13].

Although the majority of observed off-target effects appear to be due to siRNAs acting through a miRNA-like mechanism, off-targeting through the cleavage based-pathway can occur when the RISC-loaded strand has perfect or near-perfect complementarity to an unintended mRNA. Since cleavage-based silencing is generally stronger than non-cleavage-based, the magnitude of such off-target effects can rival that of the intended effect, knocking down expression of the unintended target almost completely. Off-target effects of certain siRNAs can also be caused by complementarity with an unintended target that is complete save for G:U wobbles in position 9 and 10 of the anti-sense strand; while wobbles at these key positions likely prevent cleavage of the unintended target, protein levels for that target have been shown to nonetheless suffer considerable reduction [54, 55].

### **Effects of off-Targeting**

When present but unaccounted-for in experimental data, off-targets lead to misinterpretation of biological results; in high-throughput screens, off-targets are particularly problematic because they inflate the false-positive rate of the screen,

### *RNAi and Off-Target Effects Frontiers in RNAi, Vol. 1* **7**

thus increasing cost and time spent in validation. While side-effects due to lipid usage and interferon activation can be adequately controlled in almost all cases, some screens have been found to be particularly vulnerable to RISC-dependent off-targets: for example, Lin *et al*. [7] reported a dual luciferase reporter screen designed to identify novel effectors of the HIF-1 pathway in which all three top hits were found to be caused by off-target effects, and Sudbery *et al*. [56] described an alamarBlue screen for sensitivity to TRAIL-induced apoptosis in which off-targets outnumbered true hits by two to one in their top 20 siRNAs. Schultz *et al*. [57] hypothesize that off-target effects may overwhelm on-target ones in assays that measure particularly dose-sensitive pathways such as TGF-β. However, it is not known how many or which assays fall into this category. Even in generally less-affected screens such as those based on dsRNAs, off-targets can considerably inflate the number of false positives [58]. For these reasons, considerable effort has been devoted to the minimization of off-target effects by optimizing reagent design, adoption of assays that are less sensitive to such effects, and incorporation of post-experimental data analysis.

## **PREVENTING OFF-TARGETS**

### **Reagent Design**

Because of the considerable costs and problems posed by off-target effects, reagent designers employ a wide array of strategies to reduce reagent-induced offtargeting. A sampling of the most successful approaches to date is highlighted below.

### *Avoiding Toxicity and Interferon Effects*

While less problematic than RISC-dependent off-target effects, toxic effects caused by an siRNA's activation of the innate immune system must still be addressed during reagent design. Exogenous double-stranded RNA is prone to recognition by Toll-like receptors (TLRs). TLR7 and TLR8, in particular, recognize synthetic siRNAs in a sequence-dependent manner [59, 60] and seem to prefer GU-rich subset of sequences, so siRNA target sites lacking GU-rich regions are preferable. Avoidance of motifs such as 5′-GTCCTTCAA-3′, 5′-TGTGT-3′, and tetrad-forming poly-(G) stretches also helps to overcome Toll-like receptor recognition [59, 61]. Furthermore, sequence motifs such as UGGC can lead to toxicity through RISC-dependent mechanisms [62], and excluding such sequencespecific triggers to cellular stressors where possible can reduce widespread offtarget dysregulation.

### **8** *Frontiers in RNAi, Vol. 1 Birmingham et al.*

When designing shRNAs, additional consideration must be given to the choice of scaffold and promoter. microRNA-derived shRNA scaffolds appears to reduce cellular toxicity compared to simple hairpins, presumably because of more efficient processing by the Microprocessor and Dicer [63-66]. Additionally, extremely strong promoters such as the human U6 small nuclear promoter can create toxicity, perhaps due to overexpression of the shRNA that overwhelms the cell's RNAi machinery [67]. They should therefore be avoided in favor of more moderate promoters.

## *Designing for Efficient RISC Entry of the Antisense Strand*

In the early steps of RISC formation, both strands of the siRNA duplex are initially incorporated into the complex. The siRNA strand with the less thermodynamically stable 5′-end is preferentially utilized as the antisense strand [68, 69] and the remaining (sense) strand is subsequently cleaved and released [29, 70, 71]. Because of this, siRNA guide strands with relatively low 5′-end free energy (such as those having at least three (A/U)s in the final five nucleotides at the 3′-end of the sense strand, are preferable in order to prevent off-targeting driven by unintended RISC-loading of the sense strand [72]. A different approach to prevent passenger strand entry uses an antisense strand that is complemented with two shorter, 9 - 13 nucleotide-long sense strands. Because the RISC is unable to incorporate the bipartite sense strands, such a three-stranded construct completely eliminates unintended mRNA targeting by the sense strand [19]. In addition to sequence- and structure-based design parameters, synthetic siRNAs (but not shRNAs expressed from plasmids or viral vectors) can be chemically modified to prevent loading of the sense strand into RISC (see below).

### *Avoiding Near-Perfect Complementarity with Unintended Targets*

As described above, mRNAs other than intended targets that exhibit perfect or near-perfect sequence complementarity with the siRNA can suffer unintended cleavage. Considerable success in predicting such cleavage-based off-targets is possible, since this is largely analogous to predicting on-target cleavage-based functionality of an RNAi reagent. Generally, such off-targets are predicted by checking for perfect or near-perfect sequence matches against the targeted transcriptome; they are then avoided by choosing only reagents that do not have 17 or more bases of sequence complementarity to any unintended mRNA target [73]. However, this approach can sometimes lead to false positive off-target predictions if the anticipated off-targeted transcript is not expressed in the cell line or tissue to be assayed.

Prediction of a reagent's seed-based off-targets has proven much more challenging than prediction of its cleavage-based off-targets. While it is the case that most off-targeted genes have 3' UTR seed complementarity to the offtargeting reagent, it is *not* the case that most genes with such seed complementarity are in fact off-targeted [12]. Since seed-based off-targets are understood to be produced by an miRNA-like mechanism, miRNA target prediction tools (such as TargetScan, miRanda, and DIANA microT, among others) are frequently employed to predict siRNA-mediated off-targets [74-76]. Such tools are usually based on hexamer, heptamer, or octamer seeds, and are usually limited to identifying reverse complements in the 3′ UTR of the transcriptome. Sometimes they also include criteria of limited relevance to offtarget prediction applications, like conservation filters. While these approaches perform better than simple seed-match counting, they generally lead to massive over-prediction of targets [77]. Nonetheless, since the precise mechanism of seedbased targeting is an area of intense scientific interest, much research is underway that will lead to improvement in these techniques [78-84].

Until these techniques are perfected, reagent designers will continue to rely on heuristic approaches to reducing an siRNA's potential for miRNA-like offtargeting. Such methods include avoiding siRNAs whose antisense strands contain seeds already believed to cause miRNA-like targeting in the system of interest (such as those shared by known miRNAs in that system) [73] or preferring siRNAs with low-stability seeds, since these have been shown to have reduced off-target activity [85]. Other approaches examine the potential space of seed-based off-targets for a given antisense seed by evaluating the frequency of complements to it in the 3' UTRs of the transcriptome of interest. It has been shown that siRNAs with low seed complement frequencies (SCFs)-and thus a low number of potential seed-matched transcripts-generally off-target a smaller number of unintended transcripts [86]. Conversely, it has also been suggested that siRNAs whose seeds have a particularly high abundance of potential target sites in the transcriptome may not function efficiently as miRNAs due to dilution of their ability to repress any individual unintended transcript [74]. Thus, designers may choose to favor siRNAs containing low and/or high (but not medium) frequency seeds in order to lower their off-targeting potential.

### **Chemical Modifications**

The most promising strategies so far to reduce off-target effects rely on chemical modification of the siRNA, although the utility of such modifications is limited to exogenous siRNAs and not those produced through shRNA processing. Several attempts have been made to chemically modify the guide strand of siRNAs to enhance on-target specificity and decrease off-target recognition. In addition, certain modifications of the passenger strand prevent it from incorporation into RISC.

## *Chemical Modifications of the Guide Strand*

The seed region of the guide strand initiates recognition of the target mRNA when presented to RISC [87], and the thermodynamic stability of this interaction correlates positively with off-targeting results [11]. Thus, destabilizing seed-target interactions is thought to improve on-target specificity and minimize off-target effects by requiring base-pairing of additional residues outside the seed region for the siRNA to bind to a target mRNA. Consistent with this idea, the incorporation of an unlocked nucleic acid (UNA) monomer, a strongly destabilizing modification, at position 7 of the guide strand has been shown to reduce offtargets with a minimal decrease in siRNA efficiency [17]. A similar destabilizing effect has been achieved by substituting residues 1 - 8 of the guide strand with DNA [16]. 2'-O-methyl modifications of position 2 of the guide strand also reduce off-targeting, presumably due to size constraints on Ago2's accommodation of such a modification [15], and a similar effect has been found when introducing an additional nucleotide at position 2 of the antisense strand that forms a bulge when binding the 19 nucleotide target sequence [18].

## *Chemical Modifications of the Passenger Strand*

In addition to the guide strand, the passenger strand can also contribute significantly to off-target effects [5, 14]. Thus, blocking sense strand incorporation into RISC should significantly decrease off-target silencing. As noted above, RISC preferentially loads the siRNA strand that has the less thermodynamically stable 5′ end. Accordingly, selective thermodynamic stabilization of the passenger strand 5′ end by incorporation of locked nucleic acid (LNA) disfavors passenger strand incorporation into RISC [20]. Similarly, blockage of the passenger strand 5′ phosphate by 5′-O-methyl modification [21], 5-nitroindole modification at position 15 of the passenger strand [88], or chemically modified 3′ overhangs of either strand [89] also reduces sense strand selection and thus off-target effects.

### **Pooling of siRNAs**

As both siRNA and miRNA effects are concentration-dependent [5, 53], off-target effects can be reduced by pooling several different synthetic siRNAs targeting the same gene. The strategy behind this approach is to minimize off-target effects from each individual siRNA while maintaining an effective concentration of the total siRNAs for the intended target: the total concentration of siRNA used is divided up amongst multiple individuals, each of which presumably have distinct off-target signatures, and therefore each individual is less capable of causing offtargets while still contributing to effective on-target knockdown. A similar effect is achieved when using esiRNAs, which are mixtures of siRNA oligos generated by cleaving long double-stranded RNA with an endoribonuclease such as *Escherichia coli* RNase III or Dicer [90]. In this case, a heterogeneous mixture of many dozens of different siRNAs contributes to the efficient on-target knockdown. As was the case with pools of synthetic siRNAs, off-target effect caused by a single siRNA in the esiRNA mixture is thought to be "diluted" out to a degree below detection level. Similar to shRNAs, esiRNAs cannot be chemically modified, so chemical modifications that would increase efficiency, specificity, or stability of the molecules are not usable with this technique. Overall, while these strategies may be successful in reducing off-target effects for individual target transcripts, they can also complicate large-scale screening; because strong off-target effects by one siRNA may mask the phenotype of the other siRNAs in the pool, all gene targets identified through pooled screens must be validated *via* techniques such as deconvolution of the individual reagents or testing of pools of reagents with fully independent sequences (and thus separate off-target effects) [91].

### **IDENTIFYING OFF-TARGETS**

While reagent sequence selection and modification choices can considerably increase siRNA specificity, it is not yet possible to eliminate off-target effects entirely for all reagents. For this reason, researchers employ a number of techniques to identify potential off-target effects in their experimental data.

### **Reannotating RNAi Libraries**

For large-scale genetic screens, several companies offer sets of RNAi reagents targeting thousands of genes. Such RNAi libraries target the whole genome or a family of genes (*e.g.* kinome) of a given organism. There is usually a delay of months to years between the design of a library and the analysis of screening

### **12** *Frontiers in RNAi, Vol. 1 Birmingham et al.*

results produced with it. During this delay, the annotation of transcript sequences or genome assemblies may change, providing additional information about the potential targets of siRNAs in the library. This information may reveal a number of problematic situations: 1) A given oligonucleotide's target gene is now believed to be a pseudogene; 2) The oligonucleotide targets a different gene than originally predicted; 3) The oligonucleotide targets multiple genes; 4) The oligonucleotide targets no gene at all; and/or 5) The oligonucleotide has been shown to be likely to function in an miRNA-like fashion through its seed region. To detect such situations and exclude problematic RNAi reagents from the analysis of screening results, there is a need to reanalyze RNAi library sequences from time to time. This information is provided to the siRNA community by resources such as the RNAiAtlas [92] and GenomeRNAi [93] databases; see the accompanying chapter on RNAi Databases for further details.

### **Detecting Evidence of off-Targets**

Bioinformatics techniques designed to detect evidence of off-targets generally have the goal of narrowing the list of intended targets to follow up on from the set of those tested in a broader experiment (although, in some cases, detection of offtarget effects may also suggest additional biology of interest and thus become a hypothesis-generation tool in its own right). While it is entirely possible for phenotypic results from a given reagent to be due to a combination of both its onand off-target effects, several methods are in use to identify those whose results are likely to be *primarily* due to off-target effects. If multiple reagents per target were tested separately during the experiment, then much can be learned by combining their information, using methods ranging from simple counting approaches to more complex weighted rankings. For example, a basic but common heuristic for validating screening hits is to count the number of independent reagents for a target that give the phenotype of interest; if only a minority of them produce the desired phenotype, then it is more likely to be due to off-targeting than to knockdown of the intended target. This approach has been formalized as the "H score" [94]. More nuanced aggregation of multiple reagent results can be achieved with techniques such as RSA (Redundant siRNA Activity) [95] or cSSMD (Collective Strictly Standardized Mean Difference) [96]. These approaches examine the degree of phenotype produced by all reagents for a given target and rank targets by how consistent the phenotype produced across those reagents is; while high-ranking targets are likely real effects (and are usually the desired output of these methods) low-ranking targets can be interpreted as probable victims of off-target effects. A considerable advantage of techniques based on multiple reagents per target is that they do not require any assumptions

about the mechanism of off-targeting. Thus, they can detect evidence of both cleavage-based and seed-based off-target effects, as well as clues to any other potential mode of off-targeting.

If multiple separate reagents per target were not tested during the experiment, then some mechanistic assumptions must be made in order to find signs of off-target effects in the data. Cleavage-based off-targeting is usually not widespread enough to be detectable by general analysis of a screening data set, but seed-based offtargeting can be investigated by checking for recurring association of a seed with a given phenotype. In the first approach, the tested reagents that have been chosen as screening hits are examined. If the reagent sequences are enriched for any particular seeds, this indicates that this seed may be causing an off-target effect leading to the tested phenotype [56]. When using this method, it is important to compare the enrichment against a relevant set of tested but not positive reagents, since enrichment of certain seeds is likely in the overall tested set due to design algorithm bias; for example, the Reynolds *et al*. siRNA design algorithm [72] favors (A/U)-rich seed regions, so seeds with high (A/U) content will likely be enriched in all tested siRNAs designed with this algorithm, regardless of whether or not they are hits. The second approach, Common Seed Analysis (CSA) [97], examines the overall performance of all reagents containing a particular seed that were assayed in the experiment. The expectation is that if a seed has no off-target effect, those reagents containing it will give phenotypes randomly distributed around zero. Those seeds whose reagents show a concerted shift away from zero in either direction are considered likely off-target drivers. Unlike the first method, this approach does not require identifying hits, although it would not be effective on data from a screen in which most reagents were expected to have a real biological effect (such as that of a targeted library).

### **Identifying off-Targeted Genes**

Considerable efforts have been made to develop bioinformatics methods that identify particular gene(s) off-targeted by a given RNAi reagent or set of reagents. As discussed earlier, *de novo* prediction of genes that will be off-targeted by a reagent, based only on its sequence, is relatively successful for cleavage-based off-targets but often disappointing for seed-based ones. However, outcomes can be improved when experiments have already been performed and phenotypic data can be incorporated into the analysis. In such situations, identification of genes affected by cleavage-based off-targeting is largely ad hoc; a reagent that has been identified by one of the above-discussed methods as likely to produce off-target effects is checked for perfect or near-perfect sequence matches against the

### **14** *Frontiers in RNAi, Vol. 1 Birmingham et al.*

assayed transcriptome using string searching, BLAST [98], or other similar means. Identifying a concise set of genes putatively affected by seed-based offtargeting in experimental results is usually more labor-intensive, and is frequently frustrated by the extremely combinatorial nature of seed-based regulation effects [50]. While it is theoretically possible to use microRNA target prediction algorithms to identify potential seed-based off-targets and then narrow the extensive list using one's phenotypic data, this is in general feasible only if the tested reagent shares a seed with a known miRNA because many of these algorithms accept only miRNAs (not arbitrary antisense sequences) as input. However, other approaches have been developed. For example, GESS (Genomewide Enrichment for Seed Sequences) [99] identifies transcripts in the tested transcriptome that are enriched for complementarity to 7mer seeds of active siRNAs (relative to those of inactive siRNAs). The developers of CSA have also recently proposed an additional method called Haystack [100], which assesses the correlation for each transcript in the tested transcriptome between each tested reagent's phenotype and the off-target effect predicted for that reagent's seed using a linear model. For each significantly correlated transcript, the program estimates how much of the overall screening results is explainable by off-target effects on that transcript and calculates a p-value for it. As the Haystack developers note, such approaches to finding putative off-target genes have the potential to uncover novel biology. Off-targets represent real biological effects on the pathway of interest, and thus may draw attention to genes that are involved with the phenotype of interest but were not tested in the initial screen.

### **CONCLUSION AND FUTURE DIRECTIONS**

Unintended regulation of gene expression generated by off-target effects represents a major limitation of RNAi-based technologies in research, therapeutics, and diagnostics. Strategies such as chemical modifications, reagent pooling, and novel rational design filters can significantly reduce off-target effects without reducing on-target gene knockdown. However, knockdown of unintended genes by off-targeting remains common in RNAi experimental results.

A key goal of the RNAi field is thus to be able to design RNAi reagents that cause no off-target effects. However, since every possible hexamer seed sequence is represented in the transcriptome, some degree of seed-based off-targeting may be unavoidable. Even avoidance of cleavage-based off-targets may not always be possible, especially in systems such as shRNAs that cannot be chemically modified. Therefore, a fallback aim is to reliably predict, without experimental work, which off-targets a given reagent will have. Efforts to date demonstrate that

this is still very difficult, because seed-based off-target effects are demonstrably different for different experimental systems, cell lines, and assays [101-103]. Much more data will be needed to enable more successful predictive systems.

Until such improved systems are developed, results of any computational offtarget identification or prediction technique must be treated with some skepticism. It is critical to recall that although these methods identify potential off-targets, they do not prove off-target activity, which can only be accomplished with further experimental work. Techniques like testing alternate reagents against the intended target and/or the putative off-target of a reagent, or evaluating control reagents such as seed chimeras [56, 104] or C911 constructs [105], can confirm or refute computational predictions and expand our understanding of off-target biology.

### **ACKNOWLEDGEMENTS**

Declared None.

### **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflicts of interest.

### **REFERENCES**


### **16** *Frontiers in RNAi, Vol. 1 Birmingham et al.*


### **18** *Frontiers in RNAi, Vol. 1 Birmingham et al.*


### *RNAi and Off-Target Effects Frontiers in RNAi, Vol. 1* **19**


### **20** *Frontiers in RNAi, Vol. 1 Birmingham et al.*


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

**CHAPTER 2** 

# **Automation Considerations for RNAi Library Formatting and High Throughput Transfection**

**Sean M. Johnston, Caroline E. Shamu and Jennifer A. Smith\***

*ICCB-Longwood Screening Facility, Harvard Medical School, 250 Longwood Avenue, SGM Building Room 604, Boston, MA 02115 USA* 

**Abstract:** Laboratory automation impacts nearly all aspects of high throughput RNAi screens. It is particularly relevant when considering library format and storage, and also when planning high throughput transfection protocols. In situations where libraries are stored as screening copies that are used for just-in-time dispensing, automation can be utilized as a tool to accommodate diverse assay and cell types and can enable forward or reverse transfections into a variety of different microplates. Automation has the ability to increase the feasibility and decrease consumable costs for assays that require a large number of replicates. It is also an important tool when considering more complex assays, such as those that utilize non-standard plate types or electroporation, as automation increases reliability and can improve assay performance. This chapter highlights important considerations for library formatting and ways in which laboratory automation can be implemented to facilitate RNAi high throughput screening.

**Keywords:** HTS, laboratory automation, library format, RNAi, robotics, siRNA transfection.

### **INTRODUCTION**

There are many aspects to consider when planning an RNAi-based high throughput screen (HTS). Among the more important is access to laboratory automation. This must be taken into account when planning library formats and high throughput transfection logistics. This chapter will discuss common methods for RNAi library formatting and storage, and illustrate how automation can facilitate diverse assay protocols.

This chapter presents the perspective of one academic screening center, the ICCB-Longwood Screening Facility (ICCB-L) at Harvard Medical School, which has

**<sup>\*</sup>Corresponding author Jennifer A. Smith:** ICCB-Longwood Screening Facility, Harvard Medical School, 250 Longwood Avenue, SGM Building Room 604, Boston, MA 02115 USA; Tel: 617-432-5735; Fax: 617- 432-6424; E-mail: jennifer\_smith@hms.harvard.edu

supported genome-scale siRNA screening since 2006. ICCB-L is organized as a modular screening facility, with the instrumentation set up in a workstation format rather than integrated into fully automated systems. This enables multiple screening projects to be performed simultaneously. ICCB-L operates under an investigator-initiated, staff-assisted screening model. Investigators provide the scientific rationale and overall assay design for their projects and carry out the bulk of the work (*e.g.*, cell culture, plating, and assay readout) for their own screens. ICCB-L personnel maintain the HTS infrastructure and perform all complex automation tasks, including those that involve the libraries. Critical to each screen's success, ICCB-L personnel provide advice and assistance during all aspects of the screen. This includes training on instrument use, and advising on assay development and optimization as well as data analysis.

The goal of this chapter is to provide examples from ICCB-L demonstrating how laboratory automation can be used for formatting siRNA libraries and in implementing a variety of transfection strategies for high throughput screens.

## **LIBRARY FORMAT**

There are a number of considerations when deciding how to array an siRNA library. One of the first choices to be made is the initial state of the reagents. Depending on the vendor, a new library may arrive lyophilized or already in suspension. Lyophilized libraries are more easily transported and stored because they are stable at room temperature for some length of time [1]. Receiving and storing libraries lyophilized allows the screening facility to control resuspension volume and concentration, but quality control testing may still be necessary to ensure that the appropriate quantity of each library reagent has been received and completely re-suspended. Lyophilized siRNA libraries are received at ICCB-L, which are then resuspended at 10 or 20 M to generate stock copies. Subsequently, stocks are diluted to make "screening" copies of each library at 1 M for use in transfections. See the Library Storage section below for a more detailed description of ICCB-L siRNA library stock and screening plates.

High throughput RNAi screening is generally carried out in 96- or 384-well microplate format. The layout of the library in microplates must be carefully planned to ensure it is suitable for the assays that will be conducted in the screening facility. Re-arraying an siRNA library should be avoided whenever possible. Moving sets of wells introduces the potential for costly errors: physical mistakes (transferring reagents to the wrong well or inadvertently combining wells), or misannotation in tracking library contents. In addition, every time a

library is transferred, a small amount of library material is lost to the pipet tips and the source labware. These pitfalls of re-arraying may be avoided by coordinating the desired plate layouts with the vendor.

When possible, it is advisable to avoid plating siRNA reagents in edge wells (*e.g.*, rows A and P and columns 1 and 24 in a 384-well plate), as edge effects can have a major impact on assay robustness and data quality [2]. This edge effect variation can be caused by uneven evaporation and temperature fluctuations, both of which are major concerns for RNAi assays, which usually have incubation times of 72 to 96 hours. Leaving the outer two rows and columns of a library plate free of siRNAs would likely significantly decrease assay edge effects, but this would increase the total number of plates required to hold the library by approximately 25%. Because there are other strategies that can be used to mitigate edge effects (*e.g.*, breathable seals, MicroClime lids, active humidity controlled incubators, *etc.*), a good practice is to leave at least one column and one row empty at each outer edge. Fig. **1** illustrates the current siRNA library plating strategy at ICCB-L.

There are two classes of controls that must be taken into account when planning the library format, "library" controls and "assay-specific" positive and negative controls. The library controls are arrayed onto the library plates and are used to monitor successful transfection and library integrity. Assay-specific controls are added to the assay plates directly at the time of transfection and thus have corresponding empty wells in the appropriate position on the library plate. The library controls should consist of broadly applicable transfection controls, such as death-inducing siRNAs (*e.g.*, PLK1 or KIF11) whose effects can be monitored in many different assay types, non-targeting siRNA negative controls, and even RISC-free fluorescent controls that are useful when quantitating siRNA uptake efficiency. In contrast, assay-specific controls should be selected by individual investigators for each screen. The positive control is an siRNA that produces the desired phenotype sought in the screen. When possible, it is good practice to include multiple different siRNA positive controls, *e.g.*, two to three with strong or medium effects in the assay, so that the dynamic range of the assay might be monitored. Additional negative controls should be included on assay plates if those included as "library" controls are not sufficient. The negative controls should be non-targeting siRNAs that have been demonstrated to have no significant phenotype in the screening assay; this must be tested explicitly as no "universal" siRNA negative control exists that has no phenotype in all cell-based assays. If the library is a focused set of siRNAs (*e.g.*, small library targeting kinases) where many wells might be expected to display some phenotype, it is

### **24 4** *Frontiers in RN NAi, Vol. 1*

cr co qu rucial that en ontrols, as t uantitation [ nough wells the plate me 3]. s of the libra edian or ave ary are left e erage will no empty for a r ot be a usef robust set of ful baseline f negative for assay

**Fi** la th an co *et* 38 Q si **igure 1**: **Curr** ayout. Location he vendor. Libr nd Quadrant orresponding 3 *tc.* in the 384-w 84-well format Quadrant B2 ma RNA reagent t **ent siRNA lib** n of experimen rary-specific c B2) by ICCB 384-well siRNA well plate layo t. Quadrant B1 aps to B2, B4, type in each we **brary plate for** ntal and empty controls were a B-L staff prio A library layou out. Quadrant A maps to wells , B6, D2, D4, ell. **rmat at ICCB** y wells was de added to origin or to formatt ut, quadrant A1 A2 maps to we s B1, B3, B5, D D6, *etc.* in the **B-Longwood**. **A** etermined by I nal 96-well so ting into 384- 1 maps to wells ells A2, A4, A D1, D3, D5, *et* e 384-well form ) 96-well siR ICCB-L and fo ource plates (Q -well copies. s A1, A3, A5, A6, C2, C4, C6 *tc.* in the 384-w mat. The legen RNA library ormatted by Quadrant A2 **B**) In the C1, C3, C5, 6, *etc.* in the well format. nd indicates

### *Au utomation for RNA NAi*

### *Frontiers in RN NAi, Vol. 1* **25**

**Fi** si de pl 18 pr an le **igure 2**: **Form** RNA layout r etermined by lates (Quadran 80<sup>o</sup> by ICCB-L rior to 384-wel nd can be pop egend indicates **matting 384-w** received from the vendor. **B** t A1 and Quad L staff to bette ll formatting. **C** pulated by ass s siRNA reagen **well library p** vendor, with ) Library-spec drant B1) and t r distribute em **C)** In the result say-specific co nt type in each **late from ide** the location cific controls w two of the 96-w mpty wells on e ting 384-well f ontrols. Colum well. **entical 96-wel** of experiment were added to well quadrants each side of th format, column mn 3 contains **ll quadrants**. tal and empty o original 96-w (A2 and B2) w he resultant 384 ns 1, 22 and 24 the library co **A**) 96-well y wells prewell source were rotated 4-well plate 4 are empty ontrols. The

### **26** *Frontiers in RNAi, Vol. 1 Johnston et al.*

The standard siRNA library layout at ICCB-L is a 384-well format with the outermost row (A and P) and column (1 and 24) empty, as well as one column (2) designated for a variety of library controls and an additional column (23) empty for assay-specific controls (Fig. **1B**). While the ICCB-L RNAi libraries are stored and screened in 384-well format, they are usually purchased and received in 96 well plates (Fig. **1A**). The larger volume capacity of a 96-well plate is useful when diluting a lyophilized library to an appropriate stock concentration, and 96-well plates are amenable to both automated and manual pipetting. For this reason, many vendors stock products in 96-well plate format. Other formats and densities may require a surcharge or be unavailable entirely. A 384-well plate can be subdivided into four sets of 96 wells, or quadrants [4]. The quadrants interleave with even spacing, such that the 96-well tips will go into every other well in each direction (described in Fig. **1** legend). The quadrants are then designated A1, A2, B1, or B2, depending on which well of the 384-well plate corresponds to well A1 of a particular 96-well plate (Fig. **1A**). The standard ICCB-L 384-well layout required requesting customized 96-well plates from the vendor, such that the 4 quadrants have non-identical layouts. A library vendor may be willing to provide the reagents in the desired format or the purchaser may re-organize the collection by moving samples with a pipettor, although this may be a daunting task across an entire genome. In instances where a library is shipped in 96-well plates with identical layouts (Fig. **2A**), a reasonable option for 384-well formatting may be to rotate two of the 96-well quadrants when transferring them into 384-well format (Fig. **2B**), thus equally distributing the empty wells to each side of the 384-well plate (Fig. **2C**).

For focused libraries or smaller collections, such as ICCB-L's miRNA mimic and inhibitor libraries of less than 1,000 wells, one may customize a layout with less concern for the total number of plates. As seen in Fig. **3**, the miRNA library has the two outer rows (A, B, O and P) and columns (1, 2, 23 and 24) empty, which create a strong buffer against edge effects. There is also a field of 4 columns in the middle of the plate for controls, insulating them from edge effects and allowing for the highest fidelity control data. In this example, 2 of the 4 columns are utilized for library-specific controls (13 and 14) and the other 2 columns remain available for assay-specific controls (11 and 12).

## **LIBRARY STORAGE**

Several factors need to be considered simultaneously when determining how to store the libraries. There are three basic ways in which RNAi libraries are stored: 1) as pre-dispensed ready-to-go assay plates; 2) as screening copies that are used

### *Au utomation for RNA NAi*

fo fo T fo pr fo ex im an or just-in-tim ocused librar TTP Labtech ormation of re-formatted or customiza xpensive an mplement. T nd will be th me dispensin ries on dema comPOUND a custom lib d library. Wh ation of pla nd may be The former t he focus here ng; and 3) in and. The latt D or HighRe brary for eac hile use of r ate format a challenging two storage e. n individual ter option ut es Biosolutio ch experimen reagent tubes and contents for typical strategies w reagent tub tilizes a tube ons SampleV nt rather than s is ideal in s, this type l academic work well f bes for the c e storage sys Vault) and en n being restr terms of po of storage screening c for many ap reation of stem (*e.g.*, nables the ricted to a ossibilities system is centers to pplications

**Fi** w ad in **igure 3**: **Smal** wells was deter dded to each o ndicates RNAi **l library plate** rmined by ICC original source reagent type in **e format at IC** CB-L and form e plate by ICC n each well. **CCB-Longwoo** matted by the B-L staff prio **od**. Location o vendor. Libra or to formatting of experimental ary-specific co g into copies. l and empty ontrols were The legend

S pl re st th co m pr fo w re ac im everal scree lates; for all eagent. This taff logistics he library, an onsumables might put lim rotocols and ormatted. siR while it migh eady-to-go p cceptable vo mpossible to ening center l screens, ev s strategy of s, plate avail nd the abilit cost and st mits on assay d assay plate RNA amoun ht be possib plates by alte olume range use ready-to rs elect to p very assay p ffers certain lability whe ty to mass p taff time. H y protocol d e types must nts would al ble to slightl ering assay v e that is com o-go plates. pre-dispense plate holds t n advantages enever a scre produce libra However, rea development be determin lso be fixed ly modify th volumes in so mpatible wit libraries in the same am s, including eener is read ary aliquots, ady-to-go as t. It requires ned at the tim d at the time the final siR ome cases, t th the assay nto ready-to mount of eac g simplified dy for that p thus decrea ssay plate f s reverse tra me the librar e of formatti RNA concen there may be y protocol, m -go assay ch siRNA screenerportion of asing both formatting ansfection ry plate is ing. Thus, ntration in e a narrow making it

### **28** *Frontiers in RNAi, Vol. 1 Johnston et al.*

Just-in-time dispensing overcomes several of the challenges associated with ready-to-go assay plates. This type of dispensing places few limitations on assay protocols. It allows for a high degree of flexibility in terms of forward or reverse transfection, plate type and well density utilized, final siRNA concentration, and number of replicates included in the screen. It is also easy to automate the addition of assay-specific controls and mixing of siRNA and transfection reagent with tips. Yet there are drawbacks that must be considered, including higher consumables cost, increased staff time for development of custom automation protocols and for siRNA transfection, as well as additional logistics in terms of scheduling (staff, investigator, and equipment).

RNAi libraries at ICCB-L are stored as stock plates and screening copies, with the screening copies utilized for just-in-time dispensing during automated transfections. The majority of the library is stored in high concentration (10 or 20 M) stock plates. The stocks can be diluted 10- or 20-fold in 1X siRNA buffer to create 1 M screening copies as necessary. We have found that 1 M is an ideal library screening copy concentration because 1.5 – 3 L of 1 M reagent can be added to a 40 – 50 L assay well volume in 384-well plates to achieve the recommended siRNA transfection concentration of 25 – 50 M. In addition, a volume as low as 1 L can be accurately and precisely dispensed with automated 384-channel pipettors and does not introduce an excessive amount of library buffer salts to the assay.

The plate type used for RNAi library storage at ICCB-L is the Eppendorf twin.tec 384-well PCR plate. These plates are low profile, thus minimizing storage space. They have steep conical wells, promoting high volume recovery, and, unlike most PCR plates, the twin.tec plates have a full rigid polycarbonate skirt that is compatible with a wide array of automation platforms. The plates are available in a variety of colors, allowing color coding of libraries and custom arrays as desired. The maximum comfortable well volume of the twin.tec plates is 30 L. This allows for volume displacement when pipet tips are inserted and prevents cross contamination when plates are stacked.

At ICCB-L, library plates are heat sealed and stored at -20 °C in plastic containers (*e.g.*, 13" x 7-1/2" x 4-1/4", Container Store catalog #10008759) within nondefrosting standard lab freezers (*e.g.*, Thermo Scientific Revco Ultima II). Heat sealing is an excellent way to preserve the library, as the well chimneys of the storage plate melt slightly and form a bond with the foil seal material. Adhesive seals also work, but they can be more difficult to apply consistently and effectively – they may leave an adhesive residue, thus causing plates to stick

together, or the seals may not bind very well and fall off in storage. However, heat sealing is not without its challenges. After selecting a seal material that is suitable for long-term storage (*e.g.*, aluminum- and plastic-based laminate), one must optimize the sealing time and temperature. These parameters may vary slightly with different types of storage plates and materials, so it is important to thoroughly test the settings.

Whether using heat or adhesive methods, if the seal fails, the sample will evaporate. Differential evaporation is a major concern for RNAi libraries. Over time, seals can become imperfect while in deep -20 °C storage, resulting in partially or completely dried-down samples in some wells across library plates. Partially dried samples have a higher concentration and may yield false toxicity if screened, while fully dried samples will not transfer to assay plates and will potentially cause false negative results. It is important to check libraries for differential evaporation. In instances where this has been observed, we have found that drying the contents of the entire plate (*e.g.*, by placing the plate, unsealed, in a sterile tissue culture hood overnight) and re-suspending the library in nucleasefree water is an effective method for recovery.

## **EXAMPLE ACADEMIC RNAi SCREENING FACILITY SET-UP**

Screening instruments at ICCB-L are set up in workstation format as modules rather than integrated into a fully automated system (Fig. **4**). There are a number of benefits to modularity in a screening room. The flexibility of having individual workstations allows for the rapid addition of new or special equipment for specific protocols. Equipment that is not needed for a particular assay is available for other users, whereas it may be unavailable in a fully automated system even when it is not being utilized as part of an assay in production. In modular mode, the ability to have all workstations in use simultaneously allows for the interleaving or "dove-tailing" of projects, where at any moment there may be separate screens: 1) preparing the transfection reagent and assay plates; 2) delivering the library and forming the transfection complex; 3) preparing cells and adding them to the transfection complex; 4) incubating; and/or 5) reading out assay endpoints. In modular screening facilities that have instrument redundancy, troubleshooting broken or defective equipment does not need to delay screening efforts as it is frequently possible to quickly transfer screening plates to a functional workstation. The workstation format is not without challenges, though. Transfections and assays can be time sensitive, so it is important that the screener defines the effective temporal window for each step and strives to consistently work within it. This can be made manageable by working in smaller batches and

### **30 0** *Frontiers in RN NAi, Vol. 1*

m an as th making advan nd performin s the screen heir work at nce preparat ng necessary ner must res the screenin tions, such y calculation serve all nec ng facility wi as labeling ns. The logis cessary piec ith the staff a assay plates stics are non ces of equip and other scr s, measuring ntrivial in th pment and c reeners. g reagents his model, coordinate

**Fi** si be as de in as **igure 4: Mod** multaneously w e incubated. H ssay and inter elivering the l nstrumentation ssay end-points **dular worksta** with the use of oods containin rmediate plates library to assa (4), *e.g.*, plate s. **ation illustrat** f modular work ng bulk reagent s, prepare cell ay plates and e reader and im **tion**. Several kstations. **A**) In t dispensers (2 ls, plate cells, forming the t mager, through screens have ncubators (1) p ) can be utilize , *etc*. **B**) Wor transfection co hout the lab sp the capacity provide space f ed to prepare r rkstation (3) d omplex. **C**) A pace is utilized to operate for assays to reagents, fill dedicated to A variety of to read out

The ICCB-L RNAi platform employs three types of automated liquid handling during the screening process. Transfection and assay reagent additions are performed using bulk reagent plate fillers, such as the peristaltic pump-based Matrix WellMate (Thermo Scientific), which uses disposable tubing cartridges that can be distributed to users so they have control of their own fluid paths. siRNA library reagents are transferred into assay plates using an automated 384 channel pipettor, the Agilent Bravo. The Bravo is also utilized for all library reformatting. In contrast to the peristaltic pump-based bulk reagent plate fillers, in which wells in rows are filled 8-at-a-time, an automated 384-channel pipettor will transfer fluid to all wells in a microplate simultaneously. A two-channel Tecan EVO75 and an eight-channel Tecan EVO150 pipetting station have been customized to pull screening positives or "cherry picks" from library plates to create custom sets of siRNA reagents for follow-up work. More information about laboratory automation for liquid handling can be found in other overview publications [5].

### **AUTOMATION OF siRNA TRANSFECTION**

The majority of siRNA screens at ICCB-L are performed *via* reverse transfection, where the transfection complex (transfection reagent, Opti-MEM (Life Technologies) and siRNA) is generated within the assay plate and the cell suspension is subsequently added to the complexed siRNA. A typical reverse transfection involves diluting the transfection reagent in Opti-MEM and adding it to assay plates using a WellMate plate filler, then adding the siRNA from library screening plates to assay plates using an automated pipettor, and finally adding the cell suspension with the WellMate. Following 72 to 96 hours of incubation, the assay end-point is reached, samples are processed, and results read out.

The automated pipettor used for library formatting and transfections (*e.g.*, the Agilent Bravo at ICCB-L) should be selected with an emphasis on accuracy at lower volumes and have ample deck space for a variety of applications. It is recommended that disposable tips are utilized with an automated pipettor because they minimize concerns of carry over and library contamination. As disposable automation pipet tips are inherently expensive, it is impractical to change them after siRNA library delivery between assay plate replicates. It is also unreasonable to return to the source plate with potentially dirty pipet tips. Thus serial dispensing is a necessity. Serial dispensing can be defined as aspirating a bulk volume and dispensing a smaller volume to several replicates in series.

### **32 2** *Frontiers in RN NAi, Vol. 1*

### *J Johnston et al.*

**Fi** pi 3 tra di up 5 **igure 5**: **Sche** ipettor) moves L. **3**) Bravo ansfection rea ispenses 1.5 L p and down thr L three times **ematic of two** to the tip box o moves to R agent and Opt L. It then mixe ree times. **5**) B s. **o replicate au** and attaches t Replicate (REP ti-MEM) and es REP B by a Bravo moves b **utomated tran** tips. **2**) Bravo m P) A assay pl dispenses 1.5 aspirating 5 L ack to REP A **nsfection**. **1**) moves to the L late (which a L. **4**) Brav L (half of the t and mixes by The Bravo (3 Library plate an lready contain vo moves to R total volume in aspirating and 384-channel nd aspirates ns 8 L of REP B and n each well) d dispensing

W cl sc sc re tr w as th When ICCBlosely follo creened in d creeners opte equired fewe ransfection e was aspirated ssay plates. he automate -L performe owed establ duplicate an ed for revers er steps, wa efficiency wi d from the l The transfec ed pipettor. ed its first a lished chem nd controls w se transfectio as less expe ith this meth library plate ction compl This first automated tr mical screen were manua on rather tha ensive and e hod. In one e e and 1.5 L ex was then generation ransfections ning practi ally added t an forward tr easier, and s example (Fi L was dispe n mixed in e of screens in 2006, th ices. Librar to assay pla transfection b some report ig. **5**), 3 L ensed to eac each assay p identified he design ries were ates. Most because it ted higher of siRNA ch of two late using areas for

improvement in the screening strategy. RNAi screens tend to have greater variability compared to chemical screens for a variety of reasons, including differences in cell plating, transfection efficiency, toxicity, and edge effects over longer incubation times [3]. Beginning in 2007, a third replicate was added to all new projects in an effort to improve statistical analyses. The assay-specific control addition step was also automated, thus improving the reproducibility of these controls and enabling screeners to easily increase the number of assay-specific controls per plate. As expected, data quality improved with these changes. In the updated example that illustrates these improvements (Fig. **6**), 4.5 L of siRNA is aspirated from the library plate and 1.5 L is dispensed to each of three assay plates. The pipettor then utilizes the same set of tips to aspirate 4.5 L from a "control" source plate that has a layout complementary to the library and contains only assay-specific siRNA controls (Fig. **6B**). It then dispenses 1.5 L to each of the three replicates and mixes the transfection complex in each assay plate.

Although most of the screens performed at ICCB-L are run in 384-well format, 96-well screening is also possible. Some assays have technical limitations that necessitate larger well format. For example, assays monitoring rare events require examination of many more cells per test condition or specific plate types may not be available in 384-well format. Many automated liquid handlers offer interchangeable 96- and 384-well pipettor heads, where the 96-well head is able to index to the four quadrants within a 384-well plate, thus it is easy to accommodate a 96-well assay even when library stocks are plated in 384-well format. Screening in 96-well format requires larger volumes of costly reagents and tends to have slower throughput. In addition, library controls may not be evenly distributed amongst all four 96-well plates if siRNAs are aliquoted from 384-well library plates. Thus, this screening strategy is only recommended for assays that have proven to be unsuccessful in 384-well format.

To meet the needs of assay protocols that directly compare two experimental conditions (*e.g.*, effects of siRNA on cancer cells *versus* non-malignant cells), automation protocols have been established for six replicate assays (three replicates per experimental condition). However, automating six assay replicates is not straight forward. The serial dispensing that was used in previously described protocols (Figs. **5** and **6**) proved to be inaccurate across greater numbers of replicates (5+), thus necessitating an alternative approach. Since larger dispense volumes and smaller numbers of replicates tend to be more accurate, an intermediate dilution plate for the siRNA library was created as needed for

### **34 4** *Frontiers in RN NAi, Vol. 1*

### *J Johnston et al.*

**Fi** m B tra W L ea 5 w fo **igure 6**: **Schem** moves to the tip ravo moves to ansfection reag With the same s L. **5**) Bravo m ach replicate p L three time wells that are fu or assay-specifi **matic of three** p box and attach o REP A, REP gent and Optiset of tips, Bra moves to REP A late. **6**) Bravo es. **B**) Control ull in the librar fic controls. **e replicate aut** hes tips. **2**) Bra P B and then MEM), dispen avo moves to t A, REP B and mixes REP C, plate, with gr ry plate, as sho **omated transf** avo moves to th REP C assay nsing 1.5 L of the Control pla d then REP C, , REP B and th ray wells indic own in Fig. **1B fection with c** the Library plat y plates (which f the library in ate (CTRLS, F , dispensing 1. hen REP A by cating empty w **B**) and pink we **control plate**. **A** te and aspirate h already cont nto each replica Fig. **6B**) and a .5 L of the c aspirating and wells (these co ells representin ) **1**) Bravo es 4.5 L. **3**) tain 8 l of ate plate. **4**) aspirates 4.5 controls into d dispensing orrespond to ng locations

### *Au utomation for RNA NAi*

in as ndividual as spirated from L of either 1 says. In the m the library 1X siRNA b e two-condi y plate and d uffer or Opt tion exampl dispensed in ti-MEM, cre le (Fig. **7**), nto a twin.te eating a 1:1 d 10 L of ec plate cont dilution of th siRNA is taining 10 he library.

**Fi** to m co an L pl di In as R m tip **igure 7**: **Three** o the tip box an moves to the In ontrols and 10 nd dispenses 10 L three times a lates (which a ispensing 3 L ntermediate pla ssay plates, dis REP F, REP E mixes REP C, R ps is utilized fo **e replicate x 2** nd attaches tips ntermediate pla L diluent (Op 0 L of the lib and then aspira already contai L of the dilut ate and aspirate spensing 3 L and then REP REP B and the or all of the ste **2 automated tr** s. **2**) Bravo mo ate (INT W/ C pti-MEM and t brary. It mixes ates 9 L. **4**) B in 7 l of tr ted library an es another 9 L of the diluted D by aspiratin en REP A by a eps outline abov **ransfection wi** oves to the Libr CTRLS), whic transfection re the Intermedia Bravo moves to ransfection rea nd controls int L. **6**) Bravo m d library and c ng and dispen aspirating and d ve. **ith intermedia** rary plate and ch contains pre agent, Opti-ME ate plate by asp o REP A, REP agent and Op to each plate. moves to REP D controls into ea sing 5 L thre dispensing 5 **ate plate. 1**) B aspirates 10 e-aliquoted as EM, or 1x siRN pirating and di P B and then R pti-MEM or O **5**) Bravo mo D, REP E and ach plate. **7**) B ee times. **8**) La L three times. Bravo moves L. **3**) Bravo say-specific RNA buffer), spensing 10 REP C assay Opti-MEM), oves to the then REP F Bravo mixes astly, Bravo . One set of

### **36** *Frontiers in RNAi, Vol. 1 Johnston et al.*

This new assay-specific, intermediate copy of the library can be mixed using the automated pipettor and dipped into multiple times with the same set of tips because there is no concern for library contamination. It can also be pre-loaded with assay-specific controls for automated delivery. For transfection, siRNAs from the intermediate copy are dispensed to the six assay plates in two passes, adding twice the volume to each that would have been added from standard library screening plates. The pipettor aspirates 9 L from the intermediate copy and dispenses 3 L to each of three replicates, then repeats the process for the remaining three replicates. After all six assay plates have received siRNA, the copies are mixed in reverse order by aspirating and dispensing 50% of the well volume three times.

The initial challenges associated with automating a two-condition assay emphasized the importance of pipettor calibration to validate transfer volumes. Robust calibration can be achieved through a variety of methods, including gravimetric, photometric and fluorometric quantitation. The gravimetric calibration is simple and fast, but it does not provide information about well-towell variability, and, since the readout is a well average, it is only recommended for larger volumes of 10 L or more. Fluorometric and photometric calibrations require a series of standards to which each well may be compared and independent values for each well can indicate failing pipettor channels or faulty tips, as well as general accumulation of error across a serial dispense that would not be obvious through gravimetric means. Photometric dyes (*e.g.*, Tartrazine) do not photo-bleach and are more stable than fluorometric markers (*e.g.*, Rhodamine Green, Fluorescein), but they lack the sensitivity that is ideal for quantitating submicroliter transfers. ICCB-L employs high sensitivity fluorometric calibration for sub-microliter calibrations, a simple photometric method for 1 – 10 L transfer validations, and a gravimetric approach for any transfer over 10 L. The gravimetric readout may also be coupled with a photometric reading to identify any outlying wells.

It is possible to perform siRNA transfections on a variety of interfaces beyond traditional tissue culture-treated polystyrene or glass assay plates. Two examples are transwell plates and hydrogel coated plates. A Transwell 96-well plate (Corning) is comprised of two parts: an upper compartment with individual wells and a lower compartment that has either individual wells or a communal trough. These two compartments are separated by a membrane, thus transwell plates can facilitate a wide array of trafficking assays. Hydrogel coated plates [6] can be created with tunable rigidity, more closely simulating the interfaces found in

### *Au utomation for RNA NAi*

m w an vi mammalian t with these for nd piston sp igorous liqu tissues comp rms of labwa peed while a id handling pared to gla are, it is imp aspirating, d may disrupt ass or polys portant to car dispensing an the cell laye styrene surf refully consi nd mixing, a er or interfac faces. When ider the pipe as close pro ce. n working ettor depth oximity or

**Fi pl** pl co by R B E m re di an R tim **igure 8**: **Three late. 1**) Bravo late and aspira ontrols and 13 y aspirating an REP B and then ravo moves to and then REP moves to the Int eplaced with R iluted library a nd then G) by a REP A and REP mes. One set o **e replicate x 2** moves to the ates 13 L. **2**) L diluent, an nd dispensing 1 n REP C, disp the Intermedia P F, dispensing termediate plat EP G and REP and controls int aspirating and P B. Bravo mix of tips is utilize **2 and two repl** tip box, attach ) Bravo move d dispenses 13 13 L three tim pensing 3 L o ate plate and as g 3 L of the te and aspirate P H. Bravo mov to each plate. **7** dispensing 5 xes REP B and d for all of the **licate x 1 auto** hes tips (not di s to the Interm 3 L of the libr mes. Bravo asp of the diluted spirates anothe diluted library s another 6 L ves to REP G ) Bravo mixe L three times. d then REP A steps outlines **omated transf** isplayed) and t mediate plate rary. It then mi pirates 9 L. **3**) library and co er 9 L. **5**) Bra y and controls L (not displayed and then REP es REP F throu **8**) REP G and by aspirating a above. **fection with in** then moves to containing as ixes the Interm ) Bravo moves ontrols into ea avo moves to R into each plat d). REP A and H, dispensing ugh REP H (F, d REP H are re and dispensing **ntermediate**  the Library say-specific mediate plate s to REP A, ach plate. **4**) REP D, REP te. **6**) Bravo d REP B are 3 L of the , E, D, C, H eplaced with g 5 L three

### **38** *Frontiers in RNAi, Vol. 1 Johnston et al.*

Finally, it should be noted that siRNA screens may also be run without the use of transfection reagents. Electroporation has long been a promising alternative for introducing siRNA into difficult-to-transfect cell lines, but concerns including increased cell death, transfection efficiency and cost of consumables must be addressed. High throughput electroporation platforms have recently become commercially available (*e.g*., Lonza HT Nucleofector System). Introduction of these platforms into the laboratory automation requires careful logistical planning and testing of each step.

The complexity of the automated transfection can increase as needed. ICCB-L has successfully implemented eight replicate assays, with two experimental conditions in triplicate plus a viability arm in duplicate (Fig. **8**). In addition, a 15 replicate, five cell line forward transfection campaign has been performed. However, these major endeavors are costly, time consuming and require meticulous attention to the timing and logistics of every step.

In conclusion, laboratory automation impacts nearly all aspects of RNAi screening, with some of the greatest influences found in the method of library storage and how siRNA transfections are performed. This important technology can be used to implement simple and complex experimental protocols for a wide variety of RNAi assays.

### **ACKNOWLEDGEMENTS**

We are grateful to members of the ICCB-L Screening Facility, particularly S. Rudnicki, D. Flood and S. Chiang for helpful discussions and feedback. This work was supported by National Institutes of Health Grant U54 AI057159 and Harvard Medical School.

## **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflict of interest.

### **REFERENCES**



© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

# **Public Repositories for RNAi Screening Data**

**Esther E. Schmidt1,\*, Michael S. Banos2 , Jennifer A. Smith<sup>3</sup> , Amanda Birmingham<sup>2</sup> , Michael Boutros<sup>1</sup> and Caroline E. Shamu3**

*1 German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics and Heidelberg University, Department for Cell and Molecular Biology, Medical Faculty Mannheim, Im Neuenheimer Feld 580, 69120 Heidelberg, Germany; <sup>2</sup> Dharmacon, part of GE Healthcare, 2650 Crescent Dr., suite 100, Lafayette, CO 80026 USA and <sup>3</sup> ICCB-Longwood Screening Facility, Harvard Medical School, 250 Longwood Ave., Seeley G. Mudd Building, Room 604, Boston MA 02115-5731, USA* 

**Abstract:** Public repositories for genomic data, such as sequencing and expression studies, play key roles in the dissemination of large-scale studies. It can be expected that repositories for functional genomic data, such as RNAi screens, will have a similar important role. RNAi data repositories store information about RNAi reagents and results from RNAi screening experiments, and present them in a structured and searchable manner. Implementation and use of robust, public RNAi databases is critical to realizing the potential of RNAi experiments. These databases allow investigators to re-analyze deposited datasets to ask new and different questions, and they are a rich source for functional gene annotation. This chapter describes challenges faced as databases for genome-scale RNAi screening results are developed: the diversity of RNAi assays carried out in multiple cell types and organisms; the variety of identifiers and annotations used to describe RNAi reagents; the lack of an established and accepted ontology to describe RNAi experiments; and the challenge of curating RNAi screening results and collecting complete datasets. Examples of Laboratory Information Management Systems (LIMS) that store RNAi data and of RNAi reagent and result annotation databases are provided.

**Keywords:** Data annotation, data comparison, data curation, data management, data repository, database, genome-scale RNAi screen, high throughput screening (HTS), RNA interference, RNAi, shRNA, siRNA.

### **THE POTENTIAL OF RNAi DATA REPOSITORIES**

Since the discovery of RNAi in 1998, screening experiments based on this mechanism have moved from experimental applications to highly automated, well-established routine procedures. High-throughput technology enables the

**<sup>\*</sup>Corresponding author Esther E Schmidt:** German Cancer Research Center (DKFZ), Im Neuenheimer Feld 580, 69120 Heidelberg, Germany; Tel: +49 6221 421958; Fax: +49 6221 421959; E-mail: e.schmidt@dkfz.de

efficient production of RNAi knockdown phenotypes on a large scale in a variety of species [1, 2]. The number of large-scale RNAi screening experiments reported in the literature has substantially increased in recent years: a PubMed search for "genome rnai" returns 59 entries for the year 2001 and 552 for the year 2013.

RNAi technology has proven useful in a number of different applications: it has, for example, been used in the elucidation of developmental phenotypes in *Caenorhabditis elegans* [3] and *Drosophila melanogaster* [4], and it serves as a tool for deciphering the structure and regulation of complex signaling pathways [5, 6]. The concept of synthetic lethality-here referring to RNAi double knockdown with the aim to identify such genes that are lethal when abrogated in combination but not individually-is a promising tool in the search for new drug targets in cancer [7, 8].

As more and more RNAi screening experiments are done at genome-wide coverage, they have the potential to provide unbiased answers to research questions and give unexpected new insights into gene functions. For example, Luo *et al.* conducted a genome-wide RNAi screen to identify synthetic lethal interactions with the *KRAS* oncogene (Gene ID: 3845). Among the list of Ras synthetic lethal candidates, they found only a few genes from pathways known to be implicated in Ras-driven oncogenesis. The vast majority of the candidate genes was spread over multiple cellular functions not previously associated with Ras activity, thus opening new avenues in terms of exploiting target genes for therapeutic intervention [9].

Apart from providing answers to a given research question, RNAi screening data may enable other investigators to use an existing dataset to ask new and different questions in addition to what the screen was initially designed for. RNAi data repositories will constitute an important contribution to this end by collecting diverse screening data in one place and presenting them in a structured and searchable manner. Moreover, datasets would be accompanied by well-annotated experimental and data analysis protocols, facilitating reproduction of results as needed.

RNAi screening data are also a rich source for functional gene annotation. The systematic interference with gene expression and the use of a variety of screening assays contributes to the elucidation of gene function across all areas of biology. Collecting this information in a central repository constitutes a valuable task, complementing details currently available in gene catalogues. For example, phenotype data from the GenomeRNAi database [10] have been integrated into

### **42 2** *Frontiers in RN NAi, Vol. 1*

th do so he popular ownload RN ources. FlyMine da NAi phenotyp ata mining pes along w tool [11], ith gene info allowing th ormation fro he user to v om multiple o view and other data

R *el* or be re "m or to R sp re RNAi techno *legans* [12] rganisms are e subjected edundancy, make up" fo rganisms are o generate h RNAi databa pecies, inter esource for c ology was f , then later e far easier t to *in vivo* thus avoidi or the loss o e a popular c hypotheses f ases that en rlinked *via* cross-species first discove r to the fru to study than *o* experimen ing the diff of the gene choice for R for the func ncompass inf gene homo s comparison ered and app uit fly *D. m* n mammalia nts. In addi ficulty enco under exam RNAi experi ction of hom formation o ology inform n of RNAi re plied to the *melanogaste* an species, e ition, their ountered wh mination. Th iments and th mologous ge on screening mation, repr esults. e model org *er* [13]. Suc especially as genomes s hen duplicat herefore, the the results ar enes in othe g data from resent a ve ganism *C.*  ch model s they can show less ted genes ese model re utilized er species. m different ery useful



**Fi** re **igure 1:** Sum epositories. mmary of pote ential benefits and challeng ges in providin ng RNAi scre eening data

R an th ov on da da co RNAi screeni nalyses, span hat share a verlaid onto n other geno ata could pro ata complem omparison o ing data col nning multip phenotype w networks ob omic data, su ovide new in mentary to ot f RNAi scre lected in da ple screens a within a scre btained from uch as protein nsights into ther data sou eens. Phenoty ata repositori and/or analy een could b m other screen n-protein inte pathway rela urces. Anoth ype results fr ies also lend ysis methods be represente ns, or indeed eraction or g ationships, c her useful app from multiple d themselves s. For examp ed as a netw d onto netwo genomic stud contributing plication con e screens co s to metaple, genes work, and orks based dies. RNAi functional ncerns the uld reveal central "hub" genes that are involved in many diverse biological mechanisms on the one hand, but comparison across different screens also has the potential to highlight problematic reagents that produce so-called off-target effects, by inactivating partially matching gene transcripts in addition to their intended transcript target. As indicated above, clustering of phenotype data from multiple screens could yield new, possibly unexpected relationships and interactions between genes.

RNAi technology is still comparatively new and its potential is only beginning to be realized. We expect it to continue to grow in significance, making major contributions to the elucidation of gene function and the exploration of interactions across multiple biological pathways. RNAi databases will play a pivotal role in making this data widely available and usable. Fig. **1** summarises the benefits of providing such repositories, as well as challenges faced, as described in the following section.

## **THE CHALLENGES FACED IN DEVELOPING RNAi DATABASES**

## **The Variety of Assays Makes Data Comparison Difficult**

In order for databases to be useful, standardized data representations are needed to allow implementation of efficient search strategies, to enable comparison between different RNAi experiments, and to allow the display of grouped data (*e.g.* all screens performed in a given cell line). The wide variety of assays, which here refers to the experimental strategies to measure phenotypes, being employed in RNAi screening experiments is a major challenge in pursuing data standardization and comparability. RNAi can be performed in whole organisms, *e.g.* in *C. elegans*, *Drosophila* or *Anopheles*, using various reagent delivery methods such as soaking, microinjection, feeding of bacteria or transgenesis [14-16]. Knockdown of gene expression can also be limited to specified tissues by the use of tissue-specific promoters in the vectors utilized for transgenesis [15, 17, 18]. More frequently, RNAi experiments are performed on cultured cells, introducing the reagents directly through the cell culture medium for *Drosophila* cells or *via* siRNA transfection, electroporation or lentiviral infection as required in vertebrate cells [1, 19, 20].

The types of assays used in RNAi experiments vary widely, from single read-out assays (*e.g.* monitoring expression of a luciferase reporter gene) to the observation of multiple complex phenotypic features by high-content microscopy. Microscope images provide a spectrum of measurable features, including cell number, size, shape, arrangement, density, *etc* [21, 22]. Likewise, whole animal experiments also result in diverse sets of data, and can monitor phenotypes such as organism behavior,

### **44** *Frontiers in RNAi, Vol. 1 Schmidt et al.*

organ structure, protein localization in tissues, and many others [23-27]. Even simple read-out results are often combined with other measurements or results from counterscreens carried out in parallel as part of a primary RNAi screen. A typical example would be measurement of inhibition of expression from a specific promoter, combined with the additional condition that the RNAi reagent also meets a certain viability threshold for it to be considered a screening positive. More complex combinations have also been implemented [2, 20]. It should be noted that while most RNAi screens produce quantitative measurements, yielding numerical data, some assay readouts are only textual descriptions of visual observations.

### **Standardization of Identifiers and Annotations is Needed**

RNAi assays cover a range of observations because they constitute individual approaches for any biological question that a researcher might pursue. This results in an abundance of phenotype definitions that is challenging to contain in the sort of structured ontology framework that would be desirable for collection and comparison of RNAi screening data. Moreover, RNAi assay protocols are complex, characterized by a great number of reagents, analysis strategies, and variables that must be well-described in order to enable proper interpretation and comparison of experiments. For example, the Minimum Information About an RNAi Experiment (MIARE) reporting guidelines (http://miare.sourceforge. net/HomePage) include a checklist of more than 50 items required to provide only the minimum information for the description of an RNAi experiment. Another effort to provide a standardized framework for the description of experiments involving cellular assays more generally is Minimum Information About a Cellular Assay (MIACA) (http://miaca.sourceforge.net/). MIACA might also be applied to describing RNAi experiments. However, development of both the MIARE and MIACA guidelines appears to have stalled somewhat, possibly due to the current lack of corresponding data repositories. In the field of microarray expression data, where results are routinely deposited into public databases, the Minimum Information About a Microarray Experiment (MIAME) [28] guidelines represent a very successful example of data standardization. Three major data repositories (ArrayExpress, GEO and CIBEX) accept and distribute MIAMEcompliant data, secondary data resources have been developed, and most scientific journals require the deposition of data formatted according to the MIAME standard [29].

To describe the targets of RNAi reagents, a number of commonly accepted identifiers are available, *e.g.* NCBI (National Center for Biotechnology Information) Gene ID [30], Ensembl [31], RefSeq [32], FlyBase [33]. The usage

of multiple identifier types by different authors presents a challenge. When attempting to compare different data sets, it is critical to select "equivalent" identifiers. Ideally, a database uses one type of identifier as its key (typically an internal identifier), which allows the grouping of all relevant data for a particular gene, *e.g.* for display on a single web page. Other identifiers need to be "mapped" to the reference identifier based on known relationships.

Such a mapping procedure is not always straightforward as several issues may complicate matters. These include: a) the other identifier set may not contain an equivalent identifier at all; b) in the case of closely related genes, the other identifier set may contain several possible equivalent identifiers, leaving ambiguity as to which is the correct one; and c) there may be multiple equivalent identifiers due to the type of identifiers used, *e.g.* when mapping a gene identifier to a set of transcript identifiers. Mapping procedures are further complicated by the fact that identifier sets are updated over time, and different identifier resources have different update cycles. Thus, an identifier relationship established at one time point may not be reproducible at a later time point if one or more of the underlying resources have changed. Mapping procedures may have to be repeated regularly in order to avoid too many out-of-date relationships. Similar challenges have been described in the context of array-based expression data, and tools have been developed to address the (re-) mapping of identifiers across different data sets [34-36].

### **Ontology Development is Essential but Requires a Community Effort**

The use of established ontologies and controlled vocabularies to describe experimental reagents and protocols as they are incorporated into databases is essential for data standardization. Ontologies have been actively developed in several areas, each attempting to define terms for the description of data relevant to the respective field. For example, Gene Ontology (GO) [37] is widely used for the annotation of genes, providing term definitions to describe the cellular location, molecular function or biological process of the gene in question. Ontology development relies heavily on community input and consensus in order to assure the quality of the resource and its acceptance by the research community. "The Open Biological and Biomedical Ontologies" (OBO) Foundry has been set up to coordinate such efforts and to provide a framework for efficient integration of the various ontology projects [38]. The RNAi community is in need of a well-structured phenotype ontology in order to facilitate the unambiguous description of phenotypic observations. There is an OBO-candidate ontology, "Mammalian phenotype", under active development [39], which is currently

### **46** *Frontiers in RNAi, Vol. 1 Schmidt et al.*

geared towards the description of mouse knockout phenotypes. The "Phenotypic Quality Ontology" (formerly "Phenotype and Trait Ontology", PATO) [40] provides terms to be used in conjunction with other ontologies to refer to phenotypes, *e.g.* "red", "high temperature", "ectopic". For the annotation of cellbased RNAi phenotypes, however, the challenge remains to develop an ontology, or indeed to define a combination of ontologies, addressing the abundance of phenotypic observations produced by a broad spectrum of biological assays. This task cannot be tackled by an individual research unit alone but would have to be coordinated within a wider representation of the RNAi community.

### **Comprehensive Data are Critical and Valuable**

Comparison of RNAi screening data is impeded by another issue: Many data producers do not make available the complete set of results they have obtained from large-scale screens. Frequently, publications describe large-scale experiments with regard to the technical details, but the actual results reported focus only on "interesting hits". These hits are elaborated on in detailed validation studies and follow-up experiments. Other hits may or may not be mentioned, while "negative" phenotype results are often left out completely. It is not uncommon for a publication on a genome-wide screen, carried out as a highthroughput experiment, to leave out the full list of genes included in the study. As a consequence, valuable data are actually lost. High-content RNAi screens represent a particular challenge due to the difficulty of storing and providing the large amount of raw and processed image files they produce.

Though often not of great interest to the scientist undertaking a high-throughput experiment with a specific biological question in mind, comprehensive resultseven negative ones-are valuable assets to the research community as a whole. It makes a considerable difference whether a gene has been abrogated in a study and found to have no effect on a specific biological mechanism, as opposed to a gene not being included in the study at all. Here we perceive a strong need for adequate data repositories to welcome submission of comprehensive data, as such data usually exceed the scope and focus of traditional journals [41].

Comprehensive information about RNAi experiments also includes sequence data on RNAi reagents. These are necessary to assess the specificity of reagents and to annotate RNAi constructs when gene models change; ideally RNAi repositories should require deposition of sequence information for constituent RNAi reagents. A further issue is the appropriate description of controls used in RNAi experiments. Both assay- and library-specific controls are essential for the interpretation of RNAi screening data. The MIARE and MIACA efforts mentioned above encourage authors to supply sufficient details to ensure proper interpretation and/or re-analysis; this should be supported by RNAi data repositories.

## **Curation Strategies – How to Populate Data Repositories?**

We have stressed the need for comprehensive, well-structured and standardized data repositories for RNAi screening data. A question arises as to which is the best method of populating these with data. Two major approaches can be envisaged, namely manual curation by trained curators or direct submission by data producers.

The first approach relies on dedicated curators who extract information from the literature or review information supplied by data producers or by other means. Curators have been trained in applying annotation guidelines, are frequently involved in developing them in the first place, and are familiar with the complexity of the annotation process. The obvious advantage is a high level of consistency in the data representation, while the disadvantage is the high cost in terms of time and resources, resulting in very slow progress towards a comprehensive data repository. It should be noted that curators can only take into account those details that have actually been published or submitted to them, and there is a risk of mis-interpretation of reported data as the curator has not been involved in the data production at source.

The second approach shifts the task of populating the data repository to the data producers themselves. This requires a suitable infrastructure for uploading the data, well-documented annotation guidelines, and, not least, a strategy to motivate data producers to make the effort to submit their data in the first place. Benefits of this approach are the lower cost involved, and the fact that the data producers know their data best and have the entire data set and all background information at their disposal. Given the variety of experimental setups and the resulting complexity of annotation guidelines, expected drawbacks are a lower level of consistency in the data representation, incompleteness of data submissions, and the relatively steep learning curve for individual investigators to accomplish a limited number of submissions.

Neither of these approaches alone will satisfy the requirements of robust, useful databases, so the likely mode of action will have to be a compromise between the desire for strict data standardization and for simple direct submissions. The proposed best-case might be a model combining both aspects, namely a staffassisted data submission procedure. In this model, data producers submit their data themselves, helped along by adequate tools and guidelines, and accompanied by assistance and final review by expert curators. This model has been implemented fairly successfully by the U.S. National Center for Biotechnology Information (NCBI) for their PubChem BioAssay database (http://www.ncbi. nlm.nih.gov/pcassay). GenomeRNAi [10] has also begun to take this approach.

## **TYPES OF RNAi DATABASES**

Databases for storing high throughput screening data can broadly be divided into two types: Laboratory Information Management Systems (LIMS) and result- or reagent-focused annotation databases.

## **Laboratory Information Management Systems (LIMS)**

LIMS store information relevant to laboratory workflow, *e.g.* user and administrative information, in addition to screening results. This type of database is essential for screening facility operations and is usually implemented so that most of the data stored are not publicly accessible, although at least some LIMS systems have been set up so that reagent and screening data stored in them can be made public *via* the LIMS framework when appropriate. Examples of LIMS systems developed in academe that support RNAi screening data management include Screensaver [42], MScreen [43], and FlyRNAi.org [44]. Screensaver and MScreen are open-source projects-software code is freely available for new users to customize-and software for FlyRNAi is available from the developers on request. Several commercially available LIMS packages that support RNAi screening are also available, including ActivityBase (by IDBS) and BioRails (by Accelrys).

Because LIMS systems integrate administrative and experimental workflows within laboratories, they are usually created to be dynamic and customizable. Many facilities that use LIMS employ at least one staff software developer to carry out ongoing improvements and to adapt the LIMS as lab policies and workflows change. On the administrative side, LIMS store and report on information about screens relevant to operation of the screening facility. This can include information about investigators that perform screens (*e.g.*, contact information), critical correspondence pertaining to investigator work in the screening facility, and, if relevant, billing and accounting information. LIMS also store important screen-specific information, such as details about reagents used in

screens (*e.g.*, sequence and catalogue information), well-described assay and data analysis protocols, information about RNAi library copies used for specific screening experiments, dates on which screening activities were performed, and perhaps files that record primary screen hits chosen for follow-up work.

A key function of any LIMS is to match screening results with the contents of library wells that were screened. Thus, a LIMS stores information about RNAi library contents and plate formats in addition to screen data. Many LIMS link out to external, public databases, such as GenomeRNAi or the NCBI Probe database (see below), for additional information describing RNAi reagents. Most LIMS also facilitate library management within the facility, allowing tracking of library plate copies and storing information about plate locations, well volumes, accumulated freeze/thaw cycles, *etc*.

For screening results, LIMS typically store both raw and processed data, recording plate and well layout of assay results for plate-based screens, as well as the identities of and data from experimental controls-including their positions on assay plates. This information enables anyone accessing the screen data to reanalyze the data as necessary-for example, to perform per plate normalization, Z' factor calculations, *etc*-and also to compare results directly with those from other screens carried out in the facility. In public databases where the primary goal is to provide annotation for reagents and screening data (see the section below), raw screen data is usually not available, nor is layout information for library and assay wells or detailed information about controls. Thus, re-analysis of experimental data from these databases is somewhat limited and original analyses cannot be perfectly reproduced from the information provided.

### **Result- or Reagent-Focused Annotation Databases**

While LIMS databases are typically internally facing, a considerable number of public databases aggregating various data about RNAi reagents and their experimental results have also been developed (for a partial listing, see Table **1**). Several of particular interest are highlighted below.

### *Probe - http://www.ncbi.nlm.nih.gov/probe*

The Probe database [45] from NCBI is a registry of nucleic acid screening reagents. Information currently available consists of individual and/or pooled siRNAs and shRNAs offered by providers such as The RNAi Consortium, Dharmacon (part of GE Healthcare), Life Technologies, and Sigma-Aldrich, although reagent listings may be incomplete. For each reagent record, the

### **50** *Frontiers in RNAi, Vol. 1 Schmidt et al.*

expected target location in the current version of the genome is shown and a link to the respective supplier's website is given. Sequence information is also available for a subset of reagents.

## *PubChem - http://pubchem.ncbi.nlm.nih.gov/*

NCBI's PubChem [45] is an aggregation of the databases Substance, BioAssay, and Compound (which is largely irrelevant for RNAi screening). Information on screening samples, including user-submitted RNAi reagents such as siRNAs and shRNAs, is contained in Substance, with each substance record including links to an appropriate Probe record where relevant. Dharmacon (part of GE Healthcare) has begun submitting its RNAi reagents to Substance, making them available for experimenters to use in depositing experimental results generated with these reagents to BioAssay.

BioAssay is "a public repository for biological activity data of small molecules and RNAi reagents" [46] that focuses on information regarding whether or not a tested reagent was determined to be "active" by the testing researcher's criteria of interest. Although predominantly used for small molecule data to date, BioAssay's capacities have been extended to enable it to support submission of information on RNAi screens, such as those done with siRNAs, shRNAs, and dsRNAs against a variety of organisms. BioAssay also now supports deposition of MIARE-compliant reagent and assay protocol information. These data from BioAssay are available to download either on a per assay or a per reagent-ofinterest basis, although it lacks full information about plate layouts, tested controls, and raw screen data that would enable further secondary analysis.

## *GenomeRNAi - http://www.genomernai.org*

GenomeRNAi [10] provides information for human and *Drosophila* RNAi libraries from reagent providers. When available either as a result of literature curation or experimenter-submitted data, associated phenotypic information for a gene/reagent is provided, as well as reagent sequences where approved by the provider. Direct submission of experimental data is encouraged, and guidelines for proper annotation of data using a controlled vocabulary are provided-in effect standardizing the way data can later be consumed. A further step towards data standardization is taken by regularly mapping author-provided gene identifiers to the corresponding NCBI Gene ID. Additionally, reagent specificity and efficiency information are updated regularly using NEXT-RNAi [47].

As well as listing RNAi reagents, the GenomeRNAi website provides the ability to browse, search, and download data; one may also view and download frequent hitters (lists of genes frequently showing a phenotype) as well as links to other resources such as FlyBase, FlyMine, UniProt, and GeneCards. Furthermore, GenomeRNAi data can be accessed *via* a DAS server [48], and the website includes a DAS-based genome browser.

## *RNAiAtlas - http://rnaiatlas.ethz.ch/*

RNAiAtlas [49] acts as an annotation resource for commercially available siRNAs. It includes both annotations provided by the manufacturer and those generated by an independent annotation pipeline based on publicly available transcriptomic data. Reagent annotations include target site details for predicted intended target and off-target transcripts, providing a granular view of which genes and individual variants may be affected in an experiment. Versioned copies of older annotations can also be viewed, allowing users to see how expected targets have changed as transcript information has evolved. Reagent sequences are available to registered users who have a product license from the reagent supplier.

## *RNAi Codex - http://cancan.cshl.edu/cgi-bin/Codex/Codex.cgi*

The RNAi Codex database [50] is composed of sets of public shRNA designs, providing full genome coverage for human, mouse, rat, and *Drosophila*, as well as partial coverage of the *Arabidopsis* genome. All designs are based on the miR-30 backbone, and include the full hairpin and a barcode sequence. Registration provides access to additional designs.

## *WormBase - http://www.wormbase.org/*

WormBase [51] is a repository for a large collection of RNAi reagents for worms, predominately for *C. elegans*. For each reagent listed, users have the ability to contribute experimental data pertaining to affected phenotypes. Bi-monthly data releases allow researchers to consume up-to-date repository information offline in addition to exploring the contents through an online search interface. In addition to reagent information, WormBase maintains an active user base, offering general resources for the research community, including a discussion board, meeting announcements, and links to other relevant research resources.

## *FLIGHT, ParameciumDB, HIVsirDB, and VIRsiRNAdb*

Several additional databases have emerged catering to the research communities for specific target organisms. These include, but are not limited to, FLIGHT

### **52** *Frontiers in RNAi, Vol. 1 Schmidt et al.*

(http://flight.icr.ac.uk/) [52], ParameciumDB (http://paramecium.cgm.cnrs-gif.fr/) [53], HIVsirDB (http://crdd.osdd.net/raghava/hivsir/) [54], and VIRsiRNAdb (http://crdd.osdd.net/servers/virsirnadb/) [55] for *Drosophila*, *Paramecium tetraurelia*, HIV, and human viruses, respectively. All provide varying levels of detail (from reagent type and intended target to experimental efficacy) for screening reagents targeted to their respective organism and allow users to contribute their own experimental data, promoting a community-driven approach to building RNAi knowledgebases. In addition to providing data for siRNA intended targets, HIVsirDB and VIRsiRNAdb also offer experimental data for efficacy of reagents against viral escape sequences when available.


**Table 1:** Summary of selected public RNAi databases



### **CONCLUSION**

The growth of new high-throughput technologies such as whole-genome RNAi screening (as well as microarrays and next-generation sequencing) over the last decade has exponentially increased the volume of biological data being generated. However, because data-management strategies have not matured as quickly as data-generation technologies, the full potential of this tidal wave of biological information has not yet been realized. Aggregation of these data into robust databases is critical but challenging due to the wide range of assays and the variety of annotations that hamper a standardized description of RNAi data.

LIMS for tracking reagent inventory, raw data, and progress through the experimental process are indispensible to labs and/or screening facilities actually performing high-throughput RNAi experiments, as evidenced by the independent development of several such systems. Likewise, public databases of reagents and published phenotypes from the full range of RNAi screens are crucial to the larger community, since combination of these data offer unprecedented opportunities for mitigating limitations of the RNAi technique (such as off-target effects) and for generating novel biological insights. Recognizing this need, a considerable number of RNAi-related databases have been developed to serve various constituencies.

The field as a whole is only beginning to appreciate the benefits of submitting full screening datasets-including sequences of RNAi reagents, and both positive and negative results-to public data repositories. Obstacles to establishing widely accepted and utilized repositories for RNAi data (such as the need for more curation efforts, for appropriate ontologies, and for techniques to manage large image collections) are considerable but not insurmountable. We must vigorously pursue the goal of comprehensive data repositories if we hope to ride the wave of RNAi high-throughput data, rather than be left floundering in the surf.

### **ACKNOWLEDGEMENTS**

We would like to thank Ben Shoemaker and Yanli Wang at NCBI for helping to retrieve PubChem summary statistics. This work was supported by National Institutes of Health Grant U54 HG006097 (C.E.S) and Harvard Medical School (C.E.S and J.A.S), and Helmholtz Alliances for Systems Biology (M.B).

### **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflicts of interest.

### **REFERENCES**



### **56** *Frontiers in RNAi, Vol. 1 Schmidt et al.*



© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

# **Pooled shRNA Screening**

### **Annaleen Vermeulen2 , Anja van Brabant Smith2 , Sarah B. Anderson<sup>3</sup> , Roderick L. Beijersbergen1,\* and Kaylene J. Simpson4,5**

*1 The Netherlands Cancer Institute, Plesmanlaan 121, 1066CX, Amsterdam, The Netherlands; 2 Dharmacon, part of GE Healthcare, 2650 Crescent Dr., Suite 100, Lafayette, CO 80026, United States of America; 3 Challenge Technology, Inc., 4950 Ward Road, Wheat Ridge, CO 80033, USA; 4 Victorian Centre for Functional Genomics, Peter MacCallum Cancer Centre, East Melbourne, Victoria, 3002, Australia and <sup>5</sup> Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria 3052, Australia* 

**Abstract:** Short hairpin RNA interference (shRNA) screens have earned their place in the technical repertoire of high throughput screening approaches by virtue of their broad applicability to targeting regular and primary cell types and the capacity to perform both positive and negative selection screens both *in vitro* and *in vivo*. This chapter focuses primarily on pooled shRNA screens, outlining the breadth of resources available, important library features and methods to establish effective transduction. We discuss assay development and optimization, followed by strategies for hit identification, principally using Next Generation Sequencing (NGS) approaches. Validation of any screen is essential and our collective experience guides the reader to consider a range of approaches towards confirming targets identified in the screen subsequently recapitulate the biological premise of the screen. We conclude with a thought provoking discussion on the future of shRNA screens, the challenges and the scope we can look forward to.

**Keywords:** Next Generation Sequencing, pooled screens, RNA interference, shRNA, validation.

### **INTRODUCTION**

Large-scale screens using RNA interference (RNAi) have demonstrated utility in identifying gene targets that play a role in specific biological pathways or disease progression. RNAi screens have been performed using synthetic small interfering RNAs (siRNAs), synthetic microRNA mimics or inhibitors, or expressed shRNAs to trigger the gene silencing event. While RNAi screens using synthetic siRNAs have been widely used [1-12], the applications for RNAi screens using synthetic siRNA reagents are limited by the requirement that cells used in the screen need

**<sup>\*</sup>Corresponding author Roderick L. Beijersbergen:** The Netherlands Cancer Institute, Plesmanlaan 121, 1066CX, Amsterdam, The Netherlands; Tel: +31 20 512 1960; E-mail: r.beijersbergen@nki.nl

to be relatively easy to transfect and the phenotype of interest should be apparent before the effects of gene silencing diminish. Screens using expressed shRNAs circumvent many of these limitations. Virally-expressed shRNAs can be transduced into most cell types and stable expression of the shRNAs allows analysis of phenotypes that require longer times to develop.

shRNA screens can be performed in either an arrayed or pooled format. In arrayed screening, individual shRNAs are distributed across individual wells of multi-well culture plates and phenotypes are screened on a well-by-well basis. Arrayed shRNA screens have been used to study a variety of processes, including circadian rhythm and mitotic progression [13, 14]. In pooled shRNA screens, hundreds or thousands of different shRNAs are introduced into a population of cells that are then selected for the phenotype of interest; shRNAs that are either enriched or depleted in the selected population are identified by PCR amplification of the shRNA from genomic DNA (gDNA); changes in relative abundance of the individual shRNAs between control and experimental cell populations are evaluated using microarray or NGS technologies (Fig. **1**). Pooled shRNA screens have been used to identify genetic components of various cellular processes that have been similarly interrogated by siRNA screens including cell proliferation [15], tumorigenicity [16, 17], adhesion [18], and migration [19] in cell culture systems. However, there are applications and biological questions that can currently only be addressed using pooled shRNA screening approaches; these include analysis of large gene sets across hundreds of cell lines and *in vivo* screens that will be discussed in the future directions section.

For the purposes of this review, we will focus on pooled shRNA screens and the important factors that should be considered in order to ensure that meaningful results are obtained from these screens. A successful pooled shRNA screen relies on the quality of the library, careful experimental design, hit identification and hit validation. Finally, we will discuss how pooled shRNA screening has been applied in novel areas and where pooled shRNA screening will take us in the future.

## **AVAILABLE shRNA LIBRARIES**

The earliest collections of shRNA vectors were generated by academic efforts. Initially, these targeted a small number of genes with three to five individual shRNAs for each gene [20]. Subsequently, larger collections spanning thousands of genes were generated by the Netherlands Cancer Institute [20, 21] and

**Figure 1: Pooled shRNA screening workflow** *in vitro***.** A population of cells are transduced with an shRNA library such that on average each cell contains a single shRNA with a recommended 500-1000 copies of each shRNA in the cell population. Depending on the shRNA vector used, cells can be subjected to antibiotic selection or sorted based on fluorescence to generate a population of cells expressing only hairpins. These cells are split into a reference sample that remains untreated (T0) and a sample that is subjected to a selective pressure (T1), ranging simply from growth for multiple passages to drug dosage to identify resistant or sensitive shRNA targets. Following a certain time period, the genomic DNA of T0 and T1 samples are isolated, shRNAassociated sequences are amplified and NGS is applied to count the number of shRNA sequences in each sample to determine the enriched and depleted shRNA in the experiment, resulting in a statistically ranked hit list.

Hannon/Elledge groups [22, 23] and these were made available to the scientific community. The first generation shRNA vectors were based on the expression of a small hairpin RNA under the control of H1 or U6 pol III promoters. These shRNAs consisted of a gene specific sequence between 19 and 27 bases in length, a loop sequence of four to 12 bases (sometimes including a restriction site),

followed by the reverse complement of the gene specific sequence terminating in a stretch of five T's. Further insight into endogenously expressed siRNAs led to the generation of microRNA-embedded shRNAs. In such systems, the shRNA is placed into the scaffold of an endogenous microRNA, *e.g.* miR-30. The microRNA-based hairpins resemble native microRNA structures and are processed by the endogenous microRNA pathway, potentially leading to more efficient knockdown of gene expression. These microRNAs can be expressed using RNA polymerase II promoters, allowing for promoters with cell- or tissue type specific expression. In addition, inserts can be generated in which the microRNA is fused to a reporter gene (*e.g.* GFP) to monitor expression. For both systems, inducible vectors have been generated to allow for sequential activation and inactivation of gene expression in stable cell lines. These promoter and shRNA structures are usually incorporated into viral vector systems, including retroviral, adeno-associated viral, adenoviral and lentiviral platforms.

Currently, there are several large collections of shRNA vectors available (reviewed in [24]), including the Netherlands Cancer Institute (NKI) libraries [20, 21], Hannon-Elledge libraries [22, 23] (available from Dharmacon, part of GE Healthcare), The RNAi Consortium (TRC) collection [13, 25] (available from Dharmacon, part of GE Healthcare, Sigma-Aldrich) and TransOMIC collections. Large libraries that are already in the pooled shRNA format are available from several vendors including GeneNet libraries from System Biosciences, Decipher libraries from Cellecta [26] and Dharmacon Decode Pooled Libraries. From these large collections, smaller libraries have also been generated to target specific gene families or other genes of interest.

The characteristics of a pooled shRNA library directly impacts the quality of data obtained for a pooled screen. Important characteristics include: (1) the robustness of knockdown, (2) composition of the library including gene coverage and the number of shRNAs per gene, (3) uniformity of the pooled library.

## **Robust Knockdown**

The efficiency of individual shRNAs to suppress gene expression is central to a robust pooled shRNA screen. Over time various algorithms have been developed to enable the selection of highly functional targeting sequences. Although these algorithms are quite successful for siRNA design, achieving on average more than 80% knockdown, this is in general not the case for shRNA vectors. Also, additional criteria have to be implemented for microRNA embedded shRNAs, ensuring proper processing by the endogenous microRNA machinery, including

### **62** *Frontiers in RNAi, Vol. 1*

processing by Drosha and Dicer [22]. Due to the inability to select highly active shRNAs, in general libraries contain multiple, unique shRNAs targeting the same transcript. The number of shRNAs per gene can range from three to 25. The presence of multiple independent shRNAs per gene allows for the calculation of a combined score for each gene that can determine a value for hit selection. The presence of multiple shRNAs per gene also facilitates the discrimination between on- and off-target hits in large scale screens (see Hit Validation).

### **Library Composition**

Depending on the scope of the screen, libraries can be designed to target all genes in the organism of interest (whole genome) or all the genes in a specific biological area of interest (*e.g.* protein kinases, DNA damage pathway or apoptosis collections). Because it is essential to have multiple shRNAs per gene, the size of a collection can become quite significant. If one would include five shRNAs per gene in a human whole genome collection, this would effectively mean a collection of more than 100,000 shRNAs. The use of such large collections in pooled screens is a logistical challenge requiring large numbers of cells and amounts of assay reagents (discussed in Assay Development and Screen Optimization). As such, screens are frequently designed around a limited number of genes and collections between 1,000 and 5,000 shRNAs.

## **Pool Uniformity**

Critical to the success of a pooled shRNA experiment is the requirement that each shRNA is represented at approximately equal levels in the pool. This ensures that each individual shRNA is interrogated in the screen and reaches a sufficient threshold to allow identification of targets, especially in the case of depletion compared to the control population. Several strategies to minimize differences in abundance have been used to create libraries, including growth of individual bacterial cultures or the plating of individual colonies on agar plates. Ideally, one could isolate DNA from individual cultures and add equal amounts to the plasmid pools for the libraries; however, this requires a sophisticated infrastructure, especially when creating large collections. Generally, pooling of individual cultures or scraping and mixing of bacterial colonies is used for plasmid DNA isolation. The relative representation of each individual shRNA can be determined by deep sequencing of the DNA pool used to make the virus [27]. This provides the researcher with a solid starting point to determine subsequent enrichment or depletion under assay conditions.

### **ASSAY DEVELOPMENT AND SCREEN OPTIMIZATION**

Prior to running a screen, it is crucial to optimize all assay and screening parameters. These optimization steps increase reproducibility between samples and hit calling capabilities. Experimental variability can arise in practically all stages of transduction and screening, including packaging of viral particles; transduction medium, additives and duration; cell density at transduction; functional viral titer in the cell line of interest; selection of transduced cells; multiplicity of infection (MOI); average shRNA fold representation during transduction, cell passaging and PCR; number of biological replicates; and application of selective pressure and phenotypic selection. These variables must be considered and addressed in order for a pooled shRNA screen to be run robustly.

Lenti- or retroviral particles are the most commonly used transduction vehicles for pooled shRNA screening because of their ability to transduce cell types that are refractory to transient transfection or other RNAi delivery methods, and their prolonged expression following genomic integration. In addition, viral vectors allow for the introduction of a single shRNA expressing vector per cell, which can be important for phenotypic selection in a screen. Packaging of virus from plasmid pools should be performed using a large number of cells to ensure adequate representation of shRNAs in the pool and to maintain consistent viral stocks. Once the virus is packaged, ideal transduction conditions and efficiency will vary for every cell line and must be determined empirically.

Well characterized positive and negative control shRNAs should be used to fully determine transduction parameters prior to beginning a pooled shRNA screen. Positive control shRNAs with validated knockdown efficiency against a biologically relevant target, can be used to assess the level of knockdown in a given set of conditions and the strength of the selective phenotype under screening conditions. Negative control shRNAs, both technical (*e.g.* a gene target that induces cell death) and biological (on target knockdown but no phenotype in the screen) can be used to assess the effects of transduction conditions on cell health and viability and the variation or noise under phenotypic selection. In addition to these shRNA controls, cell passage number and reagent lot numbers should be closely monitored to ensure consistency between assay optimization experiments and the pooled shRNA screen itself.

Preliminary transduction optimization steps include establishment of standard media additives such as hexadimethrine bromide (commonly known as

### **64** *Frontiers in RNAi, Vol. 1*

Polybrene) and serum. Polybrene is a cationic polymer that is thought to aggregate viral particles and neutralize cell surface charge, facilitating viral access to cell surface receptors [28]. Polybrene is generally used at 1 – 10 g/mL and should be tested in every cell line of interest to assess affects on transduction efficiency and viability. Ideally, transductions should be performed in serum-free medium; however, this is not possible in all cell lines, in which case the minimal allowable serum concentration should be assessed to maximize transduction efficiency.

Once medium conditions have been established, cell density and duration of transduction can be determined. Optimal cell density for transduction can be assessed by plating cells at a range of densities and transducing at a single MOI. Since cell number is a critical parameter in MOI calculations, it is important to know the doubling time of the cell line of interest, and therefore the number of cells at the time of transduction. Duration of viral transduction can also vary dramatically between cells lines. Times ranging from 4 to 24 hours are appropriate for viral transduction and should be determined for every cell line.

Functional titer, or infectious titer, is a measure of the transducing units per milliliter of virus in a cell line of interest. This metric must be determined empirically for every cell line and culture condition. Functional titer cannot be directly converted from titers that have been determined by other methods, such as p24 titering assays or colony forming unit assays in a different cell line. To determine the functional titer for a specific cell line, the viral pool should be titered in the cell line of interest using a colony formation assay or a FACS assay. Alternatively, a relative functional titer can be determined using a negative control shRNA virus that has been titered in parallel with the viral pool. This negative control is titered using the exact conditions determined in the optimization experiments, resulting in a conversion factor that can be calculated from the original titer of the negative control and the titer under screening conditions. This conversion factor, or relative functional titer, can then be used to determine the functional titer of the viral pool. For researchers who make their own virus, the former method is recommended, while the later technique is useful for researchers who purchase pre-packaged virus or who plan to use the virus under several different circumstances.

Most viral backbones contain a mammalian antibiotic selection cassette and a fluorescent reporter such as GFP or RFP. Any of these markers can be used to select a population of transduced cells. If using an antibiotic selection cassette, an antibiotic kill curve should be established during assay optimization to determine

the concentration and duration of antibiotic selection. In general, the antibiotic treatment is established as the lowest concentration that kills 100% of untransduced cells in three to six days. If using a fluorescent reporter to analyze transduced cells by FACS, fluorescent gates and time after transduction need to be optimized using cells transduced with control reporter constructs transduced at the same MOI established for the screen.

Selecting the appropriate MOI for a screen is a crucial step in the optimization process. To identify shRNAs that are depleted from the population, it is important to have no more than one shRNA integration per cell. However, for screens based on the positive selection this is less prohibitive. The proportion of cells that are infected at any given MOI follows a Poisson Distribution where low MOIs decrease the likelihood that any given cell is infected with more than one shRNA virus. The highest recommend MOI is 0.3, where more than 70% of infected cells are likely to have only one shRNA integrant and less than 10% of the infected cells are likely to have more than one shRNA integrant. MOIs greater than 0.3 increase the likelihood that any cell will have more than one shRNA integrant. In practice, MOIs of 0.3 to 5.0 are commonly used and the choice of MOI depends on the specific assay being used, toxicity of shRNA hairpins and available cell number.

A critical and necessary consideration of pooled lentiviral shRNA screening is the extent to which any given shRNA construct in a pooled library will be represented in the screen (*i.e.*, the number of cells that contain an independent genomic integration of any given shRNA or the number of biological replicates of each shRNA integration event). High shRNA representation results in high reproducibility between biological replicates and ensures that there is a sufficient window for detection of changes in shRNA representation after phenotypic selection [27]. shRNA fold representation between 500 and 1,000 is recommended and increasing this number can further enhance the reproducibility of a screen [29]. These recommendations are particularly important if you are interested in observing shRNA depletion hits or more subtle hits with less fold-change between reference and experimental samples.

Once a shRNA fold representation has been established, it should be maintained throughout all steps of the pooled shRNA screen, including cell passages, PCR and NGS. For example, if the shRNA pool contains 1,000 constructs and the shRNA fold representation is 1,000, then 1x106 transduced cells should be obtained, this means the transduction of at least 3x106 cells at an MOI of 0.3. The number of 1x106 cells should be maintained for each biological replicate, it should

### **66** *Frontiers in RNAi, Vol. 1*

be maintained at each passage and 1x106 genomes worth of genomic DNA (~ 6 µg gDNA) should be used to recover the shRNA inserts by PCR. If shRNA representation is not maintained through all steps of the screen, biological reproducibility decreases and hits with smaller changes between reference and experimental samples are lost.

Each pooled shRNA screen requires independent biological replicates to ensure detection of biological variation as result of treatment or selection. Multiple biological replicates are required to perform rigorous statistical analysis of the results and hit selection. In general three biological replicates are used but if resources are limited, higher shRNA fold representation is more economical than running additional biological replicates; however, at least two biological replicates should be performed in all screens. In addition to replicates, multiple samples obtained at different time points can be used to establish a temporal pattern in the changes of relative abundance of individual shRNAs.

Pooled shRNA screens depend on selective enrichment or depletion of cells based on a phenotype induced after gene knockdown. The types of selective pressure and phenotypic selection are screen-specific and techniques vary widely. A classic approach involves selection of cells based on their rate of proliferation or cell survival, assayed in the absence of treatment (straight lethality screens) [30-32] or under selective pressure of a specific drug (resistance or enhancer screens) [33- 38] However, cell selection can also report on other phenotypes such as expression of cell surface proteins [39-41], intracellular fluorescent reporters [42, 43], migration or adhesion [44]. If a screening approach allows for generation of a reference and selected population, one can identify shRNAs that are specifically enriched or depleted in selected samples compared to a reference sample.

To determine the relative abundance of each individual shRNA in cell populations obtained under assay conditions (including reference and experimental), genomic DNA (gDNA) is extracted and hairpins are recovered by PCR amplification. PCR primers, PCR conditions, amount of gDNA and the number of PCR cycles should be optimized to ensure that reactions remain in the linear phase of log amplification. Through the addition of adaptor sequences and index tags to permit multiplexing per lane, the samples can be analyzed by NGS. As mentioned above (assay optimization), it is important to maintain shRNA fold representation during PCR amplification. In general this means that several independent PCR reactions must be performed for each sample and then combined for sequencing. The number of PCR reactions can be reduced by enrichment of shRNA inserts by restriction digests and fragment isolation of gDNA or by the use of DNA capture technologies.

## **NGS AND HIT IDENTIFICATION**

Measuring the relative abundance of each shRNA within different cell populations is fundamental to the process of pooled screening. For every population, the abundance of each shRNA should be determined in a quantitative manner. Originally, DNA microarrays containing complementary oligonucleotides to shRNA sequences or associated barcodes were used for abundance detection. Although initially these approaches were sufficient for detection, increasing shRNA library size and more advanced design rules for effective shRNAs (which reduced sequence differences between unique shRNAs) resulted in reduction in specificity and high background hybridization. This higher background hybridization made it challenging to identify a decrease in abundance. Recently, NGS techniques were used to detect abundance of each individual shRNA in each sample. NGS allows for detection of each shRNA in complex libraries even when shRNA sequences are similar. Direct counting of sequence reads has the advantage of dramatically increasing the dynamic range thereby widening the screening window. In addition, multiple screening samples can be combined in one sequencing lane, eliminating artifacts due to different microarray hybridization experiments.

The analysis of individual shRNA abundance is based on sampling from the entire population with the goal of estimating the real frequency of each shRNA in that population. This sampling results in errors, known as Poisson sampling errors and caused by insufficient representation of each shRNA in the different populations. To reduce sampling error effects, each shRNA should be represented multiple times, usually at least 1,000 times. However, as pooled shRNA libraries are not normalized with respect to the abundance of each shRNA, a 1,000 fold representation does not avoid sampling error on low frequency shRNAs in the population. The screening system can also strongly influence noise and the correlation between biological replicates. For example, a low frequency of background colonies present in a screen designed to identify genes that upon knockdown cause resistance against drug-induced apoptosis, will produce a small number of colonies with different shRNAs. However, these shRNAs will not be shared among replicates because the number of these colonies is too small to represent sufficient sampling from the entire library. To be able to resolve real hits from background, it is essential to compare biological replicates. The different

### **68** *Frontiers in RNAi, Vol. 1*

types of sampling errors must be taken into account for the analysis of pooled shRNA screening results.

Data generated by NGS of pooled shRNA libraries resembles count data, similar to the output from RNA sequencing experiments. Laboratories may set up their own NGS analysis pipelines, and several methods have been developed including DESeq [45] and EdgeR [46] to generate hit lists. A data normalization step is recommended to correct for differences in read numbers between the different samples. Following this step, every construct in each population can be compared as a relative ratio.

As a general concept, these methods estimate variation based on large numbers of different shRNAs with similar count frequencies, and result in a list of shRNAs that are either enriched or depleted. Cut-offs can then be based on significance and false discovery rates in combination with a fold change threshold. However, it is difficult, due to the nature of shRNA reagents, to translate this directly into a gene hit list without applying additional criteria for selection. All RNAi-based reagents have two characteristics that complicate the performance and analysis of large scale screens; (1) the varying degree of knockdown and (2) the existence of off-target effects [47]. At this moment both the efficiency of knockdown and the degree of off-target effects for individual RNAi reagents are largely unpredictable. It has been suggested that off-target effects due to seed complementarity are less abundant, though not eliminated, with shRNA reagents compared to siRNA. To address the challenges of limited efficiency and off-target effects, most shRNA libraries contain multiple, different shRNAs targeting the same transcript. Hits can be prioritized based on the number of shRNAs per target scoring as a hit in the assay. Both concerns of silencing efficiency and off-target effects are diminished for hits that confirm with multiple, independent reagents. If sufficiently large numbers of different shRNAs per gene are included in the library, a score or metric for each shRNA can be calculated and all shRNAs targeting the same gene can subsequently be tested for a significant change in distribution compared to non-targeting controls [29]. Alternatively, a criterion of the presence of two individual shRNAs targeting the same gene among the hit list is also used, especially for libraries in which five or more shRNAs are present per gene. Another approach based on the same concept is a selection based on the second best hairpin. This method is based on the assumption that when at least two shRNAs for one gene are significantly enriched, the score for both shRNAs should also be significant. If only one shRNA scores statistically as a hit, the gene may still warrant further investigation if there is a strong biological rationale. It is clear that a pooled shRNA screen will yield a list of potential shRNAs and their corresponding genes. Because of the nature of the RNAi reagents, including variable knockdown efficiency and off-target effects, validation of these hits is still required and strategies are outlined below.

## **HIT VALIDATION**

All high throughput screening approaches require a validation step, whereby the statistically defined primary screen target list is reduced to a more manageable high confidence gene list. Validation strategies can take multiple paths and depend on: 1) the biological question being asked, *e.g.* is there only one assay for your biological question or are there additional assays that can reliably report on your phenotype; 2) the resources available, *e.g.* can you screen in additional cell lines, or can you afford orthogonal assays; and 3) the RNAi reagents available, *e.g.* do you have access to additional shRNA or siRNA sequences. The goal is to perform experiments that progressively increase your confidence that the target and the phenotype are related.

Validation of individual shRNAs can be quite laborious due to the requirement to prepare DNA for each construct, make and titer virus, perform appropriate assays and evaluate knockdown. Therefore, validation strategies can be highly influenced by the magnitude of the hit list. Small numbers of hits (less than 20 targets) from a primary shRNA screen are not usually observed (unless using an especially stringent assay), although it is possible that several targets will exhibit a higher magnitude of enrichment or depletion after the primary screen. The screener is often faced with a fairly large list that can be triaged on the number of shRNA sequences scoring for the same target gene as described above, and also on bioinformatic pathway analysis implicating target genes involved in the same signaling pathways. As mentioned, gene targets where multiple shRNA constructs have high scores can be ranked as high priority hits, however where only a single shRNA construct is significantly represented in the experiment, additional constructs must be experimentally tested in order to validate the target as a true hit.

For small numbers of primary hits, individual shRNA constructs can easily be obtained and processed in low throughput to verify phenotype and knockdown. In addition, new shRNA constructs can be designed and tested to confirm a gene target. For a larger number of hits, where the screener has no immediate sense of which targets to pursue, one option is to create a sub-pool library of the shRNAs targeting the hit genes and additional constructs, if required, with the aim of

### **70** *Frontiers in RNAi, Vol. 1*

identifying three hairpins with the same phenotype. In this case, the sub-pool library must be created with a similar number of non-enriched targets. This approach is relatively high throughput and affords the ability to review a reasonable number of high ranking targets. The sub-pool library is created and assayed as per the primary screen to identify the relative representation of the hairpins (still using NGS) within the context of a smaller pool. Those hairpins that might have influenced the phenotype, but may not have scored significantly due to their effect being statistically diluted by the vast numbers of other constructs, may be more identifiable in a smaller pool. For a depletion screen, a smaller pool translates to the possibility of working with a higher representation at the outset of the screen, requiring less cells (relative to that required when working with a large pool) and the opportunity for more closely timed assay points to more definitively identify targets regulating death. Again, the endpoint of this step is to refine the target list down to a workable number of genes.

Validation at the level of knockdown and confirmation that gene knockdown correlates with phenotype is critical. Once a gene target has been verified by the extensive phenotypic methods identified above, the level of gene knockdown must be confirmed. Reagent suppliers guarantee that target gene knockdown should be greater than 75%, confirmed at the mRNA level after short term knockdown or at the protein level after longer knockdown. A non-silencing negative control shRNA construct can be used as a reference for comparing the extent of knockdown. While knockdown alone cannot completely verify that the intended target gene is causing the phenotypic outcome, it remains a very important step in ensuring that there is on-target activity.

The gold standard for confirming that a particular siRNA or shRNA sequence is responsible for a phenotype is to develop a phenotype rescue assay [48]. Usually, a silent mutation is introduced into the target or a target ortholog is used that cannot be targeted by the RNAi reagent, but preserves the phenotype. The constructs are transfected into cells and assayed under screening conditions. Complete reversal of the phenotypic effect should be observed by specific gene knockdown. While phenotypic rescue is perhaps the ultimate means of verification, it is very low throughput and can usually only be performed for a few targets. The success of this approach however, can also be limited depending on the extent of the effect of over-expression of the target.

A common alternative validation strategy is the use of chemical compounds that have target activity against your gene of interest. Standard cell culture and assay conditions can be utilized in this approach, although drug sensitivity will need to be calculated over a dose curve. Not all drugs are completely specific for one protein, but this offers a very attractive means of verifying the functional effect of the gene of interest, particularly when the project has a translational component. Confirmation of targeting by western analysis is required to verify the phenotypic outcome. This is also an attractive strategy in the scenario where only a single shRNA has been identified as a hit.

As with other experimental systems, verification that the biological effect can be repeated in additional cell lines and with additional experimental approaches is critical to confirming that the biological effect you observed during the screen is robust and reproducible. When using different cell lines, it is always prudent to confirm gene knockdown

Following the validation process, the screener will have arrived at a relatively small number of targets with which to continue assessing in a biological context. This will include standard approaches such as alternative assays, complementing the shRNA strategies with shorter term siRNA approaches, identification of synthetic compounds that can phenocopy the knockdown result and *in vivo* strategies. The latter can include using inducible shRNA constructs (particularly for genes causing cell death) in orthotopic mouse models, or *C.elegans* and *Drosophila* knockout models.

## **FUTURE DIRECTIONS**

Over the past ten years pooled shRNA screens have become recognized as a powerful method for functional genomic screening in mammalian cells. During the same period, screening models have evolved from relatively straightforward phenotypes such as proliferation or survival to more complex phenotypes including adhesion, migration and even gene expression. In addition, pooled shRNA screens are no longer restricted to *in vitro* models, but are also used for *in vivo* screening (described below). Although the number of novel targets for therapy identified by large scale shRNA screening is limited, it has proven powerful in the identification of biomarkers or potential combination therapies.

Although genome wide screens were appealing at first, it has become clear that such approaches require a tremendous effort in validation and follow-up of the primary screen, as discussed above. Also, the interpretation of results from primary shRNA screens is not straightforward due to the inherent nature of RNAi technology, including off-targets effects and low knockdown efficiency. As an alternative, screeners are moving toward smaller, pre-selected gene sets based on gene-families

### **72** *Frontiers in RNAi, Vol. 1*

(protein kinases, phosphatases, metabolic enzymes, chromatin modifiers, *etc*.) or other types of information such as expression, mutation or pathways analysis. Another strategy is to use larger numbers of different shRNAs per gene to allow for the identification of multiple independent shRNAs per gene as hits. This type of approach can yield higher confidence in the genes selected as hits, potentially allowing integrated analysis of these genes with other data types such as gene set enrichment, protein-protein interaction or pathway analysis. Indeed, technical improvements in oligonucleotide synthesis on glass slides has greatly facilitated the custom generation of libraries with up to 50 individual shRNAs per gene [43]. Although this increase in number of shRNAs per gene has advantages in hit selection, a potential complicating factor is the representation of all individual shRNAs in a screen. This can become a major hurdle when more complex phenotypes are used as a selection method in a screen. A possible solution to this problem is the use of shRNA collections displaying more efficient or validated knockdown for the genes targeted and thereby reducing the necessity for large numbers of shRNAs in a library. One strategy to generate validated collections is the use of a target site reporter linked to a fluorescent marker present in the same construct driving the expression of the target specific shRNA [49]. While the short target site cannot fully predict shRNA activity on native mRNA, selecting those cells that have reduced expression of the reporter can enrich for active shRNAs in the population. Subsequently these cell collections can be used for screening or the generation of improved algorithms for better prediction of active shRNAs. Together these different technologies will undoubtedly result in better technology platforms for pooled shRNA screening and thereby increase the possible screening models to be used. In particular, *in vivo* screening will become more feasible.

Large scale RNAi screening has moved screens in mammalian cells towards the characteristics of genetic screens in model organisms. However, major differences still exist. In general, RNAi screens score phenotypes based on partial depletion of protein expression. Although at first this can complicate the interpretation of the effects of knockdown it also creates the opportunity to observe phenotypes based on partial or incomplete knockdown. In the case of lethal genes, this can be sufficient to rescue the lethal phenotype and at the same time cause a biological phenotype (*e.g.* resistance to drug treatment). In addition, one has to deal with potential off- target effects obscuring a biological phenotype. In model organisms gene deletions can be introduced, *e.g.* the yeast deletion collection [50] generation of null-alleles for individual genes without affecting other transcripts. As mentioned above, this cannot be used for essential genes and either hypo-morph or temperature-sensitive alleles are needed to study these essential genes.

Recently, new technologies have been developed enabling gene-editing in mammalian cells, which could potentially be used for large scale screening. The Brummelkamp group has developed a screening system based on haploid cells [51]. These cells carry only one copy of each chromosome and can be used in combination with gene trap technologies to randomly inactivate genes. Analogous to pooled shRNA screens, millions of independent integrations can be generated, selected for a biological phenotype of interest and recovered by NGS [51]. A current limitation for haploid screens in mammalian cells is the limited availability of haploid cell line types and the inability to pre-select cell lines with characteristics associated with the biological question, *e.g.* sensitive and resistant cell line pairs. A potential solution is the use of novel gene-editing technologies such as Zinc-finger-nucleases (ZFNs) [52], transcription activator-like effector nucleases (TALENs) [53] and RNA-guided endonucleases (RGENs) (reviewed in [54]). These technologies use either proteins or protein-RNA complexes to sequence specifically introduce double-stranded DNA breaks that have been harnessed to engineer mammalian genomes. Although at this moment, the application of these technologies in large scale screening is still challenging, recently there has been considerable progress with the use of the CRISPR (clustered regularly interspaced short palindromic repeats)-associated nuclease CAS9 to modify specific genomic loci on a genome wide scale. The specificity of the CAS9 nuclease is determined by short guide RNA sequences. Consequently, large scale libraries can be constructed using array-based oligonucleotide synthesis followed by cloning into lentiviral vectors which is suitable for pooled screening. Indeed, this approach has been applied successfully in positive "resistance" [55, 56] and negative "lethality" screens in mammalian cells, including human embryonic stem cells [57]. A significant difference between CRISPR/CAS9 based screens *versus* shRNA screens is the complete loss *versus* partial knock-down of the expression of the targeted genes. Although this would certainly aid in the identification of straight lethal genes [58], it could also hinder the identification of genes that only display a phenotype at reduced levels rather than complete absence. Finally, the problem of off-target effects is not eliminated with these new gene-editing technologies [59]. Partial similarity of genomic sequences to the guide RNA can result in mutations or deletions in other genes than the intended target. As consequence, also for CRISPR-CAS9 screening systems multiple independent sequences should be used to confirm the on-target phenotype.

Despite these recent advances in genome editing technologies, the application of large scale pooled shRNA screening is still expanding. There is a clear trend of

### **74** *Frontiers in RNAi, Vol. 1*

moving from screens based on differential viability to more advanced screening models coupled with sophisticated read-outs. The latter include flow cytometrybased high content screening technologies where cell surface or intracellular marker expression can be used to select cells or even selection based on quantitative analysis of translocation of cytoplasm to the nucleus. In addition, screening models are moving from 2D culture systems to 3D culture systems and *in vivo* screening. One of the first examples of successful large scale *in vivo* shRNA screens were based on models of leukemia [60]. In these models the grafting efficiency is sufficiently high to screen large collections of shRNA vectors. It is possible to re-generate full organ systems, *e.g.* the liver, under conditions compatible with shRNA screening. Upon ablation of most of the liver with a drug, it can be reconstituted with a drug resistant population of progenitor cells carrying large collections of different shRNA vectors [61]. Another approach is the use of orthotopic transplantation in which an shRNA library containing cell population is transplanted into a specific organ. An elegant example of such approach is a screen for key regulators of neural and malignant glioma stem cells [62].

Besides the development of more advanced screening models, the relative ease at which large scale pooled shRNA screens can be performed allows for the generation of large numbers of screens in many different (tumor) cell lines under different conditions. This allows for the generation of a compendium of genes required for the survival of tumor cells with specific genetic alterations, *e.g.* RAS mutations, PTEN loss or receptor over-expression. The results of such screening efforts can be integrated with genome scale analyses for genomic alterations, gene expression and protein expression and modification thus generating a platform to discover specific dependencies and novel targets for treatment. An interesting strategy is the generation of interaction maps using shRNA vectors that contain 2 different shRNAs, each targeting a specific gene. These can also be applied in a pooled format and the combined effect of knock down of both genes can be addressed [43] thereby identifying genetic dependencies, analogous to the yeast genetic interaction maps [50, 63]. Finally, the combination of pooled shRNA screening with many different compounds can provide data allowing for the clustering of functional classes of compounds and a better understanding of their mechanism of action and potential effect modifiers.

It is without doubt that pooled shRNA screening will find its further application in many areas of research. The development of more advanced screening models, better quality libraries combined with cumulative experience will enable new discoveries that will find their way into the clinic.

## **COMPETING INTERESTS STATEMENT**

A Smith, S Anderson, A Vermeulen are employed or were formerly employed by Dharmacon. Some of the materials described are products sold by Dharmacon. There are no further patents, products in development or marketed products referenced in this article to declare. RL Beijersbergen and KJ Simpson declare that they have no competing interests.

### **ACKNOWLEDGEMENTS**

Declared None.

### **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflicts of interest.

### **REFERENCES**


### **76** *Frontiers in RNAi, Vol. 1*


T.R. Golub, M. Meyerson, N. Hacohen, W.C. Hahn, E.S. Lander, D.M. Sabatini, and D.E. Root, Highly parallel identification of essential genes in cancer cells. Proc Natl Acad Sci U S A 105 (2008) 20380-5.


### **78** *Frontiers in RNAi, Vol. 1*


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

**CHAPTER 5** 

# **RNAi for Viral Disease Control**

**Cameron R. Stewart1,\*, S. Mark Tompkins2 , Kristie A. Jenkins<sup>1</sup> , Leonard H. Izzard3 , John Stambas<sup>3</sup> , Andrew G. Bean1 , Mark L. Tizard1 , Timothy J. Doran1 and John W. Lowenthal<sup>1</sup>**

*1 CSIRO Australian Animal Health Laboratory, Geelong 3220, Victoria, Australia; <sup>2</sup> Department of Infectious Diseases, University of Georgia, Athens, Georgia, USA and <sup>3</sup> Deakin University School of Medicine, Waurn Ponds, 3216, Victoria, Australia* 

**Abstract:** Zoonotic viruses emerging from wildlife and domesticated animals pose a serious threat to human and animal health and are recognised as the most likely source of the next pandemic. Containment of emerging infectious disease (EID) outbreaks is often difficult due to their unpredictability and the absence of effective control measures, such as vaccines, therapies and diagnostics. RNA interference (RNAi) provides a novel and effective therapeutic strategy to combat infectious diseases through modulation of pathogen and/or host gene expression. In this chapter we discuss the applications of RNAi to combat EIDs. We discuss how RNAi has furthered understanding of virus lifecycles by making possible genome-wide functional genomics studies to discover host functions that are essential for virus replication, and in the process, identify new targets for antiviral therapies. We also discuss the advantages and hurdles associated with the use of RNAi as antiviral therapeutics, in addition to the engineering of disease-resistant livestock using RNAi to protect both humans and animals from EIDs.

**Keywords:** Functional genomics, host-pathogen interactions, RNAi, RNAi delivery.

### **THE IMPACT OF RNAi DISCOVERY ON HOST-VIRUS STUDIES**

### **RNA Interference**

Fire and Mello received the Nobel Prize in 2006 for their discovery of RNA interference (RNAi) in the worm, *C. elegans* [1]. They found that injection of double-stranded RNA into worms was much more effective in interfering with gene expression than injection of single-stranded antisense RNA. With the discovery that the RNAi process is triggered by dsRNA, the intricacies of the

**<sup>\*</sup>Corresponding author Cameron R. Stewart:** CSIRO Australian Animal Health Laboratory, Geelong 3220, Victoria, Australia; E-mail: Cameron.Stewart@csiro.au

RNAi pathway were quickly unravelled. Work from numerous groups subsequently confirmed that RNAi is a natural, sequence-specific posttranscriptional gene silencing pathway, and is an ancient phenomenon that is shared across kingdoms from fungi, to plants, insects and animals [2-4].

At the time of the original discovery, it is unlikely that the researchers would have appreciated the scale and breadth that the application of RNAi technology would make in little over a decade. The recent applications of RNAi in plants and animals have revolutionized our understanding of gene regulation and have proven an invaluable research tool to understand the function of specific genes in numerous organisms. Furthermore, it provides a novel and effective therapeutic strategy to combat infectious diseases through modulation of pathogen and/or host gene expression.

## **One Health Approach to Fighting Zoonotic Viruses**

Over 70% of emerging infectious diseases (EIDs) in humans are zoonotic, that is they originate from animals. Despite scientific advancements over the last three decades, EIDs continue to inflict substantial social and economic costs. Furthermore, they will continue to pose serious threats into the future due to the effects of climate change and increased volumes of trade and human travel. The term One-Health refers to the combined use of human and animal health disciplines, and it is widely agreed that this offers us the best chance to reduce the global impact of EIDs on people, animals and the environment. Perhaps the most impactful examples include H5N1 (avian) and H1N1 (swine) influenza. Since 2003, persistent outbreaks of highly pathogenic H5N1 avian influenza viruses have decimated poultry production in some regions of the world and show 60% mortality rates when transmitted from chickens to humans [5]. In 2009, the H1N1 pandemic virus spread from pigs to humans in over 200 countries and quickly became the dominant circulating strain in the human population [6]. These reoccurring outbreaks highlight the persistent and devastating nature of influenza infections and increase the risk of new pandemics. The prospect of highly pathogenic strains of influenza virus becoming transmissible between humans has recently raised concerns [7, 8]. In addition to avian influenza virus, we have recently witnessed the emergence of pathogenic zoonotic viruses from previously unappreciated reservoir hosts. Bats harbour a large range of viruses, a subset of which are very highly pathogenic in humans; including rabies virus [9], Hendra virus [10], Nipah virus [11], Ebola virus [12] and SARS-like coronaviruses [13].

### **Novel Strategies for Anti-Viral Drug Discovery**

Current control strategies for influenza are largely ineffective. Protection *via* vaccination is complicated by the ability of the virus to rapidly mutate and reassort and is largely non-existent or ineffective against the highly-pathogenic forms. Furthermore, viral resistance to current antiviral therapeutics is increasing (reviewed in [14], clearly signaling that new strategies to defeat EIDs such as influenza are urgently required. An important first step will be to increase our understanding of host-virus interactions at a cellular and molecular level. RNAi technology provides an opportunity to greatly expand our knowledge in this area.

## **RNAi SCREENS OF VIRUS-HOST INTERACTIONS**

### **Introduction**

Viruses are among the most important causal agents of animal and human disease. For example, influenza virus infection is currently the principal source of combined morbidity and mortality in the world [15] - each year affecting up to 15% of the human population, causing acute illness in millions of people, and resulting in almost 500,000 deaths globally [16].

The rapid evolution of many viruses, coupled with the consistent emergence of new ones, highlight a demand for increased understanding of virus biology and the need for new strategies for antiviral therapies. All viruses lack the full complement of proteins required for the production of infectious virus – for instance the influenza A genome consists of only 8 segments encoding 10-12 proteins. The limited genomes of viruses, particularly RNA viruses, result in elements of host cells being 'high-jacked' and used to facilitate the viral life cycle. Some of the more complex viruses also express proteins that help to evade the host's anti-viral processes. Understanding the host contribution to viral replication and immune evasion is essential for discovering new therapeutic strategies.

The determination of the human genome sequence, coupled with tremendous gains in understanding of RNAi design, has made possible genome-wide functional genomic screens. This technology is an unbiased means of discovery and has led to the identification of previously unappreciated or unknown cellular pathways involved in health and disease. In the context of virus life cycle, we can now identify hundreds of host genes required for all stages of the virus replication cycle – from cellular virus entry and genome replication to assembly/budding. Data can be considered in accompaniment to analogous technologies such as 2 hybrid screens, transcriptomics and compound library screens – collectively

### **82** *Frontiers in RNAi, Vol. 1 Stewart et al.*

providing a comprehensive view of the host-virus interactome. In addition to greatly enhancing our understanding of host-virus interactions, RNAi screens identify candidates for new antiviral therapeutics due to the druggable nature of many virus-essential host genes [17].

In this section we will briefly discuss genome-wide RNAi screens of host-virus interactions, and profile a select few host genes whose role in host virus interactions has been made possible by this technology.

## **Virus-Host Screens**

The use of RNAi screens to identify host factors associated with influenza virus replication has been reviewed recently by our group and by others [18, 19]. We will therefore only cover this topic briefly in this chapter. In total, five RNAi screens have been performed to detect host contributions to influenza virus replication. Some screens focused on early replication events only by employing viruses incapable of replication [20, 21] while others included both early and late events [22-24]. Together, the five screens generated a list of potentially relevant genes that represented approximately 2% of the screened genes. Meta-analysis of candidate genes across the 5 screens demonstrates a poor degree of candidate gene overlap, however, this feature might be expected given the different methods, reagents, cell types (immortalized human lung epithelial cells used in [17, 25], primary human brochial epithelial cells used in [24], Drosophila D-Mel2 cells used in [20], and a human osteosarcoma **cell** line used in [22]) and viruses employed (A/WSN (H1N1) used in [17, 25], A/PR8 (H1N1) used in [22, 24] and a VSV-influenza pseudovirus constructed to facilitate influenza replication in Drosphilia cells [20]) among the 5 studies. For example, one screen was conducted using reagents in Drosophila, 2 studies focused on early virus replication events, and 3 on the entire replication cycle. It is important to note that the 5 influenza screens did identify common pathways relevant to virus infection – specifically, receptor tyrosine kinases (RTK), protein kinase C (PKC), phosphatidylinositol 3-kinase (PI3-K), Raf/MEK/ERK signaling NF-B signaling.

There have been a variety of other partial and full-genome screens against other virus infections, including human immunodeficiency virus (HIV) [26-28], human cytomegalovirus (HCV) [29-33], Dengue Virus [34, 35], West Nile virus [36], and more recently, vesicular stomatitis virus [37]. Each of these screens has used different approaches, and like influenza, even where the same virus was used for the screen, there has been limited overlap of genes identified as important for virus replication [38, 39]. However, despite the lack of overlap of specific genes, common pathways have been identified through meta-analysis [39]. Specific host genes that influence virus replication have also been validated in some of these virus screens. For example, a screen of kinases involved in HCV replication identified carboxylterminal Src kinase (Csk) as important; when the Csk was silenced by RNAi, HCV replication was reduced HCV subgenomic replicon system. This was confirmed using a small molecule inhibitor of Csk, JK239 [30].

### **Host Factors Impacting Virus Life Cycles Identified by RNAi Screens**

### *Interferon-Inducible Trans-Membrane Proteins*

One of the most exciting aspects of RNAi screens is the characterization of host proteins with previously unappreciated roles in virus replication. One example of this is the interferon-inducible trans-membrane proteins, IFITMs, which as the name suggests, were first identified due to their up-regulation in response to interferon treatment in human neuroblastoma cells [40]. The IFITM protein family has over 30 member proteins. A genome-wide RNAi screen of influenza A H1N1 virus identified IFITM3 as a gene inhibiting influenza replication, which was subsequently validated in a secondary screen using deconvoluted siRNAs. IFITM1, 2 and 3 were shown to disrupt the early stages of influenza replication, while IFITM3 also inhibited replication of flaviviruses such as West Nile virus and dengue virus. Demonstrating the critical nature of this protein in antiviral immune responses, IFITM3 was required by interferon (IFN)-α and IFN-γ to confer antiviral resistance in U20S and HeLa cells challenged with influenza virus. It can therefore be stated that IFITM3 in particular plays an important role in the IFN response.

Subsequent work demonstrated that IFITM proteins 1, 2 and 3 inhibited the early replication stages of HIV [41], a function that relied on the intracellular domains of the proteins. Studies were subsequently expanded to show that IFITM proteins display antiviral properties against Marburg and Ebola filoviruses and SARS coronavirus [42]. Here it was shown that IFITM proteins differentially restrict early virus replication depending on virus types – for instance, IFITM1 inhibits Marburg and Ebola viruses more so than IFITM3, while IFITM3 was again shown to be more important for blocking influenza virus than IFITM 1 or 2. The exact mechanisms of IFITM antiviral activities are still being elucidated. As they are relevant for a diverse range of viruses, it is currently thought that IFITMs antagonise viruses during the late endocytotic pathway [42]. Collectively these studies demonstrate a role for IFITM proteins in the early antiviral response against a broad range of enveloped viruses.

### **84** *Frontiers in RNAi, Vol. 1 Stewart et al.*

### *The Coatamer Proteins*

Proteins associated with the COP-I coatamer complex are perhaps the best-known example of a host factor required for the replication of a diverse range of viruses. In spite of the poor overlap of hit candidates between screens of the same virus – COP-I complex members have been identified as required for influenza virus replication, and for other viruses such as HIV, Dengue and West Nile virus. The COP-I complex is comprised of 7 protein subunits, α-, β-, β'-, γ-, δ-, ε- and ζ-COP [43]. COP-I coated vesicles are approximately 75-100 nm, and were first identified localised to the Golgi apparatus [44, 45] and to the endoplasmic reticulum [46]. The precise role of COP-I complexes in the early secretory pathway is not clear. COPI- coated vesicles are involved in the retrograde transport of transmembrane proteins to the endoplasmic reticulum [47, 48], while prospective roles in anterograde transport are less clear. A recent model proposes that COPI-I vesicles move both in retrograde and anterograde directions to mediate anterograde cargo transport [49].

The protein components of these COPI-coated vesicles have been identified in a number of high throughput siRNA screening studies as being an important host factor during virus infection [31, 36, 50]. The α-COP protein in particular is of interest as an important protein involved during virus infection [51]. RNAimediated inhibition of COP proteins, including COPA in PC3 prostate cancer cells [52], and COP-B [53], COPZ-1 [52] and COPZ-2 [52] in other cell types,causes a marked dispersion or "collapse" of Golgi structure. Cytotoxicity associated with COP protein depletion is considered a consequence of modulating the autophagy pathway as a result of ER stress caused by Golgi dispersion [54]. Treatment of cells with the fungal metabolite Brefeldin A (BFA) inhibits COP-I vesicle function in an analogous manner to RNAi interference, and also results in Golgi dispersion. This can be attributed to the effect of BFA on ADP ribosylation factor 1 (Arf1), a GTP-binding protein required for COPI complex formation. The activation of Arf1 by a guanine nucleotide exchange factor (GEF) containing a Sec7 domain [55] coaxes the exchange of bound GDP to GTP, leads to the engagement of the subunits (subcomplexes) from the cytosol to Golgi membranes [56]. Subsequently, assembly takes place to unite these subcomplexes into one structure. The effect of BFA on Arf1 potently inhibits anterograde membrane transport, COP proteins binding to membranes [57], and culminates in Golgi apparatus dispersion and redistribution of Golgi proteins to the ER. BFA impacts replication of a range of viruses, including poliovirus [58], vaccinia virus, and HIV-1 [59].

The mechanism by which COPA and COPI modulation impacts virus replication may be multi-pronged. Gross impairment of the trans-Golgi network observed for COP depletion would be reasonably expected to impair maturation of the virus proteins requiring glycosylation, such as the haemagglutanin (HA) glycoprotein, therefore inhibiting assembly of infectious virus particles. Indeed, BFA treatment decreases the expression of HA on the plasma membrane of influenza-infected cells [60]. However, COPI coatomer proteins are also associated with endocytosis [61, 62]. Relatedly, depleting the temperature-sensitive epsilon-COP inhibits endocytosis of Semliki Forest virus and vesicular stomatitis virus (VSV) [63]. A more recent study has demonstrated that COPI inhibition causes a primary inhibition of VSV internalisation, with a secondary impairment of viral RNA replication [64].

## *Rab6 GTPase*

Rab6 is a Golgi-localised GTPase that regulates endosome to-Golgi and Golgi-to-ER retrograde transport. Rab6 was identified in a genome-wide RNAi screen of host factors required for HIV [28]. While the depletion of Rab6 in HeLa cells restricted HIV replication, this inhibitory effect was not observed when HIV was pseudotyped with VSV G envelope protein, which alters the mode of HIV cellular entry from direct cell membrane fusion to virus endocytosis followed by endosome membrane fusion.

Interestingly, while this first report suggested that Rab6 facilitates HIV entry, subsequent work indicated a role for Rab6 in human CMV assembly. The intracellular formation of CMV virions occurs at perinuclear assembly compartments where viral proteins – including the CMV tegument protein pp150 – localise. The localisation of pp150 to assembly compartments requires an interaction with Bicaudal D1 (BicD1), a host effector protein of Rab6. BicD1 colocalizes with Rab6 in the trans-Golgi network [65]. Disruption of Rab6 function in CMV-infected cells interrupts pp150 intracellular movement and virus production without disturbing formation of assembly complexes [66].

## *Calcium/Calmodulin-Dependent Protein Kinase (CaM Kinase) II Beta*

Calcium/calmodulin-dependent protein kinase (CaM kinase) II beta (CAMK2B) is a widely expressed calcium-binding protein that modulates a range of cellular functions, including those tied to actin-mediated cytoskeletal control and CREBfacilitated transcription. A genome-wide RNAi screen identified CAMK2B as a host gene impacting influenza A (WSN/33) replication [17]. Inhibition of CAMK2B by the chemical compound KN-93 inhibited replication of influenza

### **86** *Frontiers in RNAi, Vol. 1 Stewart et al.*

A/WSN/33 and swine-origin influenza virus, but not VSV, in MDCK cells, hinting that pharmacological modulation of CAMK2B may be a valuable antiviral strategy. How influenza virus replication is facilitated by CAMK2B is not fully understood. Cells with depleted CAMK2B show impaired nuclear translocation of the influenza NP protein, but not defects in cellular virus entry or the formation of viral ribonucleoproteins in the nucleus [17]. The authors suggested that CAMK2B was regulating viral RNA transcription.

CAMK2B has been implicated in the replication of another virus genome – the double stranded DNA virus African swine fever virus (ASFV) [67]. Replication of ASFV involves the restructuring of scaffolding proteins known as type III intermediate filaments, and a major protein component of such structures known as vimentin [68]. CAMK2B is required for the phosphorylation of vimentin which triggers the disassembly and movement of vimentin filaments on microtubules. Both viral DNA replication and vimentin post-translational modifications (*e.g.*, phosphorylation) are prevented by KN-93 in Vero cells [67].

Members of the calcium/calmodulin kinase networks were also over-represented in an RNAi screen of druggable genes regulating the replication of Ebola virus [69]. Treatment of HEK 293 cells with KN-93 inhibited infection efficiency of Ebola virus, Zaire strain, pseudotyped with lentivirus. The authors of this study point out that while CAMK2B inhibitors have not been tested clinically as antivirals, studies that implicate calcium modulation in virus replication identify additional targets for therapeutic development.

## **Towards Broad-Spectrum Antiviral Strategies**

The plethora of data generated from RNAi screens of host-virus interactions not only furthers understanding of host pathways impacting virus replication cycles, it also provides an opportunity to drive ahead translational outcomes such as new antiviral therapeutics, diagnostics and vaccines. One example of this, the enhancement of influenza virus vaccine production in embryonated chicken eggs, resulting from RNAi-mediated impairment of the host antiviral response, has received considerable commercial interest, and will be discussed later in this book.

While strategies to treat emerging infectious diseases benefit from virus-specific genome wide RNAi screens, it also raises the case for compiling existing information from completed RNAi screens for the development of broad-

### *RNAi for Viral Disease Control Frontiers in RNAi, Vol. 1* **87**

spectrum antivirals. This is particularly relevant for sudden disease outbreaks, where a lack of time and resources hinders the management of outbreaks during the crucial early stages. For the approximately fifty FDA-approved antiviral drugs, over 80 % of these have viral targets, particularly targeting viral enzymes [70]. While this strategy offers benefits in regards to minimising side effects, it encounters obvious limitations in regards to the application of drugs to other viruses. Few if any of these drugs are broadly active. The alternate strategy – targeting the host – requires a detailed knowledge of virus-essential host factors for a wide range of viruses, and furthermore, knowledge of the consequences of modulating host gene expression.

The concept of meta-analysing screen data from many RNAi screens presents several challenges. Firstly, there is the issue of a poor degree of agreement in candidate genes observed when screens are conducted by different research groups. However, taking influenza virus as a case in point, there are obvious explanations for these perceived anomalies. For example, for the 5 RNAi screens performed on this virus, there were differences in the type of virus used, multiplicity of infection, duration of infection, cell type used, readout method deployed, among others. Due to these technical differences from screen to screen, it has been proposed that entire pathways be the subject of meta-analysis, rather than individual genes. Performing this analysis clearly identifies pathways relevant to influenza infection [71-73]. We have discussed these pathways in a recent review [73].

Another issue for consideration is the safety of modulating host gene expression – however briefly – to combat a viral infection. Firstly, the consequences of modulating any host protein may be deleterious. One example in this context are the COP-I complex proteins, mentioned in the previous section. While COPI-I proteins are required for the replication for most viruses tested by RNAi screening, their integral role in ER to Golgi cargo trafficking, and the effect of COP-I depletion on Golgi apparatus structure (and function), calls to question whether it would be tolerable for the host to impair this pathway, however briefly. In such instances, a relevant discussion point may be the pathogenicity of the viral pathogen. For example, an individual infected with Hendra virus or Ebola virus would likely be more willing to be treated with a therapy that impairs COP-I function – a potentially life-threatening scenario – more than someone infected with a low pathogenic virus such as seasonal influenza virus.

### **BIOLOGICAL DELIVERY OF RNAi**

### **Antiviral siRNAs Targeting Host and Virus Genes**

The objective of deploying RNAi to control viral disease in a human disease scenario inevitably requires the use of synthetically produced siRNA as a form of antiviral drug. Options that employ a transgenic approach for humans will be very restricted in technical applicability and will encounter substantial regulatory hurdles. Since the first recognition of the mechanism of RNAi its application to control viral infection by delivery of siRNA has been recognized and tested. The most obvious targets were those viruses that initially or predominantly infect accessible tissues, principally those of the respiratory tract including respiratory syncitial virus (RSV), parainfluenza virus (PIV) and influenza virus. Early studies showed promise for the activity of siRNA against all of these viruses [74, 75]. For such an accessible location it may be possible to use siRNA as it is without formulating or complexing reagents. However the entry of siRNA molecules into the target cells in which their antiviral action is required is a complex process that is not fully understood. Although siRNAs are relatively small as nucleic acids, only 19-21 base pairs on average, as drug molecules they are very large, being in the range of 15 kDa and highly negatively charged. This means they will not passively transit the lipid bilayer of the cell membrane. If entry is not *via* direct membrane transit then the uptake of siRNA into the cell is expected to be *via* one or more of a series of mechanisms, including receptor mediated endocytosis, nonspecific endocytosis, clathrin-mediated endocytosis, caveolae, micro-pinocytosis, macro-pinocytosis or phagocytosis. This leads to siRNA within the cell entrapped in a vesicular body which may be on a pre-programmed pathway through the cell, possibly on a destructive journey to a lysosome. At some point the siRNA has to escape its membrane bound chamber in order to access the cytoplasm, to interact with the RNA Induced Silencing Complex (RISC) to deliver the desired effect of gene specific silencing. Mediating this outcome is the design objective of various biomaterials, from simple preparations of polyethyleneimine (PEI), modifications of chitosan, through to more complex Dynamic Polyconjugates [76] and other biopolymers [77].

There are a number of research reagents available for transfection of siRNA which work to great effect on a number of cell types *in vitro*. Some have been used *in vivo* experimentally but none of these have made the transition to clinical use in humans. Bitko *et al.* (2004) [74] employed TransIT-TKO (Mirus Bio Corp, USA) in a mouse model administered intranasally and showed significant knockdown of both RSV and parainfluenza virus. Influenza A virus replication has been controlled in mouse models of infection using hydrodynamic (short duration-high pressure intravenous) delivery of siRNAs targeting the NP or PA genes, in combination with intranasal administration of siRNA formulated in oligofectamine (Invitrogen) [78] or by intravenous delivery of PEI complexed with siRNAs targeting the NP gene [75].

Systemic or parenteral delivery may be required for non-respiratory virus infection or when systemic spread has occurred. The tissues may be more difficult to access especially if this involves the brain or other privileged sites beyond the normal vasculature, *e.g.,* retinal blood barrier, testis blood barrier. Probably the most accessible tissue subject to morbid viral disease is the liver. Its role as part of the reticuloendothelial system (RES or mononuclear phagocytic system) scavenging wastes is a double edged sword. The Kuppfer cells of the liver pick up a great deal of material from circulation in their role in the MPS, but it is often the liver hepatocytes that are the target of drug delivery or viral infection. Hepatitis B virus (HBV) is a widespread infection and significant burden in public health and a beckoning target for siRNA therapy. However, the need to deliver intravenously and get the majority of the dose to hepatocytes in the liver remains a significant hurdle [79]. Early applications of lipid-encapsulated, chemically modified siRNA showed promise in terms of controlling HBV in a mouse model of virus replication, but required repeated daily dosing to a achieve between a 1 and 2 log reduction in circulating HBV DNA [80]. Two methods with similar approaches using cationic lipid/cholesterol formulations have had some success in controlling replication of this virus in a mouse model. One utilized co-formulation with apolipoprotein A-1 to target siRNA delivery to hepatocytes and showed reduction in HBV markers for 8 days [81]. Another study used co-formulation with polyethyleneglycol (PEG) as a stealth agent and hydrodynamic i.v. delivery of the siRNA to increase the duration of the anti-HBV effect [82]. More recently this formulation was used to treat hepatitis C virus infection in the mouse, achieving 65-75% inhibition of viral gene expression with a 2mg/kg dose of siRNA, with an improved efficiency of 95% and an increased duration of effect to 6 days by using a 2'-O-methyl modified siRNA [83].

A wide range of delivery approaches have been employed *in vivo* with varying degrees of success against a variety of significant viruses [84]. A number of delivery approaches are discussed by Guzman-Villanueva *et al.* [85]. These include lipoplexes (particularly those containing cationic lipids), PEI (alone or complexed with PEG), dendrimers, polymeric carriers (including Dynamic Polyconjugates), gold nanospheres, direct cholesterol conjugates and stable

### **90** *Frontiers in RNAi, Vol. 1 Stewart et al.*

nucleic acid lipid particles (SNALPs). siRNA with chemical modifications have been used in combination with many of these systems but they have also been used alone, so-called naked siRNA delivery. These modifications are generally based around the 2' hydroxyl of the ribose sugars in the RNA backbone and include modifications such as 2'O-methyl, 2'-fluoro, 2'-O-methoxyethyl, phosphorothioate, locked nucleic acid, and unlocked nucleic acid. In many mouse models of naked siRNA or formulated siRNA hydrodynamic delivery has been used, which involves rapid (5-7 sec) injection of up to 10% of animal body weight in fluid. This is a method that clearly is not acceptable for human use.

Of the methods for delivery of siRNA that have made it through to clinical assessment it is predominantly SNALPs and naked (modified) siRNA that have been targeted to viral disease [85]. TKM-Ebola by Tekmira intravenous SNALP formulation, ALN-RSV01 by Alnylam intranasal are the key contenders.

The use of siRNA offers great potential as a treatment for viruses that inflict a variety of human afflictions. Our ability to demonstrate the effectiveness of therapeutic siRNA against viral target genes remains relatively trivial *in vitro*, but the translation of this to humans in the clinic currently remains tantalizingly out of reach for most of our targets. A simple, effective formulation of siRNA that will get it to the appropriate target cell with sufficient penetration and duration remains the Holy Grail. As we learn more it is clear that simplicity may be a vain hope. RNAi has the potential to be the one-size-fits-all solution to knocking down these unwanted or exogenous gene expression. But it is most likely that each virus, with its characteristic tissue tropism and mode of spread, will require a tailor-made delivery vehicle to bring the effector to the target. In the long term the current advances in lipid bases and polymeric delivery vehicles point to great hope for a future in which RNAi can make the impact on viral disease that antibiotics made on bacterial disease in the middle of the last century.

The biosecurity needs of a burgeoning world population and the growing threat of EIDs put tremendous pressure on the medical system as it struggles to combat a variety of viral diseases. Capability now exists to rapidly identify and characterize new viral pathogens, including extracting their nucleotide sequence. This provides the Achilles Heel to attack these pathogens through RNAi in a way that can be rapidly adapted to these new viruses. With the devastating mortality and economic consequences of outbreaks in recent decades of H5N1, SARS, Nipah virus, Hendra virus and others, it is clear that emphasis on getting RNAi delivery through to the clinic should be a priority research objective.

### **Biological Delivery of RNAi**

For a siRNA delivery system to be successful, it must have a number of specific properties. To begin with, it must be able to protect the siRNA from degradation. It must also be able to bind to, and enter the target cell, and must be able to efficiently deliver the siRNA to the cell nucleus. For this reason, scientists have utilised virus delivery systems as they have evolved to display all of these properties (4). Current literature has described successful delivery of both miRNAs and siRNAs using a variety of different virus vectors, from lentiviruses and adenoviruses to replication-competent avian with splice acceptor (RCAS) retroviruses. These systems are discussed briefly below.

### **Lentivirus Systems**

Lentiviruses are members of the retrovirus family and contain two copies of a single stranded RNA genome [86]. Members of the retrovirus family are often highly pathogenic and include HIV. As such they are typically engineered to form 'self inactivating' (SIN) vectors. This usually involves removing the U3 region in the LTR and replacing it with a heterologous promoter such as CMV [87]. The key benefits of lentivirus delivery systems are that they are capable of infecting both dividing and non-dividing cells, making them prime candidates for neurological RNAi studies. In addition, they are able integrate into the genome of the host and stably persist and can be modified to broaden their tropism [86].

Limitations associated with lentiviral delivery include multiple insertions into the genome, no control of the location of insertion which may impact gene expression, a restriction on cell lineages and types infected with the virus, and the insertion of viral elements that may be unwanted into a genome.

### **Adenovirus Systems**

Adenoviruses are one of the most popular virus delivery systems, with over 20 % of gene therapy clinical trials utilizing adenovirus transport systems (Wiley Gene Therapy Database, 2012; http://www.wiley.com/legacy/wileychi/genmed/clinical/). Adenoviruses are double-stranded DNA viruses with the ability to package up to 36 kb of foreign DNA [88]. Of note is their ability to express high levels of siRNA and the ease of vector construction. These systems have been used for various RNAi therapies including; cancer targeting [89], neurodegenerative disease treatment (motor neuron disease) [90] and the targeting of other viruses such as HBV [91]. However, as with all delivery systems a number of limitations have been noted.

### **92** *Frontiers in RNAi, Vol. 1 Stewart et al.*

Studies have shown that in some cases, high levels of immunostimulation have been observed following adenovirus administration. Moreover, relatively short siRNA expression periods have also been problematic [86].

## **Other Delivery Systems**

## *Replication-Competent Avian with Splice Acceptor*

Replication-competent avian with splice acceptor (RCAS) retroviruses are used to permanently integrate siRNAs into the genome of avian models, most typically chickens [92]. However, modified versions have also previously been used in mouse models [93].

## *Adeno-Associated Viruses*

Adeno-associated viruses (AAV) have small genomes (4.8 kb) comprised of ssDNA. The viruses depend upon a second virus (for example adenovirus) for productive infection to occur [94]. The entire genome of AAV contains only two genes that can be replaced allowing up to 5 kb of foreign DNA to be added. Unlike adenoviruses, AAVs are considered to have a low immunostimulatory response and can be modified to change tropism, thereby the cells it targets. However, similar to adenoviruses, AAVs are not capable of integrating with the host genome and therefore are not a viable option for long term gene silencing [87].

## *Influenza virus and Herpes Simplex Virus*

Influenza A viruses have been engineered to express a neuron-specific microRNA (miR-124) *in vitro* and *in vivo* [95] demonstrating that it is possible for a single stranded RNA viruses to express miRNAs. Herpes simplex viruses have also been effectively used as a shRNA delivery system [96]. They have silenced the expression of an amyloid precursor protein gene and have reduced levels of amyloid-β peptide (a cleavage product of amyloid precursor protein) believed to be important in Alzheimer's disease pathogenesis.

## *Bacterial Delivery Systems*

While extensive preclinical and clinical studies have been largely focused on viral delivery of shRNAs, bacteria have also been modified to function as delivery systems. Xiang *et al.* transfected *Escherichia coli* with a plasmid encoding genes to allow it to not only enter target cells in the colon but to also release a shRNA targeting the CTNNB1 cancer gene [97]. The same gene has also been targeted using Salmonella enterica using shRNA [97].

## **RNAi and Clinical Trials**

The ability of RNAi to regulate endogenous gene expression is currently being harnessed by numerous research groups, in collaboration with commercial partners, for clinical application in the treatment of human disease. Herein we describe a number of completed, ongoing or pending trials and discuss future directions of this technology.

Non-biological delivery systems have been based on nanoparticles, aptamers, cholesterol and stable nucleic acid lipid particles. Details of the individual systems and appropriate references have been discussed in detail in previous sections. Many of these have now been deployed in Phase I, II and in one case, Phase III clinical trials. The pathway to clinical translation has been difficult. A number of significant obstacles have been encountered and must be considered in current and future applications. These include: (i) delivery, (ii) RNAse-mediated degradation and renal clearance; chemical modification of RNAi molecules to ensure bioreactivity, (iii) targeting to specific tissues and (iv), careful analysis of safety profiles to reduce or remove unwanted off target effects. Having said this, the NIH clinical trials database lists 33 current, completed or pending siRNA studies (see http://www.clinicaltrials.gov). These trials cover a number of conditions including macular degeneration, glaucoma, numerous cancers (*e.g.,* myeloid leukaemia, solid tumours, melanoma, liver and pancreatic), kidney transplantation and renal failure (see Table **1** below).


**Table 1:** National Institutes of Health RNAi Clinical Trials

### **94** *Frontiers in RNAi, Vol. 1 Stewart et al.*

*Table 1: contd…* 


### *RNAi for Viral Disease Control Frontiers in RNAi, Vol. 1* **95**

*Table 1: contd…* 


## **Clinical Trial Highlights**

It is important to note here that regulatory authorities have not as yet recorded serious adverse reactions following RNAi delivery. At the moment, the most advanced therapeutic in the RNAi pipeline is ALN-RSV (Alnylam Pharmaceuticals), which is currently undergoing Phase IIb trials and targets the expression of the RSV N-protein. The target endpoint focused on the occurrence of new or progressive bronchiolitis obliterans syndrome (BOS) at 180 days post viral infection. BOS is major source of morbidity and mortality in lung transplant patients, with re-transplantation being the only definitive BOS treatment. RSV is responsible for up to 25 % of viral lung infections occurring in BOS transplant patients. Human experimental infection study showed ALN-RSV01 significantly decreased infection rate in healthy volunteers infected with RSV intranasally [98]. The phase II clinical trial fell short in an intent-to-treat (ITTc) program, but showed greater promise in prospectively defined analysis of ITTc recipients [99].

## **APPLICATION OF RNAi AND TRANSGENIC TECHNOLOGIES TO DEVELOP VIRUS RESISTANT ANIMALS**

### **Introduction**

Transgenic (Tg) species will play a vital role in meeting global dietary and caloric needs and transgenic technology has been successfully applied to both agricultural & aquatic animals (see Tables in [100-102]). Proof-of-principle studies have demonstrated the potential application of RNAi in: (1) production of predominantly female progeny (dairy and egg industries), thereby minimizing *e.g.*, castration and elimination of males; (2) Tg swine to improve meat output by minimizing fatality and increasing the rate at which young animals mature; (3) Tg cows whose mammary glands express an anti-bacterial agent against pathogens responsible for mastitis in dairy cattle; (4) Tg swine that synthesize omega-3 fatty acids, and (5) "Enviro" pigs which release reduced amounts of phosphorous into the surroundings.

In aquatic species, similar enhancements have been achieved (reviewed in [103]) that augment: (1) growth; (2) bacterial resistance; (3) nutritional appeal; and (4) temperature tolerance.

This section focuses on how RNAi in combination with transgenic technologies would control of viral diseases and thus facilitate animal production. It was almost a decade ago when Clark and Whitelaw proposed in their review "A future for transgenic livestock" that the advent of the then new method of RNA interference (RNAi) for modifying genomes will underpin a resurgence of research using transgenic livestock [104]. They suggested this may be an important alternative to traditional breeding and could lead to the generation of farm animals that are more resistant to infectious disease (*e.g.*, influenza resistant poultry). With the advent of RNAi technology and the newest generation of gene editing tools, genetically engineered animals become an alternative to vaccines and small molecule drugs [105] circumventing the costs associated with these controls, which often limits their uptake and effective implementation.

### **Applications of RNAi for Disease Resistant Animals**

There are a number of approaches that take advantage of combined RNAi and transgenic techniques to ultimately develop disease resistant animals. These include (1) targeting host genes for functional analysis of host pathogen interactions; (2) targeting host genes required by the virus for infection; and (3) targeting of viral genes to prevent replication and spread of infection.

The biggest impact of RNAi so far has been in the study of gene function. RNAi is now a common tool for studying biological processes including host pathogen interactions. This work has primarily been conducted *in vitro* using siRNAs (see RNAi screening section in this chapter) or shRNA expression constructs. From these screens key genes of interest are identified, however, the transfer of this technology to laboratory and production animals has been limited due to difficulties in the delivery of RNAi molecules *in vivo* (see delivery section in this chapter) – this can now be circumvented using transgenic technology which allows integration of shRNA/miRNA sequences into the genomes of target animal species. This also allows tissue specific, constitutive or inducible expression of shRNA/miRNAs depending on the host-pathogen model. Whilst the outcome of this application may not directly lead to the development of viral resistant animals, it will lead to advances in the development of new vaccines and antiviral therapeutics. It may also lead to the identification of key regulatory sequences within genomes which can be used in conventional selective breeding programs for viral disease control.

A more direct application of RNAi and transgenics is to target key genes involved in virus interaction with host cells to thereby limit or prevent the attachment of virus to cells, replication of virus once in the cell and subsequent spread of infection from cell to cell. Luo *et al.* recently used shRNAs to target the porcine integrin αv subunit, which is the foot and mouth disease virus (FMDV) receptor, in a porcine cell line [106]. They reported that the inhibitory effect of the receptor knock down on FMDV growth was >3-fold and was accompanied by a > 99% decrease in virus titre when cells were exposed to 102 TCID50 of FMDV. This study identifies shRNA as plausible reagents to control FMD infection and dissemination in pigs and other susceptible livestock species. For this strategy, receptor gene knock down may be preferable to alternative knock out approaches as complete abrogation of these proteins may be detrimental to the host. Although

### **98** *Frontiers in RNAi, Vol. 1 Stewart et al.*

this will not completely eliminate the virus, it should reduce virus levels to allow the host immune response to work in synergy with the RNAi activity and successfully fight infection.

Perhaps the most promising approach is targeting of viral genes within host cells. The basic concept of this technology is that the transgenic animal has an RNAi transgene inserted into the genome leading to expression in every host cell. This transgene expresses shRNA(s) specifically targeting conserved regions within key viral genes. By this approach, essential viral functions, including viral replication, can be eliminated or slowed, thereby disrupting the viral lifecycle. Furthermore, by designing the constructs to target multiple viral genes, this strategy can allow broad protection across a range of serotypes and strains while helping to avert the appearance of resistant viral mutants [105]. There are numerous examples of virus-resistant transgenic animals currently in development including; avian influenza resistant chickens [107], Foot and Mouth Disease resistant ruminants [108], viral hemorrhagic septicaemia resistant zebrafish as a model for salmon infection [109] and the development of transgenic swine capable of inhibiting porcine retrovirus replication [110]. Circuitously, the latter pathogen has been particularly important for humans since the virus can replicate in human cells *in vitro*, and is thus an obstacle to swine xenotransplant advancements.

## **Design of RNAi Transgenes**

There are a number of considerations to take into account when designing RNAi transgenes for this purpose. These include (1) targeting choice; (2) shRNA design (3) promoter choice.

A combinatorial RNAi approach is desirable (Burkout *et al.*) targeting conserved sequence regions within a number of essential viral genes such as polymerases or key structural genes. Sometimes referred to as a "multi-warhead transgene" this also enables targeting of strategic genes at various time points of the virus infection cycle, maximising the opportunity to interfere with and prevent virus spread.

The design of the RNAi hairpin against the identified target is also critical to enable effective processing *via* the RNAi pathway in the cell. There are a number of algorithms available for siRNA design however these do not always translate to effective shRNA sequences. One algorithm to screen siRNAs to see if they may be effective as shRNAs was developed by Taxman *et al.* and is commonly used for this purpose [111]. It is still essential, however, to validate shRNAs against the

### *RNAi for Viral Disease Control Frontiers in RNAi, Vol. 1* **99**

target virus in an *in vitro* virus replication assay prior to undertaking the considerable task of generating a transgenic animal. It is also crucial to ensure that the shRNAs do not inhibit host gene expression in an off target manner. Homology between the shRNAs and the host genome should be examined (*e.g.,* BLASTN) and sequences with near perfect identity should be excluded.

Approximately 60 % of studies use the Brummel-Kamp loop [112]; however, a recent development in the design of antiviral shRNAs is mimicking the structure on naturally occurring miRNA sequences, this has been shown by a number of studies to have benefits such as increased processing to the mature form. To date studies which have looked at shRNAs with a miRNA structure have primarily focused on changing the loop to a mir loop sequence *e.g.,* mir30 but not on introducing bulges into the stem which miRNAs also contain [113, 114]. An example of this has been used by Boden *et al.* who enhanced gene silencing efficacy using an anti-HIV shRNA that was designed with a mir30 structure. In general shRNAs and miRNAs act in different ways to knock down gene expression (mRNA degradation *vs* translational repression) and this also must be taken into consideration when designing and developing an RNAi transgene. A study by Lebbink *et al.* [114] comparing miRised shRNA driven by polII promoter were less effective in achieving target knockdown compared to conventional shRNAs expressed from the RNA polymerase III promoter U6.

The third consideration when designing an RNAi transgene is promoter choice. Promoter strength is vital as too much expression can be detrimental or lethal to particular cell types or to embryo development. This is thought to be as a result of saturation of the RNAi pathway meaning key miRNAs that are required for cell differentiation are out competed by the antiviral shRNAs. The limiting step is thought to be transport of expressed hairpins from the nucleus to the cytoplasm *via* Exportin 5 protein [115]. RNA polymerase III (PolIII) promoters are commonly used for shRNA expression. PolIII family members include U6, 7SK and H1 promoters. The H1 promoter is the weakest in strength and is therefore often the promoter of choice for transgenic animal production. Although U6 is commonly used *in vitro*, it is too strong for *in vivo* use, although this can be overcome by site directed mutagenesis of key promoter elements leading to attenuated promoter strength. For particular viral targets it may be preferable to choose tissue specific or inducible promoters such as RNA polymerase II promoters. Species specific promoters are also an important consideration for the development of optimised RNAi transgenes. This may also be an advantage when considering regulatory approval and consumer acceptance of transgenic animals in the future.

## **Transgenic Technologies**

We have seen many advances in transgenesis technology in animals over recent years, including the use of retroviruses (*e.g.*, lentivirus), transposons (*e.g.*, piggy bac and Tol 2) and non-homologous recombination. These methods all result in the random integration of the transgene into the target genome, often in multiple copies and sometimes into important regulatory sequences within the genome. Therefore there is quite often a large amount of screening work required to identify and characterise the number of insertions and their locations, the level of transgene expression and ultimately the founder animals that will be used to breed transgenic offspring for viral challenge experiments to validate resistance. The recent advances in megenuclease technologies (zincfinger nucleases, TALENs) will allow the targeted integration of RNAi transgenes into selected regions of host genomes (often termed "safe harbours"). We see this as a major advance in the development of transgenic animals with RNAi transgenes conferring resistance to viral pathogens.

## **CONCLUSION**

The ability to produce viral resistant livestock will increase the welfare status of production animals, contribute to increasing the quality and safety of food production particularly in intensively reared animals such as poultry, and most importantly serve to enhance food security worldwide. Perhaps more importantly, developing animals that are resistant to zoonotic viruses with pandemic potential such as H5N1 and H1N1 influenza is a key strategy for reducing the risk of pandemic emergence in humans.

## **ACKNOWLEDGEMENTS**

Declared none.

## **CONFLICT OF INTEREST**

The authors confirm that this chapter content have no conflict of interest.

### **REFERENCES**


### **102** *Frontiers in RNAi, Vol. 1 Stewart et al.*


### *RNAi for Viral Disease Control Frontiers in RNAi, Vol. 1* **103**


### **104** *Frontiers in RNAi, Vol. 1 Stewart et al.*


### *RNAi for Viral Disease Control Frontiers in RNAi, Vol. 1* **105**


**106** *Frontiers in RNAi, Vol. 1 Stewart et al.* 


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

**CHAPTER 6** 

# **Host-Encoded miRNAs Involved in Host-Pathogen Interactions**

**Samantha Barichievy1 and Abhijeet Bakre2,\*** 

*1 Gene Expression and Biophysics Group, Synthetic Biology–Emerging Research Area, Council for Scientific and Industrial Research, Pretoria, South Africa and <sup>2</sup> University of Georgia, College of Veterinary Medicine, Department of Infectious Diseases, 111 Carlton Street, Athens, GA 30602, USA* 

**Abstract:** Humans display a remarkably diverse susceptibility to infection, the foundation of which lies in our genetic variation and ability to respond to selective pressures applied by various infectious agents. The evolution of our complex and multiplayer immune system underlines the dominance of the human host following a microbial infection. However, given the nature of obligate intracellular pathogens, their complete reliance on host gene expression machinery has led to the evolution of complex interplays between the two, such that pathogens actively and strategically maneuver their way through the host terrain. Our traditional view of this terrain as being comprised of protein-coding genes, translation intermediates (mRNAs) and protein counterparts is far too simplistic, particularly in the context of infection. The discovery of the RNA interference (RNAi) pathway has greatly enhanced our understanding of the host terrain. Small noncoding RNAs (ncRNAs) termed microRNAs (miRNAs) were shown to be key regulators of gene expression that function within the RNAi pathway to post-transcriptionally modulate mRNA stability and subsequent translation [1]. Indeed, it is now understood that miRNAs are able to rapidly, and with exquisite specificity, modulate gene expression in response to numerous environmental cues in a highly coordinated, complex and tissue-specific manner. Given the reliance of intracellular pathogens on host gene expression machinery, the RNAi pathway, and specifically miRNAs, are now understood to lie at the nexus of the host-pathogen interplay. The focus of this chapter will be on the characteristics and roles of these small noncoding RNAs in host-pathogens interactions.

**Keywords:** Hepatitis C virus, Herpesviruses, HIV-1, host-pathogen interactions, Influenza, infectious disease, miRNAs, target identification, Respiratory syncytial virus.

**Ralph A. Tripp & Jon M. Karpilow (Eds) © 2014 The Author(s). Published by Bentham Science Publishers**

<sup>\*</sup>**Corresponding author Abhijeet Bakre:** University of Georgia, College of Veterinary Medicine, Department of Infectious Diseases, 111 Carlton Street, Athens, GA 30602, USA; Tel: 706-542-2205; Fax: 706-583-0176; E-mail: bakre@uga.edu

### **AN INTRODUCTION TO miRNAs**

### **miRNAs in the Noncoding RNA Space**

Only 2% of the metazoan genome encodes protein, yet more than 50% is transcribed and we have little knowledge regarding these transcripts that function in the absence of protein production. In fact, stable ncRNA transcripts have been referred to as 'dark matter' within the cellular environment [2]. Despite improvements in the human draft genome sequence, ncRNAs remain difficult to define and thus quantify [3, 4]. However, numerous evolutionary studies have revealed that ncRNAs are estimated to be expressed at 4-fold excess compared to their protein-coding counterparts and are highly conserved across eukaryotic genomes [3]. Currently, ncRNAs have been systematically organized (according to length) into long ncRNAs (lncRNAs; > 200 bp) or small ncRNAs (< 200 bp) and make up greater than twenty-six functional categories that reflect the breadth of ncRNA function and diversity [3, 5]. In the case of humans, lncRNAs such as autonomous and non-autonomous retrotransposons, retrovirus-like elements, and DNA transposon fossils comprise about 48% of the genome [6]. Small ncRNAs including miRNAs, repeat associated small interfering RNA (rasiRNAs) and piwi associated RNAs (piRNAs) constitute a small but significant portion of the genome. While new ncRNAs are rapidly being uncovered, functional data remains sparse particularly at the host-pathogen interface. Analysis of miRNA diversity in the animal kingdom revealed that miRNAs co-evolved with eukaryotic genomes [7, 8] with less than 12 miRNA genes lost during the entire deuterostome evolution, and significant miRNA gene expansion at major evolutionary nodes. Ancestral miRNAs were typically located in inter-genic regions but have since adapted to colonize the introns of eukaryotic genomes (intronic exaptation) presumably to overcome dependence on host transcription machinery and promote generation of novel miRNAs [9]. About 50% of human miRNA genes are located within introns [9], while the remainder are found in either intergenic regions, coding regions or untranslated regions (UTRs). miRNAs may also be produced through the processing of structural RNAs such as tRNAs [10], snoRNAs [11] or transposable element encoded direct and indirect repeats [12].

### **miRNA Biogenesis**

Endogenous miRNAs are usually transcribed from RNA polymerase II promoters [13] as primary miRNA transcripts (pri-miRNAs) several kb in length, containing various stem-loop structures and ssRNA flanking segments. Pri-miRNAs are processed in two compartmentalized steps *via* the actions of distinct protein complexes (Fig. **1** and reference [14]). The initial step involves an RNAse-III

protein family member, Drosha, which cleaves at the base of the stem of the primiRNA to release a hairpin structure termed a pre-miRNA [15, 16]. The DiGeorge syndrome critical region gene 8 (DGCR8) protein co-interacts with Drosha and the pri-miRNA to form a 650 kiloDalton (kDA) Microprocessor complex [15, 17]. DGCR8 specifically recognizes the junction between the single stranded (ss) and dsRNA regions of the pri-miRNA stem and directs Droshamediated cleavage 11 bp from this site [18]. The resultant pre-miRNA is approximately 70 bp in length and includes a 5' phosphate and two-nucleotide 3' overhang. Interestingly, pri-miRNA processing may occur co-transcriptionally and prior to intronic splicing in the nucleus [19]. Drosha-mediated cleavage of intronic miRNAs does not impair transcript splicing [20] but if the miRNA is located exonically, Drosha processing destabilizes the transcript and affects downstream protein translation [21]. Cellular splicing machinery also plays a role in the biogenesis of a distinct class of miRNAs that bypass Drosha-mediated cleavage [22, 23]. Termed mirtrons, these intronic miRNAs are derived from Pol II-transcribed primary-mirtron precursors that encode canonical splice sites (Fig. **1**). Host splicing machinery recognizes the splice sites and cleaves the transcript generating a hairpin configuration that resembles a Drosha product [24].

Following nuclear cleavage, pre-miRNAs are conveyed to the cytoplasm *via*  exportin-5 (Exp5) in a RanGTP-dependent manner [25]. Upon entering the cytoplasm a second RNAse III enzyme, Dicer, cleaves the loop of the pre-miRNA and releases a miRNA duplex. Dicer is comprised of two copies of a conserved RNAse III domain, a dsRBD in the carboxy-terminus, an amino-terminus helicase domain, a domain of unknown function (DUF) and a PAZ (Piwi-Argonaute-Zwille) domain [26]. The gap between the PAZ and RNAse III domains corresponds with the average length (25 bp) of a pre-miRNA stem duplex. Dicer thus functions as a 'molecular ruler' [27] by binding the 3' end of the pre-miRNA duplex in the PAZ domain and cutting at a set distance to generate miRNA products 21 bp in length with characteristic 5' phosphate groups and twonucleotide 3' overhangs [28, 29]. Dicer interacts with TRBP (TAR RNA binding protein) [30]. PACT (Protein Kinase R activator) [31] and a member of the Argonaute (AGO) family to create the RNA-induced silencing complex (RISC) that is required for RNA interference [32-35]. One strand of the duplex remains bound within RISC as the mature miRNA or 'guide strand' while the remaining passenger strand is degraded [36]. The relative thermodynamic stability of the 5' ends of the miRNA duplex determines which strand is selected as the mature miRNA [37].

Mature miRNAs regulate post-transcriptional protein synthesis by base pairing to cognate mRNAs. Depending on the level of complementarity between the mature miRNA and its mRNA target sequence, the associated protein payload, as well as the specific AGO protein involved in RISC loading, multiple mRNA silencing mechanisms can occur including enhanced exonucleolytic mRNA decay, sitespecific endonucleolytic cleavage or translational repression [38, 39]. The initial bases from positions 2 to 7 of the mature miRNA are termed the 'seed' sequence and they provide most of the pairing specificity. In some cases, complete pairing between the seed region and target mRNA is sufficient to mediate AGO2 associated RISC cleavage of the cognate phosphodiester backbone [40, 41]. More typically for mammalian and viral mRNA targets however, cleavage activity of RISC is severely impaired by mismatched pairing in the seed and other regions and translational inhibition occurs [42, 43]. In this pathway, RISC-bound mRNAs associate with cytoplasmic processing bodies (P-bodies) that exclude translational machinery and incorporate proteins required for mRNA remodeling [44]. All four human AGO proteins bind miRNAs, although endonuclease activity is a property of AGO2 alone [40]. Thus RISC complexes containing any of the four AGO proteins can catalyze a translational block but only complexes containing AGO2 can trigger transcript decay in the case of perfect complementarity between a miRNA and its target mRNA [45]. Proteins of the TNRC6A (trinucleotide repeat containing 6A; also known as the GW182 protein family) are also essential for transcript decay activity in the above situation [46]. Intriguingly, since the complementary length of seed sequence required for miRNAs to target cognate mRNAs is short, each miRNA can target and modulate hundreds of transcripts. Indeed, current estimates predict that thousands of human transcripts are regulated by miRNAs [47-49]. Furthermore, a single miRNA can regulate multiple mRNA molecules that can in turn also be acted upon by numerous distinct miRNAs [50, 51]. Importantly, most miRNAs decrease target protein levels by less than 2-fold [52], but this non-linear tuning mechanism can still exert a large physiological effect [53]. Thus, the endogenous miRNA pathway represents a highly efficient system to simultaneously fine-tune the expression of numerous genes as well as modulate specific functional pathways.

### **miRNAs as Part of the Host Innate Immune Response**

Animals have evolved a cellular and organismal response to environmental challenges including infections with intra- and extra-cellular pathogens. The primary response to such a challenge is the innate immune response which is nonspecific and functions to identify sites of infection and contain them. The innate

**Fi** pr by M str D re Ex m nu un pa m **igure 1:** The m romoters to gen y single stran Microprocessor ructure with ch Drosha from in ecognized by ho xp5, which rec miRNA is cleav ucleotide 3' ove nwound leaving airing is incom mature miRNA p mammalian RN nerate pri-miRN nded regions. complex that haracteristic 2 ntronic sequen ost splicing ma ognizes and bin ved by Dicer erhangs. AGO2 g the guide stra mpletely comple pairs 100 % wit NAi pathway. E NAs with a stem The RNAse binds the prinucleotide 3' o ces termed m achinery. Both nds the 2 nucle and its dsRBD 2 binds the miR and (mature mR ementary, trans th the target mR Endogenous mi m-loop structur III protein Dr miRNA and c overhangs. Premirtrons that co forms of pre-m eotide 3' overha D partner TRB RNA duplex to f RNA) to pair c slational suppre RNA, AGO2- m miRNAs are tran ure approximate rosha complex cleaves to rele -miRNAs can ontain splice miRNA are exp ang. In the cyto BP to liberate form RISC wh covalently with ession occurs, mediated cleava nscribed from ely 33 bp in len xed to DGCR ase a pre-miR also be genera donor and ac ported from the oplasm, the loop a miRNA dup erein the miRN its cognate mR usually in P-bo age of the mRN RNA Pol II ngth flanked R8 forms a RNA hairpin ated without cceptor sites e nucleus by p of the preplex with 2 NA duplex is RNA. If this odies. If the NA occurs.

immune system primes a pathogen-specific potent adaptive response which functions to 1) eliminate the infection, 2) generate a memory response that can respond to future challenges from the same pathogen, and 3) restore immune homeostasis. Innate immunity depends on the identification of specific pathogen associated molecular patterns (PAMPs) by cellular pattern recognition receptors (PRRs) that lead to induction of specific cellular responses. Toll like receptors (TLRs), Retinoic acid inducible gene (RIG-I/DDX-58) like receptors (RLRs), Nucleotide oligomerization domain (NOD) like receptors (NLRs) and Scavenger receptors form the four main families of PRRs that are located either on the cell surface, internal membranes or secreted into extracellular milieu. TLRs and RLRs form the major PRR class for the recognition of intra-cellular pathogens such as viruses. Recent studies suggest that miRNAs are also integral components of the innate immune response and both the host cell and the pathogen can utilize miRNA-mediated gene regulation toward their own survival and propagation.

Multiple studies that analyzed the expression of the human miRnome in response to viral, bacterial and protozoal infection identified subsets of host miRNAs that are differentially expressed [54-60]. While mechanisms and target gene repertoires regulated by these miRNAs are yet to be fully elucidated, evidence from Drosha and Dicer ablation studies suggest that components throughout the entire miRNA pathway participate in innate and adaptive immune responses, as well as cellular development. For example, Drosha silencing significantly diminishes the capacity of endothelial cells to form tubes and sprout capillaries [61]. Drosha silencing also alters normal development and egg hatching in the nematode *Meloidogyne incognita* [62], inhibits cellular proliferation in *Drosophila melanogaster* [63], and is essential for survival of vascular smooth muscle cells (VSMCs) [64]. Drosha-generated small RNAs (DDRNAs) were also recently shown to be essential for the DNA damage response [65]. Similarly, Dicer knock-out (KO) and knock-down (KD) studies showed that T regulatory (Treg) cell development and function is regulated *via* miRNA-mediated modulation of the crucial transcription factor Foxp3 [66], and Dicer-deficient Treg cells lack suppressor function [67]. Dicer is also important for the development of invariant natural killer T (iNKT) cells [68], and bone marrow derived macrophages (BMDMs) from Dicer KO animals show increased expression of inflammatory cytokines [69]. Furthermore, Dicer-2 (dcr2) regulates the citric acid cycle, mitochondrial oxidative phosphorylation and energy metabolism in fruit flies [70].

Changes in Dicer expression levels have also been linked to altered immune responses to various pathogenic infections. Dicer expression is significantly reduced in clinical cases of Respiratory Syncytial Virus (RSV) [71] and controls Hepatitis C Virus (HCV) RNA accumulation *via* regulating the abundance of miR-122 specifically [72]. Respiratory epithelial cells with reduced Dicer exhibit increased susceptibility to influenza virus infection [73] and protozoan parasite *Leishmania donovani* down-regulates Dicer 1 to regulate miR-122 [74]. *Listeria monocytogenes* and the human vaccine strain used against tuberculosis infection, *Mycobacterium bovis* bacillus Calmette-Guérin (BCG), both decrease miR-29, which controls interferon gamma (IFN-ϒ) in natural killer (NK) and T cells [75]. *Salmonella*, an invasive bacterium, down-regulates members of the Let-7 family of miRNAs causing de-repression of target interleukin 10 (IL-10) mRNAs which attenuate pro-inflammatory cytokines [76]. Similarly, bacterial lipopolysaccharides (LPS) stimulate miR-21 expression leading to up-regulation of IL10 mRNAs and subsequent attenuation of inflammatory activation [77]. Finally, mice with reduced Dicer expression in the gut are more susceptible to worm infestations [78] highlighting the central role of Dicer and cellular immunity within intestinal mucosa homeostasis.

Considering that human cells encode >1000 miRNA species, many of which function in innate immunity and apoptosis, it is unsurprising that pathogens (and viruses in particular) have evolved mechanisms to subvert these cellular components [79]. The particular mechanisms by which viruses manipulate the host immune system are as varied as the viruses themselves but if one focuses on viral interactions with cellular microRNA machinery, the options are surprisingly minimized and constrained to a fairly limited number of human viruses. However, the above observations strongly support the role of the miRNAs and proteins involved in their biogenesis in regulating crucial aspects of host responses to environmental stimuli including infection. Furthermore, as miRNAs are expressed in a very tissue-specific manner that can vary depending on the cell cycle stage, the interactions between host miRNAs and pathogens is clearly complex. Understanding which components are involved, how the networks are regulated and what tips the host-pathogen balance in favor of infection or not will shed valuable light on the role of miRNAs in the first-line of defense innate immune response.

## **DEVELOPMENT OF miRNA-RELATED TECHNOLOGIES**

A host organism's ability to build an innate immune response against a pathogen is vital and many cellular mRNAs that control host defenses are regulated by

miRNAs. The promiscuity of miRNAs in regulating their mRNA targets coupled with their importance in post-transcriptional regulation of host gene expression make unraveling the role of miRNAs at the host-pathogen interface extremely challenging. Resolving these interactions requires identification of the specific pathogen-encoded stimuli that induce changes in the host miRnome following infection, assessment of which transcripts are targeted by miRNAs as well as which miRNAs are responsible, quantification of the miRNA-induced changes to the infection transcriptome, analysis of downstream effects on related protein outputs, and validation of each step to ensure a robust understanding of such a complex network of interactions. To achieve these goals with miRNAs, a range of strategies and reagents have been developed over the course of the last decade.

### **Identification and Detection**

The ever-expanding list of miRNAs is comprised of both predicted and validated sequences. As might be expected, the number of predicted sequences is exhausting and primarily generated through the analysis of large quantities of information derived from deep sequencing efforts. Unsurprisingly, only a fraction of the predicted miRNAs has been validated with experimental data [80]. This discrepancy is primarily due to the fact that complex genomes contain large numbers of inverted repeats capable of folding into miR-like hairpin structures [81]. Refinement of the list of predicted miRNA genes has generally involved incorporation of filters that eliminate well-characterized recurring sequences (*e.g.*, long interspersed nuclear elements (LINEs) and alu repeats). Additionally, researchers have incorporated computational techniques that identify defined nucleotide content associated with stem length or Drosha cleavage sites, and evolutionary conservation [82-88]. Approaches have led to the development of a host of public and private bioinformatic tools for computational identification of miRNA sequences including, but not limited to, CID-miRNA (http://mirna.jnu.ac.in/cidmirna/), miRScan (http://genes.mit.edu/mirscan/), and miRFinder (http://www.bioinformatics.org/mirfinder/).

In addition to bioinformatic approaches, there are a number of technologies that have been adopted for the discovery and validation of miRNA sequences. In general, the overall usefulness of each is judged on the basis of specificity, sensitivity, and throughput. Early on, extraction of total RNA followed by Northern Blot Analysis [89] led to quick determination of the presence of a predicted sequence. Yet due to the relatively labour intensive nature of the technique, the need for significant amounts of starting material, the comparatively poor sensitivity, and overall low throughput, alternative technologies and services

have quickly replaced this approach. One such technology that provides a fairly global picture of a cell's miRNA signature is based on microarray gene expression profiling [90]. Gene expression chips containing the entire complement of an organism's known or predicted miRNA content can be probed with labelled sequences generated from as little as 50-100 nanograms of total RNA. Current chip designers (*e.g.* Agilent and Affymetrix) and service providers (*e.g.*, Asuragen and Exiqon) update miRNA content on a regular basis using frequent miRBase releases, thereby offering researchers the ability to rapidly obtain a cell's miRNA profile. Not surprisingly, the greatest limitations associated with array-based profiling relates to issues surrounding specificity and sensitivity. Since arrays are based on hybridization, cross-hybridization of sequences to closely related probes (*e.g.*, miRNA family members) can lead to false positives. Given the additional limitation of arrays to accurately detect i) poorly expressed sequences, or ii) small changes in miRNA expression, the overall effectiveness of the technology may be limited to a subset of a cell's entire miRnome.

The most recent advances in miRNA quantitation utilize qRT-PCR and Next Generation Sequencing (NGS). For qRT-PCR, several companies in the research tool space (including Life Technologies, Qiagen, and Exiqon) have developed dedicated kits to facilitate quantitation of miRNAs. In most cases, workflows combine novel systems for small RNA purification with unique amplification strategies (*e.g.* adaptor ligation) specifically designed for quantitation of this of this small collection of uncommon targets. Assays can be performed on as little as 1 picogram of RNA and in many cases allows for the distinction between closely related family members. Most recently, sequencing has become the gold standard for miRNA quantitation. As with qRT-PCR, the sequencing workflow begins with small RNA purification and adaptor ligation followed by RT and PCR amplification. Subsequent sequencing using any number of available platforms is greatly simplified by the fact that miRNAs are generally shorter than 25 nucleotides in length, thus eliminating the need for sequence alignment. As such, the relative signature or profile of a cell can be generated from the overall abundance of individual sequences within the total population of reads.

### **Modulation**

Modulation of endogenous genes often provides essential clues into an entitie's function or role in cell biology. To that end, both academic and industrial research groups have focused much of their effort on developing collections of miRNA mimics and inhibitors to mediate up- and down-regulation of miRNA concentrations (respectively). To date, miRNA mimics are available in multiple

forms including synthetic and expressed constructs. Synthetic miRNA mimics such as those present in the Dharmacon miRIDIAN collection closely resemble native mature miRNA sequences with the exception that duplexes contain chemical modifications that bias strand entry into RISC. Expressed constructs, such as the shMIMIC collection (Dharmacon, part of GE Healthcare) embed the DNA encoding the desired mature miRNA sequence into a lentiviral backbone. Subsequent packaging of the construct into a viral particle and transduction into target cells leads to integration of the expression construct into the host genome and consistent, long-term expression of the desired miRNA mimic sequence.

Synthetic miRNA inhibitors have also been developed and successfully used. Early inhibitor designs described by Ebert [91] included a collection of molecules comprised of multiple, tandem (antisense) miRNA binding sites. Referred to as "sponges", the authors successfully demonstrated that this collection of competitive inhibitors could efficiently suppress miRNA function in a seedspecific fashion. Vermeulen and colleagues [92] have described additional inhibitor designs. In contrast to the molecular sponges described by Ebert, Vermeulen successfully combined a single miRNA reverse complement sequence with unrelated (flanking) secondary structural elements. Through an unknown mechanism, this combination of traits significantly increased overall inhibitor potency and allowed for multi-miRNA knockdown at sub-nanomolar inhibitor concentrations.

While experiments employing mimic and inhibitor strategies have successfully broadened our understanding of the contribution miRNAs make to cell biology (including host-pathogen interactions) there are caveats to working with these reagents. As described in an article by Robertson [93] inhibitors designed against one member of a miRNA family can silence related family members. This crossreactivity can significantly confound interpretation of results particularly as researchers move downstream in their attempts to identify relevant miRNA targets. Similar issues of cross-reactivity may complicate the interpretation of results that employ miRNA mimics. As noted by several researchers, the intracellular concentrations of mimics resulting from transfection are predicted to far exceed endogenous concentrations [94]. If non-physiological concentrations of miRNAs result in the silencing of endogenous mRNAs that under normal circumstances would not be targeted, the risk of false positive phenotypes may be heightened. Considering events upstream of actual miRNA targeting, it is conceivable that an over-abundance of a miRNA mimic (particularly those expressed as pri-miRNAs) may saturate one or more steps of the RNAi pathway. If this sort of over-saturation hinders endogenous miRNA biogenesis, false positive phenotypes may be observed.

## **Target Identification**

As is the case with miRNA gene identification, tools for identifying miRNA targets include both *in silico* and benchtop resources. Computational target prediction began with a large body of work from the Bartel laboratory [95, 96]. Predictions primarily focus on the identification of complementary miRNA seed matches (nucleotides 2-7) in the 3' UTR of target mRNA. By combining seedrelated parameters with constraints that demand the presence of position-specific nucleotides [48], evolutionary preservation of target sites, and target mRNA structure [97], researchers have significantly improved the target prediction field. However, while these parameters have been incorporated into multiple search programs including miRBase [98] and miRanda [99], it is acknowledged that the number of mRNA targets identified by these methods exceeds the true target number for any individual miRNA [100]. These shortcomings have led researchers to combine a range of experimental approaches with *in silico* predictions. These include differential gene expression profiling using microarrays, sequencing following treatment of cells with miRNA mimics and/or inhibitors [49, 101], identification of miRNA-mRNA target pairs through immunoprecipitation of RISC [102], mimic target validation using 3' UTRreporter constructs [103] well as other techniques that focus on correlating miRNA expression with a particular transcript(s).

## **mIRNAs INVOLVED IN INFECTIOUS DISEASE**

While there are a large number of studies that have analyzed miRNA deregulation during infection, studies involving viruses are predominant owing to the intracellular nature of the pathogen. For this reason, we limit our summary of miRNA deregulation during infectious diseases to those caused by viral pathogens in order to highlight the unique features of miRNA regulation at the host-virus interface.

## **Hepatitis C Virus**

Hepatitis C virus (HCV) infects about 17 thousand people annually in the US and an estimated 3.9 million people live with HCV infection [104]. Though anti-HCV treatment is effective, lack of HCV diagnosis and effective care prevent effective HCV containment. A significant proportion of HCV-infected people (>85%) are chronically infected of which ~50% develop chronic liver disease (CLD) [105, 106].

HCV is a positive strand RNA virus. The entire genome of HCV is translated into a single viral polyprotein that is subsequently cleaved to generate the core protein, two envelope proteins E1 and E2, ion-channel protein p7, and five non-structural proteins NS1, NS2, NS3, NS4A, NS4B, NS5A, and NS5B.

Infection of Huh-7 cells with three different HCV clones was found to induce the expression of miR-142-3p and downregulate miR-128a and -196a [107]. Expression of these three miRNAs was inversely correlated with the expression profiles of 37 genes deregulated during HCV infection of A549 cells and 4 genes (HNMT, XPO1, PMPCB and HMGB1) were subsequently validated. Similarly, HCV chronically infected peripheral monocytes (cHCV) showed an elevation in miR-155 expression (a positive regulator of multiple cytokines including TNF- [108]) and downregulation of miR-125b. Naïve monocytes induced miR-155 significantly upon treatment with HCV core or NS3 and NS5 proteins, and this miRNA was found to be elevated in sera taken from cHCV patients [108]. HCVinduced miRNA miR-130a was independently shown to regulate the expression of IFITM1 [109], resulting in reduction of cellular IFITM1 expression. IFITM1 proteins are small transmembrane proteins and overexpression of IFITM1 has an anti-viral effect reducing HCV infectivity [110]. Recently it was shown that IFITM1 induced during HCV infection interacts with HCV co-receptor CD81 and occludin and inhibits viral entry [111].

miR-122 plays a vital function in HCV infection. miR-122 expression is negatively regulated by HCV core protein leading to decreased miR-122 at later stages of infection [112]. miR-122 expression has been shown to increase HCV RNA concentration [113]. It was also shown that miR-122 inhibition abrogated HCV replication but did not alter viral RNA translation or RNA stability. miR-122-mediated upregulation of HCV replication is affected by the abundance of Dicer and TRBP [72] and TRBP was found to be essential for HCV RNA accumulation [72]. Additionally, AGO1 and AGO2 proteins play critical roles in recruitment of miR-122 to the HCV 5' UTR but not the 3'UTR [114], and these proteins delay HCV RNA degradation. Indeed miR-122 prevents HCV RNA degradation by cytosolic 5' exonuclease Xrn1 [115]. Current studies suggest that miR-122-HCV complexes form tertiary RNA structures that recruit other cellular proteins to decrease HCV transcript decay and promote translation [116]. Similar studies with GB virus B (GBV-B), another (+) ssRNA virus which infects chimpanzees, also validated the interaction between HCV 5' UTR and miR-122 and the formation of a tertiary structural feature [117]. Later it was demonstrated that ectopic expression of miR-122 in non-hepatic cells that do not express HCV

receptors not only confers susceptibility to HCV infection but also enables completion of the viral life cycle [118-121].

miR-122-HCV complex interaction utilizes sequences in the 5' non-coding region (NCR) of the HCV genome and spacing individual miR-122 binding sites also determines efficacy of miR-122 mediated HCV RNA upregulation [122]. This interaction is crucial to viral replication since *in vivo* suppression of miR-122 expression in mice, green monkeys [123] and chimpanzees [124] leads to a significant long-term reduction in serum cholesterol and reduced viremia. Indeed, recent human clinical trials with Miravirsen, a locked nucleic acid antisense oligonucleotide against miR-122, showed a long-term reduction of HCV RNA levels in chronically infected HCV patients without emergence of miR-122 viral escape mutants or adverse effects [125]. Recently it was shown that miR-122 also regulates the expression of suppressor of cytokine gene 1 (SOCS1) thus reducing the secretion of anti-viral IFN- [126]. miRNA-mediated translational blocks occur in cytosolic P-bodies and HCV infection reduces the frequency of cytosolic P-bodies while inducing the formation of stress granules that are sites of active viral RNA synthesis [127].

In addition to miR-122, a number of other miRNAs have been shown to have a role in HCV biology. HCV-induced miR-21 targets two important mediators of the innate immune response, MyD88 and IRAK1, causing suppression of host innate immunity and mediating viral persistence during chronic infection [128]. miR-27a has been shown to regulate the metabolism of lipids in Huh cells infected with HCV by regulating the expression of transcription factor RXR, and lipid metabolism associated genes FASN, SREBP1, SREBP2, PPAR, PPAR, Apo1, ApoB100, and ApoE3. miR-27a inhibition increased cellular lipid production and reduced interferon signalling, while miR-27a upregulation increased IFN activity and efficacy of pegylated IFN and Ribavirin therapy against HCV [129]. This suggests that native miR-27a induction plays a role in establishment of a chronic, low viral load state. Finally, the HCV core protein also induced expression of miR-345 which regulates apoptosis by targeting p21 [130].

### **Human Immunodeficiency Virus**

Despite the discovery of the human immunodeficiency virus (HIV-1) nearly three decades ago and the expanding use of potent combinatorial antiretroviral regimens, HIV-1 currently infects 34 million people globally [131]. A deeper understanding of how this pathogen manipulates human cells during infection is paramount to the development of more targeted therapies and an efficacious

vaccine. As an obligate intracellular pathogen, HIV-1 relies on host cellular machinery to complete its life cycle. Integral to this is the modulation of host gene expression to ensure a coordinated regulation of pro- and anti-viral host factors [94, 132, 133]. HIV-1 actively manipulates expression of some host miRNAs during infection although differences in pseudovirus transfection *vs*. molecular infectious clone infections, viral titres, viral subtypes, infection durations and cell types used make comparisons unreliable. As such, robust identification and characterization of host miRNAs that can modulate HIV expression in a natural infection setting, remains to be elucidated. Furthermore, it is unclear whether HIV-1 affects global miRNA expression by altering miRNA biogenesis or processing, or if the virus affects individual miRNAs through changes in transcription or miRNA maturation, or a combination of these. What is clear however is that HIV-1 infection, regardless of the model used, perturbs the cellular miRnome compared to uninfected control cells and thus continued research in this arena remains important.

## *HIV-1 and the miRNA Biogenesis Pathway*

The ability of HIV-1 to affect global miRNA expression would not be uncommon among viruses. Plant and insect viruses in particular have been shown to encode proteins that act as suppressors of RNA silencing (SRS) [134-137], thought to have developed as a counter-strategy to the anti-viral effects of the RNAi pathways in these organisms. The human T-cell lymphotropic virus (HTLV), which encodes a SRS protein, Rex, was shown to interact with Dicer to suppress RNAi-mediated silencing of the virus in mammalian cells [138]. Similarly, the VP35 protein of Ebola was shown to suppress RNAi-mediated silencing [139]. The HIV-1 Tat protein, required for transactivation of viral transcription, may act as a SRS protein but the data related to this are conflicting. Exogenous Tat in HeLa cells [140], and viral-derived Tat in Jurkat and CEM-SS cells [141] is known to work with Dicer and function as a SRS protein. In contrast however, Tat failed to inhibit global RNAi in persistently infected T cells even when stably expressed and assessed at varying time points over the course of infection [142].

During infection, Tat binds to the TAR (*trans*-activation response) element, which is an RNA structure present in the 5' LTR of all HIV-1 transcripts. This interaction recruits cellular factors including TRBP (TAR RNA binding protein) and the P-TEFb complex (CDK9 and cyclinT1) resulting in phosphorylation of stalled RNA polymerase II at the 5' LTR. Importantly, TRBP is a co-factor required for Dicer processing of pre-miRNAs as well as RISC loading (Fig. **1**), and thus titration of this protein by the Tat/TAR interaction was suggested as a

mechanism for global RNAi repression [143]. Supporting data on the sequestration of TRBP by Tat remains to be independently validated, while contrasting data show that global suppression of RNAi does not occur in cells infected with replicating HIV-1 [144]. Another HIV-1 protein, Vpr (viral protein R) was suggested to act as a SRS by binding to Dicer in monocytes concomitant with a global decrease in miRNA production [145]. This study was complicated by the fact that monocytes generally express low levels of most miRNA biogenesis-related proteins, and that Dicer is usually only detectable once the cells differentiate into macrophages. Furthermore, these results remain to be independently validated.

Not only does the validity of HIV-1-encoded SRS proteins continue to be unresolved, the effect of the miRNA pathway on HIV-1 also remains uncertain. In one study using VSVG-pseudotyped pNL4-3Luc reporter vectors in HEK293T cells, HIV-1 infection was enhanced in cells with decreased Drosha and Dicer levels [146]. A second study showed that knockdown or either Drosha or Dicer in peripheral blood mononuclear cells (PBMCs) from HIV-1 infected donors resulted in faster replication kinetics of the virus [147]. In addition, the repression of Dicer and Drosha also negatively affected HIV replication in Jurkat cells as well as latently infected U1 cells [147]. An important caveat from both studies is that Drosha and Dicer knockdown mediated by exogenous siRNAs may severely complicate the interpretation of such experiments. Given the critical role of the RNAi pathway in normal cellular gene expression, it is unsurprising that knockdown of these proteins would affect HIV-1 replication in cells depleted of such factors. The ability to tease out a direct *vs*. off-target effect of depleted Drosha or Dicer on HIV-1 replication is very difficult and until solid, robust observations can be made, the effects of the miRNA pathway on HIV-1 replication remain obscure.

## *HIV-Mediated Perturbations of Host miRNA Profiles*

The dysregulation of host miRNA profiles in response to HIV-1 infection has been examined in various cell lines as well as physiologically relevant primary cells although to date, only one quarter (~ 320) of the known human miRnome covering ~ 1200 miRNAs has been investigated. Overall, changes in the miRNA profiles do not correlate well between different studies but this is not surprising given the cell and tissue-specific expression of miRNAs in combination with varied experimental plans. In 2005, the first of such published studies compared changes in the miRNA profiles between uninfected HeLa cells and those transfected with pNL4-3 pseudovirus using miRNA microarrays that detected 312

miRNAs [148]. The dominant pattern showed HIV-mediated down-regulation of these host miRNAs, but this was unsupported in a later study using the same miRNA microarrays [147]. Eleven miRNAs were up-regulated while expression of a polycistronic miRNA cluster (miR-17/92) was significantly decreased in pNL4-3 infected *vs*. uninfected Jurkat cells [147]. Intriguingly, the miR-17/92 cluster was shown to target PCAF (P300/CBP-associated factor), a histone acetyltransferase that binds Tat to enhance HIV-1 gene expression. Suppression of the miR-17/92 cluster resulted in increased expression of PCAF and enhanced viral replication in PBMCs from HIV-infected donors.

The PBMC miRNA profiles derived from HIV-1 seropositive individuals have been fairly well characterized although the total number of patients sampled was small [149, 150]. In one study, expression levels of 327 miRNAs were assessed by miRNA microarrays in PBMCs isolated from four classes of HIV-1 donors stratified according to CD4+ T cell counts and viral loads [149]. Overall 62 miRNAs were dysregulated in response to HIV-1 infection, 59 of which were down-regulated. Of these, 12 miRNAs were common to all four classes with the remainder separated into class-specific miRNA signatures. These profiles overlapped significantly with those identified in a second study using miRNA microarrays probing for 518 mature miRNAs in HIV-positive donor PBMCs, pNL4-3-infected PBMCs and pNL4-3-infected CEMx174 cells [150]. In total 62 miRNAs were dysregulated in HIV-infected PBMCs, 32 of which were downregulated, and 33 of which were similarly expressed in CEMx174 cells. Expression of two members of the miR-17/92 cluster, miR-17-5p and miR-20a were suppressed by HIV-1 infection, in line with previous studies [147, 149]. The results of these studies would only benefit from additional data collected across significantly larger populations of HIV-positive donors, encompassing different HIV subtypes, and profiling of the entire known miRnome. However, the current data already suggests that miRNAs may fine-tune HIV's changeover from latency to activation [151], clearing the viral reservoirs in T cells, and reduction in HIV replication [152].

### *HIV Regulation by Host-Encoded miRNAs*

Host miRNAs could potentially regulate HIV-1 replication by targeting cellular factors required by the virus during infection or by directly inhibiting viral transcripts. As described above, suppression of the miR-17/92 cluster caused an increase in PCAF binding to Tat and subsequent enhanced viral replication [147]. Similarly, targeting of cyclinT1 by miR-198 reduced pNL4-3-Luc expression in the promonocytic cell line, Mono Mac 6 (MMC) following stimulation with PMA

[153]. CyclinT1 forms part of the P-TEFb complex required for RNA Pol IImediated transcription elongation, thus the mechanism of HIV-1 down-regulation seems to be *via* miR-198 inhibition of this cellular factor that is required by the virus. Notably, monocytes express high levels of miR-198 but this is substantially reduced following their differentiation into macrophages, and this oscillating miRNA expression was suggested to be a host-encoded mechanism for restricting HIV-1 infection of monocytes [153].

The Nef sequence of HIV-1 acts as the 3' UTR for the majority of viral transcripts due to its location in the genome. Nef is believed to play a part in pathogenesis [154], and viruses encoding Nef mutations have been linked to slower disease progression [155], suggesting that miRNAs targeted to this region may negatively affect HIV-1 replication. Both miR-29a and miR-29b were shown to target Nef when expressed in a luciferase reporter in HEK293 cells [152]. These two miRNAs were also identified in PBMCs and their over-expression reduced HIV-1 replication in Jurkat and HEK293 cells. In a separate study focused on miRNA profiles in activated *vs*. resting CD4+ T cells, miR-125b, miR-150, miR-223 and miR-382 were highly expressed in the resting cells only and were predicted to target Nef although this was not experimentally validated [151].

Viral replication in H9 T lymphocytes was also decreased in response to miR-29a expression although the mechanism did not involve targeting Nef [146]. Instead, miR-29a enhanced the association of HIV-1 mRNA with RISC proteins and processing body (P-body) complexes. P-bodies include proteins such as Rck/p54 (RNA helicase), GW182 (interacts with Argonaute proteins), LSm-I (RNA binding protein) and XRN1 (5' to 3' exonuclease) that sequester mRNA transcripts as part of a translation control mechanism [156]. Down-regulation of any of these proteins by siRNAs caused a decrease in of pNL4-3-Luc expression in HeLa cells, and knock-down of RCK/p54 released the translation suppression in H9 T lymphocytes [156], as well as virus re-activation in PBMCs from patients on combinatorial antitretrovirals [157].

Taken together, these limited studies provide evidence that some host miRNAs regulate HIV-1 replication, but many outstanding questions remain. Which additional miRNAs regulate HIV-1 expression? What roles do differentially expressed host miRNAs play in HIV-1 pathogenesis? How do P-bodies (and potentially stress granules) assist in miRNA-mediated translational control of HIV-1? Can HIV-1 alter the host RNA content of P-bodies and if so, what are the molecular players involved? These questions, among others will need to be answered before we have a confident understanding of how host miRNAs guide HIV-1 infection.

### *HIV-Encoded miRNAs*

Plants regularly process virus-encoded dsRNAs into siRNAs that target viral transcripts as part of a highly effective mechanism to control infection [158]. Numerous mammalian viruses have been shown to encode miRNAs including EBV, members of the Herpes family, CMV, MDV and SV40 [79, 159]. Contrasting data exist on whether or not HIV-1 encodes miRNAs. A single publication revealed an HIV-encoded miRNA, termed vsiRNA1, which targeted Env mRNA and reduced subsequent protein levels [141]. One group published data on HIV-encoded miR-H1 that targets cellular AATF thereby initiating apoptosis in human mononuclear cells [160]. A separate group showed miR-N367 to be encoded within *nef* and suggested this miRNA could reduce HIV-1 expression by binding to the negative response element within the U3 region of the 5' LTR promoter [161]. A major limitation of these studies is the lack of independent validation and the observation that none of these sequences were detected in persistently infected T cells [142]. In contrast to this, two independent groups reported two miRNAs encoded within the TAR loop of HIV-1 that aid in chromatin remodeling at the viral LTR, and prevent apoptosis in infected cells by down-regulating cellular ERCC1 and IER3 transcripts [162-165]. Importantly, only TAR-3p was detectable by Northern blot and both miRNAs were 17-18 nt in length which is smaller than the previously established Dicer products (21-22 nt).

With increased access to deep sequencing, additional HIV-encoded small RNAs have been identified. A highly abundant 18 nt RNA was identified in HIVinfected MT4 T cells that is antisense to the viral primer binding site (PBS) and associates with AGO2 [166]. A more in-depth sequencing analysis revealed viralencoded small RNAs that target both host and viral transcripts in HIV-infected T lymphocytes [167]. Numerous small RNAs corresponding to the HIV genome were detected and due to their positive polarities were defined as viral miRNAs (vmiRNAs). Antagomirs targeted to the vmiRNAs stimulated viral production suggesting a 'classical' anti-viral RNAi effect related to these transcripts. However, a small proportion of viral-encoded RNAs had negative polarities, corresponded to the 3' UTR of the viral genome, and potently suppressed HIV transcription (so-called viral siRNAs or vsiRNAs). Importantly, while the vmiRNAs were transcribed from the integrated HIV promoter, the vsiRNAs were transcribed from cellular promoters positioned downstream of the inserted HIV provirus, thereby providing two separate sets of transcripts that act in concert to

direct host and/or viral gene expression. Overall, while other RNA viruses are known to encode miRNAs that regulate their own gene expression, HIV-1 may not have such control due to its exceptionally high mutation rate. The TAR loop is the most highly conserved region suggesting the miRNAs encoded within this sequence may indeed regulate HIV-1 but additional supportive data is required.

## **Respiratory Syncytial Virus**

Respiratory syncytial virus (RSV) leads to considerable morbidity and severe lower respiratory tract disease in both the young and senior populations. RSV infections account for >14 000 deaths per annum [168] and despite nearly 6 decades of research on RSV, vaccine and therapeutic approaches are limited. RNAi efforts targeting crucial host/viral pathways have been recently shown to limit viral replication [169-171] and siRNAs against the RSV N gene have progressed to Phase III clinical trials [172]. Like many other RNA viruses, RSV modulates the host immune response *via* surface and internal viral proteins. Chief among these are the non-structural proteins (NS1/2), which work together to inhibit activation and nuclear translocation of IFN regulatory factor 3 (IRF3) [173, 174], and play a role in the suppression of cytokine production by proteasomemediated breakdown of the signal transducer and activator of transcription factor 2 (STAT2) [175, 176]. Together with nucleolin, the F and G proteins participate in attachment and entry [177]. Additionally, RSV G associates with Toll-like receptors (TLR) [178-180], and negatively affects type I IFN [181, 182], and cytokine and chemokine expression [183], in part through the activation of suppressor of cytokine signaling (SOCS) proteins in normal human bronchoepithelial (NHBE) cells and mouse lung epithelial cells [178, 181]. Furthermore, the G protein's CX3C motif imitates fractalkine (CX3CL1), thereby altering immune responses mediated by this chemokine [184].

*In vitro* and *in vivo* RSV infections stimulate early, middle and late genome-wide transcription profile changes in the host [20, 185], however, these modifications are not precisely mirrored in the host proteome [186]. RSV infection results in G1/S arrest in A549 [187] and HEp-2 cells [188], and a G2/M arrest in primary human bronchial epithelial cells through stimulation of TGF-1 and diminution of p53 both *in vitro* and *in vivo*. In HEp2 cells, the virus similarly triggers stress granule formation in a protein kinase R-dependent manner, prompting enhanced viral replication in cytosolic viral inclusion bodies [189, 190], which share components with P-bodies in the cytosol [191, 192].

Two studies have recently reported the deregulation of miRNA expression following infection with RSV. Othumpungat *et al* reported the repression of 24 and upregulation of two miRs (miR-886-3p and miR-375) with ~2 fold deregulation observed following HCV infection [193]. Of the miRNAs uncovered, six were expected to influence the neurotrophin nerve growth factor (NGF) gene, which is involved in regulating the NGF-TrKA axis and may allow airway epithelium to tolerate RSV challenge. Transfection with miR-221 mimics reduced NGF transcripts and protein although the miR-221 seed does not have a direct target in the NGF 5'- or 3' UTR and shows insignificant matches within the NGF coding region. miR-221 is a negative regulator of several tumor suppressor genes including PTEN [194], Bim [195], PUMA [196] and transcription factor Foxo3a [197], suggesting that pre-miR-221 treatment enhances apoptosis by reducing the activity of these tumor suppressors (also observed by Othumpangat [193]). We recently reported that when A549 cells are infected with RSV, five miRNAs (let-7f, miR-337, miR-520a, miR-24, miR-26b) are upregulated while two miRs (miR-198 and miR-595) are downregulated Furthermore, we observed that RSV G induced let-7f miRNA in the context of RSV infection [198]. Computational predictions and comparison with existing gene expression data for RSV in A549 cells showed that a substantial number of genes deregulated during RSV infection were predicted to be regulated by let-7f and other RSV deregulated miRNAs. By transfecting miRIDIAN miRNA mimics or inhibitors along with CMV-driven Luciferase-UTR constructs of a subset of let-7f targets, we showed let-7f mediated regulation of many RSV deregulated genes suggesting that let-7f may be serving a pro-viral role. Inhibition of let-7f reduced viral replication while upregulation of let-7f activity increased viral replication relative to control transfections.

### **Herpesviruses**

Herpesviruses are a group of large, enveloped dsDNA viruses comprised of α, β and γ subfamilies based on genomic sequence. Intriguingly, of the >200 viralencoded miRNAs currently known, the vast majority have been identified in Herpesvirus members (see Table 1 of [199]). Typically dsDNA viruses utilize bidirectional transcription to generate mRNAs and thus viral-targeted miRNAs can be readily encoded within the antisense transcripts. Furthermore, the location of dsDNA viruses within host nuclei provides them access to nuclear RNAi processing machinery allowing them to take advantage of this evolutionarily conserved pathway. This is particularly important for Herpesviruses as they, like HIV, establish long-term latent infections and thus require mechanisms to control the host immune response in an ongoing manner. Given the flexibility in miRNA-

mediated mRNA targeting and the ability of miRNAs to modulate expression of numerous discrete transcripts, miRNAs represent potent tools used by Herpesviruses, among others, to control host gene expression and subsequent immune responses to infection.

## *α-Herpesviruses*

Human herpes simplex virus 1 (HSV-1) and 2 (HSV-2) are both α-herpesviruses that establish latent infections in neuronal cells; causing cold sores and genital herpes respectively. During latent infection, noncoding latency associated transcripts (LATs) predominate, and massively parallel sequencing has uncovered numerous miRNAs within the LAT region that down-regulate viral mRNAs [200- 202]. Overall, nine miRNAs are conserved between these two viruses, particularly in the seed sequences [200]. A single miRNA (miR-H1) which is encoded roughly 300 bp upstream of the LAT region in HSV-1 only, exhibits elevated expression during productive infection [94, 200], and a functional analog (miR-H6) that shares seed sequence conservation with miR-H1 has been identified in cells productively infected with HSV-2 [200]. The combination of a single miRNA expressed during active infection and multiple distinct miRNAs expressed during latency suggests that these α-herpesviruses utilize these self-encoded sequences to facilitate and maintain stable latent infections in host cells.

## *β-Herpesviruses*

Human cytomegalovirus (CMV) is a β-herpesvirus that causes apathogenic infection of up to 90% of individuals worldwide but maintains a high morbidity rate in immunocompromised patients and is the primary source (1%) of congenital abnormalities [94]. Typically, CMV establishes a latent infection in haematopoetic progenitor cells but despite the discovery of 12 functional miRNAs scattered throughout the viral genome [203, 204], none seem to be expressed during latent infection. While no animal models exist to investigate the roles of these miRNAs *in vivo*, a recent study conducted in human foreskin fibroblasts revealed that intergenic RNAs located within a 15 kb segment of the genome function as a host-directed microRNA decay element [205]. This decay element directed specific degradation of host miR-17 and miR-20a and resulted in accelerated virus production during lytic infection. Interestingly, the same two host miRNAs, encoded within the miR17/92 cluster, are down-regulated in PBMCs from HIV-infected donors resulting in similarly enhanced viral replication (see section *'HIV-mediated perturbations of host miRNA profiles*'). Furthermore, miRNAs encoded within the mouse CMV genome have been linked to replication in salivary glands [206], providing additional data for the role of CMV miRNAs in lytic but not latent infection.

### *γ -Herpesviruses*

The first viral-encoded miRNAs were identified in B cells latently infected with Epstein-Barr Virus (EBV), a member of the γ –herpesvirus sub-family [207]. EBV maintains a life-long latent persistence in humans and is linked with numerous lymphocyte proliferative disorders including Burkitt's lymphoma, Hodgkin's lymphoma and NK/T cell lymphoma [208]. Another γ –herpesvirus that infects B cells and causes similar proliferative disorders is Kaposi's sarcomaassociated herpesvirus (KSHV), including its namesake Kaposi's sarcoma. Both EBV and KSHV are directly linked to human cancers [209] and while the complex interactions between host and viral proteins have been largely characterized, a blossoming appreciation of EBV/KSHV-encoded miRNAs has developed. Both of these viruses encode distinct miRNAs in terms of sequence, but share a similar genomic organization with a clustering of miRNAs in latency associated regions. Specifically, EBV encodes 25 miRNAs organized into the BART (22 miRNAs) and BHRF1 (3 miRNAs) clusters, while all 12 KSHV miRNAs are located within one latency cluster [210].

Both EBV and KSHV utilize their miRNAs in immune evasion, by disrupting signaling between infected B cells and surrounding immune cells, as well as to regulate apoptosis and the host cell cycle. EBV-encoded miRNAs target the IFNinducible T-cell attracting chemokine CXCL-11 which is an NK cell and Th1 lymphocyte ligand, thereby disrupting NK-mediated cytotoxicity of infected B cells [211]. Similarly, NK cells and CD8+ T lymphocytes generate a cytotoxic response through binding of their NKG2D receptors and the major histocompatibility complex class I-related chain B protein (MICB) expressed on infected B cells. EBV miRNAs BART1, BART3, BART5 and BART9 as well as KSHV miRNA-K12-7 all target MICB transcripts resulting in decreased MICB expression and subsequent NKG2D-mediated cell death [212]. The observation that MICB mRNA is targeted by multiple miRNAs from different herpesvirus members suggests that MICB is a key cellular obstacle during infection. Several additional EBV miRNA targets have been identified but functional studies to elucidate the mechanisms of immune evasion are lacking [208]. For example, host PDE7A, which is required for proliferation of NK cells [213], is a target of EBV miRNA-BART1 and miRNA-BART3-3p, but the effects on the viral life cycle remain unknown. Similarly, SP100, elements of promyelocytic leukemia-nuclear

bodies that function as part of the cellular antiviral response, is silenced by EBV miRNA-BART1-5p but the mechanisms have not been defined [213].

Viral pathogens that initiate latent infections have evolved mechanisms to control cellular apoptosis and cell cycling as a means to ensure persistence. Both EBV and KHSV utilize encoded miRNAs to target pro- and anti-apoptotic host proteins as well as cell cycle arrest factors to promote long-term viral infection. The proapoptotic protein Bcl-2-associated factor (BCLAF1) is targeted by KSHV miRNAs-K12-5, -9, -10b and -3 [214], while the pro-apoptotic modulator BCL2L11 (Bim) is down-regulated by numerous KSHV and EBV-encoded miRNAs [215, 216]. Caspase 3, a pivotal regulator of cellular apoptosis is a direct target of KSHV miRNAs [217] and various tumor suppressors, including p53 and PUMA (BBC3), are targeted by EBV and KSHV miRNAs [218, 219]. Interestingly, KSHV miRNA-K12-1 down-regulates the cyclin-dependent kinase inhibitor protein p21 to halt cell-cycle arrest and ensure availability of host factors in latently infected cells [220]. Furthermore, the KSHV LANA protein has been shown to inhibit the TGFβ-type II receptor, a key component of the antiproliferation pathway, by mediating epigenetic silencing as a means to regulate host cell cycling [221]. In an analogous manner, EBV controls cell cycling by upregulating host miR-34a, which is required for cellular proliferation [222]. Overall, the use of herpesvirus-encoded miRNAs as opposed to immunogenic viral proteins provides these pathogens with a highly effective mechanism to control apoptosis during latent infection while minimizing exposure to innate immunity.

While herpesviruses generally maintain a latent state of infection, they do switch to a lytic phase at some point in their lifecycle. The control of the latent to lytic state is critical and seems to be mediated by viral-encoded miRNAs that target both host and herpesvirus mRNAs. EBV miRNA-BART6 targets Dicer and finetunes the levels of this protein to control infection through all the latent (I, II, III) and lytic stages [223]. During latency, BART6-mediated reduction in Dicer results in suppression of genes that facilitate lytic replication and BART6-specific antagomirs shift EBV back into a lytic phase [224]. EBV also maintains latency by expressing miRNA-BHRF1 that targets host p53 [218], as well as BART2 that has complete sequence complementarity to BALF5 mRNA encoding the viral DNA polymerase [225]. Upon lytic reactivation, this suppression is relieved and EBV shifts out of a latent phase. KSHV miRNA-K1 targets the NFκB inhibitor IκBα to ensure activated NFκB and viral latency [226]. Similarly, KSHV miRNA-K12-11 targets IKKε resulting in decreased interferon signaling and ongoing latency [227]. Intriguingly, KSHV miRNA-K12-4-5 represses host Rbl2, a DNA methyltransferase (DNMT) inhibitor, to control the latency to lytic phase shift [228]. Suppression of Rbl2 caused activation of DNMTs, loss of DNA CpG methylation and subsequent silencing of various cellular genes.

### **Influenza**

Influenza A viruses are a worldwide cause of acute respiratory disease and can cause substantial morbidity and mortality each year [229-231] especially in the young and elderly. Treatment or prophylaxis with licensed antivirals curtail influenza morbidity by >80% efficacy during inter-pandemic influenza periods [232]. Influenza is an Orthomyxovirus with a genome that contains 8 negativesense, single-stranded RNA segments that encode up to 11 proteins [233]. Owing to their segmented genome, influenza virus segments can reassert in species such as pigs that are susceptible to both human and avian influenza virus strains.

Host miRNA expression (*in vitro* and *in vivo*) is significantly deregulated following challenge with a wide variety of influenza viruses from human, avian and swine origin. Early computational studies predicted that host-encoded miRNAs interact with genome segments of both swine and human origin influenza A viruses [234]. miRNA seed sequences incorporated into influenza A virus to generate live attenuated vaccines showed complete attenuation, a significant (>2 log) reduction in mortality and a diversified antibody response to viral challenge, suggesting that miRNA-based attenuation of influenza viruses can be an effective strategy for vaccine development [235]. Chicken lungs and trachea differentially express 377 and 149 miRNAs relative to non-infected animals following low pathogenic H5N3 avian influenza virus infection. Gene ontology analysis of induced miRNAs showed that multiple categories involved in virus regulation and immune response were significantly enriched, suggesting that these miRNAs potentially regulate the host response to infection [236].

Whole genome miRNA expression profiles of mice infected with recombinant pandemic 1918 pandemic influenza virus (r1918) and a non-lethal seasonal influenza strain (A/Texas/36/91) identified miR-200a and miR-223 as significantly deregulated miRNAs that were predicted to regulate pathways associated with immune response and cell death. Expression profiles of numerous genes predicted to be regulated by these miRNAs showed significant reciprocal expression in mice infected with r1918 and A/Texas/36/91 suggesting that these genes are likely targets for the deregulated miRNAs [237]. Not only are host miRNAs deregulated by influenza infection, but host miRNAs miR-323, -491 and


Infection of human lung epithelial cells by influenza A/WSN/33 and A/ Udorn/72 induces expression of miR-7, -132, -146a, -187, -200c and miR-1275 and 26 innate immunity-associated genes (including IRAK1 and MAPK3) are targeted by these miRNAs [240]. A recent study showed that miR-29a induced in A549 lung epithelial cells negatively regulates the expression of DNA methyl transferase 3a and 3b to induce the expression of cyclooxygenase 2 (COX2) and prostaglandin E2 (PGE2) [241]. miR-29c has also been shown to regulate BCL2L2 RNA that induces cellular apoptosis during infection [242]. In a recent study, we identified host proteases that are essential for influenza virus replication and showed that miR-106b and miR-124\* regulate expression of ADAMTS7,DPP3, MST1, while miR-106b\* was shown to regulate PRSS12 expression [243]. miR-451 inhibition was recently shown to induce the expression of pro-inflammatory cytokines IL-6, TNF, CCL5/RANTES and CCL3/MIP1 in primary murine dendritic cells upon influenza infection by inhibiting transcription factor YWHAZ/14-3-3 as well as two negative regulators of cytokine production FOXO3 and ZFP36/Tristetraprolin [244]. Deep sequencing approaches to identify small RNAs differentially expressed in broiler chickens upon influenza A challenge identified gga-miR-206 and gga-miR-451 and gga-miR-146a which were subsequently shown to regulate expression of ARL11, CHMP2B, POU1F1, PDHB and HIF1AN [245].

Infection of A549 cells with highly pathogenic avian influenza A/H5N1 identified miR-21\*, miR-100\*, miR-141, -574-3p, -1274a and -1274b as highly induced miRNAs. miR-141 was found to regulate the expression of pro-inflammatory cytokine TGF- [246]. Swine challenged with influenza H1N2 significantly induced the expression of miR-15a, -21, -146, -206, -223 and -451 in addition to several key cytokines [247]. miR-146a induced in A549 cells also regulates miR-146a promoter activity and knockdown of this non-coding RNA increases viral replication [248]. We recently showed that miR-149\* regulates NEK8, miR-548d regulates MAP3K1 and miR-1228 and miR-138 regulate CDK13, all of which are kinases that are crucial for influenza virus replication [249]. Overall these studies suggest that influenza virus infection in human, avian and swine deregulates expression of multiple miRNA species to regulate the host anti-viral response.

## **CONCLUSIONS**

Clearly host-encoded miRNAs have a vital function in regulating the immune response to infection, and tissue-specific miRnomes that change temporally provide cells with an expansive repertoire of RNA-based 'tools' to rapidly and specifically fine-tune gene expression. Similarly, pathogens (and viruses in particular) have evolved both their own encoded miRNAs as well as mechanisms to subvert host miRNA-mediated immunity. While there is a clear need to improve the identification of miRNA targets, especially when considering seed sequence alignment alone, the overall number of potential binding sites has been drastically downscaled with an improved understanding of RNA secondary structure and its impact on miRNA target accessibility. Furthermore, as deep sequencing methods capable of detecting extremely low copies of short sequences continue to improve, and more studies are conducted in physiologically relevant primary cells, the true nature of the complex interplay occurring in the cellular noncoding RNA space during infection will be revealed.

### **ACKNOWLEDGEMENTS**

Declared none.

### **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflict of interest.

### **REFERENCES**













© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

# **RNAi Screening in Cells of the Immune System: Challenges and Opportunities**

**Sinu P. John<sup>1</sup> , Michael Freeley2 , Aideen Long2 and Iain D.C. Fraser1,\*** 

*1 Signaling Systems Unit, Laboratory of Systems Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA and <sup>2</sup> Department of Clinical Medicine, Trinity College Dublin, Dublin, Ireland* 

**Abstract:** RNAi screening and the use of small silencing RNAs for specific gene knockdown has revolutionized basic science and translational medicine, both through the discovery of novel gene function and as a means to perturb disease-causing genes for therapeutic intervention. Availability of genome-wide RNAi libraries has made it possible to screen, in an unbiased manner, for all genes involved in any cellular process. This promises a more comprehensive understanding of complex cellular response networks, a fundamental goal in the emerging field of systems biology. Despite the obvious potential of this technology, cells of the immune system pose certain challenges for application of large scale RNAi screening, particularly in balancing the efficient delivery of silencing RNA while avoiding non-specific immune responses to the introduced nucleic acid. However, recent advancements in RNAi technology, improvements in delivery methods and the development of robust screening assays have made this technology more accessible to immunologists. Consequently, several examples of successful application of RNAi screening at both genome and sub-genome scales in immune cells are emerging, and are significantly advancing our knowledge of immune cell function. In this chapter, we outline the major challenges of using large scale RNAi screening in hematopoietic cells and describe different methodologies and assays that have been adopted for screening, with an emphasis on how these published studies have advanced our understanding of the immune system in health and disease. We conclude with a discussion of future opportunities and screening approaches that will realize the potential of RNAi screening in immune cells.

**Keywords:** Assay design, electroporation, Hematopoietic cells, immune cells, innate immune response, nucleofection, RNAi screening, siRNA delivery, viral shRNA.

### **INTRODUCTION**

Increased understanding of small RNA biology following the discovery of RNA

**Ralph A. Tripp & Jon M. Karpilow (Eds) © 2014 The Author(s). Published by Bentham Science Publishers**

**<sup>\*</sup>Corresponding to Iain D.C. Fraser:** Signaling Systems Unit, Laboratory of Systems Biology, Laboratory of Systems Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, 9000 Rockville Pike, Bldg. 4, Rm. 109A, MSC-0421, Bethesda, MD 20892, USA; Tel: 301-443-5998; Fax:: 301-480-5725; E mail: fraseri@niaid.nih.gov

Interference (RNAi) has led to the development of large-scale libraries that allow researchers to rapidly screen for novel targets relevant to their area of interest. RNAi takes advantage of the endogenous microRNA pathway, permitting the silencing of mRNA transcripts by introduction of a short interfering RNA (siRNA) complementary to the target gene messenger RNA. Large-scale application of this technology using the robotic platforms originally developed for small molecule screens now permits genome-wide forward genetic screens in a wide range of mammalian cell types [1-3]. Accordingly, RNAi screening and the use of siRNAs for specific gene perturbation are revolutionizing both basic science and translational medicine, as they permit rapid and unbiased discovery of novel gene function and the manipulation of disease-causing genes for therapeutic intervention.

A wide range of cell types derived from hematopoietic stem cell precursors orchestrates the mammalian immune system. Hematopoietic development generates two branches of immune cells, a myeloid and a lymphoid branch (Fig. **1**). The lymphoid branch gives rise to B and T lymphocytes and natural killer (NK) cells, while the myeloid branch differentiates to all other blood cells including neutrophils and macrophages [4, 5]. Dendritic cells (DCs), another immune cell type, have been reported to develop from both myeloid and lymphoid lineages [6]. The application of genome-wide siRNA screening in hematopoietic cells has the potential to provide significant insight into mechanisms of immune cell activation, host-pathogen interactions, inflammation and tumor biology. Uncovering new components in immune cell function may also reveal novel targets for immuno-suppression, anti-viral and anti-cancer therapies.

However, since the key to successful siRNA screening is the efficient delivery of siRNA into target cells, the subsequent uptake by the RNA Induced Silencing Complex (RISC), and the degradation of the target gene mRNA, hematopoietic cells present some significant technical challenges in terms of siRNA delivery. Moreover, since certain immune cell types are primed to recognize and respond to pathogen-derived nucleic acids, researchers planning RNAi screening studies in immune cells must be particularly vigilant to ensure that RNA delivery protocols do not induce a non-specific interferon (IFN) response to the RNAi reagents.

In this chapter, we discuss the challenges of siRNA screening in cells of the immune system, outline available methodology for RNAi effector delivery and assay design, highlight examples where this technology has been used successfully for immune cell studies, and discuss future opportunities for the use of RNA screening to address immunological questions.

**John et al.**

**Fi** D po R si **igure 1: App** Differentiated c otential applica RNAi screens f /shRNA delive **plications an** cell types of ations of RNAi for specific fu ery method use **d methodolo** hematopoietic i screening for unctions are g ed. **gies of RNA** origin are sh r various cellula given in paren **Ai screening** hown diagram ar functions (o ntheses, with **in hematopo** mmatically (on on right). Alrea the respective **oietic cells.** n left) with ady reported e references

### **C CHALLENG GES FOR R RNAi SCRE EENING IN IMMUNE SYSTEM C CELLS**

S ef u su as ar of li si uccessful p fficient deliv sing lipid-ba uccessfully i s HeLa, U2O re non-adher f transfectio pid-based tr iRNA delive perturbation very of siRN ased reagent in a wide ran OS and HEK rent in cultu on achieved ransfection ery wheneve of target g NA into the c ts is the mos nge of non-i K293 cells [7 ure, and are o in these cel of siRNA er possible, v genes in RN cell type und st popular de immune adh 7-9]. Howeve often less tra lls with curr remains the viral vector m RNAi screen der study. T elivery meth herent mamm er, many hem actable due t rently availa e most con mediated del ning depend Transfection hod and has b malian cell ty matopoietic to the low p able protoco nvenient app livery of sho ds on the of siRNA been used ypes such cell types percentage ols. While proach to ort hairpin RNAs (shRNA) offers an alternative to lipid-mediated methods [10, 11]. However, compared to siRNAs, shRNA have higher variability in terms of silencing efficiency [1], and viral-mediated transduction may also increase the off-target effects due to higher cellular concentrations of shRNA [12, 13].

The possibility of a non-specific cellular response to the silencing RNA must also be considered, particularly in cells of the innate immune system. While the IFN response to double stranded RNA (dsRNA) was long established as an anti-viral defense strategy, the seminal studies with siRNA in mammalian cells suggested a size threshold of 19-21 nt under which dsRNA could promote gene silencing while avoiding an IFN-induced global inhibition of protein expression [14]. While this rule also appeared initially to hold true in immune cells, it was later discovered that certain immunostimulatory motifs in the small dsRNA could stimulate IFN responses that are particularly potent in certain innate immune cell types [15-19]. While all mammalian cells express cytosolic sensors responsive to dsRNA, such as retinoic acid-inducible gene-I (RIG-I), phagocytic immune cells, such as DCs and macrophages, are also primed to respond to pathogen-derived nucleic acids in endosomal compartments through a number of different toll-like receptors (TLR3, TLR7, TLR8 and TLR9) [20, 21]. The higher sensitivity of certain immune cells to dsRNA can therefore present an additional challenge in attempting to specifically perturb individual genes with siRNA while avoiding IFN induction. For example, TLR7/8 in human monocytes and PBMC have been shown to be activated by both single stranded as well as double stranded siRNAs, with single stranded RNA being comparatively more immuno-potent [22, 23]. This can result in the production of immune cytokines including IFNs, which in turn induce the production of antiviral proteins to inhibit protein synthesis [24]. Recent studies have also suggested that certain cationic lipids used for nucleic acid transfection can non-specifically activate cells and that certain immune cell types may be particularly sensitive to these effects [25]. It is thus vital for any RNAi screening endeavor in immune cells, particularly in cells expressing high levels of the described nucleic acid sensors, to carefully evaluate delivery protocols to identify conditions that avoid any significant non-specific responses to either the transfection reagents or the nucleic acid.

As with any siRNA/shRNA mediated gene knockdown, off target effects are also a major challenge in siRNA/shRNA based high throughput screens in hematopoietic cells, particularly through microRNA (miRNA)-like targeting of the 3'UTR of unintended target genes by the seed sequence of the siRNAs [26, 27]. Strategies to evaluate seed-based off target effects will be covered in detail elsewhere in this volume, so we will not discuss them in detail here. However we would emphasize that while statistical and experimental validation of individual hits from the screen may help to minimize the off target effect, more retrospective quality control on commercially available siRNAs may be needed to remove library siRNAs with particularly promiscuous seed sequences [28, 29].

## **APPROACHES TO RNAi SCREENING IN IMMUNE SYSTEM CELLS**

Successful siRNA screening depends not only on the efficient delivery of siRNA into the cells of choice, but also on the development of a specific and sensitive assay to reliably identify an altered phenotype in response to the silencing of genes in a high throughput format. The sections below first discuss the advantages and disadvantages of various dsRNA delivery methods (Fig. **2**), and then highlight considerations that should be taken into account for assay design.

## **Silencing Reagent Delivery**

Common practice for siRNA delivery into cells uses either lipid-based transfection or viral mediated transduction. However, in 'hard to transfect' cells, such as primary cells and many hematopoietic cell lines, other alternatives have also been adopted to improve the delivery of siRNA, and these are also discussed below.

## *Lipid-Based Transfection*

The most widely used method of lipid-based siRNA transfection is often not easily transferrable to primary hematopoietic cells and cell lines because of the lower efficiency of siRNA delivery. The advantage of this approach is that in a reverse-transfection format, where plates can be prepared and stored with a prearrayed genome-wide library, it is the most straightforward approach to conduct and interpret a large-scale screen. Although not practical with suspension cell types, it can be used with adherent innate immune cells such as macrophages or DCs if considerable effort is dedicated to delivery optimization in the plate format being used for the primary screening assay. Accordingly, lipid-mediated transfection has been reported for a screen in the mouse macrophage cell line, Raw 264.7 cells [30]. Among the various commercially available lipids tested, only Lipofectamine LTX, Lipofectamine RNAi Max and HiPerFect were able to deliver sufficient siRNA to induce significant silencing in this study [30]. Another lipid-based transfection screen has been reported in the mouse macrophage cell line J774.1 for host factors regulating mycobacterium tuberculosis infection [31].

### *RN NAi Screening in Hematopoietic Ce ells*

*Frontiers in RNA Ai, Vol. 1* **149**

**Fi** of **A** ex us N al pe si ce **igure 2: Avail** f commonly us . Viral mediate xpression, such sed for siRNA Nucleofection/E lso be plasmi erformed curre RNAs which ells**. lable methods** sed methods fo ed shRNA kno h as lentiviral, A (lipid-media Electroporation id-expressed s ently in 96-we can be deliver **s for RNA int** or shRNA and ockdown can u retroviral or a ated) or shRN n is used to del shRNA) throu ell format. **D**. red without a **erference in i** siRNA delive tilize several d adenoviral vect NA (lipid-med liver silencing ugh application Modified siR transfection r **immune cells.**  ery and screeni different types tors. **B**. Arraye diated or vira RNA (most c n of an elect RNAs or Acce reagent into ot Schematic rep ing workflows of viral vector ed RNAi scree al-mediated) d commonly siRN tric field, wh ell delivery us therwise hard presentation s are shown. r for shRNA ning can be delivery. **C**. NA but can ich can be se modified to transfect

In of si T m be ex ce H a n order to ob f transfectio iRNA deliv THP1, again mycobacteriu een used for xamples, ach ells using a However, it s reporter suc btain prolon on into the ery has also n for identi um tuberculo r screening i hieving robu any commer should be no ch as lucifera nged knockd same cells o been con fication of osis [32], wh n the Jurkat ust transfect rcially avail oted that if th ase or green down, this gr [31]. A siR ducted in th host factor hile lipid-ba T cell line [ tion efficien lable lipid r he target cel fluorescent roup used tw RNA screen he human m rs involved ased siRNA [33]. Despite ncy of siRNA remains a s lls used in a protein (GF wo successiv n with lipidmacrophage in the res transfection e these few A into hem significant c screen stabl FP) (see 'Scr ve rounds -mediated cell line sponse to n has also published atopoietic challenge. ly express reen assay

### **150** *Frontiers in RNAi, Vol. 1*

design' section below), this can provide a convenient target for optimizing siRNA delivery protocols. The ability to assay such reporters directly in the screening assay plate format provides two advantages. First, it allows measurement of protein rather than mRNA knockdown (the most common validation method for siRNA delivery), providing a more direct assessment of the required endpoint needed for an effective screen. Second, the ability to run the delivery optimization assays in 96 or 384-well format allows for a more extensive matrix of experimental conditions that improves the chance of identifying an optimal siRNA delivery protocol.

### *Electroporation/Nucleofection*

An alternative to lipid-mediated transfection for siRNA delivery is electroporation or nucleofection, which has been shown to provide higher nucleic acid transfection efficiencies in many primary cell types and cell lines of hematopoietic origin [34-39]. Electroporation involves applying an electric field pulse to induce transient cell permeability *via* the formation of microscopic pores in the plasma membrane through which nucleic acids can traverse. If the electric field pulse is optimized carefully, most electroporated cells can recover to function normally [40]. Nucleofection® is an electroporation-based procedure consisting of a proprietary device, cell type-specific solutions and pre-set programs for the delivery of nucleotides into a wide variety of cell types, including 'hard-to-transfect' cells. A disadvantage of siRNA delivery *via*  electroporation/nucleofection is its low throughput, although advances have been made to electroporate/nucleofect hematopoietic-derived cells in 96-well format [41-43]. It should be noted that using this approach, electroporation of primary T cells required a comparatively higher concentration of siRNA (10x more than for HeLa cells) for efficient gene silencing [44]. Requirement of a relatively higher concentration of siRNA, and often also a need for proprietary delivery buffers for optimal nucleic acid uptake and cell survival, can make this method considerably more expensive on a genome-wide scale. However, a potential advantage of this approach that can reduce the cost per assay is that larger numbers of cells (typically up to 1 x 106 cells per sample, depending on cell type) can be electroporated with each siRNA and then distributed into multiple plates to be used for numerous different downstream assays. This could potentially allow for more comprehensive biological readouts from a single screen. Although the instrumentation for high throughput screening in 384-well format is yet to be developed, the well established reliability of this technique in immune cells suggest this is an area for promising future screening applications.

### *Viral Mediated Transduction*

Retroviral, adenoviral and lentiviral-transduction are an alternative method of exogenous interfering RNA expression in hematopoietic cell models where transient transfection is inefficient. Initial shRNA designs employed plasmid or viral-based expression RNA polymerase III-driven stem-loop (hairpin) structures of approximately 50-60 nucleotides resembling pre-miRNA, the endogenous substrate of the Dicer ribonuclease [45]. More recent studies have demonstrated that shRNA can be more efficiently processed through the host miRNA processing pathway if they are transcribed by RNA polymerase II in the context of the larger 200-400 nucleotide endogenous primary miRNA transcripts that are first processed by the Drosha ribonuclease [46]. Although transient expression of the above types of shRNA designs from plasmid templates can be employed in some easily tractable cell types, this shRNA expression approach is more widely used for screens adopting viral delivery systems such as adenoviral, retroviral and lentiviral infection that can be applied to almost any cell type [8, 47].

However, a caveat with this approach is that the shRNA may compete with endogenous miRNAs for the small RNA processing machinery described above, and there have also been reports that viral driven shRNA expression can trigger an IFN response [48-50]. In this regard, designing the shRNA in the context of the primary miRNA transcript as described above and expressing the transcripts at lower levels from weaker pol II promoters appears to be the best approach, showing reduced toxicity in mouse studies and providing a promising platform for the stable knockdown of genes with minimum cytotoxicity [51].

Another practical issue is that the generation of individual viruses for all genes in the genome could be too expensive and labor-intensive for many research groups. While there are examples of arrayed high throughput lentiviral screens conducted in hematopoietic cells [52], many virus mediated shRNA screening studies address the large-scale virus production challenge by adopting a pooling strategy using mixed viral preparations containing shRNAs targeted against many genes. Cells are infected to achieve on average, one stably integrated shRNA per cell, followed by an assay that permits selection for loss or gain of a particular phenotype. To then identify a specific shRNA that causes a phenotypic change, RNA is isolated and over or under-representation of particular shRNA hairpin sequences (or unique shRNA construct-embedded bar codes) is determined by deep sequencing or microarray [53, 54].

### **152** *Frontiers in RNAi, Vol. 1*

A notable advantage of shRNA-based screening is that viral vectors with inducible promoters can be used to tightly regulate gene knockdown [53, 55]. This is particularly useful in circumstances where genes essential for cell viability need to be depleted acutely over a short time frame. Another useful application of this platform is the ability to use cell-type specific promoters to selectively express the required inducible response elements and deplete a target gene or genes only in a particular immune cell subset. Selective promoters for many immune subsets are well established, such as the myeloid promoters lysM, csf1r, CD11c, CD68, macrophage SRA, and CD11b, lck for T cell expression, CD19 for B cells, Mcpt5 for mast cells and Ncr1 for NK cells. Although retroviral and adenoviral vectors are appropriate for most cell lines and some primary cells, such as activated lymphocytes, efficient delivery of shRNA to non-dividing cells, including

immature DCs, is better achieved with lentiviral systems [56].

## *Modified siRNA (Accell)*

Accell siRNA delivery is a relatively recent technology developed by Dharmacon (part of GE Healthcare) for use in less tractable cells. This technology uses modified siRNA for efficient uptake by cells without a transfection reagent. In a recently published study, transfection efficiencies of siRNA in primary cells with Accell delivery was 85% (range, 71–97%) compared to 38% (23–65%) using nucleofection [57]. This technique has been optimized for use in 96-well format and has the potential for development in 384-well format to use in high throughput transfection of hematopoietic cells. However, no high throughput siRNA screen has been reported so far with this method to the best of our knowledge. A caveat with this technique is the requirement to incubate cells in serum-free media for a minimum of 48 h during the siRNA delivery and, depending upon the cell type, tolerance to the serum-free conditions may vary. While there are few reports testing the tolerance of immune cells to serum-free conditions in culture, Accell siRNA delivery can also be achieved in low serum conditions down to 2.5%, which could broaden the applicability of this delivery method. It should also be noted that the required concentration of siRNA for efficient delivery using Accell technology is higher than conventional approaches using lipid-based delivery.

### **Screen Assay Design**

Various types of assays have been adapted for RNAi screening in hematopoietic cells. Selection of an optimal assay depends on the nature of the cells, whether they are suspension or adherent and also on the biological question under consideration. For genome-wide screens, the amenability of an assay to high throughput microwell formats is a primary consideration. Here we describe several types of assays researchers have adopted for RNAi screening in immune cells.

## *Viability and Cell Survival Assays*

Due to their routine use in evaluating cancer cell survival, cell viability assays are one of the most highly validated and straightforward screen readouts. Most widely used cell viability assays measure the energy metabolite ATP [58], and these assays also have important applications as a secondary readout for assays that require cell number-based normalization. Viability assays have immunological applications as a primary readout outside of the cancer field, for example in assaying pathogen infections that typically lead to cell death or necrosis, or in the study of apoptotic signaling pathways that lead to the negative selection of selfreactive lymphocytes. As an example, a FADD-deficient Jurkat cell line, I42, that undergoes programmed necrosis in response to tumor necrosis factor-alpha (TNF) stimulation, has been used to screen for kinases important in this pathway [59].

## *Endpoint Reporter Assays*

Genetic reporter systems provide a convenient and well-established approach to studying signaling and subsequent gene expression by RNAi screening. Most applications of such assays adopt a sensitive endpoint readout of an easily assayed reporter such as luciferase, driven by a signal-responsive promoter derived sequence. In Jurkat T cells, for example, a stably integrated reporter was created with Gaussia luciferase expression under the control of 8xNFκB sites, and used to screen for signaling components involved in activation of NF-B by either TNFα or anti-CD3 [43]. In another novel reporter assay designed specifically for pooled shRNA screens, a macrophage cell line was created with the TNF promoter driving a lethal diphtheria toxin A (DT-A) cassette, which promoted cell death in response to TLR stimulation. Thus, upon infection with a pooled shRNA library, cells transduced with shRNAs inhibiting genes necessary for TLR-dependent proinflammatory signals would be positively selected and the causative shRNAs could be identified by deep sequencing [60].

## *High Content Imaging and Cell-Based Fluorescence Assays*

Advances in high throughput microscopy platforms have enabled the development of increasingly complex cellular imaging and fluorescence-based screening

### **154** *Frontiers in RNAi, Vol. 1*

assays. In addition to the simultaneous detection of multiple readouts using spectrally distinct fluorescent proteins or conjugated antibodies, changes in cell shape and morphology can also be rapidly profiled using advanced image analysis software. Such 'high content' imaging can be especially advantageous in RNAi screening as it can allow the detection of multiple assay readouts in a single screen, significantly increasing the amount of data that can be collected to gain a more comprehensive understanding of the effects of gene perturbation.

The use of multiple fluorescent channels can also permit differential staining to screen for effectors of key immune processes. For example, an image-based phagocytosis assay was developed in mouse macrophage J774A.1 cells using different fluorescent dyes to distinguish external and internalized opsonized particles. This assay was used to screen for key phagocytosis effectors among a panel of 20 GTPases [61].

Another image-based fluorescence assay adapted for RNAi screening in immune cells is the Forster resonance energy transfer (FRET) assay. Using a combination of fluorescent proteins, these assays can effectively monitor changes in proteinprotein interactions or in the activities of signaling enzymes and effectors that drive immune responses. For example, the small G protein Cdc42 regulates changes in cell morphology, motility and vesicular transport, and its activity can be monitored with the FRET biosensor Raichu-Cdc42, which changes conformation in response to Cdc42 activating stimuli. This has been successfully used for RNAi screening in YTS NK cells, where Cdc42 activity was used as a readout downstream of activating and inhibitory NK receptors that drive the morphological changes essential for immune surveillance and cytotoxicity [62].

Although not microscopy-based, flow cytometry assays are another fluorescencebased readout that are often adaptable to high throughput RNAi screening applications, and these assays can be particularly useful in non-adherent immune cell types which are less amenable to microscopy-based high content assays. This long-established technology provides the advantage of rapid collection of data from a large number of single cells and is particularly well suited to assays that can provide a robust change in a fluorescent readout. It has been widely adopted for RNAi screening in immune cells to study host-pathogen interactions, immune activation and cell signaling. In one such study, GFP-expressing *F. tularensis subsp. holoarctica* was used in a genome-wide siRNA screen for genes mediating resistance to *F. tularensis* infection in the THP1 macrophage-like cell line. Hits were scored by decreased levels of GFP in cells 24 hr after infection with the labeled bacteria [63].

### *Secreted Effector Assays*

There are many well-established methods to assay for effectors secreted from immune cells or released by degranulation. Enzyme linked immunosorbent assays (ELISA) are widely used for secreted cytokine quantification, and more recently developed bead-based assays can permit the simultaneous evaluation of multiple cytokines in a single sample of cell supernatant. The advantage of adopting these assays as RNAi screen readouts are that they are already well validated in microplate formats, and the collection of the secreted factors from cell supernatants can permit additional cell-based assays to be run from the cells left behind in the same well. A potential drawback with these assays on a genomewide scale is the expense of the antibodies and reagents to run a large-scale screen. Examples of ELISA-based assays used in RNAi screens include interleukin (IL)-2 production in response to antigen presentation by DCs [64], and IL-1β production in response to *E. coli* infection of THP-1 macrophages [65]. A bead-based antibody detection assay for IFN-γ was used to screen for genes that promote target cell resistance to human NK cell mediated lysis [66]. Enzymatic activities of secreted proteins can also be used for developing RNAi screening assays in immune cells. For example, mast cell degranulation releases various enzymes that can be assayed by colorimetric detection. One such study screened for phosphatases involved in IgE-mediated mast cell degranulation using βhexosaminidase activity as a readout [67].

### *Migration Assays*

Transwell migration assays have been developed for studying immune cell migration and chemotaxis. While these assays do not lend themselves well to a high-throughput arrayed assay format, they are better suited to pooled shRNA screens as cells from a transduced population that show an aberrant migration phenotype can be easily isolated. A screen of this type has been run in RAW264.7 cells to screen for genes influencing chemotaxis towards the complement component C5a [68], and also in SupT1 lymphoma T cells to identify genes that influenced migration towards CXCL12 [69].

## *Colony Forming Assays*

While studying host cell infection by pathogens such as bacteria or viruses, growth or replication can be measured by colony forming or focuses forming assays (CFU). Although not normally considered practical in a high throughput format, this type of assay has been used for a genome-wide RNAi screen to identify host factors involved in the response to *Mycobacterium tuberculosis* infection [31].

### *Microarray and Next Generation Sequencing Assays*

Although they are not widely used as a primary assay for evaluating cell responses in RNAi screens, microarrays and next generation sequencing have important applications in identifying enriched or depleted shRNAs (and hence the genes) responsible for phenotypic effects in pooled shRNA screens. For example, microarray analysis has been used to identify enriched shRNAs after human immunodeficiency virus (HIV) infection of Jurkat cells [70].

## **INVESTIGATION OF IMMUNE CELL FUNCTION USING RNAi SCREENING**

Despite the challenges facing researchers wishing to adopt RNAi screening approaches in hematopoietic cells, this technology has been used successfully in numerous studies in a variety of immune cell types. While we do not have the scope to reference all of these studies in this review, here we highlight several examples organized by the target cells in which the studies were performed.

### **T Cells**

T cells are a vital arm of the adaptive immune response and provide surveillance against micro-organisms, pathogens and tumors [71]. T cells arise from common lymphoid progenitors in the bone marrow (Fig. **1**) and are transported to the thymus where some will eventually develop into CD4+ T cells and CD8+ T cells. In response to activation by antigen-presenting cells (APCs), CD4+ T cells proliferate and differentiate into helper T cell subtypes (*e.g.* Th1, Th2 and Th17) that secrete distinct cytokines that influence both innate and adaptive immunity, while CD8+ T cells proliferate and differentiate into cytotoxic CD8+ T cells that mediate direct targetcell killing. A lineage of CD4+ T cells known as regulatory T cells (Tregs) inhibit the function of T cells and other immune cell types. T cell differentiation also results in the formation of long-lived memory T cells that are rapidly mobilized should the host encounter these pathogens in the future. T cell activation and differentiation takes place in secondary lymphoid organs and is followed by trafficking of these cells from the bloodstream into inflamed target tissues such as the skin or gastrointestinal tract where they mediate effector function. The capacity of T cells to migrate and infiltrate into tissues, however, is also a major contributing factor in the development of autoimmunity [72], allergy [73] and graft rejection [74]. Understanding the genes and proteins that regulate T cell activation and trafficking into tissues may therefore reveal novel immunosuppressive or anti-inflammatory targets. T cells (and monocytes/macrophages) are also the primary site of infection of the HIV virus [75]. Since HIV is dependent on host factors for both viral entry and replication inside T cells, elucidating such host factors may provide novel therapies in the continuing fight against this virus. In addition, targeting malignant T cells *via* silencing of cell survival genes may provide novel therapeutic targets for T cell leukemias and lymphomas. Here we outline siRNA screening studies that have led to the elucidation of new components that regulate T cell signaling, cytokine production, migration and viral-host interactions.

### *T Cell Activation and Function*

A number of RNAi screens performed in human T cell lines or primary T cells have revealed new molecular components that influence T cell activation and cytokine production. For example, the heterogeneous ribonucleoprotein hnRNPLL was identified as a key splicing factor that regulated alternative splicing of CD45, a transmembrane protein tyrosine phosphatase, in the JSL1 T cell line following T cell activation [76]. In primary CD4+ T cells, a lentiviral screen targeting ~ 1000 kinase and phosphatases identified 55 genes (such as the tyrosine kinases FLT-3 and ZAP-70) that positively or negatively regulated cytokine production in Th1 T cells (IFN-), Th2 T cells (IL-13) and Treg cells (IL-10) [77]. In a separate study, the same group recently used this lentiviral shRNA library to identify factors that regulate the expression of CD46, a type 1 membrane protein required for T cell activation and function, in primary human CD4<sup>+</sup> T cells [78]. Two members of the serine/threonine kinase G protein-coupled receptor kinase (GRK) family, GRK2 and GRK3, influenced the expression of CD46, as knockdown of these kinases inhibited the expression of this membrane receptor [78]. Brechmann *et al.* [59] recently screened a siRNA library targeting 298 known or putative phosphatase genes to identify novel regulators of NF-B signaling in the Jurkat leukemic T cell line using nucleofection in combination with a 96-well shuttle system. The protein phosphatase 4 regulatory subunit 1 (PP4R1) was identified as a negative regulator of NF-B activation in this study and characterized further [43].

Screening of siRNA libraries have also led to the identification of new components that regulate T cell migration and trafficking. A nucleofection-based small-scale siRNA screen of Rho guanosine triphosphatases in CCRF-CEM T cells identified RhoA as a crucial mediator of trans-endothelial migration (the process by which T cells migrate through endothelial cells that line blood vessel walls into peripheral tissues or secondary lymphoid organs) [79]. In a separate

### **158** *Frontiers in RNAi, Vol. 1*

study, a lentiviral shRNA screen targeting 300 genes encoding predominantly kinases and phosphatases was carried out in SupT1 lymphoma T cells to identify genes that influenced migration towards the CXCL12 chemokine in Transwell chambers [69]. Eleven genes, including ZAP-70 and three members of the synaptotagmin family of proteins that regulate vesicle fusion (synaptotagmins 2, 5 and 7), were confirmed as being positive or negative regulators of CXCL12 mediated chemotaxis following a secondary screening assay [69]. The identification of synaptotagmins from this screen led the authors to examine in more detail the role of synaptotagmin-7 in leukocyte migration. Leukocytes derived from synaptotagmin-7 knockout mice migrated less efficiently *in vitro* in comparison to wild-type leukocytes. Furthermore, synaptotagmin-7 deficient leukocytes were perturbed in their ability to migrate *in vivo* using an inflammatory mouse model of gout [69].

Screening of siRNA libraries in T cell lines have also identified genes that regulate distinct forms of cell death and pathways implicated in T cell malignancies. A variant of the Jurkat T cell line that undergoes programmed necrosis in response to stimulation with the cytokine TNF was used to screen a siRNA library targeting 691 human kinase genes [59]. Ten distinct genes that regulated TNF-induced necrosis were identified in this screen and one of these, RIP3, was characterized further [59]. In another study, a lentiviral shRNA screen targeting ~9500 genes was carried out in Jurkat T cells to identify genes whose inhibition rendered T-cells resistant to Fas-induced apoptosis, a pathway that functions in immune cell homeostasis [54]. As well as identifying genes with well-established roles in this pathway, this study identified ARD1A (a SWI/SNF chromatin remodeling complex component) and CBX1 (a chromatin silencing protein) as critical regulators of Fas-mediated apoptosis [54]. Further siRNA screens carried out in this study revealed genes that were essential for the proliferation of Jurkat T cells, SupT1 T cells and ten other cancer cell lines [59]. The molecular events that contribute to T cell acute lymphoblastic leukemia (T-ALL) [80] and to the survival of T-ALL cells [81] have also been elucidated further using siRNA screening.

## *T Cells and HIV Host-Pathogen Interaction*

HIV encodes nine viral genes, is the causative factor of AIDS and infects T cells, monocytes/macrophages and other non-immune cell types including cells of the central nervous system [75]. The CD4 receptor that is expressed on the surface of these host cells serves as the primary site of viral attachment and entry, while chemokine receptors such as CXCR4 and CCR5 also serve as co-receptors for

### *RNAi Screening in Hematopoietic Cells Frontiers in RNAi, Vol. 1* **159**

viral entry at later stages [75]. Once inside the cell, viral replication and integration into the host genome is dependent on both viral and host cell genes. In 2008, three genome-wide siRNA screens identified hundreds of host cell genes that were crucial for HIV replication and infectivity [82-84]. Surprisingly, very little overlap in the 'hits' or host cell genes required for viral replication were observed between the three screens. This was attributed to differences in the experimental procedures and assay read-outs of the individual studies [85]. A fourth study that utilized a whole genome siRNA library to identify intrinsic resistance factors to HIV replication was recently reported [86]. A notable caveat with all of these studies was that HeLa-derived cervical cancer cells expressing exogenous CD4 were used as the host cells for HIV infection in three of the studies [82, 84, 86], while an HIV isolate that lacked the viral envelope protein required for infection was pseudotyped with the Vesicular Stomatitis Virus Glycoprotein to allow viral entry into CD4-negative HEK293T cells in the other study [83]. Hence these cells were not representative of the physiological cell types infected by the virus. In contrast, a genome-wide siRNA screen has recently been performed in a T cell line to identify host cell factors required for HIV infection. Yeung *et al*. [70] used the Jurkat leukemic T cell line that expresses endogenous CD4 to screen an shRNA library targeting 54,509 human transcripts to identify host cell proteins contributing to HIV replication. Interestingly, 252 genes were identified from this screen and although most of them did not overlap with the genes identified from previous screens, many of the genes mapped to common cellular pathways [70]. An shRNA screen targeting 622 human kinases and 180 human phosphatases was also recently performed in Jurkat T cells to identify genes required for HIV infection and replication [87]. Fourteen new genes were identified in this screen and implicated in steps of the HIV life cycle that preceded viral integration into the host genome (*i.e.* viral entry, viral uncoating or viral transcription) [75]. In theory, the identification of kinases and phosphatases that influence the life cycle of HIV may provide novel anti-viral therapeutic targets, since kinases and phosphatases are generally 'druggable' with small molecule inhibitors and as such may provide alternative strategies to block viral replication in T cells with agents other than siRNAs which are known as 'hard-to-transfect' into these cells. Although current reports on HIV screens in Jurkat T cells and other cell lines sheds valuable insight into the cellular pathways involved in the complex interplay between the virus and host proteins, identification of host factors in non-transformed primary T cells will provide a better understanding of the host-virus interaction that occurs *in vivo* and may provide better therapeutic opportunities to target HIV.

### **B Cells**

B cells, like T cells, develop from common lymphoid progenitors in the bone marrow (Fig. **1**). B cells recognize soluble antigen *via* the B cell receptor (BCR) and differentiate into antibody-producing plasma cells and memory cells [88]. Antibodies promote antigen clearance *via* neutralization, opsonization or phagocytosis by other immune cells. Most B cell immune responses require the 'help' of T cells in the form of co-stimulatory receptors or cytokines for optimal proliferation and antibody production. In the presence of T cell 'help', B cells genetically modify the type and affinity of antibody produced by processes known as class switching and somatic hypermutation, respectively. Class switching involves structural changes and permits some antibodies to cross certain tissues and allows the antibody to be recognized by specific cell types. Somatic hypermutation enables more potent antibodies to be developed against the antigen during the immune response. Class switching and somatic hypermutation occur in distinct areas of secondary lymphoid organs known as germinal centers.

The vast majority of siRNA screens performed in B cells have focused on the identification of genes required for survival and pathogenesis of malignant B cell leukemias and lymphomas. Hence, the discovery that silencing the expression of these genes results in cell death reveals exciting new opportunities to target these malignant cells. B cells are prone to malignant transformation because the genetic re-arrangements that lead to BCR diversification and antibody production may promote constitutive activation of signaling pathways leading to cellular proliferation and oncogenesis [89]. The number of B cell malignancies are broad and are categorized according to clinical, histological and genetic features (The World Health Organization Classification of Tumours of Hematopoietic and Lymphoid Tissues, Fourth Edition; 2008). Regulators of cell survival in malignant B cells have been revealed by siRNA screens in cell lines or primary cells derived from diffuse large cell B cell lymphoma [53, 90-93], Burkitt's lymphoma [94, 95], follicular lymphoma [96], multiple myeloma [66, 97-100], chronic lymphocytic leukemia [101], mantle cell lymphoma [101] and acute lymphoblastic leukemia [42]. An siRNA screen in chronic lymphocytic leukemia cells also revealed molecules that perturbed T cell/APC interaction and thus potential anti-tumor surveillance [102].

While the majority of all siRNA screens to date have been performed *in vitro*, advances have been made to adopt the siRNA/shRNA screen format to *in vivo* systems. The *Eμ-Myc;Arf*-/- B cell lymphoma mouse model is such a system whereby transfer of B cell lymphoma cells into mice results in malignancy. B cell lymphoma cells were transduced with a lentiviral shRNA library targeting 1000 genes with known or suspected roles in cancer and injected into mice to identify shRNAs that were depleted or enriched *in vivo* during tumorigenesis [103]. shRNAs targeting dynamic actin reorganization and cell adhesion genes such as Rac2, CrkL and Twinfilin were depleted *in vivo*, with subsequent studies demonstrating that these genes influenced tumor cell mobilization and metastasis [103]. A similar *ex vivo* study was conducted using *Eμ-Myc* hematopoietic stem and progenitor cell (HSPC) to identify shRNAs capable of accelerating lymphoma. This study identified a number of tumor suppressors (*Sfrp1, Numb*, *Mek1, Angiopoietin*) and components of the DNA damage response machinery (such as Rad17) as being important for lymphoma development [104].

## **Dendritic Cells**

DCs can be derived from both lymphoid and myeloid progenitors during hematopoiesis (Fig. **1**) [105], and they perform pivotal functions in host defense and antigen-presentation in innate and adaptive immune responses [106]. Application of siRNA screening technology in these cells has been primarily in the areas of antigen presentation, DC migration and host-pathogen interaction.

## *Antigen Presentation*

DCs collect antigens and present them to CD4 T cells on major histocompatibility complex (MHC) class II [107]. Researchers have used alternative cell types with DC characteristics to study proteins and networks involved in MHC-II expression and peptide loading. One such study performed a genome-wide siRNA screen in the MelJuso cell line followed by validation in human primary monocyte-derived DCs [108]. Although MelJuso cells are not hematopoietic, they express many immune-specific genes involved in regulation ofMHC-II transport in a similar manner to DCs. This study identified nine proteins involved in transcriptional regulation of MHC-II; five of these regulate the expression levels of the class II major histocompatibility complex transactivator (CIITA), and the remaining genes modulate MHC-II expression without affecting CIITA levels [108]. In addition, they were able to identify six genes involved in MHC-II distribution in DCs. Further analysis identified one of the candidates, ARL14/ARF7, as a MYO1E receptor on the MHC-II compartment required for actin-based transport control [108]

Another property of DCs is their ability to present exogenous antigens on MHC class I, a function termed cross-presentation, and this is crucial for the generation of cytotoxic T cell immunity in response to antigens associated with viral

### **162** *Frontiers in RNAi, Vol. 1*

infection, tumorigenesis and DNA vaccination [109]. The role of Rab GTPases, key regulators of membrane trafficking, in cross-presentation has been studied using lentiviral shRNA mediated knockdown of 57 Rab GTPases in the mouse DC2.4 cell line [64]. Upon establishing stable shRNA expression, the DC2.4 cells were infected with BL21 *E.coli* and antigen presentation to the B3Z T-cell hybridoma was measured by IL-2 ELISA. This identified 12 GTPases involved in antigen cross-presentation, and further experiments showed that internalized MHC class I molecules accumulated in Rab3b/3c-positive recycling endosomes, implicating these vesicles in the process of cross-presentation of exogenous antigens [64].

Another lentiviral shRNA screen was done in mouse DCs with the goal of identifying signaling molecules that regulate cross-presentation [52]. This screen used an arrayed shRNA library in 96-well format containing 3450 shRNAs against 691 kinases and phosphatases. The shRNA-expressing DCs were challenged with ovalbumin (OVA) peptide-expressing yeast to induce antigen presentation to CD8+ T cells from OT-1 mice, and T cell proliferation was assayed by a thymidine incorporation assay. This screen unveiled several new regulators of antigen cross-presentation. For example, Acvr1c, also known as Alk7, a type I receptor for the TGF-β family of signaling molecules, was identified as a novel negative regulator [52].

### *DC Migration*

Upon encountering antigens in peripheral tissues, DCs must localize to secondary lymphoid organs such as the spleen and lymph nodes to present the antigens to T cells. An shRNA screen has been conducted to identify the Rho guanine nucleotide exchange factors involved in the actin cytoskeleton signaling mechanisms controlling DC migration [68]. This screen used the macrophage-like RAW264.7 cell line, but then validated hits in primary DCs. Delivery was achieved through lipid-based transfection of plasmid-expressed shRNAs targeting 38 Rho guanine nucleotide exchange factors (GEFs). The shRNA vector was modified by incorporating a GFP-luciferase fusion protein expression cassette. At 48 h post-transfection, transmigration of the cells was assayed in 24-well transwell plates and quantitated by luciferase expression levels. From an initial set of 6 hits showing more than 50% inhibition of cell migration, Arhgef5 was found to be a key protein essential for the transmigration of DCs. Further *in vitro* studies showed a direct interaction of Arhgef5 with G12, a key effector in G-protein mediated chemotactic signaling to the actin-cytoskeletal regulator RhoA. MIP1 induced chemotaxis of immature DCs and migration of DCs from skin to lymph node were shown to be impaired in Arhgef5 knockout mice [68], thus identifying Arhgef5 as a significant regulator of RhoA-dependent immature DC migration.

## *DCs and Host-Pathogen Interactions*

DCs also have important innate immune functions in the host response to pathogen infection. The 96-well format nucleofection approach described above has been employed to study the role of the RIG-I RNA-sensing pathway during Newcastle disease virus (NDV) infection of DCs. While not run as a screen for novel pathway components, this study confirmed a role for RIG-I in NDV-driven IFNβ induction, and provides a basis for the application of this siRNA delivery approach for large-scale siRNA screens in primary DCs [110].

RNAi screening has also been employed in DCs to study the cellular response network mediating differential responses to TLR stimuli. Using gene expression arrays to implicate genes involved in regulating the DC transcriptional network, candidate regulators were tested by lentiviral-based shRNA perturbation in primary BMDC [111]. From a screen of 125 genes, it was concluded that on average, about 14 regulators activate a target gene and about 5 regulators repress it. This study also identified novel regulators of transcription in response to TLR stimuli [111]. A similar approach was undertaken to study the signaling components involved in the TLR response in BMDCs [112]. In combination with transcriptional profiling, small molecule-based perturbation and phosphoproteomics, lentiviral mediated shRNA knockdown identified novel signaling regulators involved in TLR signaling in DCs. Of particular note, Pololike kinases 2 and 4 activated a novel signaling branch including several proteins that included Tnfaip2, a gene previously associated with autoimmune diseases [112].

## **Monocytes and Macrophages**

Monocytes play a vital role in immune system function linking inflammatory responses and the defense against pathogens to the adaptive immune response. Circulating monocytes serve as a reservoir for the renewal of both macrophages and antigen presenting DCs [113]. Macrophages are derived from parent monocytes in the blood and are highly phagocytic cells that play a pivotal role as first responders to invading pathogens and also in maintaining tissue homeostasis through scavenging and clearance of dead and damaged host cells [114]. In response to various stimuli, macrophages are activated and acquire specialized functional phenotypes. Activated macrophages are generally classified into the following two types: classically activated macrophages (M1), which promote inflammation, and alternatively activated macrophages (M2), which inhibit inflammation, but promote wound repair and tissue remodeling [113]. Although, macrophages are involved in host defense against microorganisms, these cells are often a primary site for replication of bacteria, protozoans and viruses [115, 116]. Recognizing the broad range of functions performed by monocytes and macrophages, RNAi screens have been used to study their immune system function in several areas.

## *Immune Activation and Cytokine Response*

Interleukin-1 is a key inflammatory mediator produced by monocytes and macrophages in response to bacterial infection. A lentiviral shRNA screen has been conducted in THP-1 monocytes to identify components of the splicing machinery, among a set of 425 genes, regulating the expression and secretion of IL-1 in response to *E. coli* infection [65]. THP-1 cells were infected and selected for stable shRNA expression, challenged with bacteria, and the level of secreted IL-1 protein was measured by ELISA. This study identified 10 factors whose perturbation decreased the level of IL1-, and 20 factors whose depletion led to increased in cytokine secretion, the latter set including SFRS3, a member of the SR protein family required for constitutive pre-mRNA splicing and regulation of alternate splice site selection [65, 117].

TNF is another major inflammatory cytokine released by macrophages in response to a broad range of infectious stimuli. Using a novel reporter assay (described in 'Endpoint reporter assays' sub-section above), RAW264.7 macrophage cells were screened with a randomly generated lentiviral library of shRNA sequences to implicate genes required for TLR pathway induced TNF expression. Using a diphtheria toxin reporter, this assay positively selected for cells expressing shRNAs that blocked the TLR2 and TLR4-mediated proinflammatory TNF response [60]. While the authors reported identification of enriched shRNAs by this method, they did not identify putative gene targets with a complete match to the shRNA sequences, possibly due to miRNA-like targeting through partial complementarity in the shRNA seed sequence.

## *Phagocytosis*

Phagocytosis by macrophages is orchestrated by coordinated movement of the actin cytoskeleton that involves Rho GTPase signaling [118]. A small-scale siRNA screen in the J774.1 mouse macrophage cell line attempted to identify the relevant GTPases (among a target set of 20) controlling phagocytosis through the Fc receptor R (FcR) and complement receptor 3 (CR3) [61]. After lipidmediated transfection of siRNA in 24-well plates, cells were challenged with either IgG-opsonized (for FCR) or C3bi-opsonized (for CR3) sheep red blood cells (RBCs), and phagocytosis levels were assayed by immunofluorescence microscopy. Cdc42 and Rac2 were shown to be the primary effectors of FcRmediated phagocytosis, RhoA primarily mediated CR3 mediated phagocytosis, while RhoG was required for both [61].

### *Host-Pathogen Interaction*

During innate and adaptive immune responses to viral and intracellular bacterial infection, macrophages are activated by IFN-γ produced by NK cells and by CD4 Th1 and CD8 cytotoxic T cells. IFN-γ can inhibit the intracellular replication of *Francisella tularensis* in both human-derived macrophages and in mice. A lentiviral shRNA screen has been conducted to identify host factors conferring resistance to *F. tularensis*. In this screen, THP-1 cells were transduced with a genome-wide lentiviral library containing 50,000 shRNAs. At 9-10 days after infection and selection with puromycin, the cells were differentiated to a macrophage-like state with PMA and treated with IFN-γ prior to infection with fluorescently labeled *F. tularensis* for 2 h. At 24 h post-infection, the cells with higher fluorescence expression were collected by FACS and the shRNAs expressed in these cells were detected by genomic DNA sequencing. From five independent experiments, those genes that appeared multiple times in independent screens were studied further*.* Eight of the validated proteins (TNFRSF9, SERPINI1, SERPINA7, HLA-DRB1, ATG5, ATG16L1, PLEK2 and PLS1) were also required for *Listeria monocytogenes* resistance, implicating these genes in broader host mechanisms of IFN-γ restriction of bacterial replication [63].

Another focus of siRNA screening in monocytes and macrophages has been to identify host factors influencing *Mycobacterium tuberculosis* (Mtb) infection. These studies have great potential in drug development efforts to alleviate the continuing global health burden of tuberculosis. A focused screen of 744 kinases and 288 phosphatases for their influence on Mtb replication was carried out in murine J774.1 macrophages [31]. Two rounds of lipid-based transfection 48 h apart were implemented to obtain prolonged silencing, and primary screen hits were validated with different siRNAs from alternative vendors. Among a validated group of 41 hits, the TGF-β receptor isoforms TGFβRI and TGFβRII showed particular promise, as in addition to their siRNA perturbation phenotype, they showed marked elevation in expression in response to Mtb infection in microarray experiments. The TGF-β pathway inhibitor, D4476, showed marked inhibition of intracellular bacterial replication and also eliminated Mtb from

### **166** *Frontiers in RNAi, Vol. 1*

infected mice [31]. This study emphasizes the potential of siRNA screens in identifying drug targets for pathogen control in hematopoietic cells.

A comparable Mtb screen was conducted on a genome-wide scale in THP-1 cells using the same virulent H37Rv strain [32]. This screen used pooled siRNA targeting 18,174 human genes, and after several rounds of hit validation, 270 genes were identified whose knockdown led to a reduction in intracellular Mtb load, while 5 genes showed an elevated bacterial load. Since the Mtb bacterium exhibits a broad spectrum of genotypic and phenotypic variation, the authors rescreened the validated hits against different field isolates of Mtb and noted limited overlap in the host factor requirement across Mtb strains [32]. However, pathway and protein interaction network analysis of the specific hits for each isolate suggested a greater degree of overlap at the pathway level, a pattern also observed in influenza and HIV screens [119] [120].

## *Tumor-Associated Macrophage Biology*

Tumor-associated macrophages (TAMs) have an important role in facilitating tumor outgrowth through anti-inflammatory, immunosuppressive, pro-angiogenic and pro-metastatic properties [121]. It would thus be of great clinical importance to identify the pathways that lead to generation of TAM-like macrophages, and this has been recently addressed with an adenoviral shRNA screen in a human macrophage precursor, the CD14+ peripheral blood mononuclear cell [122]. Using 8495 adenoviral shRNA constructs targeting 2825 genes, shRNAexpressing cells were stimulated to form TAM-like cells by co-culture with MCF-7 breast cancer cells, and assayed for IL10 secretion (a key cytokine produced by TAM-like macrophages). Five proteins, IL-4 receptor alpha (IL-4RA, IL-4R), cannabinoid receptor 2 (CNR2, CB2), 3-hydroxy-3-methylglutaryl coenzyme-A (HMG-CoA) reductase (HMGCR), and bruton's tyrosine kinase (BTK), were identified as regulators of IL-10 and/or IL-16 production and possible targets for control of TAM function [122].

## **NK Cells**

NK cells are large granular leukocytes that play a pivotal role in host defense against viruses and tumors [51]. Although large-scale RNAi screens in NK cells have not yet been reported, smaller scale screens have been carried out. As described in above (sub-section 'High content imaging and cell-based fluorescence assays'), a FRET-based assay was developed in YTS NK cells to measure activity of the Rho GTPase Cdc42, which plays important roles in NK cell morphology, vesicular transport and motility [123, 124]. Known binding partners of Cdc42 were screened by siRNA electroporation for their effect on Cdc42 activity upon challenge with 721.221 NK activating cells [62]. This identified the guanine nucleotide exchange factors RhoGEF6 and RhoGEF7, the kinase Akt, and the p85α subunit of phosphoinositide 3-kinase (PI3K) as essential for Cdc42 activation [62].

An shRNA screen was also conducted on IM9 cells, a multiple myeloma cell line, to identify molecular pathways which modulate tumor cell susceptibility to NK cell-mediated killing [66]. IM9 cells were transduced with a lentiviral shRNA library targeting 476 kinases, 180 phosphatases, and 372 genes representing tumor suppressors, DNA binding proteins, and modification enzymes. The transduced cells were incubated with NKL effector cells, and release of IFN- from NKL cells was assayed as a measure of productive NKL-IM9 cell interaction. Of 83 hits identified in the screen, 66 were kinases, 4 were phosphatases and 12 were from the 372 gene set with non-kinase function. Genes in the MAPK pathway were highly represented with 15 genes, and two members of the JAK kinase family, JAK1 and JAK2, were shown to increase the susceptibility of tumor cells to NKmediated lysis [66].

## **Mast Cells**

Mast cells trigger allergic reactions and IgE-associated immune responses. Degranulation and release of inflammatory mediators from mast cells are initiated upon aggregation of FcεRI on these cells [125, 126]. To understand the role of phosphatases in IgE-mediated mast cell activation, a siRNA screen has been reported using a library targeting all the 198 mouse phosphatase genes in the mouse mast cell line, MMC-1[43]. Using IgE-Ag-induced mast cell degranulation as a functional readout, 10 phosphatases enhanced and 7 inhibited FcεRI-induced degranulation. Among the top hits identified were subunits of calcineurin, where siRNA knockdown of both the Ppp3r1 regulatory subunit and the Ppp3cc enzymatic subunit inhibited degranulation [43].

## **FUTURE OPPORTUNITIES AND CLOSING REMARKS**

The studies we have described emphasize that, while immune cells often present an experimentally challenging cell system, RNAi screening can be successfully implemented in hematopoietic cells to discover important new biology and provide insight to cellular mechanisms of immunological disease. Technological advances in si/shRNA delivery and screening assay formats now permit largescale screens to be conducted in primary immune cells, which were once

### **168** *Frontiers in RNAi, Vol. 1*

**John et al.**

considered intractable to genetic screening. The screens we have described here have generated a wealth of new knowledge on genes required for numerous aspects of immune cell function, including innate and adaptive immune cell activation, migration and trafficking, cell death and clonal selection, hematopoietic cell malignancies and host-pathogen interactions. Furthermore, these reported studies only scratch the surface of potential applications and opportunities for RNAi screens in other cell types of hematopoietic origin (Fig. **2**). Also, with further improvements in screening and delivery methods, we expect continued application of this technology will lead to many further insights to immune cell function in health and disease. Areas that could benefit especially from this technology are studies of immune cells *in vivo* in adoptive transfer models that were not previously amenable to large-scale unbiased screening approaches. For example, the ability to deliver immune cell subsets transduced with pooled shRNA libraries, and to leverage the increasing capacity of nextgeneration sequencing technology to identify the shRNAs expressed by small numbers of cells that show a specific phenotype *in vivo* holds great promise. Another area of potential technical development and new opportunities will be in the targeted delivery of siRNA to specific cell types *in vivo*. For example, targeted delivery of siRNAs into T cells *in vivo* is now feasible with peptide/proteinantibody fusion proteins [127, 128], integrin-targeted and stabilized nanoparticles [129, 130] and aptamers [131, 132], and it is anticipated that these novel reagents may also be useful for *in vivo* delivery of siRNAs into other immune cell types.

While the described screens have produced valuable insight, a major advantage of siRNA screening that has not yet realized its full potential in the immunology field is its ability to screen the entire genome in an unbiased manner. The majority of immune cell screens published to date have targeted subsets of genes, in many cases focusing specifically on a gene family already known to have a central role in a particular process. While this can be useful in highlighting selective usage of functionally related proteins in certain processes (such as the differential usage of GTPases in FcR-mediated and CR3-mediated phagocytosis in macrophages), these studies ultimately miss the opportunity to identify novel regulators of the process under study, and the potential to reach a comprehensive understanding of all genes required in a given cellular response. While admittedly faster and cheaper, these selective screens remain biased towards known 'canonical' gene families, and the field will only begin to realize the full potential of RNAi with an increased commitment to genome-wide screening. At the same time, research groups planning on undertaking siRNA screens should be fully aware of the potential pitfalls and difficulties that may arise when performing and interpreting large-scale or genome-wide screens [133]. This may include issues of screen reproducibility, as demonstrated by two whole genome siRNA screens performed by the same research group within six months of each other where notable intra and inter-screen variability was reported [134], and the lack of overlapping hits in screens performed by different research groups studying HIV and influenza. Given recent concerns regarding whole genome RNAi screening studies as a method to identify potential drug targets [135] and reports from industry citing lack of reproducibility of primary data from the literature [136, 137], it is imperative that all steps of the screening process, including siRNA delivery, assay design, identification/confirmation of 'hits' and secondary screening assays are robust and validated if results from siRNA screens can be exploited for therapeutic purposes.

Another area where screening data is potentially compromised is in the use of sub-optimal target cells for screening. A prime example has been in several largescale studies to identify host factors involved in the response to HIV and influenza infection, where use of a variety of 'easily transfectable' cell types has led to the identification of highly non-overlapping hit lists [70, 119, 120]. While siRNA delivery to more physiological target cells can be challenging, it is feasible with a commitment to rigorous protocol development, and we would argue is significantly more valuable in identifying physiologically relevant targets. While the above-mentioned screens have succeeded in identifying overlapping classes of host cellular pathways and processes involved in the response to important viral infections, they have been less useful in identifying specific candidates for therapeutic development. This question also applies to screens conducted in malignant B cell lines, where further work will be required to determine what proportion of hits has a significant role in primary tumor B cells, which are genetically heterogenous in nature and may have varied oncogene dependence

This brings up the challenge of translating primary data from siRNA screens into potential drug targets, and the question of whether small molecule inhibitors of screen hits, or siRNAs themselves, hold the most promise as drug candidates. The complexity of many diseases and the mutation frequency of pathogens suggest a combinatory approach may be most productive, and thus the identification of multiple potential gene targets in large-scale siRNA screens may be particularly attractive in guiding approaches to such combined therapies. For example, elucidation of the mechanism of HIV viral entry and replication using several approaches, including RNAi, has led to trials of a combinatory anti-viral therapy using hematopoietic stem cells transduced with a lentiviral vector encoding an shRNA targeting HIV tat/rev, in combination with a ribozyme targeting CCR5 (a co-receptor for HIV entry) and an RNA decoy to the HIV TAR sequence [138].

We predict there will be numerous future opportunities to exploit RNAi screening and the new knowledge it provides for the benefit of human health in the immunological field, particularly through identification of anti-viral and antiinflammatory therapies, targeting of malignant blood cancers, identification of host factors regulating bacterial diseases, elucidation of regulatory steps in protective immunity and identification of molecular mechanisms underlying immune disorders. It is anticipated that increased understanding of RNAi in mammalian cells and the continual improvements made in siRNA screening will see this technology realizing its full potential for the discovery of novel gene function and for therapeutic purposes.

### **ACKNOWLEDGEMENTS**

This work was generously supported by the National Institute of Allergy and Infectious Diseases Division of Intramural Research (S.P.J. and I.D.C.F.), and by a European Union Marie Curie Transfer of Knowledge grant (M.F. and A.L.).

### **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflict of interest.

### **REFERENCES**


### *RNAi Screening in Hematopoietic Cells Frontiers in RNAi, Vol. 1* **171**


### **172** *Frontiers in RNAi, Vol. 1*


### *RNAi Screening in Hematopoietic Cells Frontiers in RNAi, Vol. 1* **173**


### **174** *Frontiers in RNAi, Vol. 1*


### *RNAi Screening in Hematopoietic Cells Frontiers in RNAi, Vol. 1* **175**


### **176** *Frontiers in RNAi, Vol. 1*


*RNAi Screening in Hematopoietic Cells Frontiers in RNAi, Vol. 1* **177**


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

# **siRNA Microarray-Based Genomic Screening**

**Yong-Jun Kwon, HiChul Kim, Jin Y. Kim, Namyoul Kim, Jinyeoung Heo, TaeKyu Lee, Michael A.E. Hansen and Veronica Soloveva\***

*Institute Pasteur Korea, 696 Sampyeong-dong, Bundang-gu, Gyeonggi-do, 463- 400, South Korea* 

**Abstract:** The following chapter describes a miniaturized array-based platform for RNAi screening. The PhenomicID platform combines high density spotted microarray technology, high content imaging of the cells, and computational algorithms for image and data analysis. This platform provides an efficient and cost effective way to use reverse genetic tools for analysis of mammalian gene functions on a genome-wide scale. The miniaturization of this process allows for experimental complexity and applications not previously feasible when performed in a well-based format. Our team has employed this technology to identify functional genes involved in the progression of infectious diseases, in cell-based infectious models for HIV, Chagas, Dengue virus, Chikungunya virus, and Influenza virus. Importantly, the power of miniaturized technologies is not limited to screening of RNAi libraries, but can allow performance of complex experiments - combining drug and siRNA treatments to identify drug targets. In addition the flexibility of the describe platform allows researchers to profile multiple primary cell lines in search for essential genes. In this chapter we will discuss practical guidance for developing microarray-based genome-wide library siRNA screening and its applications.

**Keywords:** Assay optimization, data analysis, image analysis, infectious disease, microarray printing, reverse transfection, siRNA screening, ultra-high throughput.

### **INTRODUCTION**

### **Reverse Genetic Approach**

Roughly a decade ago, two important findings in the areas of genetics and molecular biology were published by multiple groups: the first was a draft sequence of the human genome [1]. The second was the description of RNA interference (RNAi), an endogenous pathway that used small double-stranded oligonucleotides (referred to as small interfering RNAs, siRNA) to mediate post-

**<sup>#</sup> Corresponding author Veronica Soloveva:** Institute Pasteur Korea, 696 Sampyeong-dong, Bundang-gu, Gyeonggi-do, 463-400, South Korea; E-mail: VSoloveva@yahoo.com

transcriptional gene silencing in a broad variety of organisms [2-4]. These key discoveries brought forth a broad spectrum of new ideas and resulted in development of tools not only for basic research but diagnostic applications and also drug discovery.

As a tool, RNAi can now be applied to address a variety of biological questions [5]. The ability of siRNA to reduce or "knock down" levels of a targeted mRNA (and thus protein) in model biological systems has enabled the technology to become a reliable method for therapeutic target identification in screens aimed at identifying regulatory components of various biological pathways [6]. Depending upon the question being addressed, screening can be divided into two approaches. In the first, a defined set of target genes known to be operating in a specific biological pathway are tested. This type of study is typically limited to testing a well-defined collection of genes or gene families such as kinases, phosphatases, or GPCRs [7-8]. The second approach, which is the focus of this review, is represented by genome scale RNAi screening. This is typically performed using a phenotypic readout with the goal of comprehensively screening to identify genes that were previously not known to participate in the phenotype of under investigation.

With the publication of various drafts of genome sequences, libraries comprised of different sets of silencing reagents (*e.g.*, dsRNA, siRNA, shRNA) were developed to target the open reading frames in both model biological systems as well as human genomes [9-11].

To date, the most conventional uses of RNAi screens in drug discovery have been to identify genes that encode new druggable candidates [12]. The clear benefit of RNAi-based technology in drug discovery is the ability of siRNAs to mimic the inhibitory effect of a small molecule compound of interest on the target protein. Such strategies have been employed in screening campaigns for a variety of disease models including cancer [13, 14], inflammation [15, 16], and neurological (CNS) diseases [17]. RNAi screens have also been used to identify cellular host factors implicated in viral infectious diseases [18] including HIV [19-23], HCV [24-26], Influenza [27-29], WNV and Dengue [29] as well as non-viral parasitic pathogens [30].

Interestingly, while the initial RNAi screens identified new sets of genes related to the phenotypes under investigation, they also broadened our understanding of issues associated with the technology. For instance, screens performed in independent labs and designed to identify critical host factors for HIV [31] and

### **180** *Frontiers in RNAi, Vol. 1 Kwon et al.*

and 35].

Influenza [32, 33] revealed substantial variability in the data and only a small overlap in the lists of essential genes. This analysis drew attention to the importance of the assay details and screen design, the selection of efficient and specific silencing reagents, and the incorporation of rigorous validation procedures to confirm primary screen hits and reduce false positives [5, 12, 34,

Detailed studies of RNAi reagents facilitated our understanding of the complexities associated with performing reverse genetics with siRNAs. One of the major issues is tied to off-target effects (OTEs), where the introduced siRNA modulates the expression of unintended gene targets [36-38]. Studies have shown that the most general mechanism associated with OTEs is siRNA "seed" sequences (nucleotides 2-7) induce a natural regulatory effect similar to endogenous microRNA [37]. Due to the tolerance of single nucleotide mismatches in this process, a single siRNA can potentially target hundreds of unrelated sequences [39, 40]. Separately, researchers have observed siRNAmediated activation of Toll-Like Receptors (TLRs) and other cellular signaling pathways being facililtated by the presence of GU-rich motifs in introduced oligonucleotides [41, 42]. Due to these and related issues, siRNAs design improvements became a major goal of reagent-producing groups.

The transient nature of siRNA-mediated effects and their ease of delivery into cells allows flexibility in assay design and facilitates the targeting of genes in a range of cell lines. However, potential cell toxicity effects associated with the lipid transfection reagents used to introduce siRNAs into the intracellular space, and the high copy-number of siRNAs required for target silencing can produce false results [43-45]. To address these issues, academic and commercial organizations have optimized both the sequence and chemical modifications of siRNA, thereby minimizing micro-RNA-like off-target effects and increasing overall molecular stability. Simultaneously, chemical modifications have been identified that enable lipid-independent delivery of siRNAs [44, 46, 47].

Despite these efforts, RNAi screens appear to be inherently more variable and produce a broader range of hits than small molecule screens. Birmingham and colleagues [48] performed a statistical analysis comparing results from 13 RNAi screens and 19 small molecule HTS, and found that RNAi screens have, on average, a two-fold smaller assay window and twice the median coefficient of variance (26.5% for RNAi screens *vs* 13.4% for small molecules HTS). This observation emphasizes the importance in screen optimization to minimize variability and false positive results. One way to enhance the reproducibility of screening results is to run a statistically significant number of replicates in conjunction with multiple relevant experimental controls [5, 47, 34]. To this end, we have developed stringent approaches for designing and performing screens on and analyzing data from high-density live cell microarrays.

## **High Density Live Cell Microarrays**

High density spot microarrays carrying cDNAs, beads, oligonucletides, protein, tissues samples, antibodies and chemical compounds have been used by several groups in both research and diagnostic settings (see, for example, Schena *et al.*  [49]). Microarrays allow thousands of individual experiments to be performed on multiple samples under well-controlled, identical conditions (as opposed to the limited number associated with 94, 384, and 1536 well formats). The uniformity of the platform conditions enhances the reproducibility and statistical reliability of individual observation.

There are many techniques described in literature for producing arrays with the desirable characteristics specific for the materials involved in study [50]. The first application of microarrays for introducing genetic material into cells was reported by Ziauddin J. and Sabatini M. in 2001 [51]. One hundred ninety-two plasmids expressing different cDNAs were combined with gelatin for cell adhesion and arrayed as 120-150µm spots on glass slides. When cells were mixed with transfection reagent and seeded on the top of slides, they adhered as clusters in the area of printed spots and showed good uptake of plasmid material and expression of plasmid-encoded proteins in the cells localized on the printed spots. This work demonstrated the throughput and flexibility of microarray formats for phenotypic cell-based assays, expanding the platform from its traditional use in DNA/cDNA hybridization experiments [52].

Despite the advantages mentioned above, microarray technology has its caveats. Although the reagents, including the siRNA libraries and microarray printers of all kinds are available commercially, the optimization of all processes associated with library management, array fabrication, and data acquisition and analysis need to be planned thoroughly and require additional investments. Much effort is needed for tuning each step of the screening process before it becomes efficient and reliable. This is probably one of the strongest reasons why there are only few groups who have established the cell-based arrays fabrication and reported successful results in this format using cDNA [53, 54] or siRNA [55-59, and our group 23, 30].

## **DEVELOPMENT OF SIRNA MICROARRAY TECHNOLOGY FOR CELLULAR TRANSFECTION**

Because the validation of candidate genes takes longer than the screening (similar to screening of small molecules) it is essential for the screening process to be rapid and to provide reliable data for analysis.

### **The Work Flow Process for Experiments with siRNA Microarrays**

The overall flow of the process for microarray preparation and handling is similar to protocols describes by Erfle H. *et al.* 2007 [55] and illustrated in the Fig. **1**.

The initial step involves preparation of the siRNA library for printing by reformatting the siRNA material from original 96-well plate format into 384-well plates. This first miniaturization step is the main step for saving the reagents because only small fractions of total volume will be used from each well to print hundreds of spot replicates or copies. The reagent mixture includes siRNA, transfection reagents and a labeled siRNA-red tracer for downstream visualization of the printed spots. We use the Genomic Solution Omnigrid 100 high-throughput pin-tool based printer to deliver the mixture onto slides in well-defined spots. In all, 3,888 spots (including positive and negative controls) are positioned on one slide. This allows the complete siRNA library targeting 18,000 human genes to be printed on five slides. Slides are dried at 23°C at 50% humidity and can be stored in desiccated conditions for at least 1 year without significant loss of gene knockdown efficiency.

Once experiments commence, slides are placed in 4 chamber culture well plates and seeded with cells contained in ~5~8 ml of appropriate media (per slide). It has previously been shown that synthetic siRNA printed/ or plated in this fashion can be taken up by cells and produce a knock-down phenotype within 48-72 hours post plating [55, 56]. Subsequent phenotypic effects of transfection can be visualized using immuno-staining or marker expression (*e.g.*, GFP) and depend upon the half-life of the protein and its regulation in the cell line of choice. Following fixation (4% PFA) and staining, slides are placed in an MDS ImageXpress Ultra confocal plate reader for image acquisition. In our hands, spot localization is achieved by detecting a labeled siRNA (*e.g.* siGLO, Dharmacon, part of GE Healthcare) at 561nm. Another channel (635 nm) is used in conjunction with DRAQ5 staining to identify the position of nuclei. Finally, a third channel (488 nm) is available to detect fluorophores (such as FITC or GFP) introduced by immuno-staining, cellular dyes or expression of recombinant markers (Fig. **2**).

### *siR RNA Microarray--Based Screening*

*Frontiers in RNA Ai, Vol. 1* **183**

**Fi** co si sl 6) th **igure 1: Proc** omprises the f RNA tracer, 2 ide for reverse ) images of the he images and d **cess flow for**  following steps ) the mixture i e transfection, 4 e whole slide a data are analyz **cellular micr** s: 1) the siRNA is printed on th 4) cells are inc are acquired u zed using propr **roarray-based** As are mixed he glass slide, cubated, 5) the sing a plate re rietary and com **d siRNA expe** with the trans 3) cells are see phenotype is d eader (MDS Im mmercial softw **eriments**. Our sfection reagen eded on the su detected by im mageXpress Ul ware. r work-flow nt and Redurface of the mmunoassay, ltra), and 7)

*Kwon et al.* 

**Fi** de D nm (N **igure 2: Ima** etection of siG Deep red at 635 m for detectio NFkB)-RelA. ( **ges of microa** GLO-Red tracer 5 nm for detect on of Alexa 4 D) integration **array spots.** r mixed with t tion DRAQ5 s 488 signal as of all 3 image Images were transfection rea staining of nuc ssociated with s. Scale bar= 1 taken in 3 ch agents and siR clei material. (C immuno-stain 100µm. hannels: (A) 5 RNA to visualiz C) In this exam ning of endog 561 nm for ze spot. (B) mple at 488 genous p65

T D im tr th al al da so pa The image ac During this t mages. An ransferred to he images. T lternative so lgorithm is t ata analysis. oftware such arametrical a cquisition wi time, the en image mon o a server wh These image oftware pac then organiz Alternative h as Excel, analysis like ith a 10x obj ntire slide is ntage is the here a propr es can also kages. The ed by a cust ely, data anal Graph Pad e PCA and m jective gener s scanned c en created u rietary IPK a be analyzed correspond om-designed lysis can also Prism or M multivariate a rally takes a ollecting tot using MetaX algorithm (IM d by MDS ding meta-da d IM databa o be perform Matlab for m analysis.. an hour for e tal of 408 i Xpress softw M) is used t MetaXpress ata generate ase created fo med using co more compl each slide. individual ware and to analyze s or other ed by the or IPK for ommercial lex multi-

W to sc pr w tr re sp When beginn o be thoroug creening go redominant with the mi ransfected i eliability and pecific detai ning a new p ghly tested to als and opt steps that ne icroarray su nto cells, o d reproducib ls involved i project or ass o assess the timize cond eed to be va urface, the optimization bility of dat in those four say there are compatibilit ditions for m alidated inclu efficiency n array pri ta. Below, t r steps of scr e several crit ty of microa microarray-b ude testing c at which s nting and the discussio reening prep tical factors array techno based screen cell line com siRNAs are confirmatio on is focuse paration. that need ology with ning. The mpatibility e reversen of the ed on the

### **C Cellular Adh herence and d Uniformity y of Culture e on Microa array Surfa ce**

T in The propertie n both cell g es of the sur growth and rface upon w the develop which the cel ment of sign lls have bee naling netw en seeded pl works. Glass, ays a role , which is usually selected for microscopic experiments due to its excellent optical properties, combines a hydrophobic surface with a very hard tension (10,000 times harder than in any mammalian tissues), unsuitable for the direct seeding and culturing of mammalian cells. However there are several approaches that have been adopted that to facilitate cell culture on glass surfaces. These include: coating surfaces with defined synthetic molecules free of any animal products or with extracellular matrix proteins (ECM). The Epoxide and Gamma Amino Propyl Silane (GAPS) and more recently, MAS, MAS-GP, and aminopropylsilane (APS) are synthetic coatings particularly valuable for use under the harsh conditions employed during *in situ* hybridization experiments but was tested for cell culture as well [For more information check websites of Corning; BD; IBIDI; Matsunami Glass Ind.].

Materials that include collagen I, collagen IV, fibronectin, gelatin, laminin, matrigel matrix, poly-D-Lysine (PDL), and Poly-L-Lysine (PLL) mimic the *in vivo* cellular environment and can be readily applied in typical laboratories to any plastic or glass surface. However, for optimal consistency and quality control, it is best to purchase pre-coated cell culture ware from vendors who can ensure optimal surface quality and coating homogeneity.

We have successfully used PDL, PLL, collagen and MAS coatings to promote cell seeding for reverse transfection experiments on glass slides.

With regard to the slides employed in screening, we typically use traditional tissue culture slides (of 26x76 mm and 0.9-1.2 mm thickness) and N1 glass cover slips (24x60 mm and 0.16-0.19 μm thickness). The optimal thickness of the cover slips allows positioning on the reader in a direct orientation. In contrast, the 0.9-1.2 mm thick slides need to be mounted with cover glass and inverted for scanning by ImageXpress Ultra. However, working with the thin cover slips is complicated by their fragility, so in cases where this platform is employed, we use a customdesigned slide holder for printing the arrays and a slide chamber to perform immunostaining and image acquisition. Regular 1mm thick microscope slides pretreated with different coating reagents do not required any special holders and readily available from a variety of vendors.

As the printed spots containing the siRNAs are arrayed across the complete surface of the slide, it is critical that cells uniformly cover the entire surface. Any gaps or empty spaces will result in an underrepresentation of gene replicates and potentially misleading results. Assessment of the quality of the cell layer can be

### **18 86** *Frontiers in R RNAi, Vol. 1*

do (d one by visua discussed lat al inspection ter, Fig. **14**). n (Fig. **3**) or by image an nalysis of se elected pictu ure-frames

A ne va to tr de sl After cells are ext step is t ariation in c o testing it ransfection epends on th lide coating, e confirmed to test how ellular susce in the cel and mixture he slide coat the material to be compa well the ce eptibility to t lls cultured e reagents ting. If reage l may easily atible with c lls can be tr transfection, on the sli used for si ents applied y defuse from culture condi transfected. , this test is ide. Howev iRNA print to the spot m spot to spo itions on the Because of often perfor ver the effic ting also ve will not adh ot across the e slide, the the usual rmed prior ciency of ery much here to the slide.

**Fi** se en m av **igure 3: Visua** eeded on glas ndogenous p65 monolayer can b vailable PLL– **al assessment o** s slides in 4 5 was used as be done by visu coated slide an **of uniformity** well culture a marker for c ual inspection. nd (**C**) distorted **for the cell m** dishes and cu cells (green co (**B**) Uniform c d cell layer due **monolayer on a** ultured for 48 olor). Quality a cell layer (gree e to the poor qu **arrays**. (**A**) Th 8 h. Immunoassessment of en color) on co uality of coatin he cells were -staining of the cellular ommercially ng.

### **R Reverse Tran nsfection of f Cells on a M Microarray y**

T T m en Transfection There are a w most popula ncapsulated is the proces wide range ar for RNA in cationic l ss by which of methods Ai screenin lipids [57, 58 nucleic acid by which t ng involves 8]. ds (NAs) are transfection s treating introduced can be achi cells with into cells. ieved, the siRNAs

L fo ba sp ch cr tr to co Lipid-mediate orward-trans ased RNAi potted on sli hoice [59]. R ritical param ransfection, i o-lipid ratio. omplications ed delivery sfection (FT) screening, ides at high Regardless o meters includ identifying p At the same s including t of NAs c ) and revers reverse tra h densities ar of whether R de testing o preferred lip e time, transf the immobil an be achie se-transfectio ansfection, w re then over RT takes plac f the amena pid transfecti fection on th ization of na eved by on on (RT) (Fig where siRN rlaid with ce ce in a well ability of th ion reagents he slides has ano-liter am ne of two p g. **4**). For m NA-lipid for ells, is the m or microarra he cell line t s and optima s its own uni mounts of rea protocols; icroarrayrmulations method of ay format, to reverse al siRNAique set of agents in a *siRNA Microarray-Based Screening Frontiers in RNAi, Vol. 1* **187**

tightly defined area and averting migration of siRNAs from one area (*i.e.*, spot) to another during submerging of arrays in cell culture media.

**Figure 4: Schematic illustration of forward and reverse transfection.** (i) Forward transfection and (ii) reverse transfection

## **Forward and Reverse Transfection in Plate Wells**

To select the best method for our experiments, we usually first compare forward and reverse transfection in the wells of 96 or 384 well plates. Fig. **5** (**A** and **B**) depicts such an experiment using four different transfection reagents, DharmaFECT 1-4 (DF1-4; Dharmacon, part of GE Healthcare) for transfection of siRNA against the p65 protein into Hela cells. Several wells were assessed in parallel, using QPCR to measure the decrease in amount of mRNA and immunostaining to determine how the decrease of mRNA expression affects protein detection within the cells. Our results for p65 and other genes showed that the decrease of mRNA detectable by the level of the protein at 48-72 hours of incubation. Interestingly, both forward and reverse transfection give comparable results in terms of mRNA and protein level reduction, about 60%-80% (Fig. **5A** and **B**).

For each microarray-based screening project, we first test in wells the efficiency of reverse transfection of siRNA targeting cellular genes like p65, as well as "control" genes related to pathways or cellular mechanisms involved in the assay. This test demonstrates that the reverse transfection efficiency is acceptable in the selected cell line, and shows the percentage knockdown of protein expression that can be achieved for already known control genes. In well-based tests, we generally use three concentrations of siRNA ranging from 1-20 nM in 100l of the final media dispensed in wells of 96 well plates. To achieve this, volumes of siRNA and transfection reagent are mixed and incubated for 20 min prior to dilution in culture media (see Table **2**). If these results are acceptable, yielding >50% knockdown of the si ce ex ignal (*e.g.*, tr ell/nuclei co xperiments, t ranscript leve ount during then tests wit els) with no g image an th microarray significant c nalysis, and ys can proce cytotoxic eff d observed eed. fect determin in 2-3 ind ned by the dependent

**Fi si** fo IM **igure 5: (A) RNA.** (i) QPC or immune-stai M algorithm, n **Forward tran** CR for p65 mR ining and imag ormalized and **nsfection and**  RNA in transfe ging; (ii) quant plotted as % o **(B) Reverse**  ected cells, wit tification of the of cells express **transfection**  th parallel wel e immunostain sing p65. DF 1 **of HeLa cell** lls containing c ning signal ana - 4 are Dharma **ls with p65**  cells treated alyzed using aFECT 1-4.

### **M MICROARR RAY PRINT TING**

B m Before discu microarrays, w ussing the we will brief details of fly describe f assay op the array pri ptimization rinting itself. for transfe . ection on

### **A Approaches to Printing**

A qu An ideal prin uantities of nting system f probes, w should be a while mainta able to creat aining integ te uniform, d grity and pu dense arrays urity of rea s of nanoagents by preventing contamination. Methods for printing or spotting nano- and pico-literquantities of different biological materials on solid substrate can be roughly divided into two major categories: contact and non-contact printing. Contact printing includes pin-printing, micro-stamping and nano-tip printing. Contact printing instruments have solid or split pins that come in direct contact with the surface of the slide. Even though this technique has not progressed much over the earlier printers devised by Brown and colleagues, it has proved reliable over many years for oligonucleotide microarray production for a variety of experiments including work with siRNA molecules for transfection into live cells, as shown in Table **1**. The printers with pin-tools are manufactured by Bio-Rad Labs; Genetix; Invatis, and other companies.

The non-contact printing approach includes: photochemistry-based printing, electro- or electrochemistry printing, laser writing and droplet-dispensing Inkjet technology. Inkjet technology has become popular during recent years with the development of piezo-based systems that allow the precise, quantifiable and reproducible dispensing of small volumes containing fragile biological samples such as protein molecules [60]. Such printers are produced by companies such as Abbis, Arrayjet; Biofluidix; Gesim; Perkin Elemer, Scienion, and Shimadzu.

We use the Genomic Solution Omnigrid 100 high-throughput pin-tool based printer designed to simultaneously hold 100 1'x3'microscopic slides. Our system employs stealth solid pins (SMP9, Telechem International, USA) to create spots having a 250µm diameter separated by a distance of 150 -200 µm (Fig. **6**). Using images of red spots we determined the size of the spots arrayed on PLL coated slides to be around 215±10 µm.

Many factors can influence the size and quality of the spot, including:


### **190** *Frontiers in RNAi, Vol. 1 Kwon et al.*



Due to the fluctuations of those factors during printing of the siRNA transfection material, spots can be missed, although on average less than 1% of the spots are missing, can have irregular shapes, or contain less material than desired. The use of siGLO tracer helps to assess the presence of printed spot and its quality.

Another challenge associated with high-density spot arrays is the relatively small number of cells associated with each spot area [67]. The reduction of the size of the spots can become a critical issue due to the smaller number of cells attached to the area. One study reported 50 ±3 and 150±8 PC-3 cells in spot sizes of 200 µm and 400 µm, respectively, using a technique that allowed cells to adhere only to the spot area [56]. In work that utilized HEK293 cells, other study reported 30-80 cells were counted on a spot of 120-150 µm [51], while in a separate study, 100- 400 cells were associated with a much larger spot of 600 µm [53].

*siR RNA Microarray--Based Screening* 

**Fi** m R pr so G **igure 6: Dime** microarray was Red-siGLO), th rinted on a PLL oftware, averag Graphpad Prism **ensions of tran** taken on Ima he dimensions L-coated slide. ge diameter o m, n=3. **nsfection mixt** ageXpress Ultr are summariz The size of sp f spots ~ 215 **ture spots pri** ra at 10x magn zed in drawin pots and their v m 10 (erro **inted on a mic** nification in th ng. Fluorescent variability was or- is standard **croarray.** The he Red channe tly labeled siR measured by M d deviation, ca e image of a el (to detect RNAs were MetaXpress alculated in

**Fi** m an si (**B** A **igure 7: Num** microarrays wit nd immunostai GLO-siRNA ) Quantificati Average number **mber of cells**  h different cel ining of p65 a tracer spots, ion of cell num r for 3 spots is **adhered on**  l lines seeded and nuclei stai , Blue: DRA mber per spot plotted and er **the spot.** (**A** at 10<sup>6</sup> cells/sli ining. Green: p AQ5 staining t based on a n rror bar represe **A**) Composite ide and culture p65 antibody-A g of nuclei. nuclei count b ents standard d image of a ed for 48h bef Alexa 488 lab Scale bar by MetaMorph deviation. PLL-coated fore fixation beling, Red: is 100µm h algorithm.

Cell number will also vary depending on confluence of cell cultures. We counted the number of cells per spot for several cell lines typically used in the laboratory, and found that average number was about 300 cells/spot (after plating about 10<sup>6</sup> cells/slide) for A375, HeLa and U2OS (Fig. **7**). EKVX and Huh7 showed fewer cells on the spot, with HEK293 being the most aggressively growing and reaching as many as 600 cells per spot.

The most important issue associated with microarrays is the efficiency of transfection on the spot, which varies with the type of cell line. Some cell lines such as PC-9, TC-7, some primary cells or even immortalized cell lines known to have reasonable efficiencies in well-based assays were transfected at unexpectedly low efficiency on microarrays. In some cases transfection could be optimized by switching the transfection reagent but we also found that different coatings increase transfection efficiency for some cell lines. For instance A375 shows better transfection on MAS-coated slides with a slightly increased concentration of sucrose (100 mM) and 0.09% gelatin compared with the PLL coating with 25 mM sucrose and 0.06% gelatin.

## **Optimization of Reverse Transfection on Microarrays**


**Table 2:** Comparison of transfection mixture components for reverse transfection in wells or on arrays

A w th of L D In ef si th ac A large varie with test/cont he best reage f control siR Lipofectamin DharmaFECT n general, th fficient tran iRNA and u his panel arr cceptable lev ety of lipidtrol siRNAs ents for any RNA formul ne 2000 (In Ts (Dharmac hese reagents nsfection. In ses the aver ray (with few vels of trans -based transf (*e.g.*, target cell line of ated with Ef nvitrogen), L con, part of s show little n our analys age p65 sign w exceptions fection (50 fection reag ting p65) to interest. We ffectene (Qia Lipofectami GE Healthc toxicity und sis, which r nal intensity s) confirms 0%). gents can be create test s e typically p agen), Meta ine RNAiM care) on a si der condition regularly em y to assess tr the optimal e used in co slides for de print slides c afectenePro ( Max (Invitro ingle test m ns that provi mploys p65 ransfection e reagents tha onjunction etermining comprised (Biontex), ogen) and microarray. ide highly -targeting efficiency, at provide

A tr co op (F se As the mo ransfection oncentration ptimization Fig. **8B**). Th elected as op st expensiv reagent and ns required f results for H e 2.5-3.0 µM ptimal for thi ve compon d siRNA n for efficient HeLa cells us M of siRNA is type of ex nents of m need to be knockdown sing Lipofec and 2.5-3µl xperimental c microarray-ba titrated to n. Fig. **8** sho ctamine (**8A** l of Lipofect conditions. ased screen select the ows a typica ) and siRNA tamine (Tabl ning, the minimal al titration A anti-p65 le **2**) were

**Fi m** R Li (B B ea (c of **igure 8: Titra microarrays.** C Red identifies ipofectamine w B) anti-p65 siR : (i) Overlay o ach condition; cells) in area o f the average in **ation of (A) Li** Composite imag siGLO-siRNA was tested with RNA was titered of images from (ii) the overla f spots to asse ntensity signal **ipofectamine a** ges of cells on A tracer spots h different volu d at concentrat m 3 channels to y of images fr ess the absence for the cells on **and (B) anti-p** n microarrays. G s. Blue repres umes of 0.5-3 tions of 0, 0.13 show the spot rom two chann e (or presence) n the spots (n= **p65 siRNA for** Green represen sents DRAQ5 μl using proto 3, 1.25, 2.5 and ts, 9 (A) and 4 nels to show pr ) of possible to =9 (A) and n=4 **r reverse tran** nts p65 antibod staining of ocol described d 12.5 μM. For 4 (B) replicates resence of the oxicity; (iii) qu 4 (B) replicates **nsfection on**  dy labeling. nuclei. (A) in Table **1**. r both A and s (spots) for blue nuclei uantification )

### **19 94** *Frontiers in R RNAi, Vol. 1*

It kn re si m pr w m m po of t is worth nockdown in everse transf iRNA in we mixture print rinting, the t well-based ex microarray p material, ano ortions of th f efficiency noting the n the two (w fection of a ells gave a s ted on the sl total amount xperiments. T latforms is other advant he mixture ca in transfecti e difference well-based a anti-p65 siRN similar knoc lide. Clearly t of siRNA s That said, th difficult to tage of the an be stored on experime es in siRN and microarr NA, we fou ckdown effe y, the minisc stock is sign he amounts a quantify. In microarray at -20C an ents (data no NA concent ray-based) fo und a 10-20 ect as 2.5-3 cule amount nificantly sm and kinetics n addition to y-based appr nd reused we ot shown). trations requ formats (Tab 0nM concen μM of siRN t of mixture maller than th of siRNA d o smaller am roach is tha eeks later wi uired for ble **2**). For ntration of NA in the e used for hat used in delivery in mounts of at unused ithout loss

T si 4 The timeline imilar to we 8h-72h indu for siRNA ell-based rev uces the best A-mediated g verse-transfe knockdown gene knockd ection. On a n results (Fig down on w average, an g. **9**). was found to incubation o be very period of

**Fi** m av In **igure 9: Time** microarrays (an verage signal f ntensity compa **e window for** nti-p65 siRNA for p65 immuno ared to scrambl **r incubations**  and scrambled o-staining in th ed siRNA neg **on microarra** d siRNA print he cells on the ative control. T **ay.** HeLa cell ted) and cultur spots is analyz The error-bar i ls were plated red for 24- 96 zed and presen s SD, N=4 spo d on siRNA hours. The nted as % of ots.

P of Fj tr co w tr tr rotocols for f the transfe jeldbo *et al* ransfer from omplexes an work also rep ransfection ransfection microarray fection mixtu *l.* 2008 [69] m the surfa nd acts as a c ported testin of GFP-ex reagent on printing hav ure: gelatin reported th ace to cell critical prese ng of differe xpressing c Ultra-GAP ve mentioned and sucros at gelatin he s, and suc ervative for ent concentr cDNA usin PS coated d two other se [51, 55, elped the tra rose stabili storage of th rations of su ng the X-t slides (Cor essential co 67, 56, 69, ansfection m izes the tra he printed sl ucrose and g teremeGene rning). As omponents and 70]. mixture to ansfection lides. This gelatin for (Roche) we were de or an eveloping a r MAS-coat nd gelatin un system to u ted slides, w nder our con se lipofectam we tested the nditions (Fig mine or RNA effect of di . **10**) AiMax, in co ifferent conc onjunction w centrations o with PLLof sucrose

**Fi** pl an tra (s nu op **igure 10: Com** lated on microa nti-p65 antibod acer spots, Blu spots) for each uclei (cells) in ptimal conditio **mbined titrati** array with p65 dies and DRA ue: DRAQ5 st condition; (ii) area of spots t ons of 37.5 mM **ion of gelatin**  5 siRNA and cu AQ5. Green: p6 taining of nucl ) composite im to assess for po M sucrose in co **and sucrose.**  ultured for 48 h 65 antibody-A lei. (i) compos mage from two ossible toxicity ombination with HeLa cells (1x hours. Cells w Alexa 488 labe site image from o channels to s y; (iii) zoom-in th 0.06% gelati x10<sup>6</sup> cells per were fixed and s eling, Red: siG m 3 channels, show presence n view of the s in slide) were stained with GLO-siRNA 4 replicates of the blue section with

F an bo H ca re th sl ou ob co ig. **10** show nti-Rel-A (p oth gelatin However, the aused a sign eagents, was he reagent m lides. The co ur experim bservations oncentration ws the comb p65) siRNA and sucrose e highest co nificant chang shing-off the mixture to th ombination mental condi and indicat ns for gelatin bined titratio into HeLa c e proved ess oncentration ge in the vis e clear borde he surface du of 37.5 mM itions. Thes te the impor n as well as s on of sucros cells on PLL sential for su ns of sucros scosity of the ers of the spo uring the pro M sucrose an se results rtance of co sucrose [68,7 se and gelati L-coated slid uccessful tra e (75 mM) e mixture re ots, and redu ocess of airnd 0.06% ge are quite orrectly cali 71]. in for transf des. As was ansfection o and gelatin sulting in di uction in adh -drying of th elatin was op similar to ibrating the fection of expected, of siRNA. n 0.125% iffusion of herence of he printed ptimal for previous range of

O m to Once the tran management o 20,000 sam nsfection re of the printi mples) is the eagent mixtu ing process next critical ure is optim for the large l step. mized and re e genomic si eady for prin iRNA librar nting, the ry (18,000

## **Printing the Human Genome siRNA Library on Microarrays**

To print the library for easy to transfect cell lines, the siRNA transfection solution is prepared as described in Table **2**. The library of Dharmacon "ON-TARGETplus*"* siRNA "SMARTpool" format consists of 4 different sequences pooled together for each gene. That helps to reduce amount of off-target hits and increases the chance of achieving efficient knockdown effect for the average gene in the genome. 2 μL of a 20 μM solution of each siRNA, from library is transferred into each well of a 384 well plate. 6 μL of 20 μM RED siGLO is then combined with 2 μL of 0.3 M sucrose dissolved in OptiMEM media, and 2 μL of RNAse free water. 3 l of RNAiMax is added to each well, mixed thoroughly, and then incubated for 20 min at RT before addition of 5 μL of 0.2% (w/v) gelatin

The optimized spot dimensions (Fig. **6**) allow us to position 3,888 spots (108 columns by 36 rows) on a single glass slide. Since cross-contamination could occur if the transfection reagent or cells containing the transfection reagent cross from one spot to another, multiple tests were done to ensure that does not occur. One of those tests is presented in the Fig. **11**, where the spots with the scrambled (non-targeted) siRNA were printed around the spot with anti-p65 siRNA. When the image of fixed and stained HeLa cells was assessed for p65 expression, signal reduction was only observed in spots containing anti-p65 siRNA (observed silencing ~ 20-40% of control signal). Other regions of the slide (*e.g.*, other spots and areas between spots) exhibited normal p65 expression suggesting spot-to-spot cross contamination was not occurring (Fig. **11B**).

With such a high density of spots, the 18,000 pooled siRNAs library can be distributed on as few as five slides. We also added several controls on each slide. These include 52 equally distributed negative control spots containing scrambled siRNA on each slide, along with an equivalent number of positive control spots containing pools of siRNA targeting p65 (Fig. **15**). We also printed anti-GFP siRNA as a positive control in cases where a recombinant cell line containing a GFP expression marker is used. Printing was done at 22-25˚C, 55-65% humidity in an enclosed, custom-built HEPA-filtered clean chamber.

One of the advantages of an array-based format of an siRNA library is that it eliminates the problems associated with multiple freezing and thawing of plates with aqueous siRNA solutions. After preparation of the siRNA plates in the desired format (we use 384 well plates), plates are used for direct printing of as many as 200-400 copies of the siRNA genome library. The remaining reagents on a plate can be sealed and stored at -20o C (or -80<sup>o</sup> C) for several days and re-used fo co or QC array onditions for s or addition r up to 1 yea nal test runs ar. s. The array ys can be sto ored under d desiccated

**Fi** co A C D si co pr in co **igure 11: Te** ontamination, After a 48h exp olor Indicatio DRAQ5 stainin gnal; (iii) com ontamination w resenting it as t n p65 signal i ontaining scram **est for cross** anti-p65 siRN posure, HeLa ns: Green: an g of nuclei. (i) mposite image was assessed the percentage in spots that c mbled (non-targ **s contaminat** NAs (center) w cells were fix nti-p65antibody ) Spot visualiz from 3 chann by scanning e of p65 expres contain p65 s geting) siRNA **tion between** were flanked b xed and stained y labeling, R zation in red c nels. Scale bar the slide, me ssion across the iRNA down t As exhibited nor **n spots.** (**A**) by spots conta d with anti-p6 Red: siGLO-siR channel only; ( ar = 100µm (**B** easuring inten e images. Ther to 20-40% of rmal levels of To test for aining scramb 65 antibody an RNA tracer s (ii) green chan **B**) The presen nsity of p65 re is a significa f total signal w p65 expression cross-spot led siRNA. nd DRAQ5. spots, Blue: nnel for p65 nce of cross signal, and ant decrease while spots n.

### *M Microarray I Image Analy ysis and Libr rary Annota ation*

T de There are var esigned for rious consid cultured cell derations for lular monola r analyzing ayers. The m visually add main differen dressable mi nce between icroarrays n the high-

### **198** *Frontiers in RNAi, Vol. 1 Kwon et al.*

density spot microarrays and the same experiments done in wells is that the spots first have to be localized precisely enough so that image analysis can be applied to cells positioned exactly above the spot. Therefore, the first and the most important tasks in image analysis are spot fitting and spot localization.

Typically, the process of spot localization is based on prior knowledge of the spot "map", which is usually the number of spots in x and y axis on the slide. More accurate information, such as the size and actual distance between spots (scale information), removes further degrees of freedom during spot detection, and algorithms have been proposed for resolving this issue [72-75].

A predecessor of the microarray technology described here, cDNA hybridization microarrays, used visually-addressable, precise spot localization, in conjunction with many approaches for spot detection and grid-fitting [72, 76]. Our computational group modified those approaches to address critical problems associated with spot detection and sub-spot fitting for our microarrays [60]. Like all the algorithms used in the analysis, the spot fitting should be performed with efficient speed. It also should be precise because the arrayed spots do not always have the expected shape and intensity of the signal, or may be missing entirely. Precision is critical for statistical analysis of data obtained from the cells growing on the spots. For our group, spot detection is achieved by visualizing a fluorescent labeled tracer which has been added into each sample. Specifically, we use a double-stranded siRNA-tracer, covalently labeled with the DY547 fluorophore (siGLO-Red with excitation maximum of 557 nm and emission maximum of 570 nm, Dharmacon, part of GE Healthcare). Images acquired at 560 nm (red channel) are then used for spot identification and correction of grid position. Fig. **12** shows an example of how addition of the fluorescent tracer allows both the localization of spots and the identification of missing spots. A total of 408 individual images, 34 columns x 12 rows, of cells on microarrays usually is taken by MDS ImageXpress Ultra at 10x magnification by scanning the entire surface of a single slide. Those images are stored in a database and stitched into a single montage by our Image Mining (IM) program for spot fitting and image analysis. The resulting array containing the genomic library comprises 108 columns by 36 rows of spots. This type of accurate spot fitting allows not only image analysis of relevant cells but also correct annotation of spots with information regarding targeted genes. The annotation file generated by the Genomic Solution Omnigrid printer then transfers the information from the plate-based format into the array format after the spots have been found (Fig. **12B**).

**Fi** on as a id ge **igure 12: Spo** n red channel o s solid green sq gene to the dentified). (**B**) enomic library **t grid fitting** only. The grid quares. The gri corresponding The Genomic into an array-b **and annotatio** is fitted to a p id is associated g spot. (In thi c Solution Om based format. **on.** (**A**) The im ortion of the ar d with an annot is case, a spo mnigrid printer mages of spots array and the m tation file that ot associated w converts the p on a microarr missing spots ar assigns inform with the gene plate-based fo ray acquired re identified mation about e ARF-1 is rmat of the

A th th an of nm be fl de al Another chan he "spot" are he cell space nalysis of th f channels w m and emis efore, is typ luorescent p esigned arou lgorithms de nnel is used t ea. The area es associated he signals in we can use f ssion 646 nm pically used proteins exp und the defi epends on th to detect the of image de d with those other chann for microarra m) for nucle to image s pressed in c ned nuclei o he type of ce e cell nuclei efined by the nuclei and nels (Fig. **13** ays imaging ei staining. ignals from cells. This or masked a ellular respon and select o e so called " is used for f **3**). Because o g, we use DR The third ch immunosta signal is c area. The spe nse interrog only those lo "nuclei mask further expa of the limite RAQ5 (excit hannel, as m aining or det collected in ecific design gated in this ocalized in k" defines ansion and ed number tation 598 mentioned tection of the area n of those particular

### **20 00** *Frontiers in R RNAi, Vol. 1*

as ch pa pa ssay. The in hannel gene articular spo arametric an ntensity of th erates data w ot. The othe nalysis of com he signal co with which er two chann mplex pheno ollected with to calibrate nels (blue an otypic event hin the mask e the siRNA nd green) c ts. ked area in A effect on collect data f the green cells in a for multi-

**Fi** In ce (b lo by bu **igure 13: Ima** ndividual chan ell/pathway sig blue) and GFP ocated in the de y siRNA but n ut presence of b **age analysis o** nnels: red- si gnal; (iii) Seg signal associat efined spot are not expressing t blue nuclei in l **of cells on a**  RNA tracer gmentation, or ted with nuclei ea and thus sel the marker pro locations withi **microarray.** ( signal, blueselection and i (green). Multi lected for analy otein can be cle in the spot area (i) Composite DRAQ5 nucl d localization icolored nuclei ysis. The popu early seen by t a. image of 3 c lear signal, g of spot area(r i represent mas ulation of cells the absence of channels (ii) green- GFP red), nuclei sks for cells s transfected green color

Typically, each new assay requires specific design of "plug-in" software for analyzing images. In laboratories with no access to support groups designing custom algorithms, commercially available software such as Metamorph (MDS), Columbus (PE), and open-sourced software such as CellProfiles [77], DetectTiff [78], and ImageJ [79] can be used for automated identification and quantification of cellular phenotypes, and image analysis.

### *Validation of Microarray–Based Assays*

Assay validation is a critical step in any type of high-throughput screening. In the case of microarray-based format screening it is essential due to the variability associated with transient effects of siRNA, the small number of cells analyzed in each spot, and the significant number of samples (3,888 in our case) positioned on each array.

We have developed a multi-step validation process to confirm the feasibility of assays for screening. One of them is the test for uniformity of the cell layer on the microarray and uniformity of immunostaining signal through the slide. We addressed the uniformity of cell distribution on the microarrays earlier in Fig. **3**. Now we describe the image-based approach to analyze the quality of the cell layer. This can be done by analyzing images taken by a plate reader for the crosssection of the slide. Fig. **14** illustrates the general design of the test. Data from the reader is collected for several consecutive independent experiments (or slides), and analyzed for statistical deviations. This analysis can help to optimize the cell seeding density that in 48 hour provides a homogeneous monolayer of cells. The analysis of images also confirms optimal conditions for immunostaining, including volumes for antibody solutions that provide homogeneous staining without edge effects and other artifacts. Also this test is used to confirm uniformity of viral or parasitic infection throughout the whole slide when the assay involves infectious agents. As an example, Fig. **14** shows the titration of GFP-expressing Chikungunya virus detected in cells after 48h infection. Such optimizations were used in functional genome screening to identify host factors involved in HIV infection of human HeLa cells [23], *Trypanosoma cruzi* parasitic infection of U2OS cells [30] and H1N1 influenza virus.

It is helpful for assay validation to be able to test positive and negative controls positioned throughout the whole slide. The difference between positive and negative controls should be statistically significant and reproducible in 3 independent experiments.

**Fi** m m th (ii se **igure 14: Asse** marker for the monolayer was d he slide scanned ii) could be use elected through **essment of unif** immuno-staini done by quantif d by plate reade ed for quantific the length of th **formity for cel** ing procedure fication of the i er and raw of im cation of the si he slide to gene **ll monolayer o** (green color). images of the c mages of immu ignal throughou erate the data, re **on arrays.** The . The assessm cells on the sele uno-staining sig ut the slide. (i) epresented by b endogenous p6 ment of quality ected fields of gnal (ii) and red only one set o bar graph (iv). 65 used as a y of cellular the slide. (i) d spot tracer of field were

**Fi** Sc th of zo th di sp **igure 15: Posi** crambled nega he slides to asse f the 3 channe oomed-in view he slide. (**B**) S ifference with pots for each p6 **itive and nega** ative controls ( ess variability ls for the mos ws of several in catter plot of P<0.001 in Stu 65 and for scra **ative controls**  (yellow circles on different po saic compositio ndividual imag average intens udent t-test for ambled control **on slides prin** s) and p65 siRN ortions of the m on of all the im ges illustrate th sity for each c r the average v spots) **nted with por** RNA (red circle microarray. Sh mages for the he control spot ontrol spot. Th values are indic **rtions of the l** es) are printed own is a comp whole scanned s in different l he statistically cated by asteri **library.** (**A**) d throughout posite image d slide. The locations on y significant isks (n = 52

We arrayed at least 52 scrambled siRNA (negative controls) throughout the length of each slide containing the genomic library (Fig. **15**). This figure also illustrates the statistical significance of the difference between two control populations of data points on the array (Fig. **15B**). As mentioned previously we use p65 as a positive control and this standard tests allows us to validate steps for complex assays without wasting precious reagents or viruses.

Unfortunately, it is impossible to prepare specific positive controls on genomic arrays for all possible types of assays. For projects that use GFP as a reporter fluorescent protein, we added 52 spots of anti-GFP siRNA. For other projects, we create separate new control arrays with specific positive controls among large populations of scrambled siRNAs. These arrays are added to the five genome arrays for quality controls during screening. If there are no known genes that could work as positive controls in a given project, we focus our attention on the quality of signal and noise for the scrambled non-targeting siRNA.

### *Data Analysis*

Having a large population of scrambled or non-targeted siRNA samples provides a powerful tool to work with the data from the screening campaign. The data derived from the scrambled siRNA population is used in computational methods for normalization of the data across different conditions, such as between slides and different batches of slides or even different experiments. The genome library set has 5 slides covering 18,000 individual genes and total of 300 scramble siRNA samples. Usually, to ensure statistical significance of the observed events, we conduct screening with six or more copies of the genome library, which results in a total of more than 1800 negative controls. This is a large enough population to differentiate specific change of the signal from noise and reveal any artifacts. The assumption is that the negative controls give results similar to any genes that are not involved in the modification of the observed cellular phenotype. For example, the knock down of p65 did not play a role in early steps of viral infection and replication that was measured by immunostaining for NA protein after cells were infected with influenza (H1N1). Fig. **16** shows the distribution of results of the influenza assay done against scrambled spots compared to anti-p65 siRNA, 300 copies per one genome. The observations labeled in black show the difference (distance) between the multivariate phenotypic readouts of p65 and a control (scrambled) reference population in blue. We see that there is no significant difference between the two sets, which means that the gene knockdown shows no significant difference from scrambled results.

**Fi** sc **igure 16: Scat** crambled siRN **tter plot of neg** NA spots (Blue) **gative control** ) and p-65 siRN **l data.** Metada NA (Black) gen ata collected fro ne. om cell image analysis for

F fr to ge po th or typical fu rom the scra otal populati enome scree opulation wi hat significan unctional ge ambled pop ion of 17,75 en using an i ith P=0.999. ntly differed enome screen ulation usin 53 (in this c influenza ass . The simple d from noise ning, we cal ng Matlab ap case) P valu say. Scramb e cutoff of P in the sampl lculate the d applications. ues for each bled samples P<0.01 was u le population distance of e Fig. **17A** s h gene from represents a used to selec n (Fig. **17B**) each gene shows the a human a compact ct 209 hits ).

T m m m to pr ad pr m sc sc To ensure rep more than six make hit sele many copies otal of six r resent in an ddition, the resent on th missing. The creening. In creening req producibility x replicates ection more of the same eplicates the nalysis and advantage he array (see total numbe some cases quires more c y of the obta of the librar robust we a gene give d en at least f satisfy the of spot-mic e Fig. **12A**) er of hits is u , such as the complex data ained results ry, as was m also used an data with P v four and mo cut-off crite croarrays is and can de usually varia e HIV scree a analysis. s, we run at mentioned p n additional value below ore copies o eria to be s that we kn etermine ho able and depe en by Genov least five bu previously [5 score to re the cutoff. I of the gene s selected as now if the w many cop ends on the vesio *et al.* 2 ut usually 5, 11]. To eflect how f we have should be a hits. In spot was pies were quality of 2011 [23],

M no Methods of ormalization computati n of the scr onal data reening data analysis a a and select are now c ting hits. A commonly A variety of used for f software pr so si m so ar R sc rograms can ource Bioco iRNA screen multi-variant olution for s re several g RNAi screeni creening rou n be used for nductor [80] ns) and more and multi-p statistical co ood reviews ing data [81 utines could b r statistical a ], CellHTS e general app parametric an omputing and s of statistic -83]. Havin be essential analysis of m (which is de plications su nalysis, the R d graphics ( cal approach ng the suppo in case of co microarray d esigned for a uch as SpotF R-open sour (http://www. hes develope ort of a statis omplicated a data, such as analysis of c Fire (IDBS). rce language .r-project.org ed for the an stician exper assays. s the open cell based For more is a good g/). There nalysis of rienced in

**Fi** Se **igure 17: Sca** elected hits (bl **atter plot of s** lue), P<0.01 fo **screening data** or the scramble **a** (**A**) P-value d controls (red es for all samp d). ples in screen (blue). (**B**)

## **APPLICATION OF siRNA MICROARRAYS IN SCREENING OF INFECTIOUS DISEASE MODELS**

We have used siRNA microarray technology to identify genes involved in specific disease progression and host gene-pathogen interactions for viral infections such as HIV [23], Influenza, Chikungunya, and Dengue fever; as well as parasitic diseases like Chagas [30]. Here we will describe details of the screening process with the siRNA microarray technology to identify genes involved in HIV infection and replication [23]. Study of these critical host factors is important for improving the fundamental understanding of HIV-host interactions, and for developing novel anti-HIV therapeutics. The work-flow process for microarraybased whole genome siRNA library screening includes the following steps: establishing and validating the cellular model; assay development in the microtiter plates format; assay adaptation to the microarray format; screening; hit selection and confirmation; and deconvolution of selected hits.

### **Assay Development and Genome Wide Screening for HIV Infectious Model**

First we established the most robust cellular model for the phenotypic assay. Among the available human cell lines and types of assays capable of detecting HIV, we chose an assay which provides the highest level of infection in cells that can be efficiently transfected on the microarrays. HeLa CD4+ LTR-GFP cells, developed in Peter Somers laboratory at IPK, have these characteristics and recapitulate early steps in HIV infection, which enables TAT-driven transactivation of stably integrated GFP.

Based on previous studies of HIV host factors [19, 20, 22] we chose the CD4 receptor, which is essential for viral entry into the cell, as a positive control for viral infection. We optimized conditions for HIV infection and siRNA transfection in the plate format first to achieve the best assay window which ranged from >90% of HIV infectivity to < 20% of HIV expression when cells were transfected with CD4 siRNA. (Fig. **18**)

Using these optimized conditions, we validated the assay on a small-scale siRNA microarray containing printed spots of scrambled and CD4 siRNAs. After 24 h of reverse transfection, cells on the array were infected with HIV-1 at a multiplicity of infection (MOI) of 0.14 for an additional 48hr. As observed in the well-based assay, the HIV infection was significantly repressed in cells transfected with a CD4 siRNA (Fig. **19**). The uniform distribution of infected cells throughout the entire slide confirmed that the HIV assay adapts well to the siRNA microarray system.

**Fi** in (s 48 ex (**B** no te **igure 18: The**  n 96 well plat scrambled cont 8 hour incuba xpression assoc ) GFP signal ormalized as % est (n=6, p<0.0 **cellular HIV**  tes and incuba trol and anti-C tion, cells wer ciated with viru l in all cells % of infected c 01) **assay in plate** ated overnight CD4) for 24 ho re fixed and s us decreased in was analyzed cells, \*\*\*\* sig **e format.** (**A**) H t. The next da ours, then infec stained for nu n the presence using an Ima gnificance of d Hela CD4+ LT ay, cells were cted with HIV uclei detection of anti-CD4 si age analyzing difference was TR-GFP cells w e transfected w V-IIIB. After an with DRAQ5 iRNA. Scale ba algorithm an assessed usin were seeded with siRNA n additional 5. The GFP ar is 50 µm. nd data was ng student t-

**Fi** se ho ba 10 **igure 19: HIV** eeded on a tes ours. Cells wer ased assay. (G 00µm **V infection ass** st microarray re infected with Green: HIV inf **ay was adapte** containing scr h HIV-1 (MOI fection, Red: s **ed to siRNA m** rambled and a I 0.14) and afte siRNA for spo **microarrays.** H anti-CD4 siRN er 48 hours tre ot detection, B HeLa LTR GFP NA and incub eated same way Blue: nuclei). S P cells were ated for 24 y as in well-Scale bar =

Five replicates of the siGENOME whole genome library, 20,000 siRNAs SMARTpool (Dharmacon, part of GE Healthcare) printed on seven slides, were screened in one experiment with a total of 35 slides. If this level of screening was performed in 384 well plates, it would require that 360 plates be screened in one experiment. Once siRNA microarray images were acquired and each spot localized and identified, HIV infection was independently analyzed in cells growing on each siRNA spot. Fifteen imaging parameters were used in data analysis, including measurements such as cell number, area of relative cell distribution, size of individual cells, syncytium formation, and intensity of GFP signal. The multi-parametric data analysis was performed such that it allowed selection of hits with parameters similar to those collected using the CD4 siRNA (control). This work demonstrated the advantage of using different approaches for hit selection, based on the hits position relative to controls. In this case, we selected 56 genes with Wilks' Lambda = 0.66 (a value under 0.9 indicates that this distribution is different form distribution of scrambled controls [23])

### **Confirmation and Deconvolution of Screening Results**

The 56 hits selected from the primary screening were confirmed using plate-based assays that in addition to detecting GFP expression activated by HIV, also directly detected the presence of p24 HIV proteins [84]. To verify the significance of the screening results, it was essential that we demonstrate that depletion of the candidate genes in LTR-GFP Hela cell blocks HIV replication in a similar way to the replication block induced by CD4 knockdown. To test this, we selected RNASEH2A, MED28 and JMY from the hit list and used CD4 as a control. Cells were transfected with siRNA for 24 hours and infected with HIV for 48 hours. Viral replication was measured by the appearance of GFP expression as well as p24 protein expression (Fig. **20 A**, **B**, **C**). Depletion of RNASEH2A, MED28 and JMY in Hela cells totally blocked HIV infection as detected by quantification of GFP expression and immunostaining of p24 protein expression (Fig. **20A**). Both biomarkers gave comparable results in cells transfected with RNASEH2A siRNA, which was an ~ 80% reduction of HIV infection. Cells transfected with siRNA against MED28 and JMY had somewhat higher levels of HIV infection at ~50% (Fig. **20B**, **C**).

To demonstrate that the observed effect is mediated by the reduction of the targeted genes and not due to off-target effects, the level of mRNA of the selected genes was measured using RT-PCR (Fig. **21**). The validity of the effect produced by RNASEH2A was confirmed in Jurkat (T lymphocyte) cells, a more physiologically relevant model of HIV infection. This model was also used for deco in R onvolution o n microarra RNASEH2A, of pooled siR y-based scr , MED28 an RNA. Togeth reening cam nd JMY that her, these re mpaigns can are essential esults confirm n identify h l for HIV inf med that hit host genes, fection in He ts selected such as ela cells.

**Fi** im co C ex in **igure 20: RN** mages of cells ontrol). Green: omposite imag xpression; (**C**) ntensity/pixel/c **NASEH2A, ME** transfected w : LTR-GFP ex ge shows the o ) -Quantificat cell and plotted **ED28 and JM** with siRNA aga xpression, Red overlay of 3 im tion of LTRd as a bar-graph **MY are requi** ainst RNASEH d: p24 expressi mages. Scale b -GFP express h, n=3, standar **ired for HIV**  H2A, JMY, M ion, Blue: nuc bar is 40µm. ( sion. These v rd deviation sho **infection.** (**A**) MED28 and CD clei stained wit (**B**) Quantifica values are ex own as error ba ) 3 channel D4 (positive th DRAQ5. ation of P24 xpressed as ar.

**Fi** R si an **igure 21: RT-**RT-PCR results RNA. Cells w nd tested by RT **-PCR of RNAS** show the leve were transfected T-PCR. **SEH2A, MED** el of RNASEH d with each in **D28 and JMY**  H2A, MED28 a ndicated siRNA **in control sam** and JMY mRN A for 72hours **mples and dep** NA reduction by and cDNA wa **pleted cells.**  y individual as prepared,

### **S UMMARY**

In ph li co pr ap ap se hi w ce of in n this chapte henotypic ef braries into omplex, but rohibitively pproaches. T pplications i earch for new igh throughp will support c ells from pat f new mecha nteractions. er, we discus ffects produ cells plate t enables a labor-inten The technolo in drug disco w targets an put and eco complex par tients or mul anisms of ac ssed the tech uced by the t d as a mon variety of a nsive if u ogy becomes overy that re nd pathways onomy of sc rallel experim ltiple cell lin ction and cel hnical details transfection nolayer on applications using comm s even more equire drug t involved in cale, future a ments with m nes, and to te llular pathw s associated of human s microarrays that are oth mercially a e useful for p target deconv n drug respon applications multiple sam est chemical ways by scree with the scr iRNA whol s. This tech herwise unfe available pl phenotypic t volution, as nses. By vir s of such mi mples such a compounds ening for siR reening of e genome nology is feasible or late-based target-free well as to rtue of the icroarrays as primary s in search RNA-drug

### **A ACKNOWL LEDGEMEN NTS**

W w th w D We would lik who were inv his chapter, we would lik DeWitt for th ke to acknow volved in dev Dr. Augusto e to address heir critical re wledge contr velopment o o Genovesio to Dr. Ulf N eview and su ributions of of the approa o, Dr. Neil E Nehrbass, D uggestions. former and aches and tec Emans. Spec Dr. Michel Lu current IPK chniques dis cial acknowl uizzi, and D members scussed in ledgement Dr. Natalie This work was supported by the Bio & Medical Technology Development Program of the National Research Foundation (NRF) and the National Research foundation of Korea (NRF) grant funded by the Korea government (MEST) (No.2012-00011 and No. 2012M3A9B6055466), Gyeonggi-do and KISTI.

### **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflict of interest.

### **REFERENCES**


### **212** *Frontiers in RNAi, Vol. 1 Kwon et al.*


### *siRNA Microarray-Based Screening Frontiers in RNAi, Vol. 1* **213**


### **214** *Frontiers in RNAi, Vol. 1 Kwon et al.*


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

# **CHAPTER 9**

# **A Three-Dimensional Spheroid Cell Culture Model for Robust High-Throughput RNA Interference Screens**

**Geoffrey Bartholomeusz1,\* and Arvind Rao2**

*1 Department of Experimental Therapeutics, Division of Cancer Medicine, The University of Texas M.D.Anderson Cancer Center, Houston, TX.77054, USA and <sup>2</sup> Department of Bioinformatics and Computational Biology, Division of Quantitative Sciences, The University of Texas M.D.Anderson Cancer Center, Houston, TX.77054, USA*

**Abstract:** The sequencing of the human genome and the discovery that synthetic siRNA between 19mer and 22mer could silence genes led to the development of siRNA libraries capable of targeting all known genes within a genome. The emergence of high throughput genetic screens represent a powerful unbiased approach for identifying new targets and may fundamentally change biological research by increasing the speed with which disease mechanisms and potential drug targets can be identified. High-throughput RNAi screens are typically performed using two-dimensional monolayer cell culture models due to ease, convenience, and high cell viability. Although conventional two dimensional cell culture systems have improved our understanding of basic cell biology, the morphology and physiology of cells grown as monolayers in dish cultures differ substantially from the morphology and physiology of cells grown *in vivo* within a complex three-dimensional microenvironment. There is now a growing realization that 3D cell culture models are superior in biological studies. Three dimensional cell culture models can boost the physiological relevance of cell-based assays and advance the quantitative modeling of biological systems, from cells to organisms. These models exhibit a high degree of structural complexity and homeostasis, analogous to the complexity and homeostasis of tissues and organs. In this chapter we discuss 3D cell culture models and describe a three dimensional spheroid cell culture system and the standard operating procedure for its successful use in high throughput RNAi screens.

**Keywords:** 3D cell culture, high content image analysis, high Throughput screening, matrix-free nanoculture plates, RNAi, spheroid cell culture.

### **INTRODUCTION**

RNA interference (RNAi) emerged on the global stage in 1998 when Fire and colleagues demonstrated the ability of double-stranded RNA to reduce gene

**Ralph A. Tripp & Jon M. Karpilow (Eds) © 2014 The Author(s). Published by Bentham Science Publishers**

**<sup>\*</sup>Corresponding author Geoffrey Bartholomeusz:** Department of Experimental Therapeutics, Division of Cancer Medicine, The University of Texas M.D.Anderson Cancer Center, Houston, TX.77054, USA; Tel: 713-792-4158; Fax: 713-745-1710; E-mail: gbarthol@mdanderson.org

expression in *C. elegans* [1]. Soon after, it became apparent that the RNAi pathway is conserved in eukaryotic cells and can be exploited to specifically target and cleave the mRNA using a complementary, short double-stranded RNA molecule [2-4].

In the endogenous RNAi pathway, long precursor RNA sequences with hairpin structures are processed by a ribonuclease known as Dicer into small interfering RNAs (siRNAs). The RNA-induced silencing complex (RISC) binds to the siRNA, and then components of the RISC separate the two siRNA strands. The antisense siRNA strand functions as the template allowing the RISC to bind to and cleave a complementary mRNA that is then rapidly degraded.

The realization that siRNA can be produced synthetically and delivered into mammalian cells, thus circumventing Dicer mechanics [5], opened the way for use of RNAi in a wide range of biological studies. Theoretically, with appropriately designed siRNA, researchers can silence any gene, giving RNAi a broader therapeutic potential than that of small-molecule drugs. Now that the entire human genome has been sequenced, we can develop siRNA to target and silence most human genes, and we can develop loss-of-function genetic screens using distinct siRNA libraries. These genetic screens represent a powerful unbiased approach for identifying new targets and may accelerate biological research aimed at understanding disease mechanisms and discovering potential drug targets.

RNAi screens will be invaluable for unraveling signaling pathways and will permit us to delineate networks and map pathways of gene products within cells much more rapidly and in greater detail than was previously possible. RNAi screens will also allow us to identify functional genetic differences between cell lines of diverse origin, revealing how common signaling pathways are differentially regulated in different normal and cancerous tissues. To realize the full potential of RNAi technology, however, one must adhere to a stringent set of guidelines to ensure that the data generated from these screens are reliable.

## **2D MONOLAYER CELL CULTURE MODELS FOR RNAi SCREENS**

Most high-throughput screens of RNAi libraries have been performed using twodimensional monolayer cell culture models. The preferred platforms in these screens have been microwell plates, tissue culture flasks, and Petri dishes. The benefits of a two-dimensional cell culture model are ease, convenience, and high cell viability.

Conventional two-dimensional cell culture systems have improved our understanding of basic cell biology. However, the morphology and physiology of cells grown as monolayers in dish cultures differ substantially from the morphology and physiology of cells grown *in vivo* within a complex threedimensional microenvironment [6]. Therefore, growing cells as monolayers compromises fundamental investigations not only in cell and developmental biology but also in clinical research [7-11].

## **THREE-DIMENSIONAL SPHEROID CELL CULTURE MODELS FOR RNAi SCREENS**

The intrinsic limitations of two-dimensional monolayer cell culture models have prompted the development of three-dimensional cell culture models that more closely recapitulate the complex three-dimensional microenvironment associated with normal tissues and tumors. We are now beginning to realize that threedimensional cell culture models can improve the physiological relevance of cellbased assays and advance the quantitative modeling of biological systems, from cells to organisms, because three-dimensional models exhibit a high degree of structural complexity and homeostasis, analogous to the complexity and homeostasis of tissues and organs [7, 12, 13]. Today, three-dimensional cell culture models are gaining in popularity and are being used in a broad range of cell biology studies, including studies of tumor biology [8].

Several platforms for three-dimensional cell culture models have been developed in an attempt to simulate the heterogeneous tumor microenvironment [14-17]. These platforms include scaffolds, hydrogels, hollow fiber cultures, hanging drop models, collagen gel models, microfluidic channel-based models, and multicellular spheroid cell culture models. Of these models, the multicellular spheroid model devised by Sutherland *et al*. [18] is one of the best characterized and most widely used.

Multicellular spheroid models take advantage of the natural tendency of cells to aggregate into microscale spherical clusters. These models closely simulate the pathophysiological milieu of solid tumors [16, 19-21] and are providing new insights into tumor biology as well as differentiation, tissue organization, and homeostasis [16]. Multicellular spheroids are of an intermediate complexity between *in vivo* tumors and two-dimensional monolayer cell cultures and close to that of experimental tumors in mice and natural tumors in humans. Multicellular spheroid systems exhibit oxygen, pH, and nutrient gradients resulting in a necrotic area in the center of the spheroid similar to that observed in tumors (Fig. **1**). Cells

### **21 18** *Frontiers in R RNAi, Vol. 1*

### *Bartholom meusz and Rao*

lo in ne ar ar va pr di sp hi to hy he st ocated in the n the center o ecrosis to f rrangement o re similar t ascularized m roliferative istance, and pheroid mo ighlighting t o their struc ypoxic mark eavy ion i traightforwa e periphery o of the sphero form a seco of heterogen to solid tum micrometast activity clo necrotic are odels are t their potenti ture and com kers, targete irradiation. ard to employ of the sphero oid become ondary necro neous cell po mors *in vivo* tatic foci; or se to the ca eas farther fr therefore c ial in cancer mposition, s ed therapy, They have y in high-thr oid are active quiescent an otic core [2 opulations an *o* at their in r intercapilla apillaries, qu rom the capi onsidered v r research an spheroids ar multicellula e applicatio roughput scr ely cycling, nd eventuall 22]. Because nd their grow nitial, avasc ary tumor m uiescent cel illaries [20, valid *in v* nd treatment re a useful m ar-mediated ons in bio reens [27, 28 whereas cel ly die *via* ap e of their c wth pattern, cular stages microregions lls at an int 21, 23]. Mu *vitro* tumor t [19-21, 24 model to stu drug resista technology 8]. lls located optosis or concentric spheroids ; not-yetwith high ermediate ulticellular models, -26]. Due udy novel ance, and and are

**Fi igure 1:** Simila arity between ( (A) tumor and (B) multicellu ular spheroid.

T ba si ha There is great ased assays ignaling stud ave defined t interest in t for high-thro dies in cance four importa the use of re oughput RN er and other ant factors su eliable and r NAi screening diseases. Ku upporting the robust three-d g for target unz-Schugha e use of mult dimensional identification art and collea ticellular sph spheroidn and cell agues [11] heroids for high-throughput screens. First, multicellular spheroids re-establish morphological, functional, and mass transport features of the corresponding tissue *in vivo*. In particular, tumor cells in multicellular spheroids restore a differentiation pattern similar to that observed *in vivo* and maintain this pattern for several weeks in culture. Second, as noted above, multicellular spheroids mimic many characteristics of avascular tumor nodules, micrometastases, or intervascular regions of large solid tumors with regard to both tumor growth kinetics and the pathophysiological micromilieu. Third, the well-defined, symmetry of multicellular spheroids allows comparison of structure to function by taking advantage of the microenvironmental gradients that affect spheroid morphology and are also spatially correlated with changes in cellular physiology. Fourth, multicellular spheroids are amenable to coculture with different cell types, including both tumor cells and normal cells such as stromal fibroblasts, endothelial cells, or cells of the hematopoietic/immune system. In addition, cells within three-dimensional multicellular spheroids deposit extensive amounts of their own extracellular matrix [29] and form a complex threedimensional network of cell-to-matrix and cell-to-cell interactions similar to that of tumors *in vivo* [30].

As is the case for RNAi screens performed utilizing two-dimensional monolayer cell culture models, achieving maximum benefits from an RNAi screen utilizing three-dimensional spheroid cell culture models depends on selection and optimization of an appropriate phenotypic assay and a relevant biological context. In designing RNAi screens utilizing three-dimensional spheroid cell culture models, it is important to consider the differences in growth characteristics between cells growing in a two-dimensional monolayer cell culture and cells growing in a three-dimensional spheroid cell culture. There are 3 main phases in the development of spheroids [31]. During the first phase, the cells migrate and aggregate into microspheroids. The links that bind these cells are weak, and the microspheroids are therefore fragile. In the second phase, the number of cells increases, and the general spheroid structure is reinforced. Intercellular links such as desmosomes or gap junctions appear and make the spheroid more compact. During this phase, 2 cell layers gradually appear. The external layer is proliferative because it is in contact with the nutritive medium. The internal layer is essentially nonproliferative because the farther cells are from the surface of the spheroid, the less oxygen and nutrients are available. Simultaneously, pH, osmolarity, and production of catabolites evolve as in tumors. These deficits induce a central necrosis that occurs when the spheroid diameter reaches 200 to 500 nm. The third phase in the development of a spheroid is characterized by a decline in growth rate until it reaches a plateau.

A di of ce m el co Another imp imensional s f transfectio ell culture monolayer ce liminate any omplex into portant cons spheroid cel n. To achiev model sim ell culture m y factors th the cells. sideration in ll culture mo ve transfectio milar to tha models, it is i hat might i n designing odels is the n on efficiency at routinely imperative to interfere wi g RNAi scr need to achi y in a threey achieved o select a m ith delivery reens utilizi ieve a high e -dimensional in two-dim matrix-free pl of the siR ng threeefficiency l spheroid mensional latform to RNA/lipid

### **M MD ANDER RSON 3D SP PHEROID C CELL CUL LTURE MO ODEL**

H de us na m im re fe ro m sp re to 96 di di Here, we des eveloped at sed is a m anostructured migrate and mportant feat emain as a m eature permit outinely use monolayer ce pheroid mod equired in ou o promote the 6-well plate imensional m imensional s scribe a thre The Univers matrix-free n d scaffolds ( aggregate sp ture of this m monolayer fo ts successful ed for high ll culture mo del and two ur three-dime e formation o es, we typic monolayer m spheroid mod ee-dimension sity of Texas nanoculture Fig. **2**). Whe pontaneously model is that or 24 to 48 l transfection h-throughput odels. One m o-dimensiona ensional mod of multicellu cally dispens models and 2 del. Thus, rea nal spheroid s MD Ander plate (SCIV en cells are cu y into multi t once cells hours before n of cells by t RNAi sc major differen al monolaye del because a ular spheroid se 5000 to 20,000 to 30 agents have t cell culture rson Cancer VAX Corp ultured on na icellular sph are dispense e migrating y the reverse creens utiliz nce between er models is a large numb ds. When we 10,000 cel 0,000 cells p to be adjuste model that Center. The ., Japan) c anoculture p heroids (Fig. ed into the pl and aggrega transfection zing two-dim our three-dim s that more er of cells ar perform scre ls per well per well for d accordingl has been e platform omprising lates, they . **3B**). An lates, they ating. This n protocols mensional mensional cells are re required eens using for twoour threely.

**Fi igure 2**: Matrix x-free surface nanoculture N CP plate.

*Th hree Dimensional l Cell Culture Mod dels for RNAi Scre eens*

```
Frontiers in RNA
   Ai, Vol. 1 221
```
**Fi igure 3:** (A) H Hela cells cultur red in 2D mon olayer, (B) He eLa cells cultur red in NCP pla ate.

### **A Assay Optim mization**

T as re op The most imp ssay. Develo epetition and ptimized. Fai portant elem pment of the d careful at ilure to optim ment of a suc e assay is usu ttention to mize all key p ccessful RNA ually the mos detail while parameters wi Ai screen is st time-consu e parameters will result in a a robust an uming aspect s are fine-t substandard nd specific , requiring tuned and d screen.

O se co yi as ap th ce ei (D ta te co an qu lu by th si m One importan election of p ontrols and l ield importa ssay. In addi pproach to se he optimum t ell viability, ight lipid re Dharmacon, argeted gene est a panel ontrols, are d nd theoretic uantitatively uciferase [32 y imaging (F hat in cases o iRNA binds microRNA–li nt parameter positive and ow noise wit ant informati ition to selec electing the transfection we test a p eagents (Fig part of GE s are silence of nine non designed to a ally should y by ATP pr , 33], or qua Fig. **5**). An in of low seque to the 3'-un ike effect. T r for a succ negative con th the negativ ion about the cting assay-sp most approp conditions w panel of posi g. **4**). Our p E Healthcare) ed. To contro n-targeting s avoid targetin have no ef roduction, as alitatively by nherent prob ence comple ntranslated re This phenom cessful high ntrols to ach ve controls. e reproducib pecific contr priate positiv with minimal itive and neg positive cont ) lead to ce ol sequencesiRNA. Thes ng any transc ffect on cel s measured b y determining lem with siR ementarity be egion of this menon referr h-throughput hieve high si The positive bility, robust rols, we have ve and negati l lipid toxicit gative contro trols (siPLK ell death wh -independent se siRNAs cript express ll viability, by the activi g the morpho RNA technol etween the s s secondary g red to as "o RNAi scre gnal with th e and negativ tness, and ea e developed a ive controls t ty. Using a re ols against a K1, siCOP2, hen the corre t off-target e used as our sed in a chos which is d ity of ATP-d ology of the logy is the ob siRNA and a gene bringin off target ef een is the he positive ve controls ase of the a standard to identify ead-out of a panel of siKIF11) esponding effects, we r negative en sample determined dependent spheroids bservation a gene the ng about a ffect" also

### **22 22** *Frontiers in R RNAi, Vol. 1*

oc th T co T sh ac ta si co m th as an 1) be (p (n ccurs even w he case of SC The inability t ontrols resul Therefore, fo hould be te ccurately ref argeting siRN iOTP3, and s ontrols that minimize vari he Z-factor. ssay as well nd negative c ) provide a s e based on positive cont negative cont with the non-CR1 silencin to eliminate lts in negativ r each new ested to emp flect the cell NAs (siSCR siOTP4) (Dh have neglig iation it is be The Z-factor as data vari controls to pr strong basis f differential trols) – viab trols) - viabil -targeting siR ng the EGFR the "off targ ve control-de assay or ce pirically eva lular undistu R1, siSCR2, harmacon, pa gible effects eneficial to u r is a calcula iability based roduce a 'qu for a siRNA toxicity (cel bility less th lity greater th RNA as has b R gene and SC et" effects by ependent red ell combinat aluate wheth urbed baselin siSCR3, siS art of GE He s on viabilit use statistical ation that co d on the mea uality' score. screen. The ll viability) i han 30% and han 80%. been observe CR2 silencin y the use of n duction of vi tion, several her the cho ne. We use SCR4, siSCR ealthcare) and ty. To optim measures of onsiders the asured value Assays with determinatio induced by d non-toxic ed in our lab ng the lucife non-targeting iability in so l candidate m osen negativ a panel of R5, siOPT1 d selected the mize transfe f data 'qualit dynamic ran es for interna h a good Z-fa on of the Z-f known toxi non-targetin boratory in erase gene. g negative ome cases. molecules ve control nine non- , siOTP2, e negative ection and ty' such as nge of the al positive actor (0.5 factor will c siRNAs ng siRNA

**Fi** m **igure 4**: Assay minimum toxici y development ity. t plate for the i identification o of optimum tra ansfection cond ditions with

**Fi igure 5:** Cells transfected wi th nontargeting g siRNA (OTP P4 and targeting g siRNA (siPL LK1).

### **A Applications s of the 3D S Spheroid Ce ell Culture M Model**

T sc co Three dimen creening and ommon to tu nsional cell d are espec umors. culture syst ially suitabl tems expand le for interr d the utility rogating bas y of high-th sic cellular hroughput processes

T po ex th The simplest opulation o xample, an A he cells, whi cell-based a f cells are ATP-depend ich is correl assay is one averaged a dent lucifera ated to cell in which th across each ase can be u viability [32 he phenotyp well in a used to asses 2, 33]. The pes of a hom microtiter p ss ATP prod ratio of ATP mogeneous plate. For duction of P level to

### **22 24** *Frontiers in R RNAi, Vol. 1*

A in sp de th u a ADP level, a ndicator of th pheroids mig eeper within his assay mig se of more a microscope although rou he cells' met ght have an n the spheroi ght misident automated m image analy utinely used tabolic rate. n effect on t id are quiesc tify such cel methods for a yzer. to determin The heterog the interpret cent and hav lls as nonvia assessment o ne viability, genous grow tation of thi ve reduced m able. An alte of spheroids can also se wth rate of ce s assay: cel metabolic act ernate appro morphology erve as an ells within ls located tivity, and oach is the y based on

A sp re co ac ex sp pl in lu Another cellpheroid cell eproducible. omprising th ctivation of xpression. T pheroids per lates are inc nto the sphe uminescence -based assay culture mod We use a he promoter f the gene The plates ar r well. Next ubated for a eroids. The e per well, an y that we rou del is the rep a cell line r of the gen is determin re initially s , luciferin is approximatel plates are nd this value utinely perfo porter assay. that stably ne of intere ned by mea scanned to d s added to th ly 5 minutes then image e quantitated form using o This assay y expresses est fused to asuring the determine th he medium s to permit th ed to determ d (Fig. **6**). our three dim is simple, ro s a fusion luciferase. levels of he average v in each wel he luciferin mine the int mensional obust, and construct Promoter luciferase volume of ll, and the to diffuse tensity of

**Fi igure 7:** Hypo xia status of sp pheroid.

W as ca ph We have also ssays with i aptured for henotypic de o used our t mage-based each well, escriptors, in three dimen d readouts. F and the sph ncluding deg nsional spher For such ass heroids in e gree of hypo roid cell cul ays, multipl each well ar oxia, volume lture model le images of re scored in e, and morph for other f cells are n terms of hology. A hy en co by fl ypoxic prob nvironment omplex that y oxygen, b luorescence. be, Lox-1 (S of the spher permeates c but increased SCIVAX. In roid. Lox-1 cell membra d in respons nc. Japan) w is a phosph anes. Phosph se to low ox was used to horescence l horescence o xygen levels o detect the light emittin of Lox-1 is s and detect e hypoxic ng iridium quenched ted as red

S su li is in be as ne co pheroids are ufficient diff ght at its ex s quenched i ntense red fl e detected b ssay has bee ecessary to ommunicatio e incubated fusion of the xcitation wav in the presen luorescence, y a correspo en successfu o maintain on) with the p e probe into velength, the nce of oxyg and any ch onding chang ully used in the integrit probe for ap the spheroid e probe emit en. Spheroid ange in the ge in the flu a high-throu ty of the pproximately ds. When th ts red fluore ds with hyp level of oxy uorescence in ughput scree spheroid ar y 18 hours he probe is e escence. Thi oxic inner c ygen in sphe ntensity (Fig en to identify architecture to permit exposed to s reaction cores emit eroids can g. **7**). This fy proteins (personal

### **A C Analysis of I Culture Mod Images Gen del nerated Usi ing the Thr ree Dimens sional Sphe roid Cell**

T an m The data gen nalyzed by monolayer RN nerated from methods u NAi screens m the assay used to anal . ys described lyze data g d above ca generated fro an be quant om two dim tified and mensional

**Fi igure 8:** Chang ges in spheroid d morphology.

O pe pe One significa erformed u erformed us ant differen sing three sing two di nce in data dimensiona imensional analysis b l spheroid monolayer etween high cell culture cell culture h-throughpu e models a e models is ut screens and those spheroid

### **22 26** *Frontiers in R RNAi, Vol. 1*

### *Bartholom meusz and Rao*

m cu (F th de m sc m morphology. ulture mode Fig. **8**). The hroughput pr epend on t morphologica creens, are a methods deve In high-thro el, we observ e methods u rimary scree the type o al data, wh analyzed diff eloped to ana oughput scre ve a range o sed to analy ens using thr f assay an ich are mo fferently from alyze the mo eens using th of morpholo yze morphol ree dimensio nd the scre re common m quantitativ orphological he three dime ogies among logical data onal spheroi eening mod n in three d ve cell-based l data are des ensional sph g the transfe obtained fr id cell cultur de. These q dimensional d data. The scribed below heroid cell ected cells rom highre models qualitative spheroid analytical w.

A bo nu ne ha ty pr an w Automated an ottleneck in umber of ecessitates t andling larg ypes of cros rotein abund nalysis) is a with running nalysis of th n the interp images ge the developm ge data volum s-modal gen dance). Arg significant i large-scale p he large volu pretation of enerated fro ment of a r mes and fac nomic data ( guably, the if not major phenotypic s ume of image high-throug om system eliable infor ilitating inte (gene expres cost of ana component screens in th es acquired i ghput RNA matic genom rmatics infr egration of i ssion, methy alysis (imag of the econ he high-throu is perhaps th i screens. T me-wide kn rastructure c image data w ylation, copy ging and/or nomic costs a ughput RNA he biggest The large nockdown capable of with other y number, genomics associated Ai setting.

**Fi igure 9**: Work kflow for high c content analysi is in a 3D settin ng.

Im ei th sp en + ab mage analys ither of 2 ma hree dimens pheroids (z nvisioned (*e* time + spec bove. In the sis in the co ain scenario sional cultu -stacks). O *e.g.*, two dim ctra). Howev RNAi settin ontext of thr s: (i) analys ure and (ii) f course, h mension + tim ver, we will ng, each we ree dimensio is of two dim analysis o higher-dime me, three di focus in thi ell represents onal culture mensional im of three dim nsional sce imension + t is discussion s a different screens can mages of sph mensional im enarios can time, three d n on the 2 m t genetic per n occur in heroids in mages of also be dimension main cases rturbation, so each well's image represents the phenotype resulting from that gene's knockdown. These cellular phenotypes can be measured for each cell in the image field or for the entire field (*i.e.*, the entire cell population in the well). Although field-level image processing is prevalent [34], most high content workflows pursue processing of single cells within each image field. A typical workflow for high content analysis in the three dimension setting is shown in Fig. **9**. The steps in this workflow are described below.

*Image Preprocessing:* Image data acquired from a microscope are initially preprocessed to correct for possible in homogeneities in illumination and contrast. The next important step, which influences almost all remaining downstream steps, is cell segmentation. Segmentation serves to segregate cellular regions from background. Cell segmentation is a field unto itself, and many sophisticated algorithms are available for 2D and 3D segmentation. However, because 3D imaging technologies have emerged only recently, there is still adequate space for the development of segmentation and feature extraction algorithms in the 3D domain [35, 36].

*Feature Extraction:* In the context of the screening scenarios listed above, some of the most common biological phenotypes require analyzing cells for a combination of morphological and texture features. Morphological features in general refer to measurements such as circularity, area, and shape, while texture features refer to measurements of gray-level intensity distribution of each cell image [37, 38]. The resulting feature-data matrix consists of genes along rows and cell phenotypic features along columns, *i.e.*, each row represents the feature measurements of a segmented blob (single cell or clumps of cells).

*Clustering/Classification of Phenotypes:* Clustering algorithms are routinely used to group image features into clusters. Each cluster defines a group of wells (genes from the RNAi screen) whose image measurements are most similar. A visual examination of the recovered clusters may reveal phenotypes of interest [39].

On the other hand, classification algorithms can be used when a human expert identifies what phenotypes are of interest a priori. These phenotypes are referred to as "labels". Such labels can be provided to a classification algorithm and used to find image features (measurements) that are associated with a phenotype of interest. Once a reliable classifier has been constructed, it can be used to classify previously unseen images into one of the desired phenotype labels [38].

### **228** *Frontiers in RNAi, Vol. 1 Bartholomeusz and Rao*

*Screen Interpretation:* As mentioned above, clustering over image phenotypes identifies groups of genes whose knockdown results in similar phenotypes. Functional interpretation of these gene groups is necessary to relate these phenotypes to the biology of the underlying process. Similarly, classification algorithms can identify genes whose images correspond to a specified label. Genes identified using either the clustering or classification approach can then be mined for functional behavior *via* gene ontology [39], network construction, or pathway analysis (www.ingenuity.com, [40]).

*Tools for 3D analysis:* As mentioned earlier, there is significant research space for the development of 3D image analysis tools. There are several software packages with algorithms for 2D image analysis, and their 3D counterparts are gradually emerging. Multiple tools for 3D image analysis are now available, ranging from commercial tools like Volocity (Perkin Elmer-www.perkinelmer.com), Bitplane (www.bitplane.com), InCell Developer (www.ge.com), Pipeline Pilot (www.accelrys.com), and Definiens (www.definiens.com) to open-source tools like ImageJ (http://rsb.info.nih.gov/ij/),CellClassifier(http://acc.ethz.ch/), OMero (http://www.openmicroscopy.org/site/products/omero),CellProfiler (http://www.cellprofiler.org/), and V3D (http://vaa3d.org). There is continued interest in the development of algorithms that could simplify 3D image analysis.

## **Hit Validation**

Even though the outcome of off-target effects can be reduced by carefully selecting target sequences and generating siRNA libraries in which each gene is targeted by a pool of siRNA the phenomenon of off target effect highlights the need to validate the selected target. Validation is performed utilizing the identical conditions selected for the primary screen. During the validation step the pooled siRNA targeting each selected hit is deconvoluted and each individual siRNA within this group tested separately. If two or more of the siRNA within each set leads to the desired outcome this candidate hit will be deemed validated. This will then be followed by Western blot analysis to determine expression levels of the targeted protein and qRTPCR to confirm gene knockdown.

The Multicellular spheroids generated in nanoculture plates are heterogeneous in size, and this feature has to be considered in the analysis of morphologies. A method was recently described for generating individual, uniformly sized spheroids in hanging drops [16, 41]. We modified this hanging drop technique to validate our targets. Utilizing the modified hanging drop technique, we tested each of the deconvoluted siRNAs directed against the same transcript to validate th in hi di he identified n each of the it validated imensional s phenotype. e hanging dr d for its ab spheroid is s Having two rops is good bility to alt shown in Fig o or more seq d proof of ta ter the hyp g. **10**. quences elici arget specific poxic enviro it the same p city. An exa onment of phenotype ample of a the three

**Fi igure 10:** Hyp oxia status of s spheroid utilizi ing hanging dr rop model.

### **F FUTURE DI IRECTION NS**

B m bi ca sp m th va de sc th gr ha di tu pr Because of th models have iggest challe apitalize on pheroid cell microenviron his model ha ariety of t eveloped on creens to ide his model. In rowth of me ave the pote imensional m umor progre reclinical stu heir physiolo the potentia enge in futur the benefits culture mod nmental regu as contribute reatments. n the nanoc entify target n some aspe etastatic loc ential to be monolayer m ession and i udies and cli ogical releva l to become re studies w s of working del has been ulation of tum d considerab Our three culture plate ts regulating ct, formation i. Thus, thre more predi models and w improve the inical trials. ance, three d an integral will be to dev g in three dim critical in fu mor cell phy bly to our kn dimensiona es allows u g each of th n of spheroi ee dimensio ictive and p will ultimate e success ra dimensional research too vise creative mensions. T urthering our ysiology [20 nowledge of al spheroid s to use hi he growth ph ds is similar onal spheroid physiological ely improve ate of cance l spheroid ce ol in cell bio experiment The three dim r understand 0, 42-45]. In f cellular resp cell cultur igh-throughp hases associ r to the form d cell cultur lly relevant our underst er drug cand ell culture ology. Our ts that can mensional ding of the n addition, ponse to a re model put RNAi iated with mation and re models than two tanding of didates in

**230** *Frontiers in RNAi, Vol. 1 Bartholomeusz and Rao* 

### **ACKNOWLEDGEMENTS**

Declared None.

### **CONFLICT OF INTEREST**

The authors confirms that this chapter contents have no conflicts of interest.

### **REFERENCES**


### *Three Dimensional Cell Culture Models for RNAi Screens Frontiers in RNAi, Vol. 1* **231**


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

# **CHAPTER 10**

# **The Use of RNAi Technology in the Development of High Performance Bioproduction Cell Lines**

**Weilin Wu1 , Sabine van der Sanden2 , Paula Brooks<sup>1</sup> , Jon M. Karpilow3,\*, Steven Oberste<sup>2</sup> , Ralph A. Tripp<sup>1</sup>**

*1 University of Georgia, College of Veterinary Medicine, Dept. of Infectious Diseases, 111 Carlton Street, Athens, GA 30602, USA; <sup>2</sup> Centers For Disease Control and Prevention, 1600 Clifton Road, NE, Atlanta, GA, 30333, USA and <sup>3</sup> Thermo Fisher Scientific, 2650 Crescent Dr., Suite 202, Lafayette, CO 80026, USA* 

**Abstract:** Vaccines have proven to be an effective means to protect communities from a range of human and agricultural pathogens. Unfortunately, costs associated with the development and manufacturing of vaccines often prevent some of the neediest populations from receiving and distributing these essential prophylactics. Advances in molecular and synthetic biology represent potential low cost solutions for enhancing bioproduction. In the following chapter, we describe a program in which RNA Interference (RNAi) has been successfully employed to identify gene modulation events that enhance poliovirus production in vaccine manufacturing cell lines. Transition of this technology into stable production lines promises to increase overall vaccine manufacturing capabilities – thereby making these essential, life-saving therapeutics available at an affordable cost.

**Keywords:** Cell line engineering, host-pathogen interactions, polio, RNA interference, vaccine.

### **INTRODUCTION**

Since its discovery, applications of RNAi technology have primarily been in the areas of research and drug discovery. In these venues, small interfering RNAs (siRNA) delivered individually or as pools targeting discrete genes have successfully been used to down-regulate (silence) target transcripts. When combined with diligent validation, RNAi technology has enabled researchers to accurately assess the contributions that individual genes make to pathways, cellular phenotypes, and disease.

Early adopters initiated attempts to transition RNAi technology into applied fields. Primary among these pursuits was the application of RNAi to combat

**<sup>\*</sup>Corresponding author Jon Karpilow:** GE Healthcare, Lafayette, CO., Thermo Fisher Scientific, 2650 Crescent Dr., Suite 202, Lafayette, CO 80026, USA; Tel: 720-890-5142; E-mail: jon.karpilow@thermofisher.com

human disease [1-5]. As was the case in previous attempts to develop nucleic acid-based therapies (*e.g.*, antisense, ribozymes) issues surrounding stability, specificity, and the single greatest hurdle, tissue-specific delivery, quickly challenged the application of this new technology in all but the most obvious of target tissues (*e.g.*, skin and ocular) [6-8]. While these hurdles have slowed the advancement of RNAi-based therapeutics, recent tangential movement of the technology into bioprocessing promises to greatly enhance the performance of current production platforms.

### **BIOPRODUCTION AND RNAi TECHNOLOGY**

The bioproduction industry uses a wide range of prokaryotic and eukaryotic organisms to produce a variety of biomolecules including therapeutic antibodies, enzymes, hormones, and vaccines. The classic biomanufacturing workflow includes both upstream and downstream processes involved in creation and selection of cell lines expressing the biomolecule of interest, expansion of clones into small or large biofermentation reactors, removal of cells and cell debris, and purification of the biomolecule of interest from other constituents present in the media and/or cells. Upstream improvements in biomanufacturing include optimization of expression constructs, clonal selection, media formulation, and cell culture. Downstream optimization frequently focuses on improving processes tied to biomolecule purification.

A relatively small collection of well-characterized eukaryotic cell lines are currently employed in the biopharmaceutical industry. At present, nearly 80% of all biotherapeutic molecule production employs Chinese Hamster Ovary (CHO) cells with additional production platforms employing PerC6, and African Green Monkey Kidney (Vero) cells. Vaccine manufacturing utilizes a wider array of cell types including but not limited to human (MRC-5 for rubella, varicella, rabies, and diphtheria; Wi-38 for adenovirus), avian (primary chicken embryo fibroblasts for measles, mumps), primate (Vero cells for rotavirus, smallpox, influenza, and polio), and canine (MDCK cells for influenza) cell lines (see http://www.actip.org/pages/vaccinestable.html). In general, these platforms have been sufficiently characterized to accommodate large scale biofermentation. Still, a list of challenges still hinder production, including 1) suboptimal production levels of the desired biomolecule or vaccine, 2) absence of desired (or presence of undesired) post-translational modifications (PTMs), 3) aggregation or degradation (and subsequent loss) of the desired biomolecule, and 4) contamination of desired bioproducts with proteins derived from the host cell. As the costs of goods (COGS) associated with bioproduction can reach as high of 17% of total

### **234** *Frontiers in RNAi, Vol. 1 Wu et al.*

expenditures, methods that facilitate upstream and downstream processes should be rapidly adopted to increase overall efficiency and minimize manufacturing costs.

RNAi technology can be applied to address many of the issues currently plaguing bioproduction. In the work described below, a genome-wide RNAi screen was performed as a first step in cell line engineering; enabling the identification of gene modulation events that greatly enhance the production of poliovirus in a vaccine production cell line. In the course of this screen, multiple genes with previously undisclosed anti-viral functions were identified. Elimination of individual gene functions greatly enhanced live poliovirus replication, in some cases increasing yields by greater than ten-fold.

The approach described herein has several advantages. Unlike other cell engineering efforts that build upon previous knowledge in the fields of immunology and virology, workflows that include broad up-front screening efforts have the potential of identifying novel host contributions that previously were unrecognized as "anti-viral". These discoveries can be rapidly transitioned into existing bioproduction cell lines through any number of mechanisms including gene editing by zinc finger nucleases, TALENs, and meganucleases or gene product inhibition by small molecules. Thus, unlike efforts to identify completely novel cell platforms which require lengthy regulatory approval, discoveries identified here can be rapidly transitioned into the industry. As the approach dramatically improves cellular platforms, and enables a reduction in overall production costs, the technologies hold the opportunity to greatly expand access to reagents that are essential to human and animal health.

## **A BRIEF HISTORY OF POLIO**

The poliovirus is a small single stranded positive-sense RNA virus belonging to the picornavirus family. In a small fraction of cases, infection by the fecal-oral route leads to poliomyelitis, a debilitating paralysis resulting from infection, replication, and subsequent death of motor neurons in the CNS. Humans are the only known carriers of polio and while the annual number of paralytic cases previously ranged into the hundreds of thousands, widespread vaccination efforts have successfully reduced the number of cases by greater than 99% worldwide [9].

Despite the achievements in reducing the overall incidence of polio, it remains in a long list of vaccine-preventable diseases that would greatly benefit from the

### *RNAi Screening for Cell Line Engineering Frontiers in RNAi, Vol. 1* **235**

development of a new high-production vaccine manufacturing cell line. Currently over a billion doses of polio vaccine are produced annually to prevent outbreaks across the globe. Two forms of the vaccine are currently distributed. These include the inactivated poliovirus vaccine (IPV) developed by Jonas Salk in 1952 as well as the oral vaccine (OPV) comprised of attenuated viruses developed by Albert Sabin in 1957 [10]. For the last 4 decades, the OPV vaccine has been the centerpiece in controlling poliomyelitis in developing countries. The vaccine is relatively inexpensive and provides strong immunity under ideal conditions. Unfortunately, the overall effectiveness of OPV is significantly diminished in settings where a collection of factors, including poor sanitation, the prevalence of diarrheal illnesses, immunosuppression, and the pervasiveness of rival (competing) enteric viruses, counter the overall immunogenicity of the OPV vaccine[11]. The fact that in rare cases the attenuated Sabin viruses used in OPV vaccines revert to a neurovirulent form, capable of triggering vaccine-associated paralytic poliomyelitis (VAPP), makes disease control and eradication (by OPV alone) even less attainable. Formalin-inactivated IPV vaccines are not subject to these limitations. Unfortunately, due to the costs and competing priorities in world health, expanded IPV use is not feasible. Identification of new technologies that can reduce per-dose IPV costs is viewed as one of the best mechanisms to eradicate polio.

Advances in synthetic biology and cell line engineering represent potential low cost solutions for enhancing bioproduction. Work by Jay Keasling and colleagues have recently accelerated the production of artemisinic acid, a precursor of the anti-malaria therapeutic artemisinin, by creating a novel biosynthetic pathway in a single organism [12]. Outside the field of therapeutics, collaborations between academic (U. Wisconsin) and industrial (DuPont) partners have led to the generation of synthetic organisms that greatly increase the efficiency of 1, 3, propanediol (1, 3, PDO) production, a chemical used in the manufacturing of commercial goods [13]. Similar improvements are achievable in the field of vaccine production. Large- and small-scale RNAi screens designed to identify host-pathogen interactions for human immunodeficiency virus (HIV), West Nile virus (WNV), respiratory syncytial virus (RSV), and influenza virus consistently identified rare gene knockdown events that enhance viral replication [14-17]. While these studies were not performed with the intent of enhancing vaccine manufacturing, a repeat of this work using viruses and cell lines that are relevant to the vaccine industry could significantly impact global public health.

## **RNAi SCREENING TO IDENTIFY GENE KD EVENTS THAT ENHANCE POLIOVIRUS PRODUCTION**

### **The Workflow**

The poliovirus vaccine cell line engineering program (Fig. **1**) includes two central workflows: target identification (Phase I) and stable cell line development (Phase II). Given the large number of hits typically identified in host-pathogen RNAi screens and the enormous cost burdens associated with stable cell line development, a critical goal of the Phase I workflow was to reduce the list of genes being considered for Phase II to a feasible cost-sensitive number. To accomplish this, Phase I incorporated multiple screening and validation filters. In addition to the primary screen, Phase I studies included 1) validation steps to ensure the identity of primary screen gene targets, 2) proof-of-principle in a vaccine production cell line, and 3) conversion of primary screen ELISA results into concrete increases in live virus titer. Furthermore, Phase I included testing with multiple poliovirus serotypes and a study to assess the effects of gene knockdown on viral immunogenicity. Additional work which included pathway analysis and multi-gene knockdown tests were performed to identify gene silencing combinations that further enhanced viral production. Phase II focuses on the creation of stable cell lines using one or more of the currently available gene editing technologies (*e.g.*, zinc finger nucleases, TALENs). Given the costs associated with this aspect of the program, pre-clinical studies that employ more cost-effective technologies such as recombinant adeno-associated virus gene editing (Horizon Discovery) are considered.

**Figure 1:** Schematic describing the overall workflow of the polio virus vaccine cell line engineering program. Primary screen was performed in HEp-2C cells. The hits identified from primary screen were then validated in Vero P cells.

### **Reagents and Assays**

### *Cell Line and Virus Selection*

While the prior art in the field pointed to clear opportunities for vaccine cell line engineering, these studies simultaneously highlighted the importance of cell line and virus selection. As has been noted in previous publications, RNAi screening results generated in three separate HIV host-pathogen screens identified distinctly disparate hit lists [18]. The discrepancies between these studies are attributed (in part) to differences in cell lines, virus genotypes, assay endpoints, and reagents. While it is true that meta-analysis of the existing data sets identified a more consistent collection of cellular pathways and networks, the take-home message in the context of this program was simple: to enhance the odds of identifying gene knockdown events that increased *vaccine production*, the virus and cell lines employed in current vaccine manufacturing had to be incorporated into the primary screen and validation processes. This proved challenging for while our group had access to both the viruses (Sabin 1, 2, 3) and cell line (Vero, African Green Monkey (AGM) kidney) currently employed in vaccine manufacturing, the silencing reagents available for a genome-wide RNAi screen were limited to the human, mouse, and rat genomes (*i.e.*, non-AGM). To further complicate matters, sequence information for Vero was not publically available, thus preventing a prescreen bioinformatics study to assess the compatibility of human siRNA collections with the AGM genome. Given these circumstances, we were forced to decide between the lesser of two evils: screen in Vero cells and take the risk that (in some cases) the human siRNA collections would not adequately silence relevant AGM genes due to siRNA-target gene mismatches (false negatives), or screen in human cells and accept the consequences that some hits identified in human cells would not reproduce in the Vero platform (false positives). In the end, we chose the latter of the two alternatives (screen in human cells) for two reasons. First, performing the primary screen in human cells would allow us to carry out the first step of validation, pool deconvolution, with a high level of confidence. Secondly, we assumed that any hit that was 1) identified in the primary screen, and 2) confirmed by deconvolution, but 3) failed to reproduce in Vero cells, could be investigated downstream by cloning, sequencing, and (if necessary) redesign of siRNA to the Vero ortholog. Based on this reasoning, the HEp-2C cell line was selected for the primary screen. HEp-2C cells can be efficiently modulated with the commercially available RNAi reagents targeting the human genome and are effectively transfected using conventional lipid delivery reagents. Equally important, HEp-2C cells support poliovirus replication in time frames that are compatible with high-throughput screening [19].

The three poliovirus vaccine (Sabin) strains are LSc/2ab (serotype 1), P712 (serotype 2), and Leon (serotype 3). Sabin 1 has fifty-seven nucleotide substitutions that distinguish it from the parental virus. Similarly, Sabin 2 and Sabin 3 have two and ten nucleotide substitutions (respectively) that distinguish them from the parental strains. For our studies, the Sabin 2 virus was chosen for the primary screen while Sabin 1 and Sabin 3 were reserved for testing with hits that passed the initial validation studies, deconvolution and live virus assays in Vero cells. After selecting a virus and cell line for the primary screen, quantities that were sufficient for both the primary screen and downstream validations studies, were generated and banked so the same lots could be used for the entire screen.

### *Gene Silencing and Transfection Reagents*

Previous RNAi screens have demonstrated that siRNAs can induce off-target effects through a seed-mediated process [20, 21]. Given the time and costs that can be lost to pursuing false positives, and the importance of rapidly transitioning our discoveries into a new high-value polio vaccine cell line, the ON-TARGETplus siRNA library (GE Healthcare) was chosen for the primary screen. The ON-TARGETplus siRNA collection is a chemically modified genome-wide siRNA library that contains over 72,000 siRNAs targeting greater than 18,200 protein-encoding genes of the human genome. Employed as pools (4 siRNAs per target gene), each reagent contains a specificity-enhancing chemical modification pattern that is based on the collaborative work between GE Healthcare and Rosetta Inpharmatics [22]. Gene expression profiling studies show that the ON-TARGETplus modifications greatly reduce the number of off-targeted genes. As such, application of this technology to the polio screen ensured a high degree of specificity without compromising the gene knockdown capabilities of the molecule.

In the context of the primary screen (Fig. **2A**), multiple controls were included in each 96-well screening plate to further enhance the quality and confidence in the results. These include 1) a non-targeting control having little or no effects on viral replication or cell viability (On TARGETplus Non-Targeting Control Pools, GE Healthcare, Cat. No. D-001810-10-50), and 2) a custom pool of siRNAs targeting the poliovirus VP1 & 3D genes to effectively decrease viral production (GE Healthcare). A third transfection control, TOX Transfection Control (GE Healthcare, Cat. # D-001500-01-05) which induced cell death was included as a method to evaluate transfection efficiency. Finally, a mock (untreated) control was included in each plate to provide a consistent baseline (Fig. **2B**). U ev Unfortunately vent that inc y, the describ creased polio bed screen d ovirus produ did not have ction. a positive c ontrol *i.e.*, a a gene KD

**Fi** N Tr **igure 2:** Schem Negative contro ransfection Re matic showing ol (NT), positi agent was utili g (**A**) overall w ive control (P ized to monitor workflow, and ( OS) and mock r the transfectio (**B**) plate form k control (MO on efficiency f mat for the prim OC) were incl for the entire sc mary screen. luded. TOX creen.

T (G ce th em To optimize d GE Healthca ells to identi he negligible mployed in b delivery of t are, Dharma ify a reagent e effects of c both the scre the gene sile aFECTs 1 t that provid cell viability. eening and v encing reagen 4) were tes ded the great . Based on th validation stu nts a batter o sted with bo test level of hese criteria udies. of four lipid oth HEp-2C f siRNA deli , DharmaFE d cocktails and Vero ivery with ECT 4 was

F m st E of ev th N ty inally, in or media, supple tudies (GE H Edge Plate. T f the plate a vaporative b hroughput s National Stan ypes of autom rder to prov ements, and Healthcare, H The Edge pla nd can be fi buffer zone w creens. Imp ndards Instit mation empl ide further c buffer were HyClone Re ate design in illed with ste which minim portantly, th tute (ANSI) loyed in gen consistency purchased f eagents). The ncorporates a erile water o mizes edge ef he Nunc Ed ) footprint a ome-wide R in our resu for both the e screen also a well that en or media. Th ffects freque dge Plate fo and is theref RNAi screeni ults, individu screen and v o employed ncircles the his design pr ently observe ollows the A fore amenab ing. ual lots of validation the Nunc perimeter rovides an ed in high American ble to the

### *A Assays*

A nu As stated pre umber of Ph eviously, one hase I hits in e of the goa nto a small, r als of the pr refined list o rogram was of Phase II ca to rapidly r andidates. T reduce the To achieve

### **240** *Frontiers in RNAi, Vol. 1 Wu et al.*

this, no fewer than five separate assays were employed over the course of the primary screen and validation studies to enrich for hits that induced a narrow set of highly desired phenotypes. The primary screen (Fig. **3**) relied upon a poliospecific ELISA that detected the Sabin-2 poliovirus "D-antigen" and incorporated a mouse monoclonal antibody (HYB294-06, Thermo Scientific/Pierce). It is worth noting that special efforts were made to ensure that hits identified in the primary screen were relevant to the vaccine cell line program. Previous published hostpathogen RNAi screens have identified hundreds of host genes that play a role in viral replication. In most cases, these screens have utilized ELISA or reporter expression constructs as the primary tool for hit identification. Given the exhaustive and expensive work that is required to validate hits in our program and the absolute need for primary screen hits to convert into increases in live virus titers, a preliminary study was performed to understand how primary ELISA screen results related to live virus titers. Specifically, a small collection of ~40 hits 1) identified from screens of the kinase, protease, and ubiquitin libraries, and 2) having ELISA absorbance values ranging from 1.5-5.0 were tested in the CCID50 (limiting dilution) live virus assay. From these studies we observed that most of the hits having absorbance values ranging from 1.5-2.7 showed consistent but modest increases in live virus titers (~2-3 fold). In contrast, hits having SD values of 3.0 (or greater) led to more significant increases in viral production (5- 30x). Given the importance of identifying gene KD events that greatly increased live virus production we set the minimum absorbance cutoff for the primary screen at 3.0, thereby enhancing the likelihood of identifying hits that would provide value in vaccine manufacturing.

The polio-specific ELISA used in the primary screen was also used in the first step of our validation work, siRNA pool deconvolution. Deconvolution is a critical component of RNAi screening. As mentioned above, previous studies by multiple researchers have shown that siRNA-mediated knockdown can generate false-positive phenotypes. The mechanism behind these "off-target" effects is now well understood and one strategy for minimizing false-positives involves demonstrating that two or more siRNAs (targeting different sites on the same gene) induce the same phenotype (*e.g.* an increase in viral titer). As the primary ELISA screen was performed using pools of four separate siRNAs targeting different regions of each gene, the follow-up deconvolution study tested each of the four siRNAs individually along with as many as four additional siRNAs derived from a separate library, siGENOME (GE Healthcare).

The ELISA used in the primary screen and deconvolution studies detects the native "D-antigen" conformation of the viral proteins regardless of whether they are associated with a live virus, an inactive virus, or a suspended viral protein.

### *RN NAi Screening for r Cell Line Engine eering*

S as qu Im w pi th ob ince the IPV ssays, a CCI uantitate th mportantly, workflow. In iece of data hat the incr bserved in a V vaccine i ID50 assay an e effects o both of the n doing so, t a that is requ reased viral a cell line em s derived fr nd a plaque f gene kno ese assays in this aspect o uired for the l production mployed in cu rom comple assay, were ockdown on ncorporated of the valida e success of n (observed urrent poliov ete viral part added to the n live virus d a Vero ce ation provid f the program in HEp-2C virus vaccine ticles, two a e validation particle pr ll line in th des a second m: the demo C cells) is e production additional studies to roduction. he overall d essential onstration similarly n.

**Fi** en H **igure 3:** Diagr ngineering pro HEp-2C cells an ram providing gram. An in-h nd validation in details of the p house optimize n Vero cells. polio ELISA us ed ELISA prot sed in the prim tocol was utiliz mary screen of zed for primar the cell line ry screen in

A (S O an A further step Sabin 2) scre OPV vaccine nd 3). In co p in validatio een affected es incorporat ontrast, curr on involved d other Sabin te three atte rent IPV vac testing how n serotypes. enuated Sabi ccines utiliz w the hits ide As mention in polioviru ze three dist entified in th ned above, th us strains (Sa tinct wildtyp he primary he current abin 1, 2, pe strains

### **242** *Frontiers in RNAi, Vol. 1 Wu et al.*

(Note: several groups in the Netherlands, China and Japan are working to develop IPV vaccines based on the Sabin strains). Given the diverse makeup of the current vaccines, the highest priority hits would be those that positively influenced titers of a wide collection of poliovirus. To assess the effects of gene knockdown events on Sabin 1 and Sabin 3 viruses, ELISA and plaque assays were incorporated. ELISAs to detect Sabin 1 or Sabin 3 used identical assay configurations as those adopted in the primary screen but substituted serotype-specific antibodies unique to the virus under investigation (NBP1-05101 (Novus Biologicals) for Sabin poliovirus type 1; HYB 300-06 (Thermo Scientific, Pierce Products) for Sabin poliovirus type 3). Plaque assays incorporated Vero cells to assess the effect of gene knockdown on live virus production in a vaccine manufacturing cell line.

Lastly, an important aspect of our studies focused on antigen equivalency. Prior to these studies, it was unknown whether virus grown in siRNA-treated cells would be antigenically equivalent to those grown in unmodified cells. To assess the antigenic similarities between poliovirus grown in unmodified *vs.* siRNA-treated cells, a microneutralization assay was performed with Sabin 2 viruses from Vero cells transfected with siRNAs against selected genes and a pool of human sera collected from individuals previously exposed to poliovirus vaccine. In a 96-well format, 100 CCID50 of Sabin 2 viruses from selected cell supernatants were combined with two-fold serial dilutions of the anti-polio serum, starting with a 1:8 dilution up to 1:1024. Sabin 2 viruses from cells not transfected with any siRNA were included as a control. Viruses and serum were incubated for 3 hours after which HEp-2C cells were added. After 5 days of incubation at 37°C, in 5% CO2, cells were stained with crystal violet and endpoint serum neutralization titers calculated by the Kärber formula [23].

### **Summary of Results**

As stated previously, the goal of the program was to rapidly identify gene knockdown events that significantly enhanced poliovirus production in a cell line currently employed in vaccine manufacturing. Fig. **4** shows the overall summary of results from our screening and validation studies. Using the relatively stringent absorbance cutoff for our primary ELISA, only 124 hits were identified in the primary screen. From this collection, 76 genes were found to have two or more siRNAs that induced the desired "increase-in-virus-production" phenotype. While the remaining 48 hits may represent true modulators of poliovirus replication, the observation that only a single siRNA increased viral protein expression amplified the risk that the observed changes were unrelated to target gene knockdown (*i.e.*, a false positive resulting from off-target effects). Overall, the primary screen and deconvolution cutoffs allowed us to quickly reduce the number of genes being considered for further validation to a very manageable 0.4% of the whole genome.

Follow-up CCID50 and plaque assay validation studies further reduced the list of genes that we would consider for Phase II. Using the cutoff of ≥5x increase in viral titer, fewer than 20 candidate genes out of the 76 under consideration performed to this level in both assays (Fig. **5**). The significant loss of potential candidates at this stage can be attributed to at least two factors. First, while the initial ELISA cutoff of Z-score ≥ 3.0 enhanced our chances of identifying relevant genes, this filter was not perfect. Secondly, the CCID50 and plaque assays are the first instances where the Vero cell line was included in our validation studies. Though there is a high likelihood that siRNA designed to target genes of the human genome would also silence AGM genes, the frequency of siRNA-target mRNA mismatches (and therefore the relative efficiency of gene silencing) is unknown. These two factors in combination likely influenced the number of hits that were considered in subsequent rounds of validation.

**Figure 4:** Figure provides details of cell types, assays, and reagents used in Phase I of the polio vaccine cell line engineering project. "Hits" column describes number of genes that passed stage criteria. The 124 hits identified from primary screen in HEp-2 cells were first subjected to decovolution study; afterwards followed by validation in Vero cells examined by CCID-50, plaque assay and antigen equivalence test.

Surprisingly, the antigen equivalency assay which tested whether virus produced in siRNA-modified cells was recognized by sera taken from previous vaccine recipients (data not shown), did not provide a relevant filter for moving forward into Phase II studies. In all of the cases tested, the virus produced in cells where candidate genes had been silenced was recognized by patient sera (standard reference serum, CDC). In contrast, the Sabin1/Sabin 3 testing provided a strong

### **244** *Frontiers in RNAi, Vol. 1 Wu et al.*

Phase II candidate filter. Only half of the "top 20" candidates that increased Sabin 2 production by 5x or more had similar effects on Sabin 1 and Sabin 3 serotypes. This finding supported previous work in the field that suggested small changes in cell lines, viral strains, and/or assays can lead to substantial differences in the effects of gene silencing. Overall, the combination of the primary screen ELISA together with multiple secondary validation assays allowed us to reduce the total number of Phase II candidates to approximately 10 genes or 0.005% of the original library.

**Figure 5:** Graph shows the effect of single gene knockdown events on live virus titer (CCID50 Assay) in Vero cell line for the top candidates. The hits shown here represent siRNA knowdown events which greatly enhanced poliovirus (Sabin 2) replication in Vero cells. X axis: gene number.

### **Moving Forward**

While there is still much that needs to be done to understand the complexities and balances of cellular physiology, recent advances in gene sequencing, synthesis, and modulation have opened an array of new possibilities in synthetic biology and cell line engineering. The knowledge and techniques currently available to researchers should allow significant improvements in bioproduction. As shown here, the combination of high throughput screening and thorough validation can lead to the identification of multiple gene modulation events that significantly enhance polio virus production. These discoveries can be transitioned into novel engineered cell lines that greatly enhance vaccine production worldwide. While the gene targets identified in this program are likely virus and cell line specific, expansion of this program to another virus-cell line combinations can identify additional host-encoded genes that (upon silencing) should enhance vaccine production. Adoption of such modified cell lines by the vaccine manufacturing industry could significantly reduce human disease and play a broad role in safeguarding agricultural stocks.

### **ACKNOWLEDGEMENTS**

We would like to thank Naomi Dybdahl-Sissoko (CDC), and Jason O'Donnell (UGA) for excellent technical assistance over the course of this project.

This work is supported by The Bill and Melinda Gates Foundation and GE Healthcare.

### **CONFLICT OF INTEREST**

J. Karpilow is employed by Thermo Fisher Scientific. Some of the materials described are products sold by Thermo Fisher Scientific.

### **REFERENCES**


### **246** *Frontiers in RNAi, Vol. 1 Wu et al.*


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

**CHAPTER 11** 

# **RNAi Screening to Facilitate Drug Repurposing**

## **Olivia Perwitasari and Ralph A. Tripp\***

*University of Georgia, College of Veterinary Medicine, Department of Infectious Diseases, 111 Carlton Street, Athens, GA 30302, USA* 

**Abstract:** Drug discovery is strangled by extraordinary time consuming and costly processes associated with high failure rates. In the United States, less than 5% of drug candidates that enter drug testing will be approved by the Food and Drug Administration (FDA) and offered for clinic use. An emerging solution to overcome this bottleneck in new drug development is to repurpose presently available drugs, a practice also known as drug repurposing. In this chapter, a general overview of drug repurposing is reviewed, along with screening methods that have yield successful outcomes. Emphasis is given on utilizing RNA interference (RNAi) screening to identify druggable genes that can be targeted by drug repurposing.

**Keywords:** Antiviral, drug repurposing, host-pathogen, proviral.

### **WHY REPURPOSE EXISTING DRUGS?**

Drug development is an extremely time and resource consuming process. An average drug development program takes 10-15 years to complete and costs a billion dollars or more for the final product to reach the clinic [1-3]. These figures do not include the time and money vested in the basic research that precedes indepth clinical studies. Despite the incredible investment associated with drug development, there is a >95% failure rate for compounds that enter the pipeline, *i.e.* one FDA-approved drug for every 10,000 compounds (illustrated in Fig. **1**). To compensate for the high failure rate there are currently 2,668 drugs in clinical trials, including new compounds, existing compounds for new indications, and new formulations (http://www.clinicaltrials.gov). Despite efforts by the National Institutes of Health (NIH) and FDA to promote new drug discovery, only 39 new drugs were approved in 2012, the second highest number of approvals by the FDA [4]. The high failure rate is especially disconcerting for development of therapeutics for emerging diseases, such as influenza virus where swift development is critical, and for rare or neglected diseases such as malaria where funding and demands are limited [5].

**<sup>\*</sup>Corresponding author Ralph A. Tripp:** University of Georgia, College of Veterinary Medicine, Department of Infectious Diseases, 111 Carlton Street, Athens, GA 30302, USA; Tel: 706-542-1557; E-mail: ratripp@uga.edu

**Figure 1:** Drug development process. Traditional drug development process is shown on the left side of the figure (adapted from http://www.ncats.nih.gov/research/reengineering/process.html) and was compared to drug repurposing on the right side of the figure.

One alternative that could facilitate rapid drug development and accessibility to new therapeutics involves repurposing existing or previously approved drugs. This process, which is also termed drug repurposing, rescuing, repositioning, or reprofiling, makes use of existing drugs that have been successfully employed to treat unrelated diseases, or failed FDA approval in the latter phases of testing but passed the early safety trials [1, 6]. Drug repurposing takes advantage of the often non-specific and occasionally multifunctional nature of biology. By example, repurposing can result from the promiscuous (off-targeting) nature of small molecules *i.e.* a compound's ability to bind two or more different gene products that play a role in separate and unrelated pathways/diseases. Alternativelly, successful repurposing can be the consequence of a single gene product to which a molecule targets, participating in two or more biological/disease pathways. In this instance, a drug binding a single gene product can (fortuitously) be used to address afflictions that are associated with two unrelated diseases. By repurposing drugs, compounds that have already passed the initial Phase I clinical safety trial for one particular affliction can directly enter Phase II or III trials for a second, unrelated indication. These drugs can be available for clinical use in approximately two years, as compared to the 10-15 year period required for standard drug development method (Fig. **1**). Importantly, this accelerated path, trims approximately 40% of the costs associated with standard drug development [1, 2].

Despite the stated advantages of drug repurposing there are several challenges to pursuing this approach (see Table **1**). One of these hurdles is associated with the limited access researchers have to the databases of compounds available for repurposing. Most pharmaceutical companies keep confidential lists of compounds that fail to transition into the clinic and these lists are not typically accessible to outside entities [7]. Furthermore, only a few of the publically accessible databases are searchable by genes or pathways. This shortcoming can thwart the scientific community's ability to identify relevant small molecules to gene targets identified in *e.g.*, a genome-wide screen.



Additional hurdles to adopting drug repurposing strategies come in the form of issues related to funding and intellectual property (IP). In the area of infectious disease, only 1% of new approved drugs are for treatments of neglected diseases, with the majority of these funds focused on therapeutics for malaria [8]. Given this overwhelming emphasis and the limited opportunity for profit, obtaining funding to repurpose drugs for treatments of additional rare and neglected diseases is an uphill battle. Roadblocks associated with ownership and freedomto-operate represent a separate potential stumbling block for drug repurposing. To achieve government consent for new drugs in the United States applicants must file either a *New Drug Application* (NDA) or as an *Abbreviated New Drug Application* (ANDA). In general, therapeutic molecules are protected under patents submitted during the course of drug development and have a validity

### **250** *Frontiers in RNAi, Vol. 1*

period of twenty years plus sixty additional months of patent protection (Hatch-Waxman Act) to compensates drug manufacturers for the lengthy process associated with FDA approval. When a patent expires, a new application can be submitted to the FDA for generic versions as ANDA.

The compositions of matter claims for any particular drug are generally covered by an active pharmaceutical ingredient (API) patent and are complemented with additional patents/claims covering inventions related to formulations, methods of delivery, and treatment regimes. API patents are generally submitted very early in new drug development process, thus it is likely that these documents only cover a brief window of time following the drug's availability on the market. From the perspective of drug repurposing, new patents can cover a repurposed drug if they contain new ingredients, formulations, delivery methods, drug combinations, and (importantly from the context of this discussion) new methods of use, such as new indications. Such repurposed drugs can be developed from a previously-shelved API or an existing drug available in the market. That said, in many cases, previously-shelved therapeutics or therapeutic candidates have short (or no) patent protection what-so-ever. While this allows these entities available for immediate drug studies and repurposing, the ability to protect any new application can be limited.

## **DRUG REPURPOSING INITIATIVES AND RESOURCES**

To combat the challenges described above, several initiatives and organizations including the Drugs for Neglected Diseases Initiative, Bill and Melinda Gates Foundation, USAID's Neglected Tropical Diseases program, and others have encouraged development of new therapeutics through public-private partnerships [9]. One noteworthy example of how these consortia can bridge the effects of basic research and therapeutic drug development is deomonstrated by the efforts to combat HIV infection. In 1987, AIDs was a full blown pandemic with no known treatments. Prior to the outbreak, the small molecule drug azidothymidine (AZT) had been developed as a treatment for cancer but was later abandoned from consideration due to the absence of efficacy during clinical trials. A partnership involving the National Cancer Institute (NCI), the Burroughs-Wellcome Company (now GlaxoSmithKline), and Duke University identified AZT as an effective therapeutic against HIV, launching the drug as the first FDA-approve treatment for the disease in 1987 [10, 11]. Remarkably, this effort took just over two years to move from the initial demonstration of AZT's anti-HIV property to its FDA approval.

### *RNAi Screening to Facilitate Drug Repurposing Frontiers in RNAi, Vol. 1* **251**

Another notable initiative to promote partnerships involving industry and researchers in academic and government settings is the newly established, "Discovering New Therapeutic Uses for Existing Molecules" program launched in 2012 by the NIH's National Center for Advancing Translational Sciences (NCATS, [12, 13]. The main goal of the NCATS program is to bypass bottlenecks in drug development by allowing rapid testing and transition of existing drugs to target known diseases. Through this program, pharmaceutical companies (Abbott Laboratories,AstraZeneca, Bristol-Myers Squibb Company, Eli Lilly and Company, GlaxoSmithKline, Janssen Pharmaceutical Research & Development, L.L.C., Pfizer, and Sanofi) provide academic researchers with access to compound collections for repurposing. In addition, the program provides \$20 million in funds for rapid access to multi-year research grants that focus on drug repurposing. To streamline legal and administrative processes and to guide the handling of intellectual property throughout the course of this partnership program, a memorandum of understanding was drafted between the NIH and industry partners. Templates for confidential disclosure and collaborative research agreements between are also provided and can be viewed on the NCATS website [13]. The UK's Medical Research Council (MRC) launched a similar program with AstraZeneca in late 2011, allowing access to 22 compounds for academic research [12].

An additional resource for drug repurposing is the NIH's Chemical Genomic Center (NCGC) Pharmaceutical Collection, a publicly accessible database of small molecules that have previously been accepted for human use [14]. This database contains ~2,400 of the approximately 2,750 drugs currently approved for use in the United States (FDA), European Union (EMA), Japan (NHI), and Canada (HC) (http://tripod.nih.gov/npc/). Other compound libraries that may aid drug repurposing include those developed by the National Institute of Neurological Disorders and Stroke (NINDS) which contains 1,040 compounds and Johns Hopkins Clinical Compound Library (JHCCL) containing 1,500 compounds [1, 15-17].

## **DRUG REPURPOSING THROUGH SERENDIPITY**

AZT's repurposing resulted from an extensive collaborative effort involving governmental, academic, public, and private institutions. Several other successfully repurposed drugs were the product of serendipitous discoveries of the drug's off-target effects. One notable example is Sildenafil, or compound UK-92,480, which was developed by Pfizer in the late 1980s. Sildenafil targets phophodiesterase-5 (PDE-5) and was originally intended for treatment of

### **252** *Frontiers in RNAi, Vol. 1*

hypertension and angina [18]. In the course of clinical trials it was observed that treatment resulted in an unexpected "off-target" side effect (erectile-enhancing) in male volunteers [19, 20]. This led to a dramatic shift in Pfizer's marketing strategy for Sildenafil, targeting male patients afflicted with erectile dysfunction (ED). Under the trade name Viagra®, Sildenafil was approved as a medication for ED in 1998. This precedes its approval for the original intended indication, pulmonary arterial hypertension, in 2005 marketed under a different trade name Revatio® [19]. An additional example of serendipitous drug repurposing is minoxidil (Rogaine®) which, like Sildenafil, was developed to treat hypertension but displayed an off-target effect that promoted hair growth.

Despite these success stories, relying on serendipity for drug repurposing is illadvised. In the following section, we provide details of how incorporation of high throughput screening, specifically genome wide RNAi screening, can facilitate drug repurposing.

## **RNAi SCREENING FOR THE IDENTIFICATION OF DRUG TARGETS AND DRUG REPURPOSING**

RNAi is an endogenous post-transcriptional gene regulating system that utilizes small, non-coding RNAs to silence or knockdown gene expression. As reviewed in greater detail elsewhere in this book, the RNAi pathway has been utilized by researchers to investigate the contribution of a wide range of genes to cell and developmental biology. Individual or pools of synthetic RNAs, referred to as small interfering RNAs (siRNAs), can be designed to specifically target unique messenger RNAs (mRNA) and knockdown gene function for windows of time that are compatible with a range of tissue culture-based assay platforms. At the same time, RNAi libraries containing collections of siRNAs targeting thousands of genes have been developed. When combined with the appropriate mid-high throughput automation systems, research can quickly identify subsets of genes involved in a particular phenotypic responses including host-pathogen responses. An example of such technology is the siGENOME and ON-TARGETplus siRNA libraries developed by Dharmacon (part of GE Healthcare). Of a particular interest to drug repurposing efforts are the druggable libraries within these collections that contain reagents targeting approximately 7,500 genes. These "druggable genome" libraries are comprised of targets whose protein products contain functionally relevant secondary and/or tertiary structures that are predicted to be accessible by pharmacological inhibitors. By screening this subset of the genome a closer connection between the RNAi screening reagent and (eventual) small molecule discovery is achieved [21]. Importantly, these libraries are available in multiple formats including pooled reagents (SMARTpool; mixture of four siRNA per target gene) or individual siRNAs targeting human, mouse, or rat.

A general workflow that describes how RNAi screening can facilitate drug repurposing is shown in Fig. **2**. In the first step, the cell is tranfected with an siRNA reagent(s) using conditions that are optimized for the cell line.

**Figure 2:** Workflow of siRNA screening to identify druggable target for drug repurposing.

Following a 48-72 hour period to allow for maximum RNAimediated gene knockdown, the cultures are treated as called for by the experimental protocol *i.e.* infection with pathogen of interest. Cultures are then assessed at desired time points post-treatment using an appropriate endpoint assay(s) to determine the contribution that each gene makes to a particular process or disease. Hits derived from the primary screen are frequently funneled through a series of validation assays designed to minimize false positives and provide additional details regarding the

### **254** *Frontiers in RNAi, Vol. 1*

contribution of the gene to the phenotype of interest. If pools of siRNAs were used in the primary screen, deconvolution, a process where each siRNA making up the pool is tested individually, often follows to demonstrate that multiple (single) gene targeting reagents induce the same phenotype. Additional validation often includes testing in complementary cell culture systems that incorporate different cell types, viruses, assay endpoints, and more. Gene targets that pass stringent validation studies are then investigated using various bioinformatic tools to gain insights into the gene's contribution in the context of the greater cellular physiology. These studies often include a gene ontology (GO), pathway mapping, and interaction databases (GeneGo MetaCore™, Ingenuity Pathway Analysis, Toppcluster) to further appreciate the contribution and pathway interaction of hits identified by the screen. [22, 23]. Potential therapeutic targets, including key pathway regulators or nodes can be identified from these *in silico* analyses and further evaluated for their druggability.

Once important regulators have been identified, researchers can mine public and/or private databases to identify drugs targeting hits identified during the screen. Publically available databases that contain both gene targets and small molecule effectors include PROMISCUOUS (http://bioinformatics.charite.de/promiscuous) [17]; ChemSpider (http://www.chemspider.com), [24]; DrugBank (http://www.drugbank.ca), [16] PubChem (http://pubchem.ncbi.nlm.nih.gov), ChEMBL (https://www.ebi.ac.uk/chembl), and the Clinician's Pocket Drug Reference (Scut manual). These platforms include anywhere from thousands to millions of compound references and in many cases provide interfaces that allow searches for targets, structures, and sources of each of the small molecules. Candidate small molecules identified through this sort of intensive database mining can be assessed (*in vitro* and *in vivo)* for relevance to any particular disease and (when appropriate) advanced toward clinical trials in an accelerated fashion.

It is worth noting that despite the extensive use of RNAi technology in biological research over the last decade [25] there are a limited number of examples of drugs being repurposed based on RNAi screens. Several factors appear to contribute to this. First, many of the hits identified by RNAi screens are not considered "druggable" in the classic sense. Certainly as screeners focus on RNAi collections that silence known drug targets, this hurdle will be overcome. Separately, at this time, the workflow that combines RNAi screening with drug repurposing has not been widely adopted by industrial or academic-industrial consortiums. This may be due to the additional efforts required for RNAi screening (*e.g.*, siRNA transfection, intensive validation studies) as compared to small molecule screening. As the technology evolves to further simplify the workflow (*e.g.*, selfdelivering siRNAs) we expect these challenges to be minimized. Other obstacles that are common to nearly all drug development investigations will need to be addressed to bridge the gap between gene target identification and therapeutically approved molecules. These include understanding dosing, delivery routes, bioavailability, and applicability to a range of cell types and/or pathogen strains. Despite these (common) hurdles, we believe the combination of RNAi screening and drug repurposing has the potential to greatly reduce the time required and financial burden needed to bring a drug to market.

## **CASE STUDY: USING RNAi SCREENING TO IDENTIFY PRO-INFLUENZA HOST FACTORS AND REPURPOSING THE PROTOTYPICAL OAT INHIBITOR, PROBENECID, FOR ANTI-INFLUENZA A THERAPEUTICS.**

Influenza represents a global public health challenge due to its extraordinary pandemic potential. In the United States, over a quarter of a million hospitalizations and up to 49,000 fatalities are reported annually because of seasonal influenza infections [26-28]. Several influenza therapeutics are available, including the neuraminidase (NA) inhibitors zanamivir (Relenza®) and oseltamivir (Tamiflu®) and the M2 ion channel inhibitors amantadine and rimantidine [29-31]. These therapeutics are directed toward viral proteins and therefore set in motion a range of complex selective pressures for drug resistance. Unfortunately, despite the fact that drug resistance is increasing in circulating and pandemic influenza virus populations, only a small number of innovative drugs are advancing toward FDA approval[32], emphasizing the need to support programs that adopt new drug discovery approaches that have greater potential to identify novel classes of drug targets.

Both focused and genome-wide RNA interference screens have been performed to uncover cellular features required for influenza A virus replication [33-41]. In the course of infection, host factors can either suppress virus replication (anti-viral genes, such as factors associated with host immune response) or be hijacked by the virus to support its replication (proviral genes). Targeting pro-viral host factors represents a potentially innovative and refractory therapeutic approach to combat the growing issue of drug resistance. In contrast to the viral genome which is prone to a high mutation rate and therefore more capable of responding to selection pressures, normal host cells exhibit a relatively low mutation rate, thereby making them less capable of compensating to the presence of *e.g.*, a small molecule inhibitor targeting a host gene. As such, when a host-encoded proviral

### **256** *Frontiers in RNAi, Vol. 1*

function is targeted, either by RNAi reagents or small molecules, the lost function cannot be recovered by mutagenesis of the viral genome, thereby forcing the pathogen to undergo a much more dramatic evolution to compensate. While it is conceivable that viruses could adapt to become less dependent upon specific host functions (*e.g.*, by shifting its reliance onto a host gene that encodes a redundant function) these more radical changes in the viral lifecycle are considered more difficult to overcome than the single nucleotide changes often observed in viral genes that have circumvented the effects of small molecule therapeutics.

Adopting a therapeutic strategy that targets host genes is not without its own set of challenges. Targeting host-encoded proviral genes may lead to increased cellular toxicity, which would require candidate inhibitors to undergo extensive safety profile studies. Certainly, repurposing available inhibitors with known safety profile should minimize this concern. And while the majority of antiviral therapeutics are still focused on disrupting the function of viral targets, several inhibitors targeting host factors are currently being explored for anti-influenza A therapeutics. A list of current viral and host targets for influenza A virus therapeutics are included in Table **2**.

By performing an RNAi screen that employed the SMARTpool siGENOME drug target library, a recent study identified the organic anion transporter-3 (OAT3) as a pro-influenza A host factor [41]. Although OAT3 was not among the top proviral hits on this screen (z-score of -0.854), its potential as anti-influenza therapeutic target was evaluated due to the availability of a safe and reliable inhibitor. The classic OAT inhibitor probenecid is generally recommended as a medication for gout and other hyperuricemic disorders [42, 43]. In the study, probenecid is demonstrated to reduce growth of multiple influenza A virus subtypes *in vitro* ([41], also shown in Fig. **3**). Importantly, probenecid also limits influenza A virus infection *in vivo* when provided prophylactically or therapeutically (notably against the recent pandemic influenza A virus strain (A/California/04/09) and leads to a reduction of influenza A associated mortality and morbidity in infected mice [41]. Probenecid has been shown to maintain plasma levels of the active metabolite of oseltamivir. For this reason, it has been suggested that these two drugs be administered in combination [44-46]. This study also demonstrated the utility of oseltamivir-probenecid combinatorial treatments and ascribes the efficacy of this approach to probenecid's anti-influenza property and its role to prolong plasma oseltamivir levels. Overall, the program demonstrates how i) RNAi screening can identify a new collection of targets for anti-influenza A treatment, and ii) how probenecid and other approved medications can be repurposed for treatment of viral diseases.

### *RNAi Screening to Facilitate Drug Repurposing Frontiers in RNAi, Vol. 1* **257**

### **Table 2:** List of viral and host targets for anti-influenza A therapeutics


Influenza virus proteins: M2, proton channel; NA, neuraminidase; PA, acid polymerase; HA, hemagglutinin; NS1, nonstructural protein 1; NP, nucleoprotein. Host proteins: ERK, extracellular signal-regulated kinases; NFκB, nuclear factor kappa-light-chain-enhancer of activated B cells (κB); IκB, inhibitor of κB; IKK, IκB kinase; OAT3; organic anion transporter 3; CRM1, chromosome region maintenance 1; CDC25B, cell division cycle 25 homolog B.

**Perwitasari and Tripp**

**Figure 3:** The OAT inhibitor, probenecid, limits influenza A virus infection *in vitro* in a dose dependent fashion. A549 human alveolar epithelial cells were exposed to increasing levels of probenecid for twenty-four hours before infection with influenza A/WSN/33(H1N1) (WSN) or A/New Caledonia/20/99(H1N1) (New Caledonia) virus at MOI = 0.001 or 0.05, respectively. At 24- or 48-hours post infection, samples were fixed, stained for influenza A virus nucleoprotein (NP; green) and nuclei (DAPI; blue), images were captured using the Cellomics array scan automated immunofluorescence microscopy system (Thermo Scientific). Number of NP-positive cells was quantified and graphed (bottom). Graphs represent results from six replicate experiments and error bars denote standard error of the mean. \*,p<0.05; \*\*,p<0.005.

## **OTHER APPROACHES TO DRUG REPURPOSING**

Methods other than RNAi screening have been used to identify compounds for drug repurposing. As mentioned above, screens that employ small molecule repurposing libraries have been broadly used to identify compounds that possess the desired biological activities. However, as this approach does not automatically identify the gene(s) being targeted by the molecule, a significant drawback to this workflow is that it does not easily lend itself to downstream molecule optimization. Among the repurposed compounds listed in Table **3**, pyrvinium pamoate (tuberculosis and protozoal infections), tiagabine (Huntington's diseases), and ceftriaxone (Amyotrophic lateral schlerosis) were identified through NINDS library screens, while intraconazole (angiogenesis inhibitor), mycophenolic acid (lupus erythematosus), closantel (onchocerciasis), glefanine (anti-tumor), and digoxin (anticancer) were identified using screens of the JHCCL library (as reviewed in [5]). Separate and distinct from wet-lab screening, researchers have used *in silico* target predictions based on structural bioinformatics and drug target network analyses compiled from genomics, proteomics, metabolomics, and biomarker profiling [47- 49] to repurpose drugs toward targets of interest.


### **Table 3:** List of repurposed drugs

### **260** *Frontiers in RNAi, Vol. 1*

### **Perwitasari and Tripp**

*Table 1: contd…* 


### *RNAi Screening to Facilitate Drug Repurposing Frontiers in RNAi, Vol. 1* **261**

*Table 1: contd…* 


AIDS, acquired immune deficiency syndrome; HIV, Human immunodeficiency virus; NSAID, Nonsteroidal antiinflammatory drugs.

### **SUMMARY**

The high cost, failure rate, and lengthy development time associated with standard drug development processes limit the number of new drugs available for clinic use. The NIH and other organizations have recently aimed to stimulate a more rapid drug development process by encouraging drug repurposing – a strategy that aims to find new uses for existing drugs which 1) are currently approved for treatment of unrelated diseases, or 2) failed to show adequate efficacy for the original application but passed the FDA's initial Phase I clinical trials. Repurposing drugs that have well-established pharmacokinetics and toxicity profiles allows for a more rapid deployment of these molecules against new diseases. This process can potentially reduce overall drug development times by as much as 80% and minimize costs by up to 40%. Thus, the drug repurposing strategy will facilitate rapid development and availability of desperately needed therapeutics, especially for emerging diseases, such as pandemic influenza, drugresistant pathogens, or diseases with limited or no available therapeutics.

**262** *Frontiers in RNAi, Vol. 1* 

There are many approaches for repurposing available drugs including *in silico* prediction methods, high throughput small molecule screens, and RNAi screening. An example of an RNAi screen that led to a potentially valuable step in drug discovery is illustrated in a recent anti-influenza host-pathogen screen which identified probenecid, a drug currently available for treatment of gout and other uricemic disorders, as a potential anti-viral therapeutic. Thus, despite the workflow complexities associated with RNAi screening, this method represents a flexible target-oriented platform for drug repurposing.

### **ACKNOWLEDGEMENTS**

Declared None.

### **CONFLICT OF INTEREST**

The authors confirm that this chapter contents have no conflict of interest.

### **REFERENCES**


### *RNAi Screening to Facilitate Drug Repurposing Frontiers in RNAi, Vol. 1* **263**


### **264** *Frontiers in RNAi, Vol. 1*


### *RNAi Screening to Facilitate Drug Repurposing Frontiers in RNAi, Vol. 1* **265**


© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

## **Author Index**

## *A*

Anderson, Sarah B., 58-78

### *B*

Bakre, Abhijeet, 107-143 Banos, Michael S., 40-57 Barichievy, Samantha, 107-143 Bartholomeusz, Geoffrey, 215-231 Bean, Andrew G., 79-106 Beijersbergen, Roderick L., 58-78 Birmingham, Amanda, 3-20, 40-57 Boutros, Michael, 40-57 Brooks, Paula, 232-246

## *D*

Doran, Timothy J., 79-106

## *F*

Fraser, Iain D.C., 144-177 Freeley, Michael, 144-177

## *H*

Hansen, Michael A. E., 178-214 Heo, Jinyeoung, 178-214

## *I*

Izzard, Leonard H., 79-106

### *J*

Jenkins, Kristie A., 79-106 John, Sinu P., 144-177 Johnston, Sean M., 21-39

### *K*

Karpilow, Jon M., 232-246 Kaufmann, Andreas, 3-20 Kim, HiChul, 178-214 Kim, JinYeop, 178-214 Kim, Namyoul, 178-214

Kozak, Karol, 3-20 Kwon, Yong-Jun, 178-214

### *L*

Lee, TaeKyu, 178-214 Long, Aideen, 144-177 Lowenthal, John W., 79-106

### *O*

Oberste, M. Steven, 232-246

### *P*

Perwitasari, Olivia, 247-265

### *R*

Rao, Arvind, 215-231

## *S*

Schmidt, Esther E., 40-57 Shamu, Caroline E., 21-39, 40-57 Simpson, Kaylene J., 58-78 Smith, Anja, 58-78 Smith, Jennifer A., 21-39, 40-57 Soloveva, Veronica, 178-214 Stambas, John, 79-106 Stewart, Cameron R., 79-106

## *T*

Tizard, Mark L., 79-106 Tompkins, S. Mark, 79-106 Tripp, Ralph A., 232-246, 247-265

### *V*

van der Sanden, Sabine, 232-246 Vermeulen, Annaleen, 58-78

### *W*

Wu, Weilin, 232-246

**Ralph A. Tripp & Jon M. Karpilow (Eds) © 2014 The Author(s). Published by Bentham Science Publishers**

© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode

## **Subject Index**

### *0-9*

3D cell culture 217-20, 224-6, 228-9

## *A*

Assay development, Pooled shRNA screen 63-7 Assay optimization, 3D cell culture 221-2 Assay, Hypoxia 224-5 Assays, High-content 153-4 Assays, Microarray-based 201-3 Assays, Migration 155, 162-3 Assays, Reporter 149-50, 153 Assays, Secreted effector 155 Assays, Viability 153, 222, 223-4

## *B*

B cells, Screens in 160

## *C*

CaM kinase 85-6 Cell line, Selection of 184, 186, 206, 237 Clinical trials, RNAi 93-6, 119, 125 Coatamer proteins 84-5 Controls, Location of 24-6, 49, 196, 201-3 Controls, Selection of 23, 63, 181, 203, 221-2, 238 Core facility setup 29-31

## *D*

Data analysis 13-4, 68-9, 87, 180-1, 184, 203-5, 208, 225-8 Data annotation 44-8, 50-1 Data comparison 43-44 Data curation 47-8 Data repositories 48-53 Data standardization 44-6 Delivery, Accell siRNA 152 Delivery, Therapeutic agent 90 Dendritic cells, Antigen presentation screens in 161-2 Dendritic cells, Host-pathogen screens in 163 Dendritic cells, Migration screens in 162-3 Drug repurposing 247-62 Drug target databases 254

## *E*

Emerging infectious diseases (EIDs) 80

### *H*

Hepatitis C infection, miRNAs involved in 113, 118-9

Herpesvirus infection, miRNAs involved in 127-30 Hit confirmation 11, 208-9, 243-4 Hit identification 67-9 Hit validation 69-71, 228-9, 236, 239-42 HIV infection in T cells 120, 124, 157-9 HIV infection, Host factors involved in 83-5, 119-24, 158-9 HIV infection, miRNAs involved in 121-5

## *I*

Image analysis, 3D culture 228 Image analysis, Microarray 197-201 Influenza infection, Host factors involved in 82-6, 130- 1, 256-8 Influenza infection, miRNAs involved in 130-1 Innate immune response 7-8, 110-3, 147 Interferon-inducible trans-membrane proteins (IFITMs) 83

## *L*

Laboratory Automation 21-38 Library Format 22-6 Library management 22-9 Library reannotation 11-12 Library storage 26-9 LIMS (Laboratory Information Management Systems) 48-9 livestock, Disease resistant 96-100

### *M*

Macrophages, Host-pathogen screens in 165-6 Macrophages, Immune activation of 164 Mast cells, Screens in 167 Microarray printing 181-3, 189-91, 196-7 Microarray, Cell culture on 184-7, 190-5 Microarray, siRNA high-density 181-3, 196-7 miRNA detection 114-5 miRNA inhibitors 115-7 miRNA mimics 115-7 miRNA pathway, Endogenous 108-13 miRNA target identification 117 Monocytes, Immune activation of 164 Mycobacterium tuberculosis (Mtb) 165-6

## *N*

Next Generation Sequencing (NGS) for pooled screens 67-8 NK cells, Screens in 154-5, 166-7

**Ralph A. Tripp & Jon M. Karpilow (Eds) © 2014 The Author(s). Published by Bentham Science Publishers** **268** *Frontiers in RNAi, Vol. 1 Tripp and Karpilow* 

### *O*

Off-target effects, Identifying 11-4 Off-target effects, Prevention of 7-11 Ontology, Data 44-6

## *P*

Phagocytosis 164-5 Plates, Matrix-free nanoculture 220 Polio vaccines 235 Poliovirus 234, 237-8

## *R*

Rab6 GTPase 85 Reverse Transfection 31-3, 148, 185-7, 192-4, 220 RNAi mechanism 4-6, 108-13 Robotics 29-38 RSV infection, miRNAs involved in 126

## *S*

Seed sequence 5-6, 9-10, 12-5 shRNA libraries 59-62 shRNA pool uniformity 62

shRNA silencing efficiency 61, 68 shRNA, pooled screening 63-71 sildes, siRNA on 182-4, 188-92, 196-7 siRNA Electroporation/Nucleofection 38, 149, 150 siRNA transfection 28-33, 36-7, 146-9, 182-8, 192-4, 220-1, 239 Spheroid cell culture 217-20, 224-6, 228-9

## *T*

T cells, HIV infection in 120, 124, 157-9 T cells, Screens in 150, 157-9 Transfection, siRNA 28-33, 36-7, 146-9, 182-8, 192-4, 220-1, 239 Tumor-associated macrophages 166

## *V*

Vaccine production 233-4 Viral host factor screens 82-7 Viral transduction 63-66, 147, 151-2

## *Z*

Zoonotic viruses 80

© 2014 The Author(s). Published by Bentham Science Publisher. This is an open access chapter published under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/legalcode