Science Behind the Scenes: Multiplexed CRISPR and sgRNA Arrays with the Howard Salis Lab
Over the last few years, thousands of studies have employed CRISPR/Cas systems to edit, or transcriptionally regulate, individual genetic targets. But a new study has taken CRISPR to soaring heights.
CRISPR/Cas is remarkably simple in principle: a protein, usually Cas9, can bind to an RNA molecule, such as a single guide RNA (sgRNA), which has a sequence complementary to a target site in the genome. When the Cas9:sgRNA complex binds to its target site, it cleaves the target DNA. By mutating specific amino acids in Cas9, DNA cleavage activity is abolished, thus converting it into a transcriptional repressor (called dCas9).
Though many research groups have explored methods to increase the number of sgRNAs that can be expressed at once in vivo, it has been a difficult challenge, in part, because sgRNAs have very repetitive elements. One part of the sgRNA, called the ‘handle’, is a 42-nucleotide strand of RNA that physically associates with Cas9. Unfortunately, most DNA synthesis manufacturers are unable to synthesize these repetitive elements, thus limiting the number of sgRNAs that can be assembled and expressed in living organisms.
In a new study, published in Nature Biotechnology, researchers from Penn State University have devised a method that enables 22 distinct sgRNAs to be expressed at once in bacterial cells. The solution? Design and characterize hundreds of non-repetitive genetic parts, including new sgRNA handles, that maintain their function but can actually be synthesized by DNA manufacturers.
I sat down with Alex Reis and Sean Halper (joint first authors) and Howard Salis (corresponding author and Associate Professor at Penn State University) to learn more about multiplexed CRISPR, how nonrepetitive parts are designed, and their plans for the future.
This interview with Alex Reis, Sean Halper and Professor Howard Salis on “Simultaneous repression of multiple bacterial genes using nonrepetitive extra-long sgRNA arrays”, published in Nature Biotechnology, has been edited for clarity. Words in parentheses are my own.
Niko McCarty: Can you tell me a bit about the inspiration behind this study? What was the impetus that made you look at the CRISPR multiplexing field and say, “I bet we can improve the number of sgRNAs expressed at once in living cells?”
Alex Reis: Well, five years ago, Sean Halper (co-first author) and I took a graduate level course with Professor Howard Salis, and we were exploring different ideas for scalable genetic circuit design. We kept going back to CRISPR because it is a scalable system; all you need to do to build complex CRISPR-based genetic circuits is express one protein regulator (Cas9), and a whole bunch of single-guide RNA regulators (sgRNAs). That was a very powerful idea to us, and we really wanted to scale that up. So this project was motivated from an application side, the desire to build larger genetic circuits.
Professor Howard Salis: When looking at this long DNA sequence or genetic circuit that we had designed for this class project, we basically saw that there were quite a few long regions of repetitive DNA. And if you were to copy-paste that sequence into an order form for any gene synthesis provider, it would immediately tell you, “We can’t make this. It’s too long, too repetitive.” So we knew that this was going to be a challenge for the CRISPR field as groups try to multiplex sgRNAs. If you redesign the whole system so that there is no more repetitive DNA, you would be able to build it easier, assemble it faster, and you would be able to express a lot more CRISPR regulators simultaneously.
Niko: Can you walk me through the key advancements from the paper, especially the things that you set out to do and what you accomplished?
Alex: After we identified that repetitive DNA was going to be a key bottleneck in cloning sgRNA arrays, we decided that the first step would be to identify and characterize non-repetitive parts for both genetic expression and the sgRNA handles themselves. The first thing we did was to design and characterize non-repetitive promoters and non-repetitive terminators. But a key challenge that that we faced was to identify non-repetitive handle sequence for sgRNAs. What sequences will enable handle sequence variants to still bind to the Cas9 or dCas9 protein?
To design these non-repetitive sgRNA handles, we carried out multiple rounds of a design, build, and test cycles and imposed specific constraints. In the first design round, the constraint was purely structural – we told our algorithm that the sgRNA had to fold into a structure that could be recognized by the Cas9 structure. After that round, we applied a machine learning technique called linear discriminant analysis to identify which mutations would cause handle failure. With that, we identified two nucleotides in the sgRNA handle, G43 and G52 that, when mutated, would abolish handle function. After iterating through these processes a few times, we ultimately characterized 28 highly functional, non-repetitive handle variants. And these handles work equally well for Cas9 and dCas9.
Grace Vezeau, another author on the paper, ran a bunch of cleavage assays to verify, measure and quantify how well these different non-repetitive sgRNA handles were able to load up into Cas9 and cleave DNA.
Niko: After you verified these non-repetitive sgRNA handles, you then used them for three different engineering applications. Can you walk me through those?
Sean Halper: We built three different ELSAs (extra long sgRNA arrays), the longest of which contained 22 distinct sgRNAs. We wanted to come up with some applications that would show the power of scaling up the number of sgRNAs using nonrepetitive handles in E. coli. Our first proof-of-concept was to aerobically produce succinate using a knockdown of six different genes. At first, when we targeted these six genes, it didn’t work. We troubleshooted the problem, and found that we had to increase the expression of dCas9, after which we saw a 1000-fold knockdown on some of the genes that we were targeting. This incidentally also showed that, once you start expressing many sgRNAs at once, you need to have enough Cas9 or dCas9 to handle that many simultaneous RNA regulators.
In a second example, we used an ELSA to target different amino acid biosynthesis pathways. We really wanted to see if we could use CRISPRi knockdowns to impose auxotrophy-like behavior. For the third example, we knocked down different stress response genes to explore how a broad spectrum perturbation would affect the behavior and response of E. coli.
Howard: Part of this effort was also to develop algorithms that allow us to design DNA sequences that can be readily synthesized by commercial service providers. Some of these ELSAs have over 20 promoters, 20 terminators, and so forth. Terminators can form hairpins and may contain palindromic sequences, however, so if you ask a gene synthesis provider to synthesize any old DNA with lots and lots of hairpins, they’re going to balk at you. But if you design the system correctly, if you draw from a large enough pool or toolbox of genetic parts, and you arrange those genetic parts just right, you can meet your target metrics for what can be synthesized. As long as your DNA sequence is within those target metrics, then these companies can actually deliver the DNA fragments to you. By the end of this project, we were able to synthesize 33 DNA fragments up to 3 kilobases each, all containing ELSAs, with about a 90% success rate and turnaround time, which is about five days.
Niko: Do you have any plans for designing non-repetitive ribozymes or cleavage sites, which may enable you to express many sgRNAs from a single promoter?
Howard: Let me just start off by saying that we started this project four or five years ago, and we have made some important advancements since then. Another graduate student in our group, Ayaan Hossain, developed an algorithm called the “Non-Repetitive Parts Calculator“, which formalizes how you can go about designing very large toolboxes of non-repetitive genetic parts. With this algorithm, we’ve been able to design, construct and characterize huge toolboxes of non-repetitive parts, including 4300 non-repetitive E. coli promoters, 1917 non-repetitive yeast promoters, at least 600 non-repetitive ribozymes with near wildtype cleavage activities, and about 2000 non-repetitive Cas9 handles.
So, is it possible to design many more non repetitive parts? It is absolutely possible. We know that for sure. Theoretically, there are about 100,000 non-repetitive sgRNA handles out there for Cas9. We clearly haven’t characterized 100,000 yet, we’ve only characterized 2000, but that kind of gives you an order of magnitude for the possibilities. Now, it should be possible to arrange all these genetic parts in an array and build ELSAs that are about 500,000 bases long, which is smaller than many yeast chromosomes that labs have already built. So it’s possible to build these very long sgRNA arrays, and there are many applications for them across industrial metabolic engineering and in the biomedical space.
Niko: And what about the different authors on the paper? Were there specific skillsets brought by individuals?
Alex: Sean, myself and Phillip Clauer, a former undergraduate, did the bulk of the cloning and characterization of the parts, but most of the lab pitched in and helped out. Daniel Cetnar helped with RNA level characterization, including a lot of the early RT-qPCR on the CRISPRi knockdowns.
Sean: Ayaan Hossain was really helpful in terms of helping us expand our non-repetitive part libraries for the promoters and terminators especially, as well as helping with some of the machine learning analysis. But it was definitely a collaborative effort over the last five years.
Niko: What are your plans for after graduation?
Sean: I actually defended my PhD just a couple of weeks ago. I’m part of the SMART Scholarship for Service program, which is a fellowship with the Department of Defense. Once I wrap up here, I plan on beginning work soon with my sponsoring facility, the Army Research Lab in Adelphi, Maryland.
Alex: I’m wrapping up a project or two and then will hopefully graduate and move on to the next thing. I love synthetic biology, so I am looking at postdocs along those lines. I’m also thinking about some entrepreneurial aspects that I could pursue.
Niko: This study is so appealing to me, in part, because of its collaborative nature. It seems like most people in the group helped out – can you tell me a bit about that?
Howard: Well, we have a very relaxed environment. While some people in the synthetic biology field have groups with 30-40 people, our group has less than 10. This means that everyone knows everyone else, and we all help each other. I intentionally set up my lab so that new people come in, and they receive training not just from myself, but from other graduate students and postdocs. Because of this, many students feel the obligation to pay it forward and help out other people. If you’re really good at something, and you can carry out some set of experiments quickly, then you should help out your colleagues in the lab. In our group, a lot of sharing goes on, and that’s what makes work like this possible.
Howard Salis is an Associate Professor of Biological and Chemical Engineering and Synthetic Biology at Penn State University. Research in the Salis laboratory focuses on the development of rational design methods for engineering synthetic biological systems – metabolic pathways, genetic circuits, and genomes.
Sean Halper is a graduate student at Penn State University, and co-first author on this study. He recently defended his PhD in Chemical Engineering.
Alex Reis is a graduate student at Penn State University, and co-first author on this study.