Sanger seq analysis – finding correct clones

We do a lot of molecular cloning in the lab. Sarah has been working on a great protocols.io page dedicated to writing up the entire process, which I’ll link to once it’s completely set. But once you’ve successfully extracted DNA from individual transformants, a key tool in the molecular cloning pipeline is to find the clonal DNA prep that has the intended recombinant DNA in it. We use Sanger sequencing to identify those clones. That is what this instructional post will be about.

First, gather all of your data. I have a plasmid map folder on the lab google drive, where all of the physical .gb files for each unique construct is stored. Create a new folder (named after your .gb file), and *COPY* all of the relevant Sanger sequencing traces (these are .ab1) files into that folder. That will just help organize everything down the road. *PREFIX* the Sanger sequencing files with the clone id (usually “A”, “B”, or “C”, etc). This will make things easier to interpet down the road.

You presumably already have this plasmid map on Benchling, since that’s likely where you designed the plasmid, so open that up. if you don’t already have it on there, then import it. Once open, click on the to the “alignment” button on the right hand side.

If this is your first time trying to align things to this plasmid, then your only option will be to “Create new alignment”; press this button.

Next you’ll get to another screen which will allow you to add in your ab1 files. Click the choose files button, go into the new folder you had made in the “Plasmid” directory of the lab google drive, and import the selected abi files.

If you’ve successfully added the abi files, then the screen should look like this:

The default settings are fine for most things, so you can go ahead and hit the “create alignment” button. It will take a few seconds, but at the end you should get a new screen that looks like this.

The above screen is showing you your template at the top, as well as the Sanger seq peaks for each of your Sanger runs. If you’re trying to screen miniprep clones to see which prep might have the right construct, then go to the part of the map / alignment that was at the junction. For example, in this above construct, I had shuttled in the iRFP670 in place of mCherry in the parental construct, so anything that now has iRFP670 in there in place of mCherry is an intended construct. Looks like I went 3/3 this time.

This is a good chance to look for any discrepancies, which are signified by red tick marks in the bottom visualization. I’ve now moved the zoomed portion to that area, and as you can see, the top two clones seem to have an extra G in the sequence triggering these red lines. This is where a bit of experience / intuition is important. Since this is toward the end of the sequencing reaction and the peaks are getting really broad (compare to the crisp peaks in the prior image), the peak-calling program was having a hard time with this stretch of multiple Gs, thus calling it 4 Gs instead of 3. I’m not concerned about this at all, since it’s more of an artifact of the sequencing rather than a legit mutation in the construct. If we were to sequence this area again with a primer that was closer, I’m almost sure all of the constructs will show they only have 3 Gs.

Since all three constructs seem to have the right insertion any without any seemingly legit errors (yet), I typically choose one close and move forward with fully confirming it. All things being equal, I make my life simpler by choosing the clone that is earliest on the alphabet (so A in this case). That said, if all three looked fine but Clone C had by far the highest quality sequencing (say, the Clone A trace only went 400nt while the Clone C trace went 800+, I’d instead go with Clone C).

Next is sequencing the rest of the open reading frame. if you already chose a sequencing primer, then you’ve probably already attached a bunch of primers onto this map. But let’s pretend you haven’t already done this. To do this, go to the “primer” button and hit “attach new primers”, as below.

You may already have a primer folder selected. In that case, you’re all set to go. Otherwise, you’ll have to select the most recent primer folder. Click the “Add locations” area, open the triangle for the folder that says “O_Primers” and select the most recent folder (since this is currently august, this would be “20200811”. If selected, it should show up in the window, like the below picture.

Hit “Find binding sites” and it will come up with a long list of primers *that we already have* that are located in your construct.

This part gets kind confusing. To actually use these primers, you’ll first have to hit the top left box, located on the header row of the big table of primers. You should then see a check mark next to every primer.

Once you do that, hit the “Attach selected primers” button that it’s in green at the top right of the screen.

Once you do that, all of the primers that were previously listed but colored white before should now be colored green. NOW you’re free to switch back to your alignment.

Once you switch back to your alignment, the top should have a big yellow box that says “Out of Sync” and have a blue button that says “Realign”. Hit the realign button, which will call up one of your earlier screens, and you just have to hit realign once more.

Now all of the attached primers should show up in the top row of the alignment window. Since I have so many overlapping primers, this eats up a bunch of the space on the screen (you can turn off the primers if needed by clicking on the arrow next to “Template” and clicking off the box next to Primers). Still, now you can move around the map and find primers that will be able to sequence different parts o the plasmid you have not sequenced yet. In this case, I’ll likely just move forwards with one clone (such as Clone A), and I’ll start sequencing other parts of the ORF, such as the remaining parts of mCherry which I can likely get with the primer KAM1042.

Congrats, you’re now a Sanger sequencing analysis master.

Flowjo Analysis of GFP positive cells

We do a lot of flow cytometry in the lab. Inevitably, what ends up being the most practical tool for analysis of low cytometry data is FlowJo. While I’ve been using FlowJo for a long time, I realize it isn’t super intuitive and new people to the lab may first struggle in using it. Thus, here’s a short set of instructions for using it to do a basic process, such as determining what percentage of live cells are also GFP positive.

Obviously, if you don’t have FlowJo yet, then download it from the website. Next, log into FlowJo Portal. I’m obviously not going to share my login and password here; ask someone in the lab or consult the lab google docs.

Once logged in, you’ll be starting with a blank analysis workspace, as below.

Before you start dragging in samples, I find it useful to make a group for the specific set of samples you may want to analyze. Thus, I hit the “Create Group” button and type in the name of the group I’ll be analyzing.

Now that the group is made, I select it, and then drag the new sample files into it, like below:

Now to actually start analyzing the flow data. Start by choosing a representative sample (eg. the first sample), and double clicking on it. By default, a scatterplot should show up. Set it so forward scatter (FSC-A) is on the X-axis, and side scatter (SSC-A) is on the Y-axis. Since we’re mostly using HEK cells, that means that main thing we will be doing in this screen is gating for the population of cells while excluding debris (small FSC-A but high SSC-A). Thus, make a gate like this:

Once you have made that gate, you’ll want to keep it constant between samples. Thus, right click on the “Live” population in the workspace and hit “Copy to Group”. Once you do that, the population should now be in bold, with the same text color as the group name.

Next is doublet gating. So the live cell population will already be enriched for singlets, but having a second “doublet gating” step will make it that much more pure. Here is the best description of doublet gating I’ve seen to date. To do this, make a scatterplot where FSC-A is on the X-axis, and FSC-H is on the Y-axis. Then only gate the cells directly on the diagonal, thus excluding those that have more FSC-A relative to FSC-H. Name these “Singlets”.

And like before, copy this to the group.

Next is actually setting up the analysis for the response variable we were looking to measure. In this case, it’s GFP positivity, captured by the BL1-A detector. While this can be done in histogram format, I generally also do this with a scatterplot, since it allows me to see even small numbers of events (which would be smashed against the bottom of the plot if it were a smoothed histogram). Of course, a scatterplot needs a second axis, so I just used mCherry fluorescence (or the lack of it, since these were just normal 293T cells), captured by the YL2-A detector.

And of course copy that to the group as well (you should know how to do this by now). Lastly, the easiest way to output this data is to hit the Table Editor button near the top of the screen to open up a new window. Once in this window, select the populations / statistics you want to include from the main workspace, and drag it into the table editor, so you have something that looks like this.

Some of those statistics aren’t what we’re looking for. For example, I find it much more informative to have the singlets show total count, rather than Freq of parent. To do this, double click on that row, and select the statistic you want to include.

And you should now have something that looks like this:

With the settings fixed, you can hit the “Create Table” button at the top of the main workspace. This will make a new window, holding the table you wanted. To actually use this data elsewhere (such as with R), export it into a csv format which can be easily imported by other programs.

FYI, if you followed everything exactly up to here, you should only have 2 data columns and not 3. I had simplified some things, but forgot to update this last image so it’s now no longer 100% right (though the general idea is still correct).

Congratulations. You are now a FlowJo master.