PhD Student Rotations

It’s PhD student rotation season again at CWRU, so I figured I may as well put this post on the lab website to 1) inform any prospective PhD students that may be perusing through the lab website, and 2) remind me of the things I like to bring up before people rotate.

  1. If you’re interested in rotating, we should definitely schedule a meeting so I can get a sense of your background and interests, so I can tailor the rotation appropriately (and screen out people who are likely to be really poor fits; see point 3 below). It will also give me the opportunity to talk through some of the other points listed below.
  2. Rotations are suuuuper short here (Generally 4 to 6 weeks). Thus, there is ZERO expectation on my end to get any “publication quality” experiments done. My main goal is to make sure you’re familiar with some of the bread-and-butter methods in the lab (eg. molecular cloning, landing-pad -centric tissue culture, script-based data analysis). Failed experiments are fine, since it gives us the opportunity to talk about the data and troubleshoot together. The main thing I’ll be looking for is how well we’re able to communicate and work together, since that’s arguably the most important thing we can learn from that rotation that could be extrapolated to predict how good of a dissertation work environment it would be for the specific individual.
  3. There isn’t really any prerequisite experience for rotation students. Yea, it would be helpful if you know how to pipet, have done some basic tissue culture work of any kind, and have designed and interpreted some experiments before. Being housed in a wet-lab department, I have very little expectation of computational experience. That said, wet-lab people that have zero interest in learning computational biology and data analysis are probably not great fits, since all projects in the lab will always have hefty data analysis components. Conversely, computation-only people with zero interest (and maybe even experience) in wet-lab research is also likely a bad fit, since all projects in the lab will also always have hefty wet-lab components.
  4. The lab is pretty interdisciplinary. Like, some people work on virology, while other people work on proteins related to clinical genetics. Thus, you’ll have to be generally interested in science / biology to enjoy your time here. In contrast, if you only care about subject XXXX or subject YYYY and nothing else, then lab meetings are going to be really boring to you. There’s always talk about (practical) statistics, molecular biology, cell engineering, assay development, and high throughput sequencing; thus, if you’re into those things at some level, then you’re probably fine!
  5. There are three very different options in terms of dissertation projects. There are some “ready-to-go” project ideas, where I’ve already crafted a grant application very clearly explaining the project scope. There are also some projects where I’ve played around a bit with some ideas / preliminary data, but it’s not really clearly written out anywhere and things will need to be hashed out. Both of these types of projects should be listed in this “Research Directions” network graph. Then again, there are probably some really great projects that I haven’t thought of yet, that A) are in line with the student’s interests, and B) can be tackled with the techniques / perspectives that the lab is good at. If it’s a decent idea that has links between cell culture assays, cell engineering, genetics, proteins, cell biology, and pathological consequences, I’m sure I’ll find it interesting and get on board. Highest potential risk, but also highest possible reward for the student (at least from a training for independent thinking perspective).
  6. Rotation projects don’t have to be on the same topic as potential thesis projects. In my opinion, it’s oftentimes best to separate them, since potential thesis projects likely don’t have any DNA constructs made for it already, so working on it means only doing (likely failed) cloning during the rotation, which is no fun and not particularly informative.
  7. I’ll only ever take one student any given year. So while it’s not a competition, some people who may want to join may not be able to. Something to keep in mind!

Ordering oligos at CWRU

Here’s a price comparison I did back in 2019 (presumably still correct?). But in short, per nt price was cheapest through ThermoFisher.

Thus, we’ve been almost exclusively buying oligos from them, with $7,220 spent (as of June 2022) since our first orders starting December 2019.

Here’s what the histogram of oligo costs have shaped up as.

But, well, don’t order degenerate nucleotides oligos from them as they’ll likely be T biased.

If anyone sees anything better on campus, let me know!

Consistent Plasmidsaurus sequencing miscalls

As I noted in this Twitter exchange, plasmid nanopore sequencing via Plasmidsaurus is great, but not perfect. For example, there seem to be some “achilles heal” sequences, where nanopore reproducibly (like 100% of the time with different plasmid submissions) miscalls certain parts of our plasmids. How do we know they’re miscalls? B/c the Sanger traces of the same exact plasmids show the expected sequence very clearly. Here are two that we commonly see:

A single deleted C nucleotide in the beginning of our IRES sequence:

A phantom T>C base miscall that incorrectly tells us we have a W566R nonsynonymous change in every single one of our human ACE2 constructs.

Both are related to C repeats, but there are plenty of other C repeats in the plasmids we submit and it’s ALWAYS these sequences that give Plasmidsaurus problems. Once I figured this one, it’s really NBD, since I know to ignore these changes, although it did inform our current molecular biology workflow in the lab of 1) Screen colony minipreps via Sanger -> 2) Sequence candidate good constructs with Plasmidsaurus / nanopore -> 3) Sanger to resolve unexpected discrepancies between the expected / intended and Plasmidsaurus sequences.

Command line BLAST

One of the pseudo-projects in the lab requires looking for a particular peptide motif in genomic data. While small scale searches can be done using the web interface, the idea is to do this in a pretty comprehensive / high throughput manner, so shifting to the command line makes sense for this work. I last did this back in 2018 for some preliminary studies, so I’m going to have to re-install the software on my new computer and re-run some of those analyses. I figure I’ll write down my notes as I re-do this, so that I (and others) can use this post as a reference.

Installing BLAST+

The instructions on how to download the program can be found here. I’m on a mac, so I downloaded “ncbi-blast-2.13.0+.dmg” and double clicked and ran the package installer.

Assuming it’s been correctly installed, writing the command …

blastp -task blastp-short -query <(echo -e ">Name\nAAWLIEKGVASAEE") -db nr -remote -outfmt 1

… into the terminal should actually reveal some BLAST-specific output, rather than throw an error.

Running protein motif-specific blast searches

Type in the following into your terminal:

psiblast -phi_pattern PHI-Blast_2A_pattern.txt -db nr -remote -query <(echo -e ">Name\nGATNFSLLKQAGDVEENPGP") -max_hsps 1 -max_target_seqs 10000 -out phi_blast_output.csv -outfmt 10

Note: The above command will require having a text file specifying the pattern constraint (“PHI-Blast_2A_pattern.txt” above), which can be found here. This should yield a 25 KB file csv output, like so.

Extracting just the accession numbers

I don’t remember if there are other BLAST+ outputs that give you the full hit sequence. If so, the method I ended up taking back in 2018 would seem to be unnecessarily roundabout. But, until I figure that out, I’ll follow the old method. As you can see in the aforementioned output format, it doesn’t output the hit protein sequence, and instead just gives the accession number. Thus, the next step is using the accession number to actually figure out the protein sequence. To do this, we’ll use Entrez Direct. To install Entrez Direct, follow the instructions here. Briefly, type in the following into the terminal:

sh -c "$(curl -fsSL

In order to complete the configuration process, execute the following:

echo "source ~/.bash_profile" >> $HOME/.bashrc
echo "export PATH=\${PATH}:/Users/kmatreyek/edirect" >> $HOME/.bash_profile

OK, now that it’s installed, here’s how I’ve used it:

First, the output file above has more info than the accession number. To have it pare down to only the accession number, I used this script, which can be run by entering the following into the terminal, assuming you have the previous output csv file somewhere in the directory with the script (can even be in other folders within that directory):


This will create a file called “3A_prot_accession_list_complete.txt” (example output file here) which will be the unique-ified list of accession numbers to give to Entrez Direct. (Uniquifying is important if you have multiple .csv outputs you wanted to compile into a single master list).

This can be fed into Entrez Direct using this shell script, which you can run by typing in:


You should now have an output file called “4A_prot_fasta.txt” with the resulting protein sequences in fasta format, like so.

Now you can search for your desired sequence (in its full protein context) within the resulting file.

To be continued…

Are there other steps in this process related to this project? Sure. Like what do you do with all of these full sequences containing the hits? Well, that’s beyond the scope of this post.

ODs on the spec and nanodrop

So there are two ways to measure bacterial culture ODs in the lab. The first is to use the nearby ~ $10,000 Thermofisher Nanodrop One (no cuvette option). The second option is to use a relatively cheaply made cuvette-based spectrophotometer I bought off of Amazon for ~ $100. To make it clear, this comparison is not a statement about the value of a Nanodrop (though I will say that having an instrument like a Nanodrop is essentially a must in a mol biol lab). This is more about if the Nanodrop is already being used by someone and waiting would get in the way of some bacterial speccing timepoints, can I purchase a $100 piece of equipment to relieve such a conflict? Especially for bacterial cultures, where volume isn’t really an issue and the measurement is simply the reading at 600 nm, not even requiring some algebra to make a conversion to more practical units (like ng/uL for DNA).

So to do this comparison, over a number of independent instances, I took the same bacterial culture and put 1mL into a cuvette and ran it on the old spec, and took 2 uL and put it on the Nanodrop pedestal and measured there. I made a table of the results, and graphed it in the plot below.

So the readings on the two instruments certainly correlate (that’s good), although it’s not an exact 1:1 relationship. In fact, the nanodrop gave numbers roughly 1.5 times higher than the spec. But if the two instruments give two different readings, then the question becomes “which is right?”

And to that, I essentially say there is no right answer. Each is a proxy for bacterial cell density (ie. Billions of bacteria / mL), but there’s no “absolute” information encoded in the OD number that tells us that specifically for our bacteria, and we’d still have to come up with a conversion factor either way (ie. my doing limiting dilutions of specc’d cultures and counting colonies), and once we have that, both will be right with that context. Sure, it would be nice if we had a method that was the most in-line with whatever ODs that were being described by various papers in the literature, but who knows what they used (recent papers may be using ODs from the nanodrop [with some perhaps using the cuvette option but many others not], while the older publications certainly didn’t have and instead likely used some old-school form of spec). But even that’s going to be heterogeneous, and will only give limited information anyway.

Well, good record-keeping to the rescue. We’ve transformed the positive control plasmid enough times to sample a range of various ODs just by chance, to see if certain bacterial ODs correlate with transformation efficiency. And boy, there’s been a whole lot of nothing there so far (which is actually quite notable; see below).

(FYI: I don’t remember which instrument I used to measure the OD A600 readings. Probably mostly the old spec, tho).

So yea, I’ve generally used cultures with ODs at the time of collection between 0.1 and 0.45, and they’ve collectively given me transformation rates of ~ 20,000 using our standard “positive control” plasmid. So there seems to be a pretty wide window of workable ODs. But generally speaking, I see no issue with having a culture of 0.1 to 0.4 OD as measured with either machine for use with chemical transformation.

Setting up a hybrid lab meeting

Both due to child-care and pandemic reasons, our originally 100% in-person lab meeting for a time was 100% remote and for the last few months have been 100% hybrid. For overall accessibility reasons, I’ll likely have hybrid remain the default option, and only not bother to set up the Zoom when it’s clear absolutely everybody is going to be in attendance in-person. Over time, I think I’ve better learned how I should be setting up the hybrid lab meeting, and I figure I’d write down the steps here so I can remember (and anybody else can do so if they’re setting things up).

Standard flexible format (ie. Nobody needs presenter mode)
Here, the laptop plugged into the projector is providing the sights and sounds of the conference room, but is simply serving as a “viewer” of the slides in Zoom. Here, I’ll assume this is being done with the common lab laptop (Kenny’s old laptop from 2019), although anyone’s laptop should work.

  1. Plug in the 360 degree camera into the common lab laptop. If you also want to use a different external microphone (ie. if you don’t want to use the microphone associated with the 360 degree camera), then plug that in now too.
  2. Log into the lab meeting on Zoom. Confirm that the right camera and microphone are selected. Make sure the sound is up to the maximum, and that this computer remains unmuted.
  3. Using the USB-C adapter, hook up the laptop to either the projector or the wheeled TV. The adapter will allow for connecting to the projector with the existing VGA cord, or the TV with an HDMI connection.
  4. Make sure the Zoom screen is showing on the projector / TV screen.
  5. To actually use this setup, the idea will be that 1) anybody with their own computer can log into the lab meeting Zoom and share their desktop or window with the presentation (powerpoint file or google slides, for example), or 2) if it’s someone without their own computer, they can use Kenny’s or Anna’s computer to screen share (assuming the presentation file is somewhere easily accessible, like the “Lab_meetings” directory of the lab Google Drive).
    Note: While these computers are being used to share the slides, all sound (input and output) should be happening from the common lab computer connected to the projection device.

One speaker / Longer-form talk where someone needs presenter mode
The main difference here is that the laptop presenting the slides has to be plugged into the projector / monitor, and is thus not simply a “viewer” in the Zoom call. Here, I’ll assume this is being done with the common lab laptop, a

  1. Plug in the 360 degree camera into the common lab laptop. If you also want to use a different external microphone (ie. if you don’t want to use the microphone associated with the 360 degree camera), then plug that in now too.
  2. Log into the lab meeting on Zoom. Confirm that the right camera and microphone are selected. Make sure the sound is up to the maximum, and that this computer remains unmuted.
  3. Open the presentation file. Windows may get much harder to navigate once connected to the second screen, so you may as well get everything set up beforehand.
  4. Using the USB-C adapter, hook up the laptop to either the projector or the wheeled TV. The adapter will allow for connecting to the projector with the existing VGA cord, or the TV with an HDMI connection.
  5. Now, go to Zoom and hit “Share screen”. Probably makes sense to choose the screen with the presentation on it, though it doesn’t really matter at this point since you can adjust it later.
  6. Once the screen is sharing, go to the slide / presentation software you’re using. Assuming it’s Powerpoint, then hit “presenter view”.
  7. If the wrong screen is showing on the projector / TV monitor, then hit swap displays in the Zoom panel until it does.
  8. Now, if the wrong screen is being shared on Zoom (ie. people in the Zoom call are saying they’re seeing your presenter view), then hit the “Share screen” button again and choose the correct screen to cast.

That should do it!

COVID cases at CWRU

I’ve been keeping track of what the COVID situation has been like at Case since they first started posting the data every week, back in the fall of 2020 ( Whenever the cases seem to be higher than usual, I’ve been messaging the below graph out to my group, so they can be informed and make the best risk assessments about their activities on campus.

Anyway, figured other people may be interested in this information too, and I’m getting kind of tired of sending the same exact message out like the last four weeks, so I figured I’d just post the plot here so people can see the current stats.

As of writing this (first week of May), cases have been the highest they’ve ever been, although at least almost everyone should be vaccinated and perhaps even boosted. Still, would certainly be nice to see that number come down some…

Where lab funds go

As you can tell from the above graph, the people in the lab (including me) are by far its most costly resource, accounting for the majority of all lab expenditures. Thus, while there are other important reasons, there’s always this very “bottom line” reason for me wanting to minimize how much personnel time and effort is wasted by confusion and mismanaging!

Some Expected Yields

Here is some real-world data describing expected yields we may expect from some of these routine lab procedures or services.

Obviously this is about how much total plasmid DNA we get from the kit we use in the lab.
And this is the pretty wide range of reads we’ve gotten from submitting plasmids to plasmidsaurus

Oh, and this is a good one:

How well my determination of flask “confluency” actually correlated with cell counts. I mean, sure, there must be some error being imparted by the actual measurement of the cells when counting, but I think we all know it’s mostly that my estimate really isn’t precisely informative.