Pymol Basic Tutorial

Alright CWRU students; today we will talk about installing PyMol and using it for some basic analysis of protein structure.

1) INSTALLATION

Firstly, download PyMol. CWRU has a subscription, so go to this link and read the license agreement. If you agree, press “I agree”. Then, enter your CWRU SSO info to be logged into the institutional software page. Hit any of the logos in the “PyMOL AxPyMOL v2.4” section (they all redirect to the same place).

As of 10/20/20, they do some weird web-store thing (never seen this before). You’ll have to add the version for whatever platform / OS you’re on, and hit add to cart, and then on the next pop up, hit “Check Out”. It’s confusing, but it’s all free, so this seems to just be a formality. You’ll get an email confirmation, but this email confirmation really doesn’t do much.

You will get to a screen that looks like this though:

First, while you’re still on the above screen, make sure you hit the “Important Notice” link. This will tell you the information you’ll need later for registering. The important notice screen will look like this (just without the blacked out parts).

Now that you know that info, go back to the previous screen (with the big “Download” button), add the version for the platform / OS you’re on, and hit download. You’ll then get to another screen where you’ll have to hit download again.

Now you’re on the registration screen. Type in the info from earlier, and type in your case email address. You’ll get an email to your school email address bringing you to the download link.

I whited out some of the token link, so you’ll have a link that looks much longer than above.

Follow that link and you’ll get to another page that looks like this:

Once you type in your case email address again, you should FINALLY get to the download page, which will look like this:

There’s the lot of links there, but the most important one is the very first link, under “Download PyMOL”. Hitting that should bring you to this page:

You’ll need to do two things here. First, before you forget, download the license file. This is the button on the bottom right. Following the instructions will allow you to download a “pymol-license.lic” file. I like to put it into the same Application folder, so it’s easy to find. As shown in the picture provided by the website, you’ll need this file when you first open up PyMol on your computer.

Next, download the program itself, based on your platform (again, I’m on a Mac). Once downloaded, follow the steps for installing Pymol (for macs, it’s opening up the disk image, and then dragging the Pymol application into your computer Application folder). Double click to open the program, and it will ask about activation. Click the “Browse for License File” button and select that license file you recently downloaded. Voila!, you have successfully navigated the gauntlet of website complexity to achieve your goal. A world of exploring protein structures awaits.

2) USAGE

Now that it’s downloaded and properly linked to the license, you should get a nice blank screen like this.

I could explain what everything is, but it’s easier to just to get started. If you’re on an internet connection, the easiest way to load protein structures is to use the “fetch” command. Essentially 99% of the protein structures you’ll ever want to see will be deposited at the RCSB Protein Data Bank. Go to that website, type in your favorite protein, and see what hits you have. For the purposes of today, I’ll just use a structure I’ve stared at many times over the last 5 years: that of the tumor suppressor protein PTEN. I have the code memorized, and it’s “1d5r”. So, type in “Fetch 1d5r” into either the bottom “PyMOL>_” prompt area, or in the top “PyMOL>” prompt area. Both spaces work / are largely redundant, although the top area actually records what your previous commands were, so I have a slight preference for this area. You should then now see the below protein pop up into your PyMol window.

From here, it’s easiest if you have a two button mouse connected. If you do, you can do a left-click hold and drag to flip the protein around. You can also do a right-click hold and drag to zoom in and out. And finally, you can also hold down either option or command (if you’re on a mac) and left click to actually move the entire molecule around on the screen. if you have a scroll wheel, you may notice that turning it makes part of the protein appear or disappear. This is called “clipping” and will be useful in certain cases where you need just the right picture of part of the protein, but we won’t be getting into this for today.

On the other hand, you may be using a laptop without a two-button mouse. While slightly less ideal, you can still do everything pretty easily as long as you go to the top menu-bar, click and over over “mouse” and click down on “1 Button Viewing Mode”. Now, if you go back to the Pymol window, you can still turn the object clicking and dragging on your trackpad, but you zoom by holding down control while clicking and dragging, and moving the entire molecule around (relative to your frame of view) by first holding down option before clicking and dragging.

The default is a “cartoon” view, which shows the backbone of the peptide, as well as secondary structure with alpha helices shown as helices, and beta strands shown as those flat arrows. I find cartoon views to be the most generally useful, so we’ll keep it like that for now. The small red plusses are water molecules the authors have in their model. I find these to be kind of annoying, so I like to hide them in the beginning, only brining them back in much later when we’re considering how various side-chains may be interacting with water in the medium. To do this, I just type in “hide all” which makes everything disappear, and then “show cartoon”. Now all of the extra molecules aside from the peptide backbone should be gone. In the case of PTEN, there’s a small tartrate molecule that was bound to the active site. It’s worth making this visible. The easiest way to select it is to click the “S” button in the bottom right menubar…

This will call up the sequence at the top of the protein viewer window, below the command-line area. The tartrate molecule is after the entire PTEN sequence, but before the waters. Scroll, there, and click on “TLA” to select it.

The sequence pops up, and default to the very beginning (starting with chain A)
After scrolling to the right, you can see the TLA substrate mimic in the sequence list.

This will make a new selection called “sele”. Now you can tell the program to make a visual representation of the selected “TLA” molecule. One option is to type in “show sticks, sele”. This will be pretty subtle. On the other hand, if you want to be able to see the atoms of the TLA molecule from pretty far, you can instead type “show spheres, sele”. The molecule will then look like this:

Nice. From here, feel free to explore. If you want to select particular residues, it’s helpful to use the “resi” denotation. For example, if we wanted to select residues 124, 129, 130, and 38, you can type in “select important_residues, resi 124+129+130+38′. There will now be a new selection called “important_residues”. To show where these are, you can choose to show these as spheres also (“show spheres, important residues”), and color them a different color to make them easier to see, such as by typing “color yellow, important_residues”. You’ll have something that looks like this.

3) OUTPUTS

So you’re doing doing all of your exploratory analyses, and feel like making an image to put into a presentation. First thing to consider is the background. It defaults to black, but white is usually better for most presentations / paper figures. You can change this either by selecting Display/Background/White from the top menu, or by simply typing in “set bg_rgb, white”.

While you can get a decent picture even with this, you can get much nicer images if you tell the program to render the image first. You can do this by clicking the “Draw/Ray” button at the top right corner of the screen, and fill out the values you want, or you can tell it to do its ray-tracing / rendering from the command line by typing something like ‘ray 900″. After a little bit of waiting, you’ll get a rendered image.

Now you can go to “File/Export image as/PNG…” and export your image file to one of your computer directories. You’ve made a simple, high quality figure. Hurrah! You may want to save your Pymol analysis file by going to “File/Save Session As…”. Thus, if you want to get back to this step without starting from scratch, all you need to do is open this session file again and you’re back to the same spot.

Hopefully that was a useful introduction to some basic operations in Pymol. There’s a ton more you can do with the program, but I’ll leave it here for now.

4) ADDENDUMS

So the above the the most basic things to get you started, but as I interact with trainees here (and see what we need to do for their work), I’ll amend this page with other more specific operations.

A) Show the protein surface.
Cartoon diagrams are great to understand how the protein is structured, but doesn’t give you a sense of what the surface of the protein looks like. That’s where showing the protein surface comes in handy. You can do this quite simply by typing in “show surface, all” (replacing “all” with whatever object name there is). This will create something like this:

The reason for the blue, red, and yellow is that the default coloring scheme in pymol is color carbon atoms green, nitrogen atoms blue, oxygen atoms red, and sulfer atoms yellow. If that’s too distracting, you can just color everything green by typing in “color green, all”.

But what if you want to see both the protein surface as well as the underlaying secondary structure? One approach is to make the surface representation semi-transparent. You can do this by typing something like “set transparency, 0.75”.

B) Selecting atoms near another selection of atoms
This can be a pretty useful feature. For example, you may be curious which residues of a protein is near a substrate mimic molecule. Or maybe the structure shows a protein-protein interaction, and you’re trying to figure out which residues in protein A are in close contact with protein B. Below is a third example, which is figuring out which atoms of a protein are pretty close to water molecules on the surface of the protein.

As described on this page, you can use the “around” command for this. Here’s a series of commands to show the atoms of the protein as spheres, color them all green, and then select only the atoms near water and color them blue:
“show spheres, 1d5r and not resn HOH”
“color green, 1d5r and not resn HOH”
“select near_water, (resn HOH) around 3.5”
“color blue, near_water”