Since form follows function, the visualization of protein structures is vital for understanding biological complexity. Several ways of producing images that are not just beautiful, but also address certain research questions and help to elucidate protein function exist. Here I briefly want to talk about the program PyMOL which was originally created by Warren Lyford DeLano in 2000. My current project deals with the single-molecule characterization of bacterial translesion DNA polymerases. The polymerase I am working with the most is Pol II and therefore it is very interesting for me to picture this molecule in a way that allows me to understand how Pol II interacts with its DNA substrate (polymerases replicate DNA). However, translesion polymerases such as Pol II are not only capable of binding regular DNA, but also damaged DNA in order to prevent a stalling of the entire replication process which would otherwise lead to cell death. But why is Pol II DNA damage tolerant?

Now we are at the point were structural protein information is required. In this case this information was created by Wang and Yang in a very elegant and interesting crystallization study of Pol II (1). Even though the authors do a great job to visualize their findings, it might be helpful to do this yourself in order to create for example a different perspective view or even a short video that shows the protein from different angles. In addition you might also want to highlight certain amino acid residues by a specific color and thereby state the importance of certain functional proteins domains. All this can elegantly be performed by PyMOL. For my purposes I created the following view of Pol II containing a DNA helix with a tetrahydrofuran (THF) lesion which can not be processed by a regular polymerase.


For protein structure visualization the Protein Data Bank is the place to go. Just search for the protein structure you are interested in (hopefully it exists) and download the so-called PDB file which contains all the 3D data that is necessary to visualize the protein. Here I used the PDB file 3K5M which contains information on Pol II bound to a THF lesion DNA.

I assume you have downloaded PyMOL by now and know how to load a PDB file into it. What I like about PyMOL is its command line which allows to rapidly change what you want to see and achieve with your protein. The downside is that you need to know the syntax of the commands and also need to memorize the important commands because looking them up all the time is time consuming. The PyMOL user guide on pages 17 to 37 is a great introduction to the most essential commands. However, I used the following command sequence to produce the picture above. This is the foundation, but it can bring you quite far.

Step 1: Know your protein domains. Pol II has five domains in total. I gave a distinctive color to each one of them. The N-terminal domain stretches form residues 1-146 and 366-388, the Exo domain from 147-365, and so on… From the literature you must figure out yourself which domains your protein has and where they are located. The following command is very important and lets you select your domains (here the example for the N-terminal domain):

PyMOL> select nterminal, resi 1-146,366-388

This command will name the indicated residues “nterminal”. Later you can use “nterminal” to address the entire domain instantaneously. Specify the residues of all your domains and give them a descriptive name. On the right side of the screen you can then see an overview of all your domains.

Step 2: Know how to hide. Proteins can be very confusing. Use the following commands to first hide everything and then only selectively display what you want to see in a style that you like.

PyMOL> hide all

PyMOL> show cartoon, nterminal

I personally like the alpha-helix and beta-sheet displaying style called “cartoon”. But please also play around with the following styles:¬† ellipsoids, lines, ribbon, dashes, mesh, volume, sticks, and many more… In case you become confused or made a mistake just use “hide all” or “hide cartoon, nterminal” to get rid of your confusion. Now choose a fitting style for all your domains and do not worry about the colors yet.

Step 3: Colors are nice. Now it is time to further organize your residues not only by displaying style, but also by color. This command is very easy and works with all major colors like green, yellow, purple, blue, red, orange and so on…

PyMOL> color yellow, nterminal

So go ahead and color each of your domains in a clear, distinctive way.

Step 4: Size does matter. The “cartoon” view is very handy to see in which contexts residues are located without the confusion of all the side chains. However, a real protein is much bigger and the electrostatic forces and hydrophobic interactions determine which part of a protein is actually accessible by ligands or in the case of Pol II by DNA. The accessibility can be modeled in PyMOL by an algorithm displaying the surface that is available to water molecules.

PyMOL> show surface, nterminal

PyMOL> set transparency, 0.6

Use these commands for each of your domains and play around with the transparency (from 0 to 1) so that the surface depiction does not become overwhelming and the cartoon residue structure is still visible. The combination of two or more different styles at the same time such as “cartoon” and “surface” in this case is actually one of the strongest features of PyMOL!

Step 5: Make it nice. Most people’s aim is to create protein structure visualizations for presentation or publications. How to achieve a high-quality and clean file is therefore very important.

PyMOL> bg_color white

PyMOL> ray

These commands set the background color to white which is most convenient for most applications. “Ray” creates a sharper (read: nicer) depiction of your protein. It is important to bear in mind that the “ray” command effects are lost once you perform other modifications to your protein. So only use “ray” at the finish line. Or just use it as many times as you want.

Step 6: Hold on to good things. In order to save your work go to the “File” menu located on the top of one of the PyMOL windows. There are a couple of options how to save your work. Most important is “Save Session as…” because it allows you to go back to your current state of the project. But you are probably also interested in just saving the current view of your protein (“Save Image as…”). If you want another view angle just turn your protein as desired and save again. A PNG image will be created that can be used in papers or presentations.

Step 7: Let’s move it. You can also make a video out of individual PNG pictures that show your protein form different angles. For Pol II the result looks like this. Luckily you do not have to turn your protein 60 times and save everything¬† and than put it together as a movie. PyMOL does it for you.

PyMOL> mset 1 x60

PyMOL> util.mrock 1,60,180

PyMOL> mplay

PyMOL> mstop

Use these commands to make and test-view a movie of 60 frames that lets your protein structure rotate in a 180 degree space. If it gets boring after a while use “mstop” to stop. In order to save, go to the “File” menu again and choose “Save Movie as…”. Every major media player should now be able to display your moving protein.

By now you probably still have a number of unanswered questions. But there is relief. People who know much more about PyMOL have created a very convenient FAQ page which contains the answers to most questions that beginners have or that are just good to know.

And now go ahead and use structural biology research for your own purposes with PyMOL!

(1) Wang F., Yang W., Structural Insight into Translesion Synthesis by DNA Pol II, Cell 139, 1279-1289, 2009.