Hello Basel!

August 30, 2013

BaselStay tuned for my next exciting project!

Designing a poster

August 1, 2013

Presentation time. In order to attract a few people to my talk I designed this poster with the freely available GIMP software. It takes a while, but the possibilities that GIMP has to offer are astonishing. The software is great for creative outbursts. And wouldn’t it be nice if scientific posters could become more appealing to the eye in the future? The Biology Department of the University of North Carolina at Chapel Hill is already quite good at it: http://www.flickr.com/photos/biologyposters/.


Crossing 45° North

April 13, 2013


Partly painted walls – Boston


Enjoying some liquid sugar – Boston


Winter swimming (no pictures) – Somewhere in the woods


The Atlantic – Rockport


Freeclimbing or something like that – Rockport


Gone fishing – The Atlantic


Mont Royal – Montréal




A house in Montréal


A factory in Montréal


More houses in Montréal (it’s a beautiful city though, photos are selective)




Montréal at night (on top of Mont Royal)


Parking lot






Québec and its frozen river/part of the sea


See above


White Mountains hiking


Since form follows function, the visualization of protein structures is vital for understanding biological complexity. Several ways of producing images that are not just beautiful, but also address certain research questions and help to elucidate protein function exist. Here I briefly want to talk about the program PyMOL which was originally created by Warren Lyford DeLano in 2000. My current project deals with the single-molecule characterization of bacterial translesion DNA polymerases. The polymerase I am working with the most is Pol II and therefore it is very interesting for me to picture this molecule in a way that allows me to understand how Pol II interacts with its DNA substrate (polymerases replicate DNA). However, translesion polymerases such as Pol II are not only capable of binding regular DNA, but also damaged DNA in order to prevent a stalling of the entire replication process which would otherwise lead to cell death. But why is Pol II DNA damage tolerant?

Now we are at the point were structural protein information is required. In this case this information was created by Wang and Yang in a very elegant and interesting crystallization study of Pol II (1). Even though the authors do a great job to visualize their findings, it might be helpful to do this yourself in order to create for example a different perspective view or even a short video that shows the protein from different angles. In addition you might also want to highlight certain amino acid residues by a specific color and thereby state the importance of certain functional proteins domains. All this can elegantly be performed by PyMOL. For my purposes I created the following view of Pol II containing a DNA helix with a tetrahydrofuran (THF) lesion which can not be processed by a regular polymerase.


For protein structure visualization the Protein Data Bank is the place to go. Just search for the protein structure you are interested in (hopefully it exists) and download the so-called PDB file which contains all the 3D data that is necessary to visualize the protein. Here I used the PDB file 3K5M which contains information on Pol II bound to a THF lesion DNA.

I assume you have downloaded PyMOL by now and know how to load a PDB file into it. What I like about PyMOL is its command line which allows to rapidly change what you want to see and achieve with your protein. The downside is that you need to know the syntax of the commands and also need to memorize the important commands because looking them up all the time is time consuming. The PyMOL user guide on pages 17 to 37 is a great introduction to the most essential commands. However, I used the following command sequence to produce the picture above. This is the foundation, but it can bring you quite far.

Step 1: Know your protein domains. Pol II has five domains in total. I gave a distinctive color to each one of them. The N-terminal domain stretches form residues 1-146 and 366-388, the Exo domain from 147-365, and so on… From the literature you must figure out yourself which domains your protein has and where they are located. The following command is very important and lets you select your domains (here the example for the N-terminal domain):

PyMOL> select nterminal, resi 1-146,366-388

This command will name the indicated residues “nterminal”. Later you can use “nterminal” to address the entire domain instantaneously. Specify the residues of all your domains and give them a descriptive name. On the right side of the screen you can then see an overview of all your domains.

Step 2: Know how to hide. Proteins can be very confusing. Use the following commands to first hide everything and then only selectively display what you want to see in a style that you like.

PyMOL> hide all

PyMOL> show cartoon, nterminal

I personally like the alpha-helix and beta-sheet displaying style called “cartoon”. But please also play around with the following styles:  ellipsoids, lines, ribbon, dashes, mesh, volume, sticks, and many more… In case you become confused or made a mistake just use “hide all” or “hide cartoon, nterminal” to get rid of your confusion. Now choose a fitting style for all your domains and do not worry about the colors yet.

Step 3: Colors are nice. Now it is time to further organize your residues not only by displaying style, but also by color. This command is very easy and works with all major colors like green, yellow, purple, blue, red, orange and so on…

PyMOL> color yellow, nterminal

So go ahead and color each of your domains in a clear, distinctive way.

Step 4: Size does matter. The “cartoon” view is very handy to see in which contexts residues are located without the confusion of all the side chains. However, a real protein is much bigger and the electrostatic forces and hydrophobic interactions determine which part of a protein is actually accessible by ligands or in the case of Pol II by DNA. The accessibility can be modeled in PyMOL by an algorithm displaying the surface that is available to water molecules.

PyMOL> show surface, nterminal

PyMOL> set transparency, 0.6

Use these commands for each of your domains and play around with the transparency (from 0 to 1) so that the surface depiction does not become overwhelming and the cartoon residue structure is still visible. The combination of two or more different styles at the same time such as “cartoon” and “surface” in this case is actually one of the strongest features of PyMOL!

Step 5: Make it nice. Most people’s aim is to create protein structure visualizations for presentation or publications. How to achieve a high-quality and clean file is therefore very important.

PyMOL> bg_color white

PyMOL> ray

These commands set the background color to white which is most convenient for most applications. “Ray” creates a sharper (read: nicer) depiction of your protein. It is important to bear in mind that the “ray” command effects are lost once you perform other modifications to your protein. So only use “ray” at the finish line. Or just use it as many times as you want.

Step 6: Hold on to good things. In order to save your work go to the “File” menu located on the top of one of the PyMOL windows. There are a couple of options how to save your work. Most important is “Save Session as…” because it allows you to go back to your current state of the project. But you are probably also interested in just saving the current view of your protein (“Save Image as…”). If you want another view angle just turn your protein as desired and save again. A PNG image will be created that can be used in papers or presentations.

Step 7: Let’s move it. You can also make a video out of individual PNG pictures that show your protein form different angles. For Pol II the result looks like this. Luckily you do not have to turn your protein 60 times and save everything  and than put it together as a movie. PyMOL does it for you.

PyMOL> mset 1 x60

PyMOL> util.mrock 1,60,180

PyMOL> mplay

PyMOL> mstop

Use these commands to make and test-view a movie of 60 frames that lets your protein structure rotate in a 180 degree space. If it gets boring after a while use “mstop” to stop. In order to save, go to the “File” menu again and choose “Save Movie as…”. Every major media player should now be able to display your moving protein.

By now you probably still have a number of unanswered questions. But there is relief. People who know much more about PyMOL have created a very convenient FAQ page which contains the answers to most questions that beginners have or that are just good to know.

And now go ahead and use structural biology research for your own purposes with PyMOL!

(1) Wang F., Yang W., Structural Insight into Translesion Synthesis by DNA Pol II, Cell 139, 1279-1289, 2009.

This little article attempts to introduce the research that professor Charlotte Hemelrijk and her group “Behavioural Ecology and Self-organization” at the University of Groningen perform on understanding the complexity of bird flocks and fish schools. In the context of the Honours College course “Leadership or Not in Animal Societies” the questions was addressed whether a single leader is necessary to organize the complex behaviour that one can observe in large and moving fish schools or bird flocks.

Is a leader required to guide the instant movement of thousands of birds or fish into one direction in order to find food or escape an enemy?

Or is some kind of intrinsic property that emerges from the school or flock sufficient to explain the observed behaviour?

In the following I will describe how the estimation of movements parameters, computer simulations, and the careful observation of fish schools in nature led to a robust model that can explain why no leader is necessary to coordinate the movement of fish schools.

For fish it is very attractive to organize in schools because spawning an area, finding food, access to mates, protection from predation and hydrodynamic effects all become more optimal for the individual fish. In order to understand how complex school behaviour evolves it first became necessary to be able to describe the formation of a school in general. It has been hypothesized that collective movements, as they occur in a school, are characterized by a directional and temporal coordination. This coordination might only become possible if individuals mutually influence each other by the distance towards other members of the school (Huth and Wissel, 1992). Fig. 1 shows which effects the distance of one to another fish has on its movement according to the formulated hypothesis. These parameters were then used in a computer simulation in order to test whether the resulting model schooling behaviour resembled the natural schooling behaviour. Interestingly, the parameters seemed to be sufficient to describe the natural behaviour (Fig. 1 (C)) that has been noted earlier (Partridge, 1981).

Fig.1Fig. 1: Hypothesized effects of the presence of a second fish on the movement of the first fish depending of the distance to each other. (A) and (B) display how a first fish reacts when a second fish is present in four different proximity zones (based on Huth and Wissel, 1992). (C) shows how the modelled fish schooling behaviour (bottom) clearly resembles the observed behaviour in reality (top) (based on Partridge, 1981).     

Based on the above described parameters attraction, alignment, and avoidance further studies were able to significantly link the behaviour of individual fish to distinct school shapes. However the researchers needed to introduce the factor “speed” into their model in order to being able to reproduce observations in nature (Kunz and Hemelrijk, 2003 and Hemelrijk and Hildenbrandt, 2008). The researchers found that through coordination and collision avoidance a transition to an oblong shaped (with respect to movement direction) school occurred as a function of speed. In other words, a fish school reduced its width and increases its lengths at increasing velocities because this enables the individual fish to avoid collisions. Later, Hemelrijk and colleagues were able to prove that the conclusion drawn from their model also holds true for fish schools in real-life experiments (Hemelrijk et al., 2010).

In order to narrow the gap to the experimental observations, the researchers also introduced a factor to describe the effect that the number of neighbours have on the movement decisions of the fish. They hypothesized the existence of two mechanisms that could govern the movement of individual fish when surrounded by more than one neighbouring fish. Fig. 2 schematically depicts both hypothesis and their respective outcomes in computer model when applied over a number of “decision cycles”. The first model assumes that the movement of an individual fish (3.) who has at least two neighbours (1. and 2.) in his field of view largely is the result of taking the average path between both fish. The second model was assumed to be more realistic because the fish (3.) would have a priority direction that largely depends on factors such as distance to his neighbour. A computer simulation of both models, however, resulted in a surprising outcome. After a number of cycles the priority model had led to a disturbed school pattern that is never observed in nature. On the contrary, the average direction model resulted in an accurate reproduction of field observations. The researchers therefore assumed that the priority direction effect is probably averaged out in a large school because there are many and changing neighbours. The final result is an average directional movement.

 Fig.2Fig. 2: Two models and their simulation results that take neighbouring fish into account during the movement decision process of an individual fish. (A) The average direction model assumes that the movement of fish 3. is the result of averaging between the direction of its neighbours. The priority direction model assumes that fish 3. decides to follow the closest neighbour. (B) Simulations of both models resulted in a dispersed fish distribution in the case of the priority model (right) and a more realistic ordered fish distribution for the average model (left).

The findings presented in Fig. 2 and the fundamental work presented in Fig. 1 therefore prove that individual fish can lead to an emergent property, such as the coordinated behaviour of a large group, without requiring a leader. It is important to note that the individual perceptions of the fish within and on the edges of the school are vital for the coordination. This means that the final direction of the fish school is probably and to a large extend based on the “decisions” that fish on the edges of the school make. These movements “decisions” might be based on knowledge and experience, but also on motivational factors such as hunger. Whether these factors can drive the behaviour of individual fish and therefore the movement of the whole school still remains to be elucidated.

The computational tools of theoretical biology therefore seem to be a good approach to describe complex behavioural patterns in animals groups. Other research projects of the Hemelrijk group used similar parameter-simulation approaches to describe the behaviour of bird flocks and their internal dynamics (Hildenbrandt et al., 2010). Also in bird flocks no real leader is necessary, but the individual movement decisions of birds in their neighbouring context seem to govern the movement of the flock as a whole. These studies can also help to improve understanding on how large groups of humans act in situations of panic and fear when rational decisions might become overruled by movement decisions that are based on the individual context.


Hemelrijk, C.K., and Hildenbrandt, H. (2008). Self-Organized Shape and Frontal Density of Fish Schools. Ethology 114, 245–254.

Hemelrijk, C.K., Hildenbrandt, H., Reinders, J., and Stamhuis, E.J. (2010). Emergence of Oblong School Shape: Models and Empirical Data of Fish. Ethology 116, 1099–1112.

Hildenbrandt, H., Carere, C., and Hemelrijk, C.K. (2010). Self-organized aerial displays of thousands of starlings: a model. Behavioral Ecology 21, 1349–1359.

Huth, A., and Wissel, C. (1992). The simulation of the movement of fish schools. Journal of Theoretical Biology 156, 365–385.

Kunz, H., and Hemelrijk, C.K. (2003). Artificial fish schools: collective effects of school size, body size, and body form. Artif. Life 9, 237–253.

Partridge, B.L. (1981). Internal dynamics and the interrelations of fish in schools. J. Comp. Physiol. 144, 313–325.

Yes, also bacteria seem to have an immune system. Bacteria frequently become attacked by phages and viruses, so they need protection too!

Here I want to briefly introduce the CRISPR/Cas system which is a very interesting type of bacterial defense system using former viral DNA sequences to guide bacterial DNA endonucleases to cellular targets where viral DNA is present. It has always been hypothesized that CRISPR/Cas could be used for biotechnological non-invasive genome editing. However, recently a number of breakthroughs concerning this application have been described. In the following I especially would like to discuss three Science journal; papers that, in my opinion, have been groundbreaking in paving the way for real future applications of CRISPR/Cas and on the other hand helped to understand the molecular basis of this system. Here I will especially concentrate on the type II CRISPR/Cas system (of in total three). In general a bacterial immune response against a viral invader can be split up into three phases: Adaption, expression, and interference. Fig. 1 schematically shows how the type II CRISPR/Cas system is currently assumed to work during the expression and interference phases.


Fig. 1: Schematic depiction of the type II CRISPR/Cas system present in bacteria to destroy invading DNA originating from viruses. The CRISPR/Cas gene cluster contains previously obtained sequence information about foreign DNA (colored triangles) which are separated by repeats (black rectangles). Upon induction this information is transcribed into pre-crRNA together with the expression of the Cas9 protein and so-called tracrRNA which serves as a universal linker to connect crRNA with Cas9. In the following steps tracrRNA and pre-crRNA are cleaved to smaller sizes at least twice. Now the crRNA-Cas9-tracrRNA complex is able to bind foreign DNA at a homologous site termed “protospacer” which is followed by a second, but very short identifier called PAM (protospacer adjacent motif). Once stable binding has been achieved Cas9 seems to cleave the invading DNA and thereby induces double-stranded breaks that inhibit the expression of viral genes. Scheme created by myself, based on (1) Supplementary Fig. 1.

Even though Fig. 1 depicts some of the molecular details that occur during a CRISPR/Cas mediated response there is probably more to the system. Especially during the adaptive phase that governs the incorporation of DNA fragments into the bacterial genome important functional key features are still unidentified. Soon after the first functional properties of the CRISPR/Cas system became evident in the late 1980s researchers began to hypothesize about the biotechnological usability of this defense system. Since then the expression and interference phases have been studied very extensively and especially during the last couple of months some exiting insights have been gained with regard to an actual application by different researchers. Emmanuelle Charpentier and coworkers described in a proof-of-principle study how a fused and custom made crRNA-tracrRNA can be applied to target sequences of interest in a DNA plasmid and in addition identified the two Cas9 protein domains that are responsible for the double-stranded target DNA cleavage (Fig. 2) (1). Partially based on Charpentier’s groundbreaking work, in a second and third paper published last month, researchers from the Massachusetts Institute of Technology and Harvard Medical School describe an exciting approach to silence entire gene loci in mouse and human cellular DNA. The key to successfully being able to target specific sequences in eukaryote cells seems to have been the co-delivery of an expression vector including both pre-crRNA sequences and the Cas9 genes (2). This approach also makes use of the chimeric crRNA-tracrRNA hybrid (Fig. 2) that mimics the naturally occurring crRNA:tracrRNA duplex described by Charpentier and colleagues (1). In a parallel study targeting rates of 4 to 25% under different conditions are described. In addition 40% of all human exons are identified by a bioinformatic approach as being potentially available for CRISPR/Cas silencing. Cloning these target sequences into a 200 base pair format for the first time allowed the creation of a library describing potential target sites in the human genome (3).

Chimeric_crRNA-tracrRNAFig. 2: Schematic depiction of the the naturally occurring crRNA:tracrRNA duplex which in conjunction with Cas9 is able to cleave viral DNA in a target specific manner (top). By creating a linker loop that fuses two functional RNA sequences it became possible to engineer a crRNA-tracrRNA chimera that, based on experimental evidence, has the same functionality as its natural counterpart and has been proven to be very effective for genome targeting purposes in eukaryotic organisms (bottom). Based on (1) Fig. 5.

I consider the CRISP/Cas system extremely interesting because of its seemingly simplistic nature consisting of only a cleavage protein, a target, and a target-identifier. Whether this is the whole story remains to be seen, but some of the most important functional elements now seem to have been identified on a molecular level. The construction and usability of an artificial linker which connects RNAs and endonuclease demonstrates how far knowledge has proceeded. Or in Richard Feynman’s famous words: “What I cannot create, I do not understand“. We might not understand it completely, but at least we can “create” an important part of it. I am confident that CRISP/Cas will play a very important role in bacterial/industrial biotechnology and also genome editing in the future because it dramatically decreases the challenges that accompany the silencing of eukaryotic genes in their native context. By further advancing knowledge and consequently the technology it might even become possible to use the system for genome editing through achieving controlled and sticky-end like double-stranded breaks. This would enable the ligation of desired sequences into eukaryote genes.

It would not be the first time that a giant leap across the phylogenetic divide is made. The bacterial Taq polymerase used during PCR everyday around the globe is just one example how small the world can be at the molecular level. Darwin would have enjoyed it.


(1) Jinek M. et al., A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity, Science 2012 Aug 17;337(6096):816-21.

(2) Cong L., et al., Multiplex Genome Engineering Using CRISPR/Cas Systems, Science. 2013 Jan 3 [electronic publication ahead of print].

(3) Mali P., et al., RNA-Guided Human Genome Engineering via Cas9, Science. 2013 Jan 3 [electronic publication ahead of print].

Random impressions

February 7, 2013

No biology today, just photos from Boston. Click here for a little soundtrack and click the individual photos for higher resolution.

28 12 13 14 15 16 17 18 19_b 20 22 23 24 25 26 27

Getting visual

December 9, 2012

My current project explained in a short video. For biological science beginners and only in German. My apologies for these restrictions!

Check the video HERE.

Seeing is believing…

November 30, 2012

…and understanding!

Most cellular processes are ingeniously and non-intuitively orchestrated. In addition they contain a large number of components. Understanding all this can pose a challenge. Not surprisingly research experience over the last decades has shown that it actually is a challenge! When ignoring crystal structures, the bulk molecular knowledge about cellular processes comes from indirect evidence such as blots, gels and other traditional techniques. Would it not be interesting to observe processes (1) in vivo, (2) at high resolution and magnification, and (3) in real-time? Probably not many biologists will disagree with this. A major challenge towards this aim, however, is the nature and chemical structure of the cell surface/membrane. From an optical perspective this lipid bilayer structure can be viewed as a so-called “opaque surface”. First of all this means that it is non-transparent and scatters light (Fig. 1). Looking through a sample therefore becomes impossible because the incoming light is so distributed all over the place that it is hard to attach any meaningful information to it.

Image Fig. 1: Plane waves that hit a rough surface are scattered instead of being directly reflected. The resulting random speckles lead to a distorted image of the object in the human eye (Source: http://physicsworld.com/cws/article/news/2007/aug/15/opaque-lens-focuses-light).

In the past techniques have been developed to extract information from opaque layers which at least let through a small amount of direct beams.  Here, however, I want to present a new approach which has recently been published by a team of Dutch and Italian researchers who managed to retrieve images through a completely opaque surface.

Two layers of material were used for this study. The first layer is the opaque layer which scatters the light and does not allow a direct observation of a 50 μm sized object on the second layer (Fig. 2 (A)). In this case the object to become imaged was the Greek letter pi made out of fluorescent polymers. Both layers are 6 mm apart, which is a large distance considering the size of the object. As depicted in Fig. 2 (B) a laser with 532 nm wavelength is consequently directed at the first opaque layer and serves as the light source. Due to scattering of the light that passes the first layer and due to scattering of the light which fluoresce back through the first layer, it becomes impossible to identify which object is hiding behind the layer (Fig. 2 (C)). It is, however, possible to measure the overall amount of light that originates from the fluorescent object. The researchers now assumed that all the information that one would need in order to reconstruct the image is already contained in this recorded light. Due to scattering and speckling this information is disorganized and cannot be readout by conventional means (such as our eyes and brain). Of course the key element of the study was to develop an algorithm and a technical procedure that were able to extract meaningful information from the chaotic light mix (Fig. 4 (D)). In the following I will explain this procedure in a bit more detail. Since I am NOT a physicist I omitted some of the details, but I am sure that I am mentioning enough facts to understand the technique.


Fig. 2: Experimental design involving an opaque first layer and a second layer with a fluorescent object that could be retrieved by an algorithm involving scanning over different angles and autocorrelating the resulting information. For details see text (Source: Press release “Looking through an opaque material” by the University of Twente, the Netherlands).

Four variables are essential for being able to recalculate the nature of the object behind the opaque layer. First the object’s fluorescent response O which roughly translates to the fluorescent intensity at a given point in space. Secondly the speckle intensity S is important. It describes the amount of light speckle formation also at a given point in space. The third important factor is the angle θ at which the laser shines through the first layer. Finally, the measured overall light intensity I was essential.

During the course of the study, the angle of incident light was slightly varied. By iteratively changing the laser angle and consequently measuring the overall intensity I it became possible to calculate correlations between all four factors. The interesting part is that these correlations are the founding block for organizing the information which is hidden in the scattered fluorescent light and that was thought to be of totally random nature before.

The first step in order to achieve this was the discovery of the relation between incident laser angle and the measured fluorescent intensity I. Interestingly, speckle intensity S and the objects response O remained largely unchanged under one set laser angle. This enabled the researchers to use nine different angle scans, which yielded enough information to autocorrelate S and O. Spatial information from the previously randomly distributed intensity, could know be extracted because the relationship how S, O, and θ influence I was known.

In the same paper the authors also demonstrate the use of this technique for imaging a more complex biological sample hidden behind an opaque layer. However, the amount of detail of such a sample is much greater than the amount of detail of a relatively simple pi letter and reconstruction becomes much more calculation intensive. In order to solve this issue the resolution had to be decreased by increasing the size of the speckle spots, thereby lowering the amount of incoming information. Despite these practical limitations the researchers have clearly demonstrated that it is possible to noninvasively image through an opaque layer, such as a cell membrane. I am sure that this discovery has great potential for molecular and cell biological in vivo studies. This potential is enhanced even more by the possibility of increasing resolution by decreasing speckle spot sizes and by introducing 3D imaging by measuring speckle patterns in an additional direction.

Article: Jacopo Bertolotti, Elbert G. van Putten, Christian Blum, Ad Lagendijk, Willem L. Vos, Allard P. Mosk, Non-invasive imaging through opaque scattering layers, Nature 491(7423), 232–234, 2012.

Since November 1st I am doing research on the mechanisms of polymerase exchange under DNA damage conditions in the laboratory of Dr. Joseph Loparo at Harvard Medical School in Boston, USA.

In case you do not belong to the small group of people who intuitively know what my projects entails, this page intends to make it more clear.

Below you see a photograph taken out of my window at night. It symbolically stands for the need to regulate complex processes which are required to achieve a certain task. As with traffic a constant “flow” is also important for DNA replication. Every cell contains the same DNA so replication events are constantly occurring during life. However, there are certain “stop-lights” involved which come into play under certain conditions. During rush-hour traffic lights are controlled in a different manner then at 3 o’clock at night (at least this is desired).


In a living cell it’s almost always rush-hour. Still, many checks lead to a surprisingly perfect flow with very few mistakes occurring. Despite this, external “stress factors” can distort the flow. Imagine the driver on the right would not care about his or her red sign. The following accident would require the set-up of a detour around the site of the accident so that a minimum of traffic flow can be guaranteed. However, this detour will make the overall traffic situation less efficient (especially during rush-hour!!) and everything will take way more time than needed. But at least a total breakdown can be prevented.

The sun’s UV-light leads to sort like accidents during DNA replication flow, because it changes the regular DNA structure which causes the formation of a so-called “roadblocks”. Regular enzymes called Polymerase III that catalyze the DNA replication can not replicate across these roadblocks. Polymerase III normally guarantees a high fidelity DNA replication meaning that there is an extremely low copy error rate. A problem related to this high accuracy is that Polymerase III is not very tolerant to DNA damage. So in order to prevent a total DNA replication breakdown a “detour” enzyme comes in (for example Polymerase II or IV). These enzymes are build up differently and are more tolerant to DNA roadblocks. However, they slow down the whole replication process and can lead to even more errors when used for too long.

This is why the use of Polymerases II and IV needs to be tightly regulated. Despite the importance of this process currently not a lot is known about it. And this is what my work will focus on.

The following explanations might become a bit more technical, but I will try to keep it to a minimum. To recap you knowledge about the general DNA replication process I want to introduce Fig.1. It depicts the protein complex that is required and sufficient for DNA replication in Escherichia coli bacteria. Together these proteins constitute the replisome.

Fig. 1: Schematic overview of the replisome. DNA helicase unwinds the double stranded DNA. Two single DNA strands arise; one in 5’-3’ and the other one in 3’-5’ direction. Because the DNA polymerase complex is only able to synthesize into the 5’-3’ direction in a continuous fashion, the inverse direction needs to be synthesized in small bits called Okazaki fragments. The strands are therefore termed “leading” and “lagging” strand, respectively. Primase attaches primers to the DNA so that replication can be repeatedly initialized on the lagging strand. In E. coli the two DNA polymerases are tethered to the helicase by the γ clamp-loader complex which is especially required to load the β clamps through which the αεθ DNA polymerase subunit attaches to the single DNA strand. Single strand binding proteins (SSBs) ensure that the single strand does not coil up and remains accessible for the lagging strand polymerase (based on: van Oijen, Loparo, Annual Reviews Biochemistry, 2010).

Back to the switching between the polymerases under DNA damage conditions: This process is termed translesion synthesis response (TLS) because the DNA roadblock (lesion) is bridged. A few aspects of TLS are already known, but in order to understand what they entail it is important that you have good grasp of Fig. 1.

It has been elucidated by O’Donnell and coworkers (Molecular Cell, 2005) and Sutton and colleagues (PNAS, 2009) that the β clamps which tethers the αεθ polymerase III subunits to the DNA plays an essential role during TLS. Fig. 2 explains why this is the case.

Fig. 2: Structure of the DNA β clamp with its rim and cleft contacts sites for Polymerase IV. Within the structure of the β clamp two hydrophobic clefts have been identified. These seem to be able to accommodate certain amino acid residues of the so-called “little finger” domain of Polymerase IV. This essential key-lock mechanism is supported by additional rim contacts (Loparo, based on Bunting et al., EMBO, 2003).

Therefore the β clamp seems to serve as the basis for polymerase exchange during TLS.

The next question would be: How can we study the dynamical changes between Polymerase III and the other polymerases such as Polymerase II or IV?

In Dr. Loparos’s laboratory several different methods have been developed how these changes can be observed on a single-molecule basis. The focus of my project will especially  lie on the TLS polymerase II (Pol II) and how it is able to replace the replicative polymerase III (Pol III). Pol II is special in regard to its high-fidelity and proofreading capability by its 3’-5’ exonuclease. These features are not present in the other existing translesion polymerase and make Pol II an interesting object to study. Also there are indications existing that the Pol III – Pol II exchange might work differently than the Pol III – Pol IV exchange. Single-molecule biology only works when methodology originating from physics is combined with biologically relevant questions, and classical biochemical techniques.
Traditional biochemical studies have obtained almost all of the current DNA replication knowledge and are continuing to do so. However, bulk effects seem to average out the dynamic states which are so essential for protein functioning. Single-molecule and fluorescence methods are a powerful means to study the functional trajectories of proteins while they are functioning. Studying the formation of (replisome) complexes is a very interesting application and has been demonstrated to be successful. The development of procedures to very locally and quantitatively study the mechanisms and stoichiometry of polymerase exchange will be central to this project.
Fluorescent organic molecules (dyes) and inorganic molecules (quantum dots) will be used to label proteins and DNA. Innovative laser microscopic techniques such as Total Internal Reflexion
Fluorescence (TIRF) and Förster Resonance Energy Transfer (FRET) microscopy can then be applied to image and quantify the associations of the labelled single molecules.

Central to these approaches is a single-molecule primer extension assay. It works by the application of a microfluidic flow-cell in which a DNA molecule is attached to a glass surface and is consequently stretched by a laminar buffer flow (Fig. 3, top). Extending this linear molecular and observing the change via fluorescence microscopic techniques allows to make conclusions about polymerase behaviour and dynamics (Fig. 3, bottom).

Fig. 3: The principle of a laminar flow cell which can be used to determine the single-molecule dynamics of polymerases (top) and an example of the fluorescence patterns that can consequently be observed by dark field microscopy (bottom). See the text for more details (based on: Tanner & van Oijen, Methods in Enzymology, 2010).

Based on these data (DNA length extension and time) it now becomes possible to actually observe polymerase switching. Previous studies have shown that Pol III is much faster in synthesizing long DNA molecules than other polymerases. By determining the difference between polymerase extension speeds it therefore becomes possible to determine which polymerase is active. Furthermore, it has been determined that a single Pol III synthesizes about 900 base pairs before it leaves the replication fork and a new Pol III comes in. Between these events always a small pause occurs. The lengths of the pause depends on the polymerase concentration. This is logical because at a higher polymerase concentration a switch can occur faster because there simply more molecules available in a certain location. By measuring a number of single-molecule trajectories under different conditions and with different polymerase types it becomes feasible characterize polymerase dynamics. Fig. 4 shows the result of a plot where two different polymerases were tested. Because the pause length and the reaction speed are known it is possible to distinguish between Pol III (fast) and Pol IV (slow).

Fig. 4: Time and DNA length info from a flow-stretching assay helps to identify switching dynamics of polymerases if synthesis speed pausing behaviour is known (see also Fig. 3).

Most of the above described approaches and the derived knowledge is only valid for Pol III/Pol IV switching. The situation might be different for Pol III/Pol II switching. Elucidating Pol II switching behaviour with the above describes assay will be a main aspect of my project.

Please stay tuned for updates!