One simple lifehack for your lab book

I’ve seen people use many different systems for documenting their lab work: a traditional hard cover notebook, sheets of paper in ring binders, electronic notebooks, post-it notes stapled together, even permanent marker directly on the lab bench…

Personally I like to have a plain lab notebook that I write in by hand. It’s a calming ritual, at the end of the day, to think over what I’ve done and to paste in gel images and the like. But beyond the haptic pleasures of writing and amateur arts-and-crafts, paper notebooks have a feature that digital notebooks cannot replace.

One number to rule them all…

Paper note books are linear. What I do is to number the pages in my lab notebook sequentially, and continue the numbers even when I start a new volume. I don’t start a new book for each new project, and I keep everything in strict chronological order. If your lab has pre-numbered books that’s fine too, just prefix the page numbers with the volume number.

You might ask: isn’t that just how page numbers work? Where’s the hack? Here’s my secret: I use the page numbers to label everything associated with the experiment on a given page, from tubes and vials to computer files. That has simplified many things in the lab for me.

If I find a tube in the freezer, I can always be confident that I know where it comes from. If I find some imaging files on the server, I can find which slides they come from and which experiment they were associated with.

Say goodbye to tube-labelling woes

It’s also much easier to label lab vials and slides with this method. There’s not much space for labelling on most labware. I’ve seen people struggle to squeeze in their initials, the date, and a description of contents in the one square centimeter of usable surface on a typical Eppendorf tube.

All I need to write is the lab book page number (I’ve reached three digits now, but don’t think I would go beyond four in my career) and the tube’s own number, max four to five characters. All other details go into the lab book. The tube has a unique identifier within the context of my lab space and documentation, and there’s only one place I have to look. No fancy relational databases, barcodes, or QR codes required.

Why this works for me

What is clear is that some self-discipline is needed to keep things coherent. Lab work rarely goes in one straight line, and most people have several projects going at the same time. Some people tout digital lab books as the ultimate solution, because you don’t have to keep them linear but can create separate work spaces, attach digital files, edit them collaboratively, etc.

In the lab, projects and experimental plans continually branch and fork; the only thing that constantly moves in a single linear direction is TIME. The main attraction of digital notebooks – that they allow you to branch and fork your documentation as your projects develop – is also their biggest drawback. It creates more complexity, when as a scientist my work would often benefit from pruning back such complexity aggressively.

On the other hand, a good digital content management system is a huge help with organizing projects and collaborations. My institution uses Confluence and I think it works well for most of what we need to do. By no means am I suggesting that you should only use a paper notebook. Optimized protocols, written reports, and freezer content lists – things that can change and be shuffled around – go into the CMS, but the main record of experimental work goes into the paper notebook.

Tips for keeping a sane notebook

  • Choose a hardcover notebook with sturdy binding, no flimsy ring-bound books!
  • Start new entries on the recto page (the right hand side of two facing pages) so it’s easier to flip through and browse. Leaving the facing verso page blank is useful for adding notes or cross-references afterwards (but NOT new results).
  • Underline or highlight the headings so you can skim your records quickly.
  • Use glue, not tape, to paste things in. Cheap tape can often come loose or discolor, especially with the type of glossy paper that is used for photo printers.
  • Don’t put off writing in your notebook. Even if you are in a hurry, scribble down what you can and leave a blank space so that you can come back and write it out properly later.
  • Choose a book with lightly colored rules or squares, otherwise they will show up too dark in scans or photocopies
  • Scan each volume of your notebook as a digital backup when you are done

I don’t always follow all my own tips, but this is the system that I’ve been keeping to since my PhD, and it has worked well for me. Hope that these tips will be helpful to lab newbies who are considering their options.

Quick guide to parallelisation tools for bioinformatics

Many computational tasks in bioinformatics are “embarrassingly parallel”, that is to say, they can be easily sped up by simply splitting one long job into smaller parts that can be run at the same time on different processors, because the individual jobs do not need to communicate with each other.

When I started out in bioinformatics, I knew that some software tools had parallelization built in, e.g. whenever you see a command line option like “-CPUs” or “–threads”. When I started writing my own scripts, though, I wanted to learn how to speed things up for myself.

Here’s a few tools that I’ve found useful for this, starting from the simplest tools which can be run on a single workstation, to more complex ones that are suitable for a cluster.

Continue reading

Symbionts as rechargeable energy and nutrient “batteries”

I recently saw a new paper on Stentor ciliates with symbiotic algae with internal carbon storage in the form of starch granules. The authors spent five years trying to get this species of Stentor into culture: admirable effort and persistence!

The study reminded me of another symbiotic organism, the Paracatenula flatworms and their endosymbiotic sulfur bacteria Candidatus Riegeria (Alphaproteobacteria: Rhodospirillaceae), which store carbon in the form of polyhydroxyalkanoate (lipid) inclusions, which are actually occupy quite a large proportion of the bacteria’s cell volume (see Figure 3 here, the red blobs are the lipid inclusions stained with a fluorescent dye). This was research that I was involved in as a graduate student; the project was led by Oliver Jäckle and supervised by Harald Gruber-Vodicka.

One striking feature observed by microscopy in the Stentor algal symbionts was the so-called “Maltese cross” pattern seen in starch granules when they were viewed under polarized light microscopy. In the flatworm symbionts, we observed a faint dark band in each of the lipid inclusions (stained with Nile Red dye) when imaged with confocal laser scanning microscopy. We figured that this was some kind of optical artefact because they didn’t appear to correspond to anything we saw in the electron micrographs, and the band was always oriented in the same direction in all lipid inclusions in the same image. It’s a minor point, and probably not very interesting to the biology, but it was an odd kind of parallel that intrigued me when I saw the Stentor paper.

The more obvious parallel, though, is the ecophysiological interpretation of this carbon-storage trait. Many algae and bacteria have storage inclusions for carbon (e.g. lipids, starches) and energy. Some of the carbon-storage polymers do double duty as energy reserves, so these functions can’t be neatly isolated from each other in real life. In both Stentor ciliates and the Paracatenula worms, the storage happens in an endosymbiont. Because they are physically contained inside the host organisms, the symbionts can only take up nutrients that first pass through the host along the way, and the storage and release of these substances has an effect on both host and symbiont fitness.

The starch storage in algal symbionts of Stentor was interpreted as a survival strategy for nutrient poor (oligotrophic) conditions in the freshwater lake and pond habitats where they were found. The symbiotic bacteria in Paracatenula worms are also able to fix CO2 to make biomass, but they can be limited by sources of chemical energy that fuel this process, which may have patchy availability in the environments where they are found. So for both these organisms, it makes sense to store up for a rainy day, and to tap into the physiological capabilities of their respective symbionts to play the role of this “rechargeable battery” for them.

That said, maybe this interpretation is too neat and simple. Algae can photosynthesize after all, and CO2 is not a limiting factor, so why would they need to store starch? On the other hand, oxygenic photosynthesis may be stressful for the host because it produces reactive oxygen species as a waste product, so the algae may rely more heavily on organic nutrients (mixotrophy) than we think. Still many puzzles to chew over….

How I write peer reviews

As a scientist, writing peer reviews of other scientists’ papers is part of the job. It’s not remunerated financially (nonscientists are often surprised to hear), and reviews are traditionally confidential so no one but the authors and editor will ever see the amazing review that you poured time and energy into, but it’s a form of community service. After all, we ask others to review our papers so we should pay this service forward.

I’m not an expert on peer review and still have much to learn, but to sort out my own thoughts, here’s some notes on how I approach reviewing. The basic principle I try to adhere to is to treat others as I would wish to be treated myself, i.e. to write a review that I would appreciate if I were on the receiving end, even if I don’t agree with the judgement calls. So this is as much about what I don’t like in reviews that I’ve received, as it is about the reviews that I have written.

Continue reading

Qt error over remote login

When working from home, I was using Jupyter Notebook over a remote connection, and wanted to plot some graphics with the Python library for phylogenetics ete3 (http://etetoolkit.org). I was stumped with the following error message:

qt.qpa.screen: QXcbConnection: Could not connect to display

The code snippet notebook had worked before, when I was sitting at my work computer in the office, so why wasn’t it working remotely?

Googling around led to these GitHub issues. The fix appears to be to set an environment variable QT_QPA_PLATFORM to offscreen.

Within a Python notebook this can be done with:

import os
os.environ['QT_QPA_PLATFORM'] = 'offscreen'

Alternatively this can be set in Bash, but I prefer to have it in the notebook so that I won’t be stumped the next time I run the code in there.

Tips for GNU Screen

Now that I’m doing most of my work remotely by logging in to my work computers from home, I am using GNU Screen routinely to help manage terminal sessions on the command line, and to protect login sessions from dropped connections.

(Alternatives to Screen include tmux (terminal multiplexer), which also lets you open multiple windows as tiles on a single screen.)

However the default settings for Screen are quite bare, and it’s easy to forget how many screen sessions you have open. Here’s a few tweaks and settings that I find make the experience of using Screen much smoother and more efficient.

Continue reading

Combining filters/gates in openCyto and flowCore

openCyto is an R Bioconductor package that implements automated gating methods for flow cytometry data. It is built on top of the flowCore package for importing and working with flow cytometry data in R.

One useful functionality in flowCore is combining different gates with Boolean logic (manual section 7.4). However I was having trouble figuring out two things: (1) how to use Boolean logic to combine gates defined on openCyto GatingSet objects, and (2) how to apply these combined gates back onto the GatingSet object.

Watching misinformation spread

I recently received a forwarded message that’s been doing the rounds on WhatsApp. It concerns a recently published study on the coronavirus, and I was asked by the sender whether the forwarded message was true or not. Short answer: It’s a mixture of half-truths and straight-up false claims.

In this post, I’ll compare the chain message to what the original scientific article says. This is a useful exercise in seeing how these dubious messages come to be, and why they might compel people to forward them.

Continue reading

Enabling PDF conversion on ImageMagick

ImageMagick is a software suite for processing and converting image files. What’s great about it is that it’s free to use, open-source, and supports a huge variety of image formats. I’ve used it most often as a command line tool to convert and resize large numbers of images.

I recently found that I could no longer convert PDF files with ImageMagick, even though I had done so many times in the past. It turns out that this functionality was turned off by default, because of a security loophole in the underlying PDF engine that has since been fixed. (See this thread on StackOverflow).

One way to override this default is to directly edit the ImageMagick policy file. In a Linux system, this would typically be installed at /etc/ImageMagick-6/policy.xml. The file itself contains explanatory notes about how it is formatted and used.

The following line controls behavior for PDF files:

<policy domain="coder" rights="none" pattern="PDF" />

To change the rights, modify it to:

<policy domain="coder" rights="read|write" pattern="PDF" />

You will probably need sudo rights to make this change.

Then you should be able to read and write PDF files again with ImageMagick commands, such as convert.