Publishing ‘dirty’ data
Posted by Katherine Brown, on 22 May 2012
How much does it matter that the images we publish are neat and tidy? It’s a question I’ve been dealing with over the past couple of weeks, and I wanted to share some thoughts. Here at Development, as at many journals, we check all figures before publication to try and identify potentially inappropriate image manipulation. Whenever we do come across a figure that doesn’t comply with our guidelines on image processing, we contact the authors to ask for clarification, request that the author provides us with the original data – so we can check that nothing fraudulent is going on – and often also ask that the final figure be changed to properly represent the original data. I’m happy to say that problems are few and far between, and that those issues I have come across in the short time I’ve been here have been more a case of beautification than of fraud. But is it okay for authors to ‘clean up’ their images with Photoshop paintbrush tools or the like: not touching the data itself, but rather getting rid of specks of dust or extraneous bits of tissue that are there on the slide?
The images shown here don’t come from any paper, but have been kindly provided by a researcher to illustrate what I’m talking about.

This is a Drosophila wing disc, where clones of cells are marked with GFP, and the entire disc stained with phalloidin in red. Very often in preps like this, you get bits of irrelevant tissue associated with the disc on the slide. But this one looks very clean, right? Wrong. Here’s the original version – you can see that there’s a piece of trachea, stained red, off the left side of the wing disc.

So, thinking that this bit of extraneous tissue is problematic, the researchers have taken the simple solution of photoshopping it out: something that’s very clearly revealed by the standard checks we run on our figures: as shown here.

I’ve seen a seen a few of these cases recently, and in each, the aim of the authors was to ensure that the images were easily interpreted, and that readers weren’t diverted from the data by the extraneous bits of stuff. This may seem innocent, but it could be the first step on a dangerous slope, at the bottom of which lie the clearly fraudulent activities of deleting the bits of data that don’t fit our hypothesis, or making up data that do. Journal guidelines are (or at least should be) pretty unambiguous, and the case above falls foul of this statement taken from our Guide to Authors: “Unacceptable manipulations include the addition, alteration or removal of a particular feature of an image, and splicing of multiple images to suggest they represent a single field in a micrograph or gel.” So while it may seem innocuous, it’s not permitted. Nor is it, at least to my mind, in any way necessary: are we really that easily distracted? Does that little bit of trachea really stop us from seeing the clones in the wing disc? It’s been pointed out to me that the image above could have simply been re-cropped to remove the offending tissue, and if it’s okay to do that, why isn’t it okay to selectively black out those parts of the panel? That’s a reasonable point, and selective cropping is an issue to which I’m not convinced there is a straightforward answer. But I’m guided by the basic principle that the presented data should accurately reflect what you saw down the microscope or on the blot or whatever, and that what may seem irrelevant to you (a higher molecular weight ‘background’ band on your Western) might actually be important to someone else (“Oooh look – this might be a post-translational modification of my protein”).
I well remember from my time in the lab the agony of discovering that the perfect picture was ‘ruined’ by a bit of fluff to the side of the embryo, or because the vibrotome knife had left streaks across the section. And then spending hours re-mounting or re-sectioning to avoid these imperfections. But we all know that science can be an inherently messy endeavour: cells don’t grow in neat rows, and Western blots often give us background bands. So why do we need to hide this when it comes to publication? Of course, it’s vital that the data are clearly presented and understood, but what’s most important is that they accurately represent the experiment, and there’s a danger of losing sight of this in the desire for a beautiful image.
Initiatives like publishing all the uncropped blots that have gone into making the figures in a paper (as pioneered by Nature Cell Biology) are aimed at addressing this issue: by all means show only the relevant bit of the blot in the main figure, but for those interested in the (literally) bigger picture, the whole thing – warts and all – is available. But it can be a pain to find and assemble these files, and we don’t want to make publishing harder than it already is – although there’s a school of thought that says if you can’t lay your hands on the original data, you need to be better at archiving it in the first place!
So what do the Node readers think? Have you been tempted to ‘prettify’ your data for publication, or have you actually done it? Are our guidelines clear enough on what you can and can’t do? Do you support initiatives to make the raw data available to the reader, or is it all too much of a hassle? We’d really love your input on what kind of requests or demands a journal should make in terms of data presentation, so please answer the poll below (it’s completely anonymous!) and give us your feedback in the comments section.
Katherine Brown is the Executive Editor of Development


