the community site for and by developmental biologists

6 thoughts on “Publishing ‘dirty’ data”

  1. What’s the difference between the deliberately blackened out part and the 4 triangles of black created by cropping? Both show black in place of off-subject parts of the image. I’m not sure then how image C is any better than image B by the standard, “Unacceptable manipulations include the addition, alteration or removal of a particular feature of an image.” Perhaps close cropping should be allowed, but raw data should be provided in a supplement so that anyone can see the cropping?

  2. You’re absolutely right and I did allude to this – there’s no real difference between selective cropping and the blacking out shown here. Except that, from a practical point of view, we can tell when someone’s deleted something within the field of view shown, but not necessarily when they’ve deleted something by cropping. Showing the raw data as supplementary material would be a great solution, but would you be willing to go to the all the hassle of assembling that file?

    But then of course, you can go one step further: who’s to say that you didn’t set your field of view in the microscope selectively?!

    I guess if this is the main kind of image manipulation we’re seeing, we’re on the lucky side: it’s very much beautification not fraud. But I’m struck by how much pressure people apparently feel they’re under to produce ‘beautiful’ images, rather than worrying mainly about scientific content. At Development, we’re proud of our reputation for beautiful figures and images, but if that creates a pressure that pushes people towards inappropriate data manipulation, then we should be worried…

  3. While I agree that image manipulation is not an acceptable practice, the issue of cropping is a very challenging one. As to archiving raw data and making it available to others. I point you to The Cell: An Image Library-CCDB ( We accept submissions of images from the public, so why not start referencing entries in The Cell much as nucleic acid sequences were first referenced in GenBank. Put the pretty pictures in the Journals (not manipulated, but possibly cropped) but put all the image data in its raw form into the Cell for future analysis by others.

    1. Thanks for the tip David: we’re actually looking into archiving resources at the moment, and while The Cell Image Library wouldn’t be appropriate for all our images, it might be a good place for certain data types. But would you be able to deal with the increase in volume if multiple journals started asking their authors to deposit raw data with you?

      And this still doesn’t get round the ‘hassle factor’ of having to find and upload all this data! Perhaps I’m being unfair to authors, but I fear that many would find it frustrating if – at the point of acceptance – they were asked to go back and find all the original data to make it available for the community: as a Supplementary file with the journal or as archived data in your or other databases. You’re right – we routinely demand this for sequences, microarray data and the like, but can or should we do the same for all other data types?

    2. Katherine,
      We would definitely be able to deal with the increase in volume. In fact, that is one of our goals, to be the central repository for microscopy imaging data.

      As to the hassle factor, there are a couple of options. First, if the community moves towards the GenBank model, images would be deposited with The Cell prior to publication and receive an accession number. This accession number would be required by journals and images, (data) would be stored at The Cell and only illustrative imaging would be shown in the journals.

      Another option is for images to be deposited with The Cell at the time of article submission (while everything is current for the researcher) and held during an embargo period and then released to the general public through The Cell.

      I believe many of the funding agencies are looking to ensure that the research that they fund is made publicly available and open and we can serve that purpose at The Cell.

      As to your last question – should we? – I think it is very shortsighted of the community to not think that techniques and technology will continue to develop and that the analysis of this image data will also develop further. By not storing and cataloging this data now we are missing a great opportunity. Just as evolutionary studies of proteins could not have occurred until we started to amass all that data in one database, who knows what opportunities we are missing by not amassing all this image data in one place.

  4. Not sure if having everything in one spot will help to be honest. One always has to build on the honesty of other people in the community. Not sure if (one more) huge database will help.
    I think one should just stay as critical as possible and demand additional experiment which are well controlled when something looks funny.


Leave a Reply

Your email address will not be published. Required fields are marked *