A primer or two in collegiality and mutual benefit
Posted by Heather, on 2 November 2010
Community resources are usually only as good as the people who use them are numerous and virtuous.
Despite my best intentions, there are not enough incentives out there for me to spend my time validating and then manually entering human SNPs, that I’ve found during the course of my sequencing various candidate genes for diseases, in the fantastically useful dbSNP database. However, with the advent of high-throughput sequencing and the possibility for large-scale genome annotation, I don’t think that my lack of participation makes such a difference.
It is otherwise with respect to designing and validating primers for PCR. Oh, and if you ever have to teach a trainee about PCR, have a go at the fabulous teaching resources provided by the Cold Spring Harbor Laboratory. Like this pretty video I can’t seem to embed, but you can look at here.
Anyhow, I wanted to draw your attention to RTPrimerDB. It’s been around a number of years, and has been the object of three readily accessible publications in Nucleic Acids Research as a community resource.
About half of the primers are for human gene expression assays of various types, but there are as of today, more than 800 primers for mouse PCRs of various ilks. I found some to my liking today.
So, in my laboratory, I keep a spreadsheet with a tab for human, a tab for mouse and a tab for chicken. Into this I have added, somewhat indiscriminately and in the order in which they arrive, primers for genomic DNA or cDNA amplification or both, and specify whether or not they are intended for quantitative or end-point PCR. Actually to be brutally honest, I haven’t developed any primers for qRT-PCR for the chicken.
I write on a regular basis to authors who are not among those who increasingly do include their primer sequences in their article submissions, because I am under the misguided illusion that I will save time by using assays that have already been validated by someone else.
Woe is me when I presume such a thing. Housekeeping genes as standards are particularly notorious, but many is the time when I have either blindly ordered primers according to publications and then been surprised at inefficient amplification under the more-or-less specified conditions, or a poor melting curve, to find that even in silico they shouldn’t have worked.
So, here’s to saving a little time and checking in silico.
The first suggestion is to check that your genomic DNA primers will amplify what you expect. For this, I enjoy using the simple PCR module on UCSC’s (wonderful) Genome Browser. You check that you are using the right organism, and sometimes the right “build” – that is, the right version of the genome sequence against which to check, and the rest is self-explanatory. Sometimes it is also nice to double-check primers that span exons, in case you do get some genomic amplification because of contamination, and to see what the expected size would be if that happened.
The second, is to make use of Primer-BLAST. You know this resource – or should, if you don’t yet:
Primer-BLAST was developed at NCBI to help users make primers that are specific to the input PCR template. It uses Primer3 to design PCR primers and then submits them to BLAST search against [a] user-selected database. The blast results are then automatically analyzed to avoid primer pairs (all combinations including forward-reverse primer pair, forward-forward as well as reverse-reverse pairs) that can cause amplification of targets other than the input template.
So, what I didn’t know, but is perfectly lovely, is the following: Primer-BLAST can check those published primer pairs for you without specifying their target.
That is, you skip the whole first section about PCR Template, and go right to Primer Parameters > Use my own forward primer (and reverse, natch). You don’t have to play around with anything about length or melting temperature, but you scroll right down to Primer Pair Specificity Checking Parameters.
I only change Organism if needed. There are automatic fill-in fields that you need to give a little time to suggest, when you start typing eg. Mus musculus. When I have had doubts as to whether authors who carried out xenotransplantations actually posted their host or their donor amplification sequences, I use the link “Add more organisms” and away we go. Leaving RefSeq RNA (refseq_rna) for the Database is usually fine for checking RT-PCR primers, but there are other options. Like “nr” if you want to play it safe (but it takes slightly longer, of course).
For example, let’s say I want to know for sure what part of the Xenopus Chd7 protein was used to make a recombinant peptide to immunize rabbits and develop the polyclonal antibody used in this publication. I don’t specify whether I want Xenopus laevis or tropicalis, but stick with the genus only.
As a result, I find out that the 30-bp primers provided in the Methods section amplify perfectly and exclusively, a 549-bp fragment of “>NM_001091800.1 Xenopus laevis chromodomain helicase DNA binding protein 7 (chd7), mRNA” – and by following the link, I can figure out which part of the protein it would be. I also know the predicted melting temperatures, giving me an idea of conditions, and alternative amplifications either in other species (Xenopus tropicalis, with a single nucleotide difference in each primer) or in other parts of the genome (just try checking standard primers used against Gapdh sometime). I’ve often seen such single nucleotide differences, which can mean the difference between a PCR that works and one that doesn’t. Typos do happen. Another thing that happened to me today was that I noticed that of the two housekeeping gene primer pairs provided in a publication, the two genes were supposed to amplify with the same primer pair – a simple cut/paste error. While waiting for a response from the authors, perhaps who would have to ask a postdoc long gone from the lab, one can easily find out which one it is.
Then of course, you can make up your mind as to whether you really want to order primers on the basis of confidence in such a lab’s ability to optimize all the parameters. But that is another story for another day.
Meanwhile, I would be eager to find out the following:
1. How do you keep track of PCR primers in your lab, for any use including that of making templates for in vitro RNA probe transcription?
2. Do you annotate as to whether or not they work, or more subtly, the conditions tried for optimization?
3. Do you systematically ensure that primers appear somewhere with publications that use them, either as cited references, tables or supplementary material, or online on your lab website?
4. If you work in a non-muring, non-human, non-Arabidopsis, non-Oryza kind of model organism, are there other similar public resources of PCR primers? Because aside from these, hardly any other organisms are represented in RTPrimerDB. There are something like 5 pairs for Drosophila or Danio, for example.
Ooh. I see there is a “Poll” tab in my author possibilities, but I don’t know how to use it further. Meanwhile, comments and discussion will be much appreciated.
I can’t even remember how we kept track of primers in the lab. There may have been a database (we definitely had a database for constructs, and any primers used to create the constructs were listed in there.) I also definitely had a binder with all the info on the primers I designed/ordered/used, and for RT-PCR that contained test results including optimal temperatures etc.
Also, I’ve played with the new polls feature before I went on vacation, but I forgot the details. It’s mostly straightforward how to create one from the sidebar menu, but to add it into the post you had to take note of a number and I forgot where that was so I’ll have to go through it again. It only does multiple choice (not yes/no) so if you can think of a question with multiple answers to chose from, that can be made into a poll. E-mail me question + answers and I can try to add a poll if you want.
So, Eva, what would it take to convince you to enter/upload those validated, tested RT-PCR primers in something like RTPrimerDB?
Because my point is, no matter how good an idea these sort of things seem to be, it’s rare that we want to consecrate more than a fleeting amount of time at any one sitting for the commonwealth, without some sort of incentive. I can’t imagine what the incentive would be for you to do that, unless somehow it was something like plasmids where a person who used them can at least say thank you in the publication, or use the words “generously provided by” etc.
Thing is, I’m not sure that me testing them in *my* system would help someone else. They still have to test if it works for them.
And no incentive indeed. That’s usually the problem with sharing anything… Although there have been some interesting discussions lately (including at Science Online London) about credit/incentive for uploading data, and primers would probably fit that same bill. The idea that’s being thrown around is that if you can actually say on something like a grant application that you shared primers/data, that it would eventually *matter*.
Thanks for this post. I did my post-doc in a zebrafish lab, and it was pretty much every person for herself in terms of keeping track of primers for amplifying genes. We were more communally organized when it came to the primers used for amplify the SSLPs important for mapping mutants. More than a few years ago, many of the microsatelites in the zebrafish genome were characterized, and now nearly all of the SSLP primer sequences can be found at ZIRC (Zebrafish International Resource Center) or on the Ensemble genebuild.
Our lab keeps a relatively up-to-date database of all the primers used for mapping, but the only info is the sequence and the genomic location. It would be a great idea to include a column for efficiency of amplification, etc.
To keep track of my personal oligo collection (generally used for amplifying genes/partial genes from cDNA or from plasmids), I use a notebook dedicated to primers which lists the oligo sequence, its use, and sometimes the conditions for optimum amplification. I always think that I should transfer this to a database, but then I am deterred by the mind-numbing work of typing all those sequences. Maybe if I had started out keeping an excel spreadsheet, instead of a paper notebook…I would feel differently…
I appreciate the pointers to the ZIRC, which I didn’t know about, and of course to Ensembl.org, which I did.
It’s not bad already to have the database the way you do, but wouldn’t it be better for labs to put that sort of thing on their websites at least, on their open lab notebook wikis even better?
I only ever type sequences myself – I am too prone to copy errors if I write them out by hand, and I make ample use of cut-and-paste. So spreadsheets and text documents are mandatory.
Therer are two intersting topics here. First, how some lab publishes results with some primers and then another lab tries to replicate those results without success? It is supossed that every result from a publication should be replicated independently by any scientist, and this implies that the publication should contain all the necessary information. In the case of PCR, the publication should include the sequence of the primers used, the protocol of the PCR (template concentration, primer concentration, etc) and the number of cycles used in every test. In human genetics, it is common to see that info. But in other disciplines, including developmental biology, it is not so common. Researcher should not assume that “every scientists knows how to make a PCR”, because that’s not the point. The point is what’s the whole idea of a publication.
The second issue here is the behaviours of the research community. Some journals establishes in their author guidelines that the author agrees in sharing reagents, vectors, primers, etc, with other researchers if they request them. But, in reality, once the paper is published, some authors refuse to share those reagents. And also some authors do not publish detailed protocols, and when some fellow researcher writes an email to those authors, they don’t answer. That is annoying, and journals should have mechanisms to punish those authors that do not respect the guidelines. In the most extreme of the cases, the journal may even retract the paper, specially when authors refuse systematically to share detailed protocols and reagents, because such a behaviour indicates that somethin is wrong, or that results are not easily reproduced, going against the idea of a scientific report.
As a developmental biologist by training but working in human genetics, I can tell you that it’s not instinctive for many of my colleagues to include even primer sequence information. Conditions are, like for immunohistochemistry, only really necessary if they are not standard, though a Tm can certainly save time. I don’t have a problem with these shortcuts.
I do have a problem with malicious tampering – not-full disclosure. Easy enough to chalk things up to a typo but I have seen a great deal of “mistakes” and necessary omissions which makes me think that, on occasion, someone in the author list wants to control who does what with the information and oblige a scientist to write them and reveal their interest. Perhaps as a potential competitor?
Anyhow, I firmly believe that when a “corresponding author” does not correspond, that the journal is under some reponsibility to intervene. I doubt this would ever go to the point of retraction but it would probably be effective for a representative of the journal to point out that continued behavior of the sort would preclude any other submissions to that journal, and that reputations get made and broken very easily.
We always publish primer sets for any PCR-based assay we use in a paper. This is important for others to be able to repeat your work.
In ancient times (1980’s-1990’s), for our first thousand oligos made, we used to scrupulously keep a log book of each oligonucleotide sequence, its extent of purity, and details on its use. Since automated synthesis has become much cheaper and we usually make many oligos at once, we now just keep the quality control page from the prep, which includes the sequence and basic information about the oligo. The quality control pages are kept in a lab resource binder. Each oligo is still given a lab number, in the binder, and stored in special boxes in our -80oC freezer. We have over 2500 oligos in the lab. So many are made and the cost has gone down, such that it is no longer economical to enter and check the sequence of each one in a database.
Thanks for your thoughts, Ken. So, we do sort of the same – we keep of course such binders with all the sequences in it. However, I still think that is not the most efficient solution.
a. what if the binder is destroyed somehow? (Well, perhaps you can pull the sequences out of the supplier again, to a certain extent, and out of your publications, for the ones that were published, and kudos for your policy!)
b. Searching for information is not easy in such a binder – or your earlier log book – and at some point the primer sequences actually were text strings on a computer, to place the order. Why stay analog when it is possible to paste them into a digital file, and later search by keyword or sequence, as well as to store that precious resource in more than one location?
(I’ve been through a lab fire before.)
Now, for everything from the past, I totally understand why no one wants to go back and whip their primer data into standard shape. But prospectively? Or for re-orders?
And, my point is, wouldn’t it be even MORE fantastic if there were a way to upload such files, perhaps standardized in format, to a central resource for the entire community? That is, no more work than the original entry to place the primer order itself, and then all one would have to do is check off a “validated” kind of box, and off it goes… and the resource could automatically calculate what the primers amplify, and one could add metadata if wanted.
A girl can dream.
Thanks for the post. I’ve worked on two non-model systems (with unsequenced genomes) where every gene has had to be degeneratively amplified. Usually, everyone in the lab has their own list of the primers that they’ve used. Then, if a particular gene amplified is published, the degenerate primers should be included in that publication. Realistically though, many additional genes (especially from non-model organisms) are amplified that are never published so it would be great to have some way of describing those. Some of the non-model organisms that people work on have such a small community that I doubt a primer database exists. What would be very useful (which might exist) is a database of all degenerate primers used, what organisms people have used them in, and perhaps also what degenerate primers were designed but did not produce a product.
This is nice post. There are two intersting topics here. First, how some lab publishes results with some primers and then another lab tries to replicate those results without success?