Quantitative PCR (or qPCR), microarrays and RNA sequencing are all very valuable assays for in-depth gene expression analysis. But which one should you choose over the other for your next big experiment? Well quite frankly, it mostly depends on the goals of the project, your budget, and the organism of interest.
The method of choice for nucleic acid (DNA, RNA) quantification in most areas of molecular biology is real-time PCR or quantitative PCR (qPCR). The method´s name derives from the fact that the amplification of DNA by polymerase chain reaction (PCR) is monitored in real time. It is a quantitative method in contrast to conventional PCR, meaning that it enables the determination of exact amounts (relative or absolute) of amplified DNA in samples. Conversely, amplified DNA can only be detected after the amplification had been carried out (end-point detection) in conventional PCR.
New to qPCR? No problem, we have all been there. Read our extensive article on Real Time PCR (qPCR) Technology Basics.
Choose qPCR for your experiment if you have to analyze the expression of a few genes with a known sequence – let’s say a maximum of 30. Why? It has got the widest dynamic range, the lowest quantification limits and the least biased results in comparison to microarrays or RNA-seq. Additionally, the amount of starting material can be very low and running 30 reactions per sample will still be cheaper. In order to ensure reliability and repeatability of your research follow the minimal standards (MIQE) which show you how to include controls and check PCR efficiency among other key elements. qPCR is the gold standard for expression analysis and will likely have to be used to confirm results even if you do choose to go for microarrays or RNA-seq.
DNA microarrays (or DNA chips) are collections of microscopic slots fitted on a microchip designed to examine the expression levels of a larger number of different genes simultaneously. While microarrays were a hot topic roughly a decade ago, they are rapidly being replaced by sequencing technologies. However, that is not to say they bring nothing left to the table, in fact, microarrays remain a reasonable pick for genotyping-related purposes and do still provide certain benefits.
Consider going for microarrays if you are looking for an affordable and robust solution, especially if you do not know which genes you want to analyze – or for when you want to perform a whole transcriptome deferentially expressed (DE) genes analysis and have a good reference sequence for your organism. Good bioinformatics and statistics practices for microarrays are well established and easy-to-use free software packages and pipelines exist (e.g. oneChannelGUI, dChip, Chipster, Orange) that can do more or less all the work. All the data, together with the analyzed results, can easily fit on a normal-sized hard drive (even USB stick) and the analysis can be run from your laptop.
The downside of microarrays is the low dynamic range of the assay and the additional need for probe design sequences. Since the arrival of next generation sequencing (NGS), however, the microarrays also fell slightly “out of fashion”.
In comparison to microarrays, RNA-sequencing (or RNA-seq for short) enables you to look at differential expressions at a much broader dynamic range, to examine DNA variations (SNPs, insertions, deletions) and even discover new genes or alternative splice variations using just one dataset. Bear in mind that RNA-seq is still more expensive than microarrays and presents a bigger challenge in the planning stage though.
Firstly, you’ll have to decide which technology to use (Illumina, Solid, IonTorrent, PacBio etc – or a combination of these), what kind of library preparation to go for (strand-specific or not, barcode or not, amplified by PCR or not, remove rRNA or use oligoT beads) and what kind of sequencing you want (read length, single or paired end). And it does not stop there – you have to decide how many reads you want to sequence. Is 100x transcriptome coverage enough? It may not be if you want to analyze weakly expressed genes.
When you finally get the data, you will have to decide how to analyze it – there are already some good practices available (Tuxedo protocol for RNA-seq and GATK for SNP calling), but since the so-called “wet lab” advanced rapidly the pipelines quickly become obsolete, slow, or just don’t work anymore.
Want to perform RNA-seq bioinformatics yourself and do not want to pay a pile of money for software suites such as CLC Genomics? Then prepare to say goodbye to nice graphical interfaces. It is all Linux-based software and scripts come in different programming languages, diverse file formats that will require more than a single tool to comprehend for complete analysis. Even when all you wish to obtain are fold-changes from a single RNA-seq, you’ll have to choose from many software options that each claim to perform best for each of the standard 5 steps (adapter trimming, filtering, alignment/mapping, counting, normalization and statistical analysis) – see the list of short read alignment programs. Based on the decisions you make, the fold changes and the number of DE genes will differ accordingly.
Hiring a bioinformatician or a statistician translates to more costs in addition to library prep and sequencing expenses. Also, consider that the files you get out of the sequencers are massive (few GB per sample), so be sure to check if you need more space and computing power.
RNA-seq makes sense if you want to find DE genes in a huge, non-sequenced genome. If you are working with small genomes like bacteria for example, sequence its genome first! Lack of a reference genome means you will have to assemble the transcriptome “de-novo”, for which you will need a lot of RAM and some serious computing power if you do not want to wait for ages.
Again, you will have at least 5 of “the best” programs or pipelines to do it. This means you can get several versions of the transcriptome assembly from which you have to choose the best reference. Some guidelines to assess transcriptome assembly quality already exist, but may not hold true for all organisms and datasets so be cautious. The problem is that no parameter really guarantees that the transcripts you are interested in are really assembled correctly. For proper interpretation of RNA-seq results transcriptome assemblies lack the confidence you get with a good quality reference sequence. Although RNA-seq is an invaluable tool to study gene expression and variation, make sure you carefully plan your experiments and estimate the costs before diving into it.
Is there an off-the-bat answer to which assay you should choose for your next gene expression experiment?
Most likely not. But there are a couple of simple questions you can answer that will help steer you in the right direction, like “How many genes am I analyzing?” or “What is my experiment budget?” or even “Am I properly trained to carry out this assay?” Answer as many of these questions and come back to this guide to find your ideal pick!
Marko Petek, PhD, is a bioinformatician at National Institute of Biology, Ljubljana, Slovenia.