The strengths of microarrays and MPSS appear somewhat complementary to one another. The degree to which microarray and MPSS data correlate is valuable information for researchers involved in gene expression studies. The two technologies could theoretically provide genome-wide coverage of a transcriptome. In practice, our data shows that Affymetrix or MPSS alone does not cover the transcriptome of LNCaP and C4-2 cell lines as evidenced by the detection of certain genes by one technology but not the other. Therefore, previous single-technique studies of LNCaP and C4-2 gene expression [27–29] have likely captured parts of their transcriptomes. Our merged Affymetrix and MPSS data have 11,010 genes for the LNCaP transcriptome and 10,667 genes for the C4-2 transcriptome; we believe that the numbers represent a reasonably complete profile of the genes that are expressed by these cells within the sensitivity range of the technologies. However, a comparison of the Affymetrix and MPSS data revealed a potentially surprising finding in that the expression of thousands of genes was not corroborated by the two technologies. In the LNCaP transcriptome, 28.9% of the genes were only detected by Affymetrix and 10.6% only by MPSS. In the C4-2 transcriptome, 38.6% of the genes were only detected by Affymetrix and 9.5% only by MPSS. Overall, we see that the Affymetrix signals are correlated with MPSS tpm. It is likely that some variability in this correlation comes from both detection processes. However, we note that at high tpm the signal strength tends to slow its increase with respect to tpm. The curve flattening suggests that ProbeSet signals may saturate for highly expressed transcripts, which has been previously observed by James et al . Such saturation, however, does not pose a problem for an experimental design such as ours that focuses on the presence and absence of particular transcripts.
The GCOS "detection P-value" is recommended by Affymetrix to assess the presence or absence of a gene in an experiment. Our data shows that when the Affymetrix detection call is related to MPSS tpm value for genes called "absent" by GCOS greater than 90% of the genes also have a tpm of zero, which indicates that the technologies have a similar level of low-end sensitivity. However, when genes called "present" by GCOS are compared to MPSS tpm only 45% of genes also have a tpm greater than 10 and 39% of the genes have zero MPSS tpm. Many of these zeros are likely to be due to the failure of MPSS to measure certain splice forms of some GeneIDs, particularly those missing Dpn II sites. Given the low correlation of Affymetrix "present" calls and MPSS tpm the usefulness of the relationship as an absolute means to compare data sets is limited.
Due to the detection limits of Affymetrix and MPSS technologies further analysis of the genes "unique" to LNCaP and C4-2 cells was necessary. We used RT-PCR to determine the presence or absence of "unique" transcripts in the LNCaP and C4-2 cell lines. Of the 33 genes from the C4-2 "unique" list that we analyzed 21% were verified to be unique to C4-2 cells relative to LNCaP by RT-PCR. Of the 66 genes assayed from the LNCaP "unique" list 6% were verified to be unique to LNCaP cells by RT-PCR. In one case, our RT-PCR verification appears to validate MPSS signals as low as 1 tpm. The gene PHLDA1 had a C4-2 Affymetrix signal of 105 and was detected in both C4-2 and LNCaP RNA by RT-PCR [see Additional file 2]. An interesting aspect of the RT-PCR verification was the detection of many transcripts that were "absent" as determined by GCOS and had zero tpm. Qualitatively, it appears that the majority (89%) of the "absent" and zero tpm genes detected by RT-PCR in both cell lines are actually differentially expressed. Therefore, it may be more appropriate to interpret an Affymetrix "absent" call or an MPSS zero as a failure of the technology to detect the transcript and not its absence.
The biological reason for comparing LNCaP and C4-2 was to identify genes associated with cancer progression. C4-2 is a more malignant progeny of LNCaP produced through an in vivo process involving interaction between LNCaP and human bone stromal cells . Unlike LNCaP, C4-2 has metastatic potential and is hormone insensitive. We postulated that C4-2 genes were likely to be found in advanced cancers; the strongest candidate was carbonic anhydrase 1 (CA1), expression of its transcript was restricted to metastases with a possible increase in bone. Our data suggests that CA1 expression is related to the progression of prostate cancer from tissue-localized disease where the gene is not expressed to metastasis where the gene is present. The expression data suggests that clones expressing CA1 are selected in bone metastasis. The expression pattern of the other tested genes was less notable in this regard. PRG-3 , an enzymatically inactive member of the recently described plasticity-related gene family of lipid phosphate phosphatases  is present in normal prostate but appears to be expressed at a lower level in primary tumor. Its expression pattern suggests that PRG-3 may be expressed in basal cells in normal glands as these cells are missing in tumor glands. Caspase recruitment domain family member 14 (CARD14), expression appeared to be elevated in primary cancer but reduced in lymph node and bone metastasis. CARD14 has been shown to interact with the apoptosis activator BCL10 and activate NF-κB . The increased expression of CARD14 may facilitate the activation of NF-κB in prostate cancer. Ephrin receptor A7 (EPHA7) is a member of a large class of cell-cell communication receptor-ligand pairs, expression of which was detected in benign tissue and primary tumor but not in lymph node or bone metastasis. The expression of EPHA7 has been observed to be elevated in liver tumors, decreased in colon tumors, and unchanged in lung or kidney tumors . It is interesting to note that increased expression of another ephrin receptor, EPHA2, has been demonstrated in prostatic intraepithelial neoplasia, and shown to be associated with neoplastic transformation . Finally, ETS translocation variant 1 (ETV1), a member of the ETS transcription factor family, was expressed in normal and primary tumor, not in lymph node metastasis (although it is a LNCaP unique gene), and potentially elevated in bone metastasis. Greater expression of ETV1 may promote an aggressive phenotype of metastasis through its recently documented activation of human telomerase reverse transcriptase .