Transparent Process

The expression level of small non‐coding RNAs derived from the first exon of protein‐coding genes is predictive of cancer status

Athanasios Zovoilis, Andrew J Mungall, Richard Moore, Richard Varhol, Andy Chu, Tina Wong, Marco Marra, Steven JM Jones

Author Affiliations

  1. Athanasios Zovoilis1,2,
  2. Andrew J Mungall1,
  3. Richard Moore1,
  4. Richard Varhol1,
  5. Andy Chu1,
  6. Tina Wong1,
  7. Marco Marra1,3 and
  8. Steven JM Jones*,1,3,4
  1. 1BC Cancer Agency Genome Sciences Centre, Vancouver, BC, Canada
  2. 2Department of Molecular Biology, Harvard Medical School, Boston, MA, USA
  3. 3Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
  4. 4Department of Molecular Biology and Biochemistry, Simon Fraser University, Vancouver, BC, Canada
  1. *Corresponding author. Tel: +1 604 877 6083; Fax: +1 604 876 3561; E‐mail: sjones{at}
  1. AZ and SJMJ contributed to conception and design; AJM, RM, RV, AC, TW, MM and SJMJ were involved in generation of mapped sequenced reads; AZ performed data analysis; AZ and SJMJ carried out data interpretation and wrote the manuscript.

View Abstract


Small non‐coding RNAs (smRNAs) are known to be significantly enriched near the transcriptional start sites of genes. However, the functional relevance of these smRNAs remains unclear, and they have not been associated with human disease. Within the cancer genome atlas project (TCGA), we have generated small RNA datasets for many tumor types. In prior cancer studies, these RNAs have been regarded as transcriptional “noise,” due to their apparent chaotic distribution. In contrast, we demonstrate their striking potential to distinguish efficiently between cancer and normal tissues and classify patients with cancer to subgroups of distinct survival outcomes. This potential to predict cancer status is restricted to a subset of these smRNAs, which is encoded within the first exon of genes, highly enriched within CpG islands and negatively correlated with DNA methylation levels. Thus, our data show that genome‐wide changes in the expression levels of small non‐coding RNAs within first exons are associated with cancer.


Embedded Image

The expression of small non‐coding RNAs encoded within the first exon of genes can be used to efficiently identify cancer samples and classify patients into subgroups of different survival. Such pan‐cancer association is the first link between these RNAs and disease.

  • Exon 1 small non‐coding RNAs (smRNAs) can distinguish between cancer and normal tissues.

  • The prediction potential of exon 1 smRNAs differs from that of other smRNAs around transcriptional start sites (TSS).

  • smRNA locations around TSS are conserved between different individuals.

  • smRNA locations are enriched within CpG islands and their levels negatively correlated with DNA methylation.


  • The authors declare that they have no conflict of interest.

  • Received September 4, 2013.
  • Revision received January 13, 2014.
  • Accepted January 14, 2014.
View Full Text

Subscribers, please sign in with your username and password.

List of OpenAthens registered sites, including contact details.