Variant Calling, Annotation, and Filtering in Breast Tumor RNA Sequencing Datasets

SNIC 2018/6-34


SNIC Small Compute

Principal Investigator:

Christian Brueffer


Lunds universitet

Start Date:


End Date:


Primary Classification:

30203: Cancer and Oncology



Breast cancer is the most common kind of cancer in women. Tumors are caused by germline and somatic mutations that lead to deregulated cell processes which allow unregulated growth. The Sweden Cancerome Analysis Network - Breast (SCAN-B) initiative was launched in 2010 to prospectively collect breast cancer biomaterial for molecular research and to develop, validate and implement new biomarkers for improved breast cancer care (Saal et al, Genome Medicine 2015). Sequencing of DNA, either in a deep high-throughput fashion or via Sanger sequencing, is the gold standard for mutation detection. In the SCAN-B project each patient tumor is currently analyzed by RNA sequencing. Calling of genomic variants in RNA-seq data is limited to expressed regions of the genome and is complicated by the complex nature of the transcriptome and splice variants. The overall aim of this project (PI: Lao Saal) is to determine the mutational portraits of breast cancer within the SCAN-B project and relate this information to tumor phenotypes and patient and clinical features such as survival. For this purpose we have generated variant calls from a large set of breast tumors. In this Small Project, we intend to annotate our variant calls using information from a variety of databases, and develop filters to eliminate false positive calls from our dataset.