Evaluation of a hidden Markov model implementation on KNL

Dnr:

SNIC 2016/1-443

Type:

SNAC Medium

Principal Investigator:

Carl Nettelblad

Affiliation:

Uppsala universitet

Start Date:

2017-02-21

End Date:

2018-04-01

Primary Classification:

10203: Bioinformatik (beräkningsbiologi) (tillämpningar under 10610)

Secondary Classification:

10105: Beräkningsmatematik

Tertiary Classification:

10106: Sannolikhetsteori och statistik

Webpage:

http://www.it.uu.se/research/project/genomics

Allocation

Abstract

The cnF2freq codebase ( https://github.com/cnettel/cnF2freq ) is used to infer genotypes and haplotypes using iterative training of hidden Markov models. Each individual is a natural separate thread. The total dataset is limited, and many parameters are shared between at least some of the threads (in a read-only manner, updated between iterations). The computation of the state-vector consists of medium-length vector operations (e.g. on the range of 64 element-wise multiplications) as well as custom logic. This project will investigate how well ICC will convert the existing codebase to KNL code, given that some aspects seem suitable - and study some attempts to rearrange the algorithm to better utilize the specific memory and core properties.