Distributed machine learning algorithms for natural language processing

Dnr:

SNIC 2016/2-33

Type:

SNAC Small

Principal Investigator:

Richard Johansson

Affiliation:

Göteborgs universitet

Start Date:

2016-11-23

End Date:

2017-12-01

Primary Classification:

10208: Språkteknologi (språkvetenskaplig databehandling)

Webpage:

https://spraakbanken.gu.se/corpsem

Allocation

Abstract

This pilot project will investigate the feasibility of parallelizing our existing machine learning algorithms for statistical natural language processing. For this purpose, we will build distributed Java and Python applications using libraries for numerical and distributed processing. The pilot project will be developed using CPUs but we plan to migrate to a medium-scale project using GPUs eventually.