SNIC SUPR
Safe Reinforcement Learning and Decision Making
Dnr:

SNIC 2019/4-39

Type:

SNIC Small Compute

Principal Investigator:

Hannes Eriksson

Affiliation:

Chalmers tekniska högskola

Start Date:

2019-03-27

End Date:

2020-04-01

Primary Classification:

10207: Computer Vision and Robotics (Autonomous Systems)

Webpage:

Allocation

Abstract

In my PhD project we investigate different methods of performing Safe Autonomous Driving. One approach in particular we are looking at now is to use Safe Reinforcement Learning for Decision Making. If we are granted access to the computing resources we would want to use them for further projects. Included below is the abstract for the current work: We develop a framework for interacting with uncertain environments in reinforcement learning (RL) by leveraging preferences in the form of utility functions. In the framework the preference for risk can be tuned by variation of the parameter $\beta$ and the resulting behavior can be risk-averse, risk-neutral or risk-taking depending on the parameter choice. We evaluate our framework for learning problems both containing a true underlying model $\mu$ and experiments with model uncertainty. We measure and control for \emph{epistemic} risk using dynamic programming (DP) and policy gradient-based algorithms. The risk-averse behavior is then compared with the behavior of the optimal risk-neutral policy in environments with epistemic risk.