Blog – Stanford ILIAD

Oct 6, 2018

Batch-Active Preference-Based Learning of Reward Functions

by Erdem Bıyık
In this post, we discuss an efficient way of reward learning. With a focus on preference-based learning methods, we show how sample-efficiency can be achieved along with computational efficiency by using batch-active methods. We practically analyze the tradeoff between informativeness and diversity within batch elements, and propose several methods that can provide a good balance. Lastly, we showcase our methods on several different simulators along with some usability studies.

Newer

Older

Page 3 of 3