Learning Task Informed Abstractions

Xiang Fu*, Ge Yang*, Pulkit Agrawal, Tommi Jaakkola

ICML 2021

[paper] [code] [bibtex]

Video

Abstract

Current model-based reinforcement learning methods struggle when operating from complex visual scenes due to their inability to prioritize task-relevant features. To mitigate this problem, we propose learning Task Informed Abstractions (TIA) that explicitly separates reward-correlated visual features from distractors. For learning TIA, we introduce the formalism of Task Informed MDP (TiMDP) that is realized by training two models that learn visual features via cooperative reconstruction, but one model is adversarially dissociated from the reward signal. Empirical evaluation shows that TIA leads to significant performance gains over state-of-the-art methods on many visual control tasks where natural and unconstrained visual distractions pose a formidable challenge.

Qualitative Results

placeholder image 1 placeholder image 2 placeholder image 3 placeholder image 4 placeholder image 5 placeholder image 1 placeholder image 2 placeholder image 3 placeholder image 4 placeholder image 5 placeholder image 1 placeholder image 2 placeholder image 3 placeholder image 4 placeholder image 5
Top to Bottom: Cheetah Run, Hopper Hop, ManyWorld. Left to Right: (1) Raw Observation (2) Dreamer Reconstruction (3) TIA Joint Reconstruction (4) TIA Task Model Reconstruction (5) TIA Distractor Model Reconstruction. Dreamer does not distinguish task-relevant vs task-irrelevant information and mix the two in its state representation. In DMC domains, Dreamer tries to reconstruct background videos that it memorized during training. On the other hand, TIA disentangles the state space into a task stream and a distractor stream through reward, and is able to the task-relevant information much better than vanilla Dreamer.