Can Humans Be out of the Loop?

Details

Title : Can Humans Be out of the Loop? Author(s): Junzhe Zhang, Elias Bareinboim

What

Shows that an agent that does not take into account human advice in some tasks will always have a sub-optimal policy.

Why

Important to characterize problems where humans in the loop can improve results.

How

Use of causal models.

TLDR

Paper looks at the case of human instructors after the automated agent learns a model of the environment. Here, the agent and the human have different perceptual capabilities and share the same reward and transition functions.

The agent explores human-AI interactions via counterfactual reasoning, and the paper shows that such a counterfactual agent always dominates normal agents who do not use human advice in terms of cumulative rewards.

The authors use Dynamic Structural Causal Models (DSCMs), which they refer to as a direct translation of POMDPs to causal language.

The main theorems are that given an experimental agent (the standard autonomous agent) and a counterfactual agent (who takes human advice as its intended action, also called intuition, and adjusts this action via counterfactual reasoning if a better decision is available), the optimal policy of the counterfactual agent dominates that of the experimental agent, and they are the same if the human decision is affected only by the observable variables.

The authors then show a trade-off between optimality and autonomy, namely full autonomy is often desired, but the agent could achieve better performance by using human advice. In this regard, the authors define a hybrid agent as one who switches between experimental and counterfactual modes, subject to a budget constraint.

The experiment section details 3 cases,

Offline planning.
Online Learning, post-learning phase with incomplete model of the environment.
Hybrid planning, where there is a constraint on the available human advice.

And?

Thoughts

Paper neatly formalizes human-AI systems into the causal framework, I need to learn more on causality.