introduction

QUERY-REDUCTION NETWORKS FOR QUESTION ANSWERING

submitted to ICLR-2017

Minjoon Seo 1, Sewon Min 3, Ali Farhadi 1,2 & Hannaneh Hajishirzi 1 University of Washington 1, Allen Institute for Artificial Intelligence 2, Seoul National University 3 {minjoon, ali, hannaneh}@cs.washington.edu, shmsw25@snu.ac.kr

Target:

Reasoning over multiple fact questions:

Frogs eat insects.
Flies are insects.
Do frogs eat flies?

“standard attention mechanisms are insensitive to the time step (memory address) of the sentences when accessing them”

model

Query-Reduction Network (QRN)

  • considers the context sentences as a sequence of state-changing triggers
  • reduces the original query to an easier-to-answer query

Model Overview:

  1. Input: sentence $x_t$ and question $q_t$ encoding
  2. QRN Layer: predict answer $\hat y\in R^{d}$
  3. Output: $\hat y \to$ natural language answer

QQ20161129-3.png

QRN Layer:

A variant of RNN unit

  • update gate:
  • reduce query:
  • hidden:

QQ20161129-0.png

Stacked Layer:

  • next layer q:
  • next layer q bi-direction:
  • next layer x:

QQ20161129-4.png

Extension:

Reset gate:

  • Hidden:

Vector gates: both update gate and reset gate is replace as a vector

Parallelization

unrolled hidden states:

QQ20161129-1.png

Vectorized (for all time step t):

QQ20161129-2.png

implementation

input module: Position Encoder (Weston et al., 2015 (Memory Networks))

output module:

  • story-based QA: V-class single-layer softmax (multi-word answer included)
  • dialog: RNN decoder (Cho et al. 2014), w/o recurrent hidden states and attention

result

QQ20161129-5.png

QQ20161129-6.png

QQ20161129-7.png