Coupling Distributed and Symbolic Execution for Natural Language Queries

Lili Mou,1∗ Zhengdong Lu,2 Hang Li,3 Zhi Jin1

  1. Key Laboratory of High Confidence Software Technologies (Peking University), MoE; Institute of Software, Peking University, China
  2. DeeplyCurious.ai
  3. Noah’s Ark Lab, Huawei Technologies

doublepower.mou@gmail.com, luz@DeeplyCurious.ai HangLi.HL@huawei.com, zhijin@sei.pku.edu.cn

Introduction

Applications of using natural language to query database:

  • Generative QA [Yin et al. 2016a]
  • Human-Computer conversation [Wen et al. 2016]
  • Table Query [This work]

Methods:

  • Semantic Parsing (language to logic forms)
    • [Long et al. 2016; Pasupat and Liang, 2016] (Further Progress over [Pasupat and Liang 2015])
    • Seq2seq: supervision of groundtruth logic forms, weak supervision of denotations are not enough
      • [Dong and Lapata, 2016] seq2tree
      • [Xiao et al. 2016] DSP
    • Neural Semantic Parsing (NAMPI)
      • [Yin et al. 2016b] Neural enquirer (basic of this work)
        • lack of explicit interpretation
      • [Neelakantan et al. 2016] neural programmer, ICLR-2016
        • symbolic operations only for numeric tables, not for string matching
        • exponential number of combinatorial states
      • [Liang et al. 2016] neural symbolic machines
        • REINFORCE is sensitive to initial policy

This work: Combine symbolic and neural methods.

Distributed Enquirer

Similar to neural enquirer of [Yin et al. 2016b].

QQ20161213-1.png

QQ20161213-2.png

Query Encoder: Bi-LSTM of the question

Table Encoder: any cell $c = MLP([e_{cell}; e_{f_name}])$

Executor

Executor is a sequence of execution step, results of each step: $p^t_f, r^t$

multiple rows may be selected thus the last is sigmoid rather than softmax.

At the last step, softmax over all the table cells.

Symbolic Executor

Defined Operations:

QQ20161213-0.png

A Jordan-type RNN (no input) [Jordan 1997]:

further improved using query reduction network

Training whithout step-by-step supervision is non-trivial. (RL trial-and-error)

Coupling Distributed and Symbolic Execution

Combined:

  1. using the intermediate execution result of the distributed enquirer to pretrain the symbolic enquirer for an initial policy
  2. use REINFORCE algorithm to improve the policy for symbolic execution

QQ20161213-5.png

Binary reward R indicating whether the final denotation matches the groundtruth.

loss of a policy, actions are sampled from the current distribution:

partial derivatives:

Results

Synthetic dataset from [Yin et al. 2016b], with given groundtruth denotation and execution actions.

QQ20161213-3.png

QQ20161213-4.png