Bilateral Multi-Perspective Matching for Natural Language Sentences Zhiguo Wang, Wael Hamza, Radu Florian IBM T.J. Watson Research Center 1101 Kitchawan Rd, Yorktown Heights, NY 10598 {zhigwang,whamza,raduf}@us.ibm.com

Background

Sentence Matching

  • paraphrase classification
  • entailment recognition
  • Query-answer pair matching and candidate answer ranking

Two types of NN structures:

  • Siamese architecture: one encoder for both sentence + matching, no interaction between two sequences
  • matching-aggregation (Wang and Jiang, 2016), matched firstly, then aggregated for final classification

Problems:

  • only word-by-word matching, no other granularity
  • only single direction matching

reason not that plausible

Thus the new way:

  1. BiLSTM
  2. BiMatch, P to Q (p slice is matched with all q) and vice vesa
  3. BiLSTM
  4. FC Layer

Task

An example is a triple

Model

QQ20170306-220931.png

  • Word Embedding Glove concat Characters within a word LSTM
  • BiLSTM for context representation
  • Multi-pespective matching
  • BiLSTM for aggregation
  • 2 layer FFN

Multi-pespective Matching

define a multi-perspective matching function (different weighted cosine for all perspectives):

and concatenate all the four strategies:

Full Match

match a p slice with the final q, in both directions

QQ20170306-231724.png

Maxpooling Matching

match a p slice with all q slices, and use max pooling

QQ20170306-231731.png

Attentive-Matching

match a p slice with an attentive q repr

QQ20170306-231735.png

Max-Attentive-Matching

match a p slice with a max-pooling over attentive q slices, instead of weighted sum

QQ20170306-231743.png

Experiments

settings:

  • char embedding is 20-d, word is composed to 50d by LSTM
  • all BiLSTM uses 100d
  • perspectives = 20
  • dropout ratio = 0.1, learning rate = 0.001
  • word embedding not updated

multi-pespective(on paraphrase): l = 1, 5, 10, 15, 20

QQ20170306-232521.png

bi-direction and four matching strategies(on paraphrase):

QQ20170306-232613.png

Paraphrase (on Quora Question Pairs (400k)):

QQ20170306-233553.png

RTE(SNLI):

QQ20170306-233617.png

Answer Selection(WikiQA):

QQ20170306-233715.png