Kwiatkowski, Zettlemoyer, Goldwater, Steedman 2011

Lexical Generalization in CCG Grammar Induction for Semantic Parsing

tags

EMNLP 2011, GeoQuery, ATIS

factored lexicons

lexicon item -> (lexeme, template) pair

  • lexeme: (word span, [constant1, constant2])
  • template:

maximal factor

all the constants of h are included in lexeme.

partial factor

Kwiatkowski, Choi, Artzi, Zettlemoyer, 2013

Scaling Semantic Parsers with On-the-fly Ontology Matching

tags

EMNLP 2013, Freebase QA, GeoQuery

ontological mismatch problem

At first, GeoQuery / ATIS dataset is too small, predicates and utterances are not that much. Learning a parsing model is easy.

If a database has more predicates and thus more capable to answer more questions in theory, the amount of possible utterance can go even further.

What’s worse, new utterances linguistically involve more predicates in theory, but database schema is fixed and supports only limited predicates.

parsing

convert to underspecified LF

  • predefined set 56 lexical categories (WordNet)
  • 49 domain-independent lexical items (English only)
  • underspefified constants are type placeholders

ontological matching

list of operators:

  • collapsing operator
  • expansion operator
  • constant matching

operators:

QQ20160921-0@2x.png

parsing example

QQ20160921-1@2x.png

inference

CKY-style chart parser, threshold pruning, …

and ranking:

learning

find correct samples and wrong samples, and update parameter

QQ20160921-2@2x.png

Goldwasser et al. 2011

Confidence Driven Unsupervised Semantic Parsing

tags

ACL2011, unsupervised

###idea

if a non-random model produces a prediction pattern multiplt times it is likely to be an indication of an underlying phenomenon in the data.

confidence

output structures which fall close to the center of mass of these statistics will receive a high confidence score.

confidence-driven

use a confidence driven EM-like learning will significantly improve the model compared with using only prediction score

learning

QQ20160927-2@2x.png

confidence choice

translation model

  • unigram
  • bigram

structural proportion

  • Prop(x, z): proportion of #pred_in_z and #words_in_x
  • AvProp(S): Average over sets
  • PropScore(S, (x,z)) = AvProp(S) - Prop(x, z)

combined

use the latter approach to filter out unlikely candidates and ranks the remaining ones using the former approaches

Poon, 2013

Grounded Unsupervised Semantic Parsing

tags

ACL 2013, unsupervised, database schema, ATIS

contents

characteristics:

  • leverage database schema
  • start from dep-parse, and added states for mismatching between dep- and sem-parse
  • semantic not needed to train: datetime, logical connector, numerics
  • superlatives are applied to the most restricted case

assign nodes and edges in a dep-parse to various states.

  • these states directly come from database schema
  • for NL/parse-MR mismatch, add more states (Raising / Sinking)
  • devise a lexical trigger from DB values, DASH(Pantel et al. 2009) is used to get additional word-pair

grounded-unsupervised-parse.png

Parsing:

  • inference using tree-Viterbi and inside-outside algo.
  • weights learned from feature-rich EM