• stacking
  • multi-view learning
  • pretraining

transition-based method: IMAGE



uni-/bi-gram embedding + length-5-window


beam search: 保留 gold action sequence 而不是优化最高得分

NMT:encoder 的作用较大,并不是一个结构 search 的难题


  • distillation
  • multitask
    • adversarial


A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing Neural probabilistic parsers are attractive for their capability of automatic feature combination and small data sizes. A transition-based greedy neural parser has given better accuracies over its linear counterpart. We propose a neural probabilistic structured-prediction model for transition-based dependency parsing, which integrates search and learning. Beam search is used for decoding, and contrastive learning is performed for maximizing the sentence-level log-likelihood. In standard Penn Treebank experiments, the structured neural parser achieves a 1.8% accuracy improvement upon a competitive greedy neural parser baseline, giving performance comparable to the best linear parser.

Syntactic Processing Using the Generalized Perceptron and Beam Search

Neural Word Segmentation with Rich Pretraining

Neural Network for Heterogeneous Annotations