NTM relies on only the last state
attention is an idea that uses a** weighted combination of input states**
like a memory reference where the access is soft.
Show, Attend and Tell for image labelling
Grammar as a Foreign Language for language parsing
End-to-End Memory Networks allow the network to read same input sequence multiple times before making an output, updating the memory contents at each step.
Neural Turing Machines use a similar form of memory mechanism, but with a more sophisticated type of addrdessing that using both content-based (like here) and location-based addressing, allowing the network to learn addressing pattern to execute simple computer programs, like sorting algorithms.
a clearer distinction between memory and attention mechanisms, perhaps along the lines of Reinforcement Learning Neural Turing Machines, which try to learn access patterns to deal with external interfaces.