zoukankan      html  css  js  c++  java
  • CS224n, lec 10, NMT & Seq2Seq Attn

    1990s-2010s: Statistical Machine Translation

    • Core idea: Learn a probabilistic model from data

    • best English sentence y, given French sentence x, by bayes rule.

    • is LM is translation model.

    • Do some statistics on aligned parralle corpus

    • Rules!

    NMT

    The first model is an end-to-end, combining an encoder-decoder language model, and use beam search to find the most suitable result, more fluent and less handcraft, but less interpretable and harder to control(add rules).

    Evaluation

    BLEU compares the machine-written translation to one or several human-written translation(s), and computes a similarity score based on: n-gram precision. Good buy Imperfect.

    ATTENTION!

    Seq2Seq models are good, but they encode too much information in a single state, which appears to be the information bottleneck. We need some structual improvements to tackle this problem. Here we have attention.

    Namely speaking, attention is another layer of assigning weights to all the hidden states of the encoder by dot product them with the initial state of the decoder, and summing the hidden states by weights, and concatenate the result with the initial state of the decoder, then take the result as the input of the decoder.

    Formally we have

    There a a variety of pros of the attention mechanism

    1. solves the information bottleneck problem

    2. helps with vanishing gradient problem: add highway-like gradient path

    3. provides some interpretablity, remembering image captioning? Somehow let us know what do the neurons care about. Another example is that we can get free alignment of parallel corpus! Very cool!

      Next time: more attention! I am super excited!

  • 相关阅读:
    xinetd编程
    我是这样学习Linux下C语言编程的编译命令gcc的使用
    Linux man命令的使用方法
    string.Format出现异常"输入的字符串格式有误"的解决方法
    .net 发送邮件
    cross join
    解决ASP.NET中的各种乱码问题
    网站推广优化教程100条(SEO,网站关键字优化,怎么优化网站,如何优化网站关键字)
    网页中嵌入Excel控件
    C#基础之 集合队列
  • 原文地址:https://www.cnblogs.com/ichn/p/8909282.html
Copyright © 2011-2022 走看看