zoukankan      html  css  js  c++  java
  • [NLP] The Annotated Transformer 代码修正

    1. RuntimeError: "exp" not implemented for 'torch.LongTensor'

    class PositionalEncoding(nn.Module)

    div_term = torch.exp(torch.arange(0., d_model, 2) *
                                 -(math.log(10000.0) / d_model))

    将 “0” 改为 “0.”

    否则会报错:RuntimeError: "exp" not implemented for 'torch.LongTensor'

    2. RuntimeError: expected type torch.FloatTensor but got torch.LongTensor

    class PositionalEncoding(nn.Module)

    position = torch.arange(0., max_len).unsqueeze(1)

    将 “0” 改为 “0.”

    否则会报错:

    pe[:, 0::2] = torch.sin(position * div_term)
    RuntimeError: expected type torch.FloatTensor but got torch.LongTensor

    3. UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

    def make_model

    nn.init.xavier_uniform_(p)

    将“nn.init.xavier_uniform(p)” 改为 “nn.init.xavier_uniform_(p)”

    否则会提示:UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

    4. UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

    class LabelSmoothing

    self.criterion = nn.KLDivLoss(reduction='sum')

    将 “self.criterion = nn.KLDivLoss(size_average=False)” 改为 “self.criterion = nn.KLDivLoss(reduction='sum')”

    否则会提示:UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

    5. IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

    class SimpleLossCompute

    return loss.item() * norm

    将 “loss.data[0]” 改为 loss.item(),

    否则会报错:IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

    6. floating point exception (core dumped)

    直接运行“A First Example”会报错:floating point exception (core dumped)

    参考的修改方法:https://github.com/harvardnlp/annotated-transformer/issues/26,该方法中,修改 run_epoch 函数,将计数值转换为numpy。方法:.detach().numpy() 或者直接 .numpy()

    但是我试了仍有问题。最后需要将gpu上的先移到cpu中,再进行numpy转换。

    以下是自己调整后的代码,是可以正确运行的:

     1 def run_epoch(data_iter, model, loss_compute, epoch = 0):
     2     "Standard Training and Logging Function"
     3     start = time.time()
     4     total_tokens = 0
     5     total_loss = 0
     6     tokens = 0
     7     for i, batch in enumerate(data_iter):
     8         out = model.forward(batch.src, batch.trg, batch.src_mask, batch.trg_mask)
     9         loss = loss_compute(out, batch.trg_y, batch.ntokens)
    10 
    11         total_loss += loss.detach().cpu().numpy()
    12         total_tokens += batch.ntokens.cpu().numpy()
    13         tokens += batch.ntokens.cpu().numpy()
    14         if i % 50 == 1:
    15             elapsed = time.time() - start
    16             print("Epoch Step: %d Loss: %f Tokens per Sec: %f" % (i, loss.detach().cpu().numpy() / batch.ntokens.cpu().numpy(), tokens / elapsed))
    17             start = time.time()
    18             tokens = 0
    19     return total_loss / total_tokens  

    7. loss 均为整数

    class SimpleLossCompute

    在运行“A First Example” 时, 结果显示的 loss 全部是整数,这就很奇怪了。测试后发现,是 class SimpleLossCompute中的返回值的问题,norm这个tensor是int型的,虽然loss.item()是浮点数,但是return loss.item() * norm的值仍是int型tensor.

    修改方法:将norm转为float再进行乘法运算:

    return loss.item() * norm.float()

  • 相关阅读:
    易语言软件加VMProtect壳的正确方法
    ghost系统到硬盘完后,重启进入winxp安装的画面变成了蓝屏
    万象客户端设置服务端ip保存在注册表的位置
    php乱码解决
    远程桌面Default.rdp 中各个参数的含义
    关闭自动检测磁盘
    关于collapsed margin(外边距合并)
    position定位
    grunt-replace和grunt-include-replace问题
    关于动态生成dom绑定事件失效的原因
  • 原文地址:https://www.cnblogs.com/shiyublog/p/10909009.html
Copyright © 2011-2022 走看看