[NLP] The Annotated Transformer 代码修正

zoukankan html css js c++ java

[NLP] The Annotated Transformer 代码修正
1. RuntimeError: "exp" not implemented for 'torch.LongTensor'

class PositionalEncoding(nn.Module)
div_term = torch.exp(torch.arange(0., d_model, 2) * -(math.log(10000.0) / d_model))
将 “0” 改为 “0.”

否则会报错：RuntimeError: "exp" not implemented for 'torch.LongTensor'

2. RuntimeError: expected type torch.FloatTensor but got torch.LongTensor

class PositionalEncoding(nn.Module)
position = torch.arange(0., max_len).unsqueeze(1)
将 “0” 改为 “0.”

否则会报错：

pe[:, 0::2] = torch.sin(position * div_term)
RuntimeError: expected type torch.FloatTensor but got torch.LongTensor

3. UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

def make_model
nn.init.xavier_uniform_(p)
将“nn.init.xavier_uniform(p)” 改为 “nn.init.xavier_uniform_(p)”

否则会提示：UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

4. UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

class LabelSmoothing
self.criterion = nn.KLDivLoss(reduction='sum')
将 “self.criterion = nn.KLDivLoss(size_average=False)” 改为 “self.criterion = nn.KLDivLoss(reduction='sum')”

否则会提示：UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

5. IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

class SimpleLossCompute
return loss.item() * norm
将 “loss.data[0]” 改为 loss.item()，

否则会报错：IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

6. floating point exception (core dumped)

直接运行“A First Example”会报错：floating point exception (core dumped)

参考的修改方法：https://github.com/harvardnlp/annotated-transformer/issues/26，该方法中，修改 run_epoch 函数，将计数值转换为numpy。方法：.detach().numpy() 或者直接 .numpy()

但是我试了仍有问题。最后需要将gpu上的先移到cpu中，再进行numpy转换。

以下是自己调整后的代码，是可以正确运行的：
1 def run_epoch(data_iter, model, loss_compute, epoch = 0): 2 "Standard Training and Logging Function" 3 start = time.time() 4 total_tokens = 0 5 total_loss = 0 6 tokens = 0 7 for i, batch in enumerate(data_iter): 8 out = model.forward(batch.src, batch.trg, batch.src_mask, batch.trg_mask) 9 loss = loss_compute(out, batch.trg_y, batch.ntokens) 10 11 total_loss += loss.detach().cpu().numpy() 12 total_tokens += batch.ntokens.cpu().numpy() 13 tokens += batch.ntokens.cpu().numpy() 14 if i % 50 == 1: 15 elapsed = time.time() - start 16 print("Epoch Step: %d Loss: %f Tokens per Sec: %f" % (i, loss.detach().cpu().numpy() / batch.ntokens.cpu().numpy(), tokens / elapsed)) 17 start = time.time() 18 tokens = 0 19 return total_loss / total_tokens
7. loss 均为整数

class SimpleLossCompute

在运行“A First Example” 时，结果显示的 loss 全部是整数，这就很奇怪了。测试后发现，是 class SimpleLossCompute中的返回值的问题，norm这个tensor是int型的，虽然loss.item()是浮点数，但是return loss.item() * norm的值仍是int型tensor.

修改方法：将norm转为float再进行乘法运算：
return loss.item() * norm.float()
查看全文

相关阅读:
个人工作总结07
软件项目第一个Sprint评分
 丹佛机场行李系统没能及时交工的原因
 第一次团队冲刺 5
第一次团队冲刺4
第一次团队冲刺3
第一次团队冲刺2
第一次团队冲刺 1
风险评估
 团队开发——第一篇scrum报告

原文地址：https://www.cnblogs.com/shiyublog/p/10909009.html

[NLP] The Annotated Transformer 代码修正

1. RuntimeError: "exp" not implemented for 'torch.LongTensor'

2. RuntimeError: expected type torch.FloatTensor but got torch.LongTensor

3. UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

4. UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.

5. IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

6. floating point exception (core dumped)

7. loss 均为整数