zoukankan      html  css  js  c++  java
  • 跑实验记录一

    1.使用tagger&wikipedia-pubmed-and-PMC-w2v词向量

    Loading pretrained embeddings from ../.local/lib/python3.5/site-packages/neuroner/data/word_vectors/wikipedia-pubmed-and-PMC-w2v.txt...
    WARNING: 5443657 invalid lines
    Loaded 0 pretrained embeddings.
    0 / 18309 (0.0000%) words have been initialized with pretrained embeddings.
    0 found directly, 0 after lowercasing, 0 after lowercasing + zero.
    Compiling...

    词向量无效的问题。

    2.使用tagger&PMC-w2v词向量

    Loading pretrained embeddings from ./dataset/PMC-w2v.txt...
    WARNING: 2515687 invalid lines
    Loaded 0 pretrained embeddings.
    0 / 18141 (0.0000%) words have been initialized with pretrained embeddings.
    0 found directly, 0 after lowercasing, 0 after lowercasing + zero.
    Compiling...

     依旧是词向量不能加载的问题。

    解决:找到原因了,因为词向量中的维度和默认维度不同,需要指定默认维度啊,--word_dim 200。即可:

    Found 10407 unique words (115614 in total)
    Loading pretrained embeddings from ./dataset/PMC-w2v.txt...
    Found 80 unique characters
    Found 9 unique named entity tags
    4595 / 4598 / 4840 sentences in train / dev / test.
    Saving the mappings to disk...

    Loading pretrained embeddings from ./dataset/PMC-w2v.txt...
    WARNING: 1 invalid lines
    Loaded 2515686 pretrained embeddings.
    17963 / 18141 (99.0188%) words have been initialized with pretrained embeddings.
    17876 found directly, 46 after lowercasing, 41 after lowercasing + zero.
    Compiling...

    目前使用的是Att中的CDR数据集进行训练的。

    3.使用tagger和chemdner_pubmed_drug.word2vec_model_token4_d50词向量

  • 相关阅读:
    frp最简配置 实现内网穿透(访问内网WEB服务器)
    frp最简配置 实现内网穿透(访问内网其他服务器SSH)
    Linux 进程树查看工具 pstree
    svn Server authz 配置示例(文件夹权限配置)
    centos7 安装 mysql5.7.25
    centos7中将tomcat注册为系统服务
    keepalived 配置文件解析
    datatables参数配置详解
    使用jquery.datatable.js注意事项
    ondblclick和dblclick区别
  • 原文地址:https://www.cnblogs.com/BlueBlueSea/p/10724243.html
Copyright © 2011-2022 走看看