zoukankan      html  css  js  c++  java
  • Word2vec多线程(tensorflow)

    workers = []

    for _ in xrange(opts.concurrent_steps):

    t = threading.Thread(target=self._train_thread_body)

    t.start()

    workers.append(t)

       

       

    Word2vec.py使用了多线程

    一般认为python多线程其实是单线程 由于python的设计 GPL 内存不是现成安全的

    但是这里由于内部是调用c++代码 所以还是能起到多线程作用

       

    Word2vec skipgramoperator内部类设计 解决多线程访问冲突问题用的是锁

    mutex mu_;

    random::PhiloxRandom philox_ GUARDED_BY(mu_);

    random::SimplePhilox rng_ GUARDED_BY(mu_);

    int32 current_epoch_ GUARDED_BY(mu_) = -1;

    int64 total_words_processed_ GUARDED_BY(mu_) = 0;

    int32 example_pos_ GUARDED_BY(mu_);

    int32 label_pos_ GUARDED_BY(mu_);

    int32 label_limit_ GUARDED_BY(mu_)

       

    觉得operator的操作还是单线程并行执行的 由于锁

    后面的batch计算是并行的

    def _train_thread_body(self):

    initial_epoch, = self._session.run([self._epoch])

    while True:

    _, epoch = self._session.run([self._train, self._epoch])

    if epoch != initial_epoch:

    break

       

    (words, counts, words_per_epoch, self._epoch, self._words, examples,

    labels) = word2vec.skipgram(filename=opts.train_data,

    batch_size=opts.batch_size,

    window_size=opts.window_size,

    min_count=opts.min_count,

    subsample=opts.subsample

       

       

       

    The threading lock only affects Python code. If your thread is waiting for disk I/O or if it is calling C functions (e.g. via math library) you can ignore the GIL.

    You may be able to use the async pattern to get around threading limits. Can you supply more information about what your program actually does?

    I have issues with the technical accuracy of the video linked. David Beazley has done many well respected talks about the GIL at various Pycons. You can find them on pyvideo.org.

       

    来自 <https://www.reddit.com/r/Python/comments/3s0vg9/is_my_multithreaded_python_program_doomed/>

       

       

  • 相关阅读:
    scipy.spatial.distance.cdist
    关于hstack和Svstack
    numpy.hstack(tup)
    numpy.random.uniform(记住文档网址)
    Python集合(set)类型的操作
    python+Eclipse+pydev环境搭建
    python数据挖掘领域工具包
    LVS 命令使用
    CMD mysql 备份脚本
    Windos Server Tomcat 双开配置
  • 原文地址:https://www.cnblogs.com/rocketfan/p/5052243.html
Copyright © 2011-2022 走看看