zoukankan      html  css  js  c++  java
  • GPU基本概念详解

    §1 个 multiprocessor <-> 1个instruction unit  <-> 8 个processor  <-> 在一个warp中执行  <-> 32条threads

    1个processor处理1条thread,所以1个warp在1个multiprocessor中需要4个clock cycles执行完成;

    1个multiprocessor可以处理多个block,但是一个block只能放在一个multiprocessor中;

      

    一个block里头有shared memory,这些shared memory分成16个banks,刚好让half-warp (一个warp有32条threads)操作。当多条processor同时要同一个bank里头的数据时,bank会broadcast,让这些thread排队,这其实是sequential的,就与CPU没有区别,没有了GPU的优点,所以要尽量避免这种情况发生。

    一个processor一次只能处理一条thread,之所以比CPU快,是因为GPU有多个processor。

    coalescing:global memory transactions (基本单位 half-warp:16条threads)

    的时候,第一条thread对应的address必须是64bytes的倍数,然后之后31条threads 一一对应,只有这样才能批量的global memory transaction,否则thread就要一条一条做global memory transaction,效率很低。

    úEach active thread is allocated some registers for the entire lifetime of the thread
    úEach active block is allocated shared memory for the entire lifetime of the block
     
    §Very fast on-chip memory
    §Can be used to avoid non-coalesced global memory accesses
    §Can be used to reduce global memory accesses
    §
    §Shared memory is organized into 16 banks, where successive 4-byte words are assigned to successive banks
    §Memory load or store of n addresses by a half-warp that span n distinct memory banks can be serviced simultaneously
    §If multiple addresses map to the same memory bank, the accesses are serialized
    §If multiple requests for the same memory address, a broadcast occurs
     
    同一个block中的threads共用shared memory,同一个block中的thread才能进行__syncthreads()。这个是shared memory的概念。global memory有thread的概念,但是没有block的概念,即没有将thread组织成block,让同一个block中的thread相互协作。
    Shared memory is organized into 16 banks
     
    在device的code中,如果是unqualified的变量是register memory的。
     
     
  • 相关阅读:
    多线程 信号量
    sql在不同数据库查询前几条数据
    Office Outlook同步 很奇怪的BUG
    搜索小技巧整理
    想做一个权限管理插件
    ibatis和Castle学习历程
    查找存储过程中的错误位置
    VS2005项目模版丢失解决方案及VS2005项目模版查找原理
    C# 邮件发送接收
    数据库优化整合
  • 原文地址:https://www.cnblogs.com/qingsunny/p/3402732.html
Copyright © 2011-2022 走看看