zoukankan      html  css  js  c++  java
  • CUDA SDK VolumeRender 分析 (3)

    本文主要分析CUDA SDK sample如何同OpenGL相结合.

    在CUDA中调用OpenGL主要有以下几个要点:

    1. Interoperability with OpenGL requires that the CUDA device be specified by cudaGLSetGLDevice() before any other runtime calls.
    2. Register resource to CUDA before mapping. 一个资源只需注册一次
    3. After registering to CUDA, a resource should be mapped before accessing with CUDA function and unmapped after accessing it by calling cudaGraphicsMapResources() and cudaGraphicsUnmapResources().
    4. A mapped resource can be read from or written to by kernels using the device memory address returned by cudaGraphicsResourceGetMappedPointer()
      for buffers and cudaGraphicsSubResourceGetMappedArray() for CUDA arrays.
    5. DO NOT access a resource through OpenGL or Direct3D while it is mapped to CUDA, cause it will produce undefined results.

    整体伪代码

      1:   set_OpenGL_device(); 
    
      2:   register_resources();
    
      3:   
    
      4:   while( is_running )
    
      5:   {
    
      6:       map_resource();
    
      7:       resource_pointer *pointer = get_mapped_pointetr();
    
      8:       process_using_cuda( pointer );
    
      9:       unmap_resource();
    
     10:       do_normal_rendering();
    
     11:   }
    
     12:   unregister_resources();

    选择设备
      1: // sets device as the current device for the calling host thread.
    
      2: extern __host__ cudaError_t CUDARTAPI cudaGLSetGLDevice(int device);

    本例中被封装在chooseCudaDevice()函数中, 自动选择性能最佳的device.

    资源创建和注册

    这里使用的资源是Pixel Buffer Object : The buffer object storing pixel data is called Pixel Buffer Object (PBO). initPixelBuffer()函数负责创建并注册PBO.

      1:   // OpenGL pixel buffer object
    
      2:   GLuint pbo = 0;     
    
      3: 
    
      4:   // create pixel buffer object for display
    
      5:   glGenBuffersARB(1, &pbo);
    
      6:   glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, pbo);
    
      7:   glBufferDataARB(GL_PIXEL_UNPACK_BUFFER_ARB, width*height*sizeof(GLubyte)*4, 0, GL_STREAM_DRAW_ARB);
    
      8:   glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
    
      9: 
    
     10:   // register this buffer object with CUDA
    
     11:   cutilSafeCall(cudaGraphicsGLRegisterBuffer(&cuda_pbo_resource, pbo, cudaGraphicsMapFlagsWriteDiscard));  

    使用资源

      1: // CUDA Graphics Resource (to transfer PBO)
    
      2: struct cudaGraphicsResource *cuda_pbo_resource; 
    
      3: 
    
      4: // render image using CUDA
    
      5: void render()
    
      6: {
    
      7:   // Copy inverse view matrix to const device memory
    
      8:   copyInvViewMatrix(invViewMatrix, sizeof(float4)*3);
    
      9: 
    
     10:   // Map graphics resources for access by CUDA
    
     11:   uint *d_output;
    
     12:   cutilSafeCall(cudaGraphicsMapResources(1, &cuda_pbo_resource, 0));
    
     13: 
    
     14:   // Get CUDA device pointer
    
     15:   size_t num_bytes; 
    
     16:   cutilSafeCall(cudaGraphicsResourceGetMappedPointer((void **)&d_output, &num_bytes,  
    
     17:                    cuda_pbo_resource));
    
     18: 
    
     19:   // clear image
    
     20:   cutilSafeCall(cudaMemset(d_output, 0, width*height*4));
    
     21: 
    
     22:   // call CUDA kernel, writing results to PBO
    
     23:   render_kernel(gridSize, blockSize, d_output, width, height, density, brightness, transferOffset, transferScale);
    
     24: 
    
     25:   cutilSafeCall(cudaGraphicsUnmapResources(1, &cuda_pbo_resource, 0));
    
     26: }

    这里的cutilSafeCall()是cuda util中的函数, 负责log错误. render_kernel()前面的d_render()函数负责写入计算出的颜色到PBO, d_output是map得到的供CUDA存取的指向PBO内存的指针.

    显示

      1: // display results using OpenGL 
    
      2: void display()
    
      3: {
    
      4:     // use OpenGL to build view matrix
    
      5:     BuildViewMartix();
    
      6: 
    
      7:     // prepare pbo piexl
    
      8:     render();
    
      9: 
    
     10:     // display results
    
     11:     glClear(GL_COLOR_BUFFER_BIT);
    
     12: 
    
     13:     // draw image from PBO
    
     14:     glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
    
     15: 
    
     16:     // copy from pbo to texture
    
     17:     glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, pbo);
    
     18:     glBindTexture(GL_TEXTURE_2D, tex);
    
     19:     glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, width, height, GL_RGBA, GL_UNSIGNED_BYTE, 0);
    
     20:     glBindBufferARB(GL_PIXEL_UNPACK_BUFFER_ARB, 0);
    
     21: 
    
     22:     // draw textured quad
    
     23:     glEnable(GL_TEXTURE_2D);
    
     24:     glBegin(GL_QUADS);
    
     25:     glTexCoord2f(0, 0); glVertex2f(0, 0);
    
     26:     glTexCoord2f(1, 0); glVertex2f(1, 0);
    
     27:     glTexCoord2f(1, 1); glVertex2f(1, 1);
    
     28:     glTexCoord2f(0, 1); glVertex2f(0, 1);
    
     29:     glEnd();
    
     30: 
    
     31:     glDisable(GL_TEXTURE_2D);
    
     32:     glBindTexture(GL_TEXTURE_2D, 0);
    
     33: 
    
     34:     glutSwapBuffers();
    
     35:     glutReportErrors();
    
     36: 
    
     37:     cutilCheckError(cutStopTimer(timer));  
    
     38: 
    
     39:     computeFPS();
    
     40: }

    其中render()就是上面写入PBO的函数, 这个display()函数是由glutDisplayFunc()注册的显示函数. 也就是渲染的全过程. 为了简化函数, 中间省略了一些统计和不核心的处理.

    我们可以看到, 渲染的所有效果都是由CUDA通过volume render产生的, 最后OpenGL只是把结果作为一张图片贴在我们的视口上. 这里面有两个小细节glPixelStorei()函数修改数据对齐的单位, 详细介绍在这里. 第二是如何从PBO拷贝到纹理, Song Ho的OpenGL教程介绍的非常清楚, 我就不再赘述了.

    看过以上几期的分析, 希望大家对Volume Render和CUDA能有一些新的理解, 欢迎大家与我讨论学习. 下一次想分析一下这个例子的一些细节技术.

    参考:

    CUDA C Programming Guide

  • 相关阅读:
    linux下动态链接库.so文件 静态链接库.a文件创建及使用
    matlab 自动阈值白平衡算法 程序可编译实现
    C++ 迭代器介绍 [转摘]
    C++ Primer 第三章 标准库类型vector+迭代器iterator 运算
    matlab灰度变彩色+白平衡算法实现
    我和奇葩的故事之失联第七天
    C++ Primer 第三章 标准库类型string运算
    OpenCV白平衡算法之灰度世界法(消除RGB受光照影响)
    查看网络情况netstat指令与动态监控top指令
    linux服务
  • 原文地址:https://www.cnblogs.com/hucn/p/2109969.html
Copyright © 2011-2022 走看看