zoukankan      html  css  js  c++  java
  • Load store action in vulkan & ogles 的解决方案

    metal的带宽之前的blog有讲

    这篇主要是vulkan 和ogles的解决方案

    https://www.khronos.org/registry/vulkan/specs/1.1-extensions/man/html/VkAttachmentDescription.html

    typedef struct VkAttachmentDescription {
        VkAttachmentDescriptionFlags    flags;
        VkFormat                        format;
        VkSampleCountFlagBits           samples;
        VkAttachmentLoadOp              loadOp;
        VkAttachmentStoreOp             storeOp;
        VkAttachmentLoadOp              stencilLoadOp;
        VkAttachmentStoreOp             stencilStoreOp;
        VkImageLayout                   initialLayout;
        VkImageLayout                   finalLayout;
    } VkAttachmentDescription;

    这样 load store action和programmable blending 

    vulkan上都有对应实现了

    (vulkan还能单独操作stencil的load store。。。)

    这样 这两步在unity上应该 都是跨平台支持的 不需要分平台支持了

    下面就是metal2那些 feature 在vulkan上的实现。。。

    这里八成是要分平台写 因为metal里面的那些关键字别的平台又没有对应。。

    =======================================================================================

    https://developer.arm.com/solutions/graphics/developer-guides/understanding-render-passes/how-render-passes-work

    opengl下 上述方案对应的解决方案

    ogles是隐式控制pass切换的

    • The application calls glBindFramebuffer() to change the GL_FRAMEBUFFER or GL_DRAW_FRAMEBUFFER target.
    • The application calls glFramebufferTexture*() or glFramebufferRenderbuffer() to change the attachments of the currently bound draw framebuffer object when the drawing is queued.
    • The application calls eglSwapBuffers() to signal the end of a frame.
    • The application calls glFlush() or glFinish() to explicitly flush any queued rendering.
    • The application creates a glFenceSync() for some rendering in the current render pass and then calls glClientWaitSync() to wait on the completion of that work, or an equivalent behavior with a query object

     vulkan和metal是显式控制切换的

    this means avoiding:

    • Reading in older framebuffer values at the start of a render pass if they are going to be overdrawn.
    • Writing out values at the end of each render pass which are transient and are only needed for the duration of that render pass.

    这两步就是load store

    mali gpu

    避免tile memory和ddr之间的traffic

    这部分的traffic都是带宽 带宽除此之外还包括texture sample

    https://developer.arm.com/solutions/graphics/developer-guides/understanding-render-passes/efficient-render-passes

    1.避免切出去再切回来

    2.尽量合并 共用renderpass

    OGLES:

    Load

    • glClear()
    • glClearBuffer*()
    • glInvalidateFramebuffer()

    Caution: Only the start of tile clear is free. Calling glClear() or glClearBuffer*() after the first draw call in a render pass is not free, and this results in a per-fragment clear shader.

    For Vulkan, set the loadOp for each attachment to either of:

    • VK_ATTACHMENT_LOAD_OP_CLEAR
    • VK_ATTACHMENT_LOAD_OP_DONT_CARE

    Caution: If you call VkCmdClear*() commands to clear an attachment, or manually use a shader to write a constant color, it results in a per-fragment clear shader. To benefit from the fast fixed-function tile initialization, it is much more efficient to use the render pass loadOp operations.

    不影响最终结果的情况下 mali里面invalidate operation 优于 a clear operation.

    Store

    用这个 glInvalidateFramebuffer

    OpenGL ES, you can notify the driver that an attachment is transient by marking the content as invalid using a call to glInvalidateFramebuffer() as the last draw call in the render pass.

    Note: If you write applications using OpenGL ES 2.0, you must use glDiscardFramebufferExt() from the [EXT_discard_framebuffer][EXT_dfb] extension.

    depth能用这个么。。。GL_COLOR_ATTACHMENTiGL_DEPTH_ATTACHMENTGL_STENCIL_ATTACHMENT, and/or GL_DEPTH_STENCIL_ATTACHMENT....可以 处理depth

    这样解决方案就ok了 

    For Vulkan, set the storeOp for each transient attachment to VK_ATTACHMENT_STORE_OP_DONT_CARE. For more efficiency, the application can even avoid allocating physical backing memory for transient attachments by allocating the backing memory using VK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT and constructing the VkImagewith VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT.

    这块是metal里 memoryless在vulkan上的解决方案

    depthstencil在vulkan metal上的建议是 别包一起 我们确实遇到这个问题了 就一个mask却把depth一起开了 这种状况是要避免的

     -----------------

        EXT_discard_framebuffer
    这个unity有支持
    ===================
    看了下unity代码 load store action它在ogles里也是支持的
    那之后测下msaa下 是否生效就可以了
    metal下msaa是不生效的 有bug要修
    https://www.cnblogs.com/minggoddess/p/11447389.html
  • 相关阅读:
    计算机精英协会考核题 —— 第三题:斐波那契数
    pandas向表格中循环写入数据
    fiddler导出请求返回的响应数据
    notepad++下载及安装
    UVA 1647 Computer Transformation(计算机变换)(找规律)
    UVA 1612 Guess (猜名次)(贪心)
    UVA 11925 Generating Permutations(生成排列)(构造)
    UVA 1611 Crane(起重机)(贪心)
    UVA 10570 Meeting with Aliens(外星人聚会)(暴力枚举)
    【洛谷P1352】没有上司的舞会【树形DP】
  • 原文地址:https://www.cnblogs.com/minggoddess/p/11236547.html
Copyright © 2011-2022 走看看