zoukankan      html  css  js  c++  java
  • Unity中的shadows(四)collect shadows

    本文是Unity中的shadows系列的最后一篇文章。上一篇文章主要介绍了阴影接收的内容,这一篇文章来看下之前跳过的creen space shadow map下的阴影收集过程。

    阴影收集

    最后的最后,让我们来看一下平行光源下使用screen space shadow map时,unity会进行一次阴影收集的行为。通过frame debug我们可以看到:

    Unity使用了一个名为Internal-ScreenSpaceShadows的shader进行处理,幸运的是,这个shader是开源的,可以在unity官网上下载到。整个shader代码有400多行,这里就不完整贴出来了,只截取一些需要分析的代码段。

    首先注意到这个shader有4个subshader,分别处理软阴影和硬阴影:

    // ----------------------------------------------------------------------------------------
    // Subshader for hard shadows:
    // Just collect shadows into the buffer. Used on pre-SM3 GPUs and when hard shadows are picked.
    
    SubShader {
        Tags{ "ShadowmapFilter" = "HardShadow" }
        Pass {
            ZWrite Off ZTest Always Cull Off
    
            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag_hard
            #pragma multi_compile_shadowcollector
    
            inline float3 computeCameraSpacePosFromDepth(v2f i)
            {
                return computeCameraSpacePosFromDepthAndVSInfo(i);
            }
            ENDCG
        }
    }
    
    // ----------------------------------------------------------------------------------------
    // Subshader for hard shadows:
    // Just collect shadows into the buffer. Used on pre-SM3 GPUs and when hard shadows are picked.
    // This version does inv projection at the PS level, slower and less precise however more general.
    
    SubShader {
        Tags{ "ShadowmapFilter" = "HardShadow_FORCE_INV_PROJECTION_IN_PS" }
        Pass{
            ZWrite Off ZTest Always Cull Off
    
            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag_hard
            #pragma multi_compile_shadowcollector
    
            inline float3 computeCameraSpacePosFromDepth(v2f i)
            {
                return computeCameraSpacePosFromDepthAndInvProjMat(i);
            }
            ENDCG
        }
    }
    
    // ----------------------------------------------------------------------------------------
    // Subshader that does soft PCF filtering while collecting shadows.
    // Requires SM3 GPU.
    
    Subshader {
        Tags {"ShadowmapFilter" = "PCF_SOFT"}
        Pass {
            ZWrite Off ZTest Always Cull Off
    
            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag_pcfSoft
            #pragma multi_compile_shadowcollector
            #pragma target 3.0
    
            inline float3 computeCameraSpacePosFromDepth(v2f i)
            {
                return computeCameraSpacePosFromDepthAndVSInfo(i);
            }
            ENDCG
        }
    }
    
    // ----------------------------------------------------------------------------------------
    // Subshader that does soft PCF filtering while collecting shadows.
    // Requires SM3 GPU.
    // This version does inv projection at the PS level, slower and less precise however more general.
    
    Subshader{
        Tags{ "ShadowmapFilter" = "PCF_SOFT_FORCE_INV_PROJECTION_IN_PS" }
        Pass{
            ZWrite Off ZTest Always Cull Off
    
            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag_pcfSoft
            #pragma multi_compile_shadowcollector
            #pragma target 3.0
    
            inline float3 computeCameraSpacePosFromDepth(v2f i)
            {
                return computeCameraSpacePosFromDepthAndInvProjMat(i);
            }
            ENDCG
        }
    }
    
    vertex shader

    可以看到,这4个subshader所用到的vertex shader代码是同一份,来看一下它关键部分的实现:

    struct appdata {
        float4 vertex : POSITION;
        float2 texcoord : TEXCOORD0;
        float3 ray : TEXCOORD1;
    };
    
    v2f vert (appdata v)
    {
        v2f o;
        float4 clipPos;
    
        clipPos = UnityObjectToClipPos(v.vertex);
    
        o.pos = clipPos;
        o.uv.xy = v.texcoord;
    
        // unity_CameraInvProjection at the PS level.
        o.uv.zw = ComputeNonStereoScreenPos(clipPos);
    
        // Perspective case
        o.ray = v.ray;
    
        // To compute view space position from Z buffer for orthographic case,
        // we need different code than for perspective case. We want to avoid
        // doing matrix multiply in the pixel shader: less operations, and less
        // constant registers used. Particularly with constant registers, having
        // unity_CameraInvProjection in the pixel shader would push the PS over SM2.0
        // limits.
        clipPos.y *= _ProjectionParams.x;
        float3 orthoPosNear = mul(unity_CameraInvProjection, float4(clipPos.x,clipPos.y,-1,1)).xyz;
        float3 orthoPosFar  = mul(unity_CameraInvProjection, float4(clipPos.x,clipPos.y, 1,1)).xyz;
        orthoPosNear.z *= -1;
        orthoPosFar.z *= -1;
        o.orthoPosNear = orthoPosNear;
        o.orthoPosFar = orthoPosFar;
    
        return o;
    }
    

    v2f保存了要传递给fragment shader的信息,其中pos表示当前相机投影剪裁空间的位置,uv存储了顶点的纹理坐标和屏幕坐标。这个ray分量却是一个没见过的新鲜东西,它代表当主相机是透视相机时(注意当前相机和主相机不是一回事),从相机位置到远剪裁面顶点的射线,可以理解为是构成相机视锥体的4条射线。这么一说,仿佛传给vertex shader的顶点数量就只有4个一样。实际上呢,的确如此。

    怎么验证呢?unity本身并没有提供给我们查看传给vertex shader顶点信息的工具。不过不要紧,我们可以使用RenderDoc,截帧查看:

    6个index,恰好就是两个三角形,也就是4个顶点。可以看到传入的position其实就是屏幕的4个顶点坐标,TEXCOORD1就是上面提到的ray分量。这个ray是怎么算的呢?首先我们的camera组件的fov是60度,近裁剪面是0.3,远裁剪面是1000:

    ray的z分量就可以立刻求出。如果要求xy,我们还需要知道当前camera的aspect,可是camera组件并没有,这时可以从frame debug中捞一捞,看一下camera渲染的depth texture的尺寸是多少:

    这样aspect的值就显然了:

    [aspect =dfrac{w}{h} = dfrac{1141}{529} ]

    由于我们的fov是vertical的,所以优先求出远裁剪面的高度:

    [h = z_f cdot tan heta = 1000 imes dfrac{sqrt 3}{3} = 577.35 ]

    进而远裁剪面的宽度:

    [w = h cdot aspect = 577.35 imes dfrac{1141}{529} = 1245.29 ]

    怎么样,是不是完美地吻合!

    从RenderDoc中,我们还发现clipPos的坐标为:

    对照frame debug看当前跑这个shader的相机矩阵为:

    嗯,它就是个简单的正交投影,目的就是让变换后的顶点xy都在视锥体的边界((-w leq x leq w, -w leq y leq w)​​),这么做是为了方便后面的计算。

    往下看,clipPos.y乘上了一个_ProjectionParams.x的分量,它表示投影矩阵是否经过翻转。1就是没有翻转,-1就是翻转过。这个分量主要是用来处理camera渲染到render texutre的平台差异的。众所周知,DirectX的uv坐标系是顶点在左上角,v方向竖直向下;而OpenGL是顶点在左下角,v方向竖直向上:

    通过翻转投影矩阵,可以在渲染render texture时,不用关心平台差异,从0到1按OpenGL的规范渲染即可。这里对clipPos.y去手动乘以这个分量也是类似的原因,保证clipPos.y是符合OpenGL的规范的。

    接下来就是逆投影变换,把剪裁空间的点变换到相机空间,求出近远剪裁面的坐标。这里使用了unity_CameraInvProjection而不是UNITY_MATRIX_P的逆矩阵,表示我们不关心平台细节,直接按照OpenGL规范进行逆投影变换,并且这里的camera指的主相机而不是当前相机,而我们之前已经处理过clipPos,因此直接计算即可。

    最后对求出的orthoPosNear,orthoPosFar的z分量乘以-1,是因为在OpenGL规范中相机空间的z都是负数,再次取负变成正的z,方便后面计算。

    frag_hard

    fragement shader根据是否启用软阴影,有frag_hard和frag_pcfSoft两套实现。先看下frag_hard:

    fixed4 frag_hard (v2f i) : SV_Target
    {
        float4 wpos;
        float3 vpos;
    
        vpos = computeCameraSpacePosFromDepth(i);
        wpos = mul (unity_CameraToWorld, float4(vpos,1));
    
        fixed4 cascadeWeights = GET_CASCADE_WEIGHTS (wpos, vpos.z);
        float4 shadowCoord = GET_SHADOW_COORDINATES(wpos, cascadeWeights);
    
        //1 tap hard shadow
        fixed shadow = UNITY_SAMPLE_SHADOW(_ShadowMapTexture, shadowCoord);
        shadow = lerp(_LightShadowData.r, 1.0, shadow);
    
        fixed4 res = shadow;
        return res;
    }
    

    代码看起来通俗易懂,这里就不解释了,让我们看看里面用到的几个函数。computeCameraSpacePosFromDepth顾名思义就是计算当前pixel在相机空间中的位置,它有两套不同的实现方式,一种是unity_CameraInvProjection矩阵逆投影变换,还有一种是计算出当前pixel在相机空间的深度,手动插值计算:

    /**
    * Get camera space coord from depth and inv projection matrices
    */
    inline float3 computeCameraSpacePosFromDepthAndInvProjMat(v2f i)
    {
        float zdepth = SAMPLE_DEPTH_TEXTURE(_CameraDepthTexture, i.uv.xy);
    
        #if defined(UNITY_REVERSED_Z)
            zdepth = 1 - zdepth;
        #endif
    
        // View position calculation for oblique clipped projection case.
        // this will not be as precise nor as fast as the other method
        // (which computes it from interpolated ray & depth) but will work
        // with funky projections.
        float4 clipPos = float4(i.uv.zw, zdepth, 1.0);
        clipPos.xyz = 2.0f * clipPos.xyz - 1.0f;
        float4 camPos = mul(unity_CameraInvProjection, clipPos);
        camPos.xyz /= camPos.w;
        camPos.z *= -1;
        return camPos.xyz;
    }
    
    /**
    * Get camera space coord from depth and info from VS
    */
    inline float3 computeCameraSpacePosFromDepthAndVSInfo(v2f i)
    {
        float zdepth = SAMPLE_DEPTH_TEXTURE(_CameraDepthTexture, i.uv.xy);
    
        // 0..1 linear depth, 0 at camera, 1 at far plane.
        float depth = lerp(Linear01Depth(zdepth), zdepth, unity_OrthoParams.w);
    #if defined(UNITY_REVERSED_Z)
        zdepth = 1 - zdepth;
    #endif
    
        // view position calculation for perspective & ortho cases
        float3 vposPersp = i.ray * depth;
        float3 vposOrtho = lerp(i.orthoPosNear, i.orthoPosFar, zdepth);
        // pick the perspective or ortho position as needed
        float3 camPos = lerp(vposPersp, vposOrtho, unity_OrthoParams.w);
        return camPos.xyz;
    }
    

    GET_CASCADE_WEIGHTS宏根据是否定义SHADOWS_SPLIT_SPHERES关键字也有两套实现。这个关键字取决于前文提到quality settings中的Shadow Projection设置,只有设置为Stable Fit时该关键字才会启用。先看下没有定义该关键字,也就是Close Fit的情况:

    Close Fit的原理如图所示,根据相机距离的远近划分视锥体,对每个子视锥体计算其包围盒,该包围盒就是光源相机的视锥体。

    /**
     * Gets the cascade weights based on the world position of the fragment.
     * Returns a float4 with only one component set that corresponds to the appropriate cascade.
     */
    inline fixed4 getCascadeWeights(float3 wpos, float z)
    {
        fixed4 zNear = float4( z >= _LightSplitsNear );
        fixed4 zFar = float4( z < _LightSplitsFar );
        fixed4 weights = zNear * zFar;
        return weights;
    }
    

    函数接受两个参数,wpos是顶点的世界坐标,z是其在相机空间中的深度。这里的z遵循深度值越大,离相机越远的规则。代码很简单,就是计算当前坐标位于哪个cascade中。用frame debug确认下具体的参数细节:

    主相机的近剪裁面距离是0.3,远剪裁面距离是1000,但我们还设置了shadow distance为200,意味着200之后就不再有阴影了,所以这里可以就把shadow distance认为是远剪裁面的距离。cascade splits设置为6.7%,13.3%,26.7%,53.3%,这表示划分后4个子视锥体的大小。那么_LightSplitsNear_LightSplitsFar就是保存4个子视锥体的近剪裁面和远剪裁面距离。计算公式如下:

    [n_i = left{ egin{aligned} N, i = 0 \ f_{i-1}, i > 0 \ end{aligned} ight. \ f_i = n_i + (F - N) cdot p ]

    其中N,F就是近远剪裁面的距离,p是分割比例,代入计算后的结果就是_LightSplitsNear_LightSplitsFar的值。

    再看一下Stable Fit的情况,Stable Fit是用包围球的方式对视锥体进行划分的:

    我们在前文中提过,Stable Fit模式下的阴影不会受摄像机自身的位置和旋转影响,这里通过两张图我们可以发现,Close Fit时光源相机视锥体的大小会随着光源方向变化,而Stable Fit却不会。光源视锥体大小变化意味着生成的shadow map分辨率会发生变化,从而产生一种称为阴影边缘闪烁的瑕疵。

    /**
     * Gets the cascade weights based on the world position of the fragment and the poisitions of the split spheres for each cascade.
     * Returns a float4 with only one component set that corresponds to the appropriate cascade.
     */
    inline fixed4 getCascadeWeights_splitSpheres(float3 wpos)
    {
        float3 fromCenter0 = wpos.xyz - unity_ShadowSplitSpheres[0].xyz;
        float3 fromCenter1 = wpos.xyz - unity_ShadowSplitSpheres[1].xyz;
        float3 fromCenter2 = wpos.xyz - unity_ShadowSplitSpheres[2].xyz;
        float3 fromCenter3 = wpos.xyz - unity_ShadowSplitSpheres[3].xyz;
        float4 distances2 = float4(dot(fromCenter0,fromCenter0), dot(fromCenter1,fromCenter1), dot(fromCenter2,fromCenter2), dot(fromCenter3,fromCenter3));
        fixed4 weights = float4(distances2 < unity_ShadowSplitSqRadii);
        weights.yzw = saturate(weights.yzw - weights.xyz);
        return weights;
    }
    

    函数计算当前顶点与4个包围球的距离,unity_ShadowSplitSpheres就是4个包围球的世界坐标,unity_ShadowSplitSqRadii就是4个包围球的半径平方。由于包围球可能两两之间有交集,一个点可能位于交集之中,Unity对这种情况进行了处理,确保顶点只属于某一个包围球。同样可以使用frame debug看到具体的细节:

    然后我们继续看代码,GET_SHADOW_COORDINATES宏也有两套不同的实现,取决于是否定义SHADOWS_SINGLE_CASCADE关键字。这个关键字表示在Shadow Projection设置中cascade的数量,如果设置为No Cascades则会定义该关键字。这种情况下的函数实现非常简单:

    /**
     * Same as the getShadowCoord; but optimized for single cascade
     */
    inline float4 getShadowCoord_SingleCascade( float4 wpos )
    {
        return float4( mul (unity_WorldToShadow[0], wpos).xyz, 0);
    }
    

    对于使用cascade的情况,函数定义如下:

    /**
     * Returns the shadowmap coordinates for the given fragment based on the world position and z-depth.
     * These coordinates belong to the shadowmap atlas that contains the maps for all cascades.
     */
    inline float4 getShadowCoord( float4 wpos, fixed4 cascadeWeights )
    {
        float3 sc0 = mul (unity_WorldToShadow[0], wpos).xyz;
        float3 sc1 = mul (unity_WorldToShadow[1], wpos).xyz;
        float3 sc2 = mul (unity_WorldToShadow[2], wpos).xyz;
        float3 sc3 = mul (unity_WorldToShadow[3], wpos).xyz;
        float4 shadowMapCoordinate = float4(sc0 * cascadeWeights[0] + sc1 * cascadeWeights[1] + sc2 * cascadeWeights[2] + sc3 * cascadeWeights[3], 1);
    #if defined(UNITY_REVERSED_Z)
        float  noCascadeWeights = 1 - dot(cascadeWeights, float4(1, 1, 1, 1));
        shadowMapCoordinate.z += noCascadeWeights;
    #endif
        return shadowMapCoordinate;
    }
    

    cascadeWeights是一个四维向量,最多只有一个维度的值为1。当cascadeWeights是零向量时,即顶点不在阴影中,需要考虑下reverse z的情况,因为此时计算出的shadowMapCoordinate的z分量是0。

    frag_pcfSoft

    这样一来,我们就只剩下软阴影的代码还没看了:

    /**
     *  Soft Shadow (SM 3.0)
     */
    fixed4 frag_pcfSoft(v2f i) : SV_Target
    {
        float4 wpos;
        float3 vpos;
    
        vpos = computeCameraSpacePosFromDepth(i);
        // sample the cascade the pixel belongs to
        wpos = mul(unity_CameraToWorld, float4(vpos,1));
        
        fixed4 cascadeWeights = GET_CASCADE_WEIGHTS(wpos, vpos.z);
        float4 coord = GET_SHADOW_COORDINATES(wpos, cascadeWeights);
    
        float3 receiverPlaneDepthBias = 0.0;
    #ifdef UNITY_USE_RECEIVER_PLANE_BIAS
        // Reveiver plane depth bias: need to calculate it based on shadow coordinate
        // as it would be in first cascade; otherwise derivatives
        // at cascade boundaries will be all wrong. So compute
        // it from cascade 0 UV, and scale based on which cascade we're in.
        float3 coordCascade0 = getShadowCoord_SingleCascade(wpos);
        float biasMultiply = dot(cascadeWeights,unity_ShadowCascadeScales);
        receiverPlaneDepthBias = UnityGetReceiverPlaneDepthBias(coordCascade0.xyz, biasMultiply);
    #endif
    
    #if defined(SHADER_API_MOBILE)
        half shadow = UnitySampleShadowmap_PCF5x5(coord, receiverPlaneDepthBias);
    #else
        half shadow = UnitySampleShadowmap_PCF7x7(coord, receiverPlaneDepthBias);
    #endif
        shadow = lerp(_LightShadowData.r, 1.0f, shadow);
    
        // Blend between shadow cascades if enabled
        //
        // Not working yet with split spheres, and no need when 1 cascade
    #if UNITY_USE_CASCADE_BLENDING && !defined(SHADOWS_SPLIT_SPHERES) && !defined(SHADOWS_SINGLE_CASCADE)
        half4 z4 = (float4(vpos.z,vpos.z,vpos.z,vpos.z) - _LightSplitsNear) / (_LightSplitsFar - _LightSplitsNear);
        half alpha = dot(z4 * cascadeWeights, half4(1,1,1,1));
    
        UNITY_BRANCH
            if (alpha > 1 - UNITY_CASCADE_BLEND_DISTANCE)
            {
                // get alpha to 0..1 range over the blend distance
                alpha = (alpha - (1 - UNITY_CASCADE_BLEND_DISTANCE)) / UNITY_CASCADE_BLEND_DISTANCE;
    
                // sample next cascade
                cascadeWeights = fixed4(0, cascadeWeights.xyz);
                coord = GET_SHADOW_COORDINATES(wpos, cascadeWeights);
    
    #ifdef UNITY_USE_RECEIVER_PLANE_BIAS
                biasMultiply = dot(cascadeWeights,unity_ShadowCascadeScales);
                receiverPlaneDepthBias = UnityGetReceiverPlaneDepthBias(coordCascade0.xyz, biasMultiply);
    #endif
    
                half shadowNextCascade = UnitySampleShadowmap_PCF3x3(coord, receiverPlaneDepthBias);
                shadowNextCascade = lerp(_LightShadowData.r, 1.0f, shadowNextCascade);
                shadow = lerp(shadow, shadowNextCascade, alpha);
            }
    #endif
    
        return shadow;
    }
    

    代码基本是自明的,同硬阴影类似,首先计算出当前pixel的相机空间坐标和世界坐标,然后转换到阴影空间求得其阴影剪裁空间坐标。根据是否启用bias,调用前文提到过的UnityGetReceiverPlaneDepthBias计算bias,然后对shadow map进行采样,如果是移动平台,则使用PCF5x5的方式进行采样,否则使用PCF7x7。这两者的实现方式与前文提到过的UnitySampleShadowmap_PCF3x3类似,都是使用等腰三角形进行面积覆盖,计算出不同采样点的权重。只不过5x5需要采样9次,用到的等腰三角形长为5个texel,高为2.5个texel;7x7需要采样16次,用到的等腰三角形长为7个texel,高为3.5个texel。

    由于当前pixel可能是位于两个不同的cascade边界附近,因此可以对采样的结果再做一次blend操作。我们直接对cascadeWeights向量做一次右移操作即可得到下一个cascade对应的weights向量,然后拿这个向量再计算一次当前pixel的阴影剪裁空间坐标,去采样shadowmap。这样就可以对两个cascade下shadowmap的采样结果进行blend,blend的权重就是当前pixe到所属cascade近剪裁面的距离。

    本系列文章所有的参考如下:

    Reference

    [1] Shadows

    [2] 反向Z(Reversed-Z)的深度缓冲原理

    [3] 自适应Shadow Bias算法

    [4] OpenGL Projection Matrix

    [5] UWA问答

    [6] Unity实时阴影实现——Screen Space Shadow Mapping

    [7] SampleCmp (DirectX HLSL Texture Object)

    [8] what is in float4 _LightShadowData?

    [9] An introduction to shader derivative functions

    [10] Shadow Mapping: GPU-based Tips and Techniques

    [11] 阴影的PCF采样优化算法

    [12] 把float编码到RGBA8

    [13] CubeMap采样过程

    [14] Unity实时阴影实现——Cascaded Shadow Mapping

    [15] Cascade Shadow进阶之路

    [16] 关于ComputeScreenPos和ComputeGrabScreenPos的差别

    [17] What's difference between UNITY_MATRIX_P and unity_CameraProjection?

    如果你觉得我的文章有帮助,欢迎关注我的微信公众号(大龄社畜的游戏开发之路)

  • 相关阅读:
    Flash 全局安全性设置面板
    响应式布局的一个例子mark
    移动平台WEB前端开发技巧汇总
    自定义事件机制——观察者模式
    学习之响应式Web设计:Media Queries和Viewports
    常用栅格布局方案
    观察者模式的一个例子
    二进制文件转换为文本工具
    C#面向对象名词比较(二)
    MSN消息提示类
  • 原文地址:https://www.cnblogs.com/back-to-the-past/p/15130735.html
Copyright © 2011-2022 走看看