zoukankan      html  css  js  c++  java
  • MD5算法详解

    前面一篇,带大家对加密算法进行了鸟瞰,本篇主要谈md5算法的实现。

    MD5:Message-Digest Algorithm 5(信息摘要5),确保信息的完整性。其算法是1992年公开的,那时我才几岁,鉴于大家对md5都很熟悉,且程序中经常应用,我就不再介绍了。我简单的介绍下设计者。其人是罗纳德·李维斯特,美国密码学家,后来发明了非对称秘钥RSA算法,因这个算法的在信息安全中的突破与重要性而获得了2002年的图灵奖。

    好了,接下来一起看算法步骤以及源代码:

    1、填充

    在MD5算法中,首先需要对信息进行填充,使其位长对512求余的结果等于448,并且填充必须进行,使其位长对512求余的结果等于448。因此,信息的位长(Bits Length)将被扩展至N*512+448,N为一个非负整数,N可以是零。

    理解:位长,就是位数。比如一个“wbq”,字符串是三个字节存储,一个字节8bit,所以位长就是24。

    用数学语言可能更简洁:设M为位长,当且仅当  M%512==448时,才可以处理。换另一种表示方式,M=N*512+448 ,N>=0

    填充的方法如下:

    1) 在信息的后面填充一个1和无数个0,直到满足上面的条件时才停止用0对信息的填充。

    2) 在这个结果后面附加一个以64位二进制表示的填充前信息长度(单位为Bit),如果二进制表示的填充前信息长度超过64位,则取低64位。

    经过这两步的处理,M=N*512+448+64=(N+1)*512,即长度恰好是512的整数倍。这样做的原因是为满足后面处理中对信息长度的要求。

    经过两步处理后,信息变成了这样,如下图所示:

    64位,8个字节,用来表示原始信息的位长。

     1         private static UInt32[] MD5_Append(byte[] input)
     2         {
     3             int zeros = 0;
     4             int ones = 1;
     5             int size = 0;
     6             int n = input.Length;
     7             int m = n % 64;
     8             if (m < 56)
     9             {
    10                 zeros = 55 - m;
    11                 size = n - m + 64;
    12             }
    13             else if (m == 56)
    14             {
    15                 zeros = 0;
    16                 ones = 0;
    17                 size = n + 8;
    18             }
    19             else
    20             {
    21                 zeros = 63 - m + 56;
    22                 size = n + 64 - m + 64;
    23             }
    24 
    25             ArrayList bs = new ArrayList(input);
    26             if (ones == 1)
    27             {
    28                 bs.Add((byte)0x80); // 0x80 = $10000000
    29             }
    30             for (int i = 0; i < zeros; i++)
    31             {
    32                 bs.Add((byte)0);
    33             }
    34 
    35             UInt64 N = (UInt64)n * 8;
    36             byte h1 = (byte)(N & 0xFF);
    37             byte h2 = (byte)((N >> 8) & 0xFF);
    38 
    39             byte h3 = (byte)((N >> 16) & 0xFF);
    40             byte h4 = (byte)((N >> 24) & 0xFF);
    41             byte h5 = (byte)((N >> 32) & 0xFF);
    42             byte h6 = (byte)((N >> 40) & 0xFF);
    43             byte h7 = (byte)((N >> 48) & 0xFF);
    44             byte h8 = (byte)(N >> 56);
    45             bs.Add(h1);
    46             bs.Add(h2);
    47             bs.Add(h3);
    48             bs.Add(h4);
    49             bs.Add(h5);
    50             bs.Add(h6);
    51             bs.Add(h7);
    52             bs.Add(h8);
    53             byte[] ts = (byte[])bs.ToArray(typeof(byte));
    54 
    55             /* Decodes input (byte[]) into output (UInt32[]). Assumes len is
    56              * a multiple of 4.
    57              */
    58             UInt32[] output = new UInt32[size / 4];
    59             for (Int64 i = 0, j = 0; i < size; j++, i += 4)
    60             {
    61                 output[j] = (UInt32)(ts[i] | ts[i + 1] << 8 | ts[i + 2] << 16 | ts[i + 3] << 24);
    62             }
    63             return output;
    64         }

    说明,补多少0,如何补?第7行,求余。第10行,为什么是55-m,而不是56-m?此时m<56,56-m表示,还需要补多少。因为需要补1个1,所以补0,就是56-m-1=55-m。那么变更后的长度size如何计算?应该是新长度=原始长度+补1的长度+补0的长度+最后64位的长度,第11行  size = n - m + 64,推导如下:

    size=n+1+55-m+8=n-m+64

    注意:这里的计算都是字节数的计算

    其余两个分支,可以以此类推。从35-44行,把原始信息的位长转为字节,追加到数组后面。58行以后,是把信息划分了4组。分组是UInt32,无符号32位,即4个字节。61行的操作,就是把四个字节转为一个UInt32。

    2、初始化变量

          private static void MD5_Init()
            {
                A = 0x67452301;  //in memory, this is 0x01234567
                B = 0xefcdab89;  //in memory, this is 0x89abcdef
                C = 0x98badcfe;  //in memory, this is 0xfedcba98
                D = 0x10325476;  //in memory, this is 0x76543210
            }

    注意:这里用的是小端模式,什么是大端和小端模式?

    举一个例子,比如数字0x12 34 56 78在内存中的表示形式。

    1)大端模式:Big-Endian就是高位字节排放在内存的低地址端,低位字节排放在内存的高地址端。(其实大端模式比较直观)

    低地址 --------------------> 高地址
    0x12  |  0x34  |  0x56  |  0x78

    2)小端模式:Little-Endian就是低位字节排放在内存的低地址端,高位字节排放在内存的高地址端。

    低地址 --------------------> 高地址
    0x78  |  0x56  |  0x34  |  0x12

    3. 处理分组数据

            private static UInt32[] MD5_Trasform(UInt32[] x)
            {
                UInt32 a, b, c, d;
    
                for (int k = 0; k < x.Length; k += 16)
                {
                    a = A;
                    b = B;
                    c = C;
                    d = D;
    
                    /* Round 1 */
                    FF(ref a, b, c, d, x[k + 0], S11, 0xd76aa478); /* 1 */
                    FF(ref d, a, b, c, x[k + 1], S12, 0xe8c7b756); /* 2 */
                    FF(ref c, d, a, b, x[k + 2], S13, 0x242070db); /* 3 */
                    FF(ref b, c, d, a, x[k + 3], S14, 0xc1bdceee); /* 4 */
                    FF(ref a, b, c, d, x[k + 4], S11, 0xf57c0faf); /* 5 */
                    FF(ref d, a, b, c, x[k + 5], S12, 0x4787c62a); /* 6 */
                    FF(ref c, d, a, b, x[k + 6], S13, 0xa8304613); /* 7 */
                    FF(ref b, c, d, a, x[k + 7], S14, 0xfd469501); /* 8 */
                    FF(ref a, b, c, d, x[k + 8], S11, 0x698098d8); /* 9 */
                    FF(ref d, a, b, c, x[k + 9], S12, 0x8b44f7af); /* 10 */
                    FF(ref c, d, a, b, x[k + 10], S13, 0xffff5bb1); /* 11 */
                    FF(ref b, c, d, a, x[k + 11], S14, 0x895cd7be); /* 12 */
                    FF(ref a, b, c, d, x[k + 12], S11, 0x6b901122); /* 13 */
                    FF(ref d, a, b, c, x[k + 13], S12, 0xfd987193); /* 14 */
                    FF(ref c, d, a, b, x[k + 14], S13, 0xa679438e); /* 15 */
                    FF(ref b, c, d, a, x[k + 15], S14, 0x49b40821); /* 16 */
    
                    /* Round 2 */
                    GG(ref a, b, c, d, x[k + 1], S21, 0xf61e2562); /* 17 */
                    GG(ref d, a, b, c, x[k + 6], S22, 0xc040b340); /* 18 */
                    GG(ref c, d, a, b, x[k + 11], S23, 0x265e5a51); /* 19 */
                    GG(ref b, c, d, a, x[k + 0], S24, 0xe9b6c7aa); /* 20 */
                    GG(ref a, b, c, d, x[k + 5], S21, 0xd62f105d); /* 21 */
                    GG(ref d, a, b, c, x[k + 10], S22, 0x2441453); /* 22 */
                    GG(ref c, d, a, b, x[k + 15], S23, 0xd8a1e681); /* 23 */
                    GG(ref b, c, d, a, x[k + 4], S24, 0xe7d3fbc8); /* 24 */
                    GG(ref a, b, c, d, x[k + 9], S21, 0x21e1cde6); /* 25 */
                    GG(ref d, a, b, c, x[k + 14], S22, 0xc33707d6); /* 26 */
                    GG(ref c, d, a, b, x[k + 3], S23, 0xf4d50d87); /* 27 */
                    GG(ref b, c, d, a, x[k + 8], S24, 0x455a14ed); /* 28 */
                    GG(ref a, b, c, d, x[k + 13], S21, 0xa9e3e905); /* 29 */
                    GG(ref d, a, b, c, x[k + 2], S22, 0xfcefa3f8); /* 30 */
                    GG(ref c, d, a, b, x[k + 7], S23, 0x676f02d9); /* 31 */
                    GG(ref b, c, d, a, x[k + 12], S24, 0x8d2a4c8a); /* 32 */
    
                    /* Round 3 */
                    HH(ref a, b, c, d, x[k + 5], S31, 0xfffa3942); /* 33 */
                    HH(ref d, a, b, c, x[k + 8], S32, 0x8771f681); /* 34 */
                    HH(ref c, d, a, b, x[k + 11], S33, 0x6d9d6122); /* 35 */
                    HH(ref b, c, d, a, x[k + 14], S34, 0xfde5380c); /* 36 */
                    HH(ref a, b, c, d, x[k + 1], S31, 0xa4beea44); /* 37 */
                    HH(ref d, a, b, c, x[k + 4], S32, 0x4bdecfa9); /* 38 */
                    HH(ref c, d, a, b, x[k + 7], S33, 0xf6bb4b60); /* 39 */
                    HH(ref b, c, d, a, x[k + 10], S34, 0xbebfbc70); /* 40 */
                    HH(ref a, b, c, d, x[k + 13], S31, 0x289b7ec6); /* 41 */
                    HH(ref d, a, b, c, x[k + 0], S32, 0xeaa127fa); /* 42 */
                    HH(ref c, d, a, b, x[k + 3], S33, 0xd4ef3085); /* 43 */
                    HH(ref b, c, d, a, x[k + 6], S34, 0x4881d05); /* 44 */
                    HH(ref a, b, c, d, x[k + 9], S31, 0xd9d4d039); /* 45 */
                    HH(ref d, a, b, c, x[k + 12], S32, 0xe6db99e5); /* 46 */
                    HH(ref c, d, a, b, x[k + 15], S33, 0x1fa27cf8); /* 47 */
                    HH(ref b, c, d, a, x[k + 2], S34, 0xc4ac5665); /* 48 */
    
                    /* Round 4 */
                    II(ref a, b, c, d, x[k + 0], S41, 0xf4292244); /* 49 */
                    II(ref d, a, b, c, x[k + 7], S42, 0x432aff97); /* 50 */
                    II(ref c, d, a, b, x[k + 14], S43, 0xab9423a7); /* 51 */
                    II(ref b, c, d, a, x[k + 5], S44, 0xfc93a039); /* 52 */
                    II(ref a, b, c, d, x[k + 12], S41, 0x655b59c3); /* 53 */
                    II(ref d, a, b, c, x[k + 3], S42, 0x8f0ccc92); /* 54 */
                    II(ref c, d, a, b, x[k + 10], S43, 0xffeff47d); /* 55 */
                    II(ref b, c, d, a, x[k + 1], S44, 0x85845dd1); /* 56 */
                    II(ref a, b, c, d, x[k + 8], S41, 0x6fa87e4f); /* 57 */
                    II(ref d, a, b, c, x[k + 15], S42, 0xfe2ce6e0); /* 58 */
                    II(ref c, d, a, b, x[k + 6], S43, 0xa3014314); /* 59 */
                    II(ref b, c, d, a, x[k + 13], S44, 0x4e0811a1); /* 60 */
                    II(ref a, b, c, d, x[k + 4], S41, 0xf7537e82); /* 61 */
                    II(ref d, a, b, c, x[k + 11], S42, 0xbd3af235); /* 62 */
                    II(ref c, d, a, b, x[k + 2], S43, 0x2ad7d2bb); /* 63 */
                    II(ref b, c, d, a, x[k + 9], S44, 0xeb86d391); /* 64 */
    
                    A += a;
                    B += b;
                    C += c;
                    D += d;
                }
                return new UInt32[] { A, B, C, D };
            }

    每一个分组经过64轮处理,FF、GG、HH、II为处理函数。从上面程序,可以看出,每16个数字为一组。以上是算法的核心处理方法,下面是程序主方法:

            public static byte[] MD5Array(byte[] input)
            {
                MD5_Init();
                UInt32[] block = MD5_Append(input);
                UInt32[] bits = MD5_Trasform(block);
    
                /* Encodes bits (UInt32[]) into output (byte[]). Assumes len is
                 * a multiple of 4.
                     */
                byte[] output = new byte[bits.Length * 4];
                for (int i = 0, j = 0; i < bits.Length; i++, j += 4)
                {
                    output[j] = (byte)(bits[i] & 0xff);
                    output[j + 1] = (byte)((bits[i] >> 8) & 0xff);
                    output[j + 2] = (byte)((bits[i] >> 16) & 0xff);
                    output[j + 3] = (byte)((bits[i] >> 24) & 0xff);
                }
                return output;
            }

    把output连接起来,就是md5值,output传入到下面方法:

          public static string ArrayToHexString(byte[] array, bool uppercase)
            {
                string hexString = "";
                string format = "x2";
                if (uppercase)
                {
                    format = "X2";
                }
                foreach (byte b in array)
                {
                    hexString += b.ToString(format);
                }
                return hexString;
            }

    附录:常量和基础函数:

           //static state variables
            private static UInt32 A;
            private static UInt32 B;
            private static UInt32 C;
            private static UInt32 D;
    
            #region 常量
    
            //number of bits to rotate in tranforming
            private const int S11 = 7;
            private const int S12 = 12;
            private const int S13 = 17;
            private const int S14 = 22;
            private const int S21 = 5;
            private const int S22 = 9;
            private const int S23 = 14;
            private const int S24 = 20;
            private const int S31 = 4;
            private const int S32 = 11;
            private const int S33 = 16;
            private const int S34 = 23;
            private const int S41 = 6;
            private const int S42 = 10;
            private const int S43 = 15;
            private const int S44 = 21;
    
            #endregion
    
            #region 基础函数
    
            /* F, G, H and I are basic MD5 functions.
             * 四个非线性函数:
             * 
             * F(X,Y,Z) =(X&Y)|((~X)&Z)
             * G(X,Y,Z) =(X&Z)|(Y&(~Z))
             * H(X,Y,Z) =X^Y^Z
             * I(X,Y,Z)=Y^(X|(~Z))
             * 
             * (&与,|或,~非,^异或)
             */
            private static uint F(UInt32 x, UInt32 y, UInt32 z)
            {
                return (x & y) | ((~x) & z);
            }
            private static uint G(UInt32 x, UInt32 y, UInt32 z)
            {
                return (x & z) | (y & (~z));
            }
            private static uint H(UInt32 x, UInt32 y, UInt32 z)
            {
                return x ^ y ^ z;   
            }
            private static uint I(UInt32 x, UInt32 y, UInt32 z)
            {
                return y ^ (x | (~z));
            }
    
            /* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4.
             * Rotation is separate from addition to prevent recomputation.
             */
            private static void FF(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti)
            {
                a = a + F(b, c, d) + mj + ti;
                a = a << s | a >> (32 - s);
                a += b;
            }
            private static void GG(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti)
            {
                a = a + G(b, c, d) + mj + ti;
                a = a << s | a >> (32 - s);
                a += b;
            }
            private static void HH(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti)
            {
                a = a + H(b, c, d) + mj + ti;
                a = a << s | a >> (32 - s);
                a += b;
            }
            private static void II(ref UInt32 a, UInt32 b, UInt32 c, UInt32 d, UInt32 mj, int s, UInt32 ti)
            {
                a = a + I(b, c, d) + mj + ti;
                a = a << s | a >> (32 - s);
                a += b;
            }
    
            #endregion

    说明:

    假设Mj表示消息的第j个子分组(从0到15),常数ti是4294967296*abs( sin(i) )的整数部分,i 取值从1到64,单位是弧度。(4294967296=2的32次方)

    现定义:

    FF(a ,b ,c ,d ,Mj ,s ,ti ) 操作为 a = b + ( (a + F(b,c,d) + Mj + ti) << s)

    GG(a ,b ,c ,d ,Mj ,s ,ti ) 操作为 a = b + ( (a + G(b,c,d) + Mj + ti) << s)

    HH(a ,b ,c ,d ,Mj ,s ,ti) 操作为 a = b + ( (a + H(b,c,d) + Mj + ti) << s)

    II(a ,b ,c ,d ,Mj ,s ,ti) 操作为 a = b + ( (a + I(b,c,d) + Mj + ti) << s)

    注意:此处“<<”表示循环左移位,不是左移位。函数内部有循环左移位的处理,符号本身表示左移位。FF函数的第二行代码如下:

     a = a << s | a >> (32 - s);

    它先左移,然后右移,两者与操作。左移,右边补0。右移,左边补0。所以实现了循环左移。可以想象把一直线,首尾相连,然后移动点,最后从某处切开,变成了新的首尾。

    小结:关于MD5的算法,还算是比较简单的算法,相比其它的加密算法而言。每一个算法都值得去推敲和学习。

  • 相关阅读:
    记坑
    常用模板
    ACM-东北赛划水记
    jzoj 4178游戏
    JZOI 4163
    jzoj 4146踩气球
    jzoj 5589. 缩点
    jzoj 5588 %%%
    jzoj 5571 ffs
    BJOI 2017 Kakuro
  • 原文地址:https://www.cnblogs.com/wangqiang3311/p/14945925.html
Copyright © 2011-2022 走看看