首先要理解一个概念
MAC (message authenticate code)
消息认证码(带密钥的Hash函数):密码学中,通信实体双方使用的一种验证机制,保证消息数据完整性的一种工具。
构造方法由M.Bellare提出,安全性依赖于Hash函数,故也称带密钥的Hash函数。消息认证码是基于密钥和消息摘要所获得的一个值,
可用于数据源发认证和完整性校验。
签名cookie就是基于这种原理。
一般发送的数据都会做base64编码,关于base64编码可以看这个链接
blog.xiayf.cn/2016/01/24/base64-encoding/
总的来说:base64并不是用来加密的,base64是一种数据编码方式,目的让数据符合传输协议的要求,将二进制的数据转化为一种文本数据
在Python中提供了两个模块,去实现。以下是base64的基本用法
>>> import base64 >>> s='b'asdafsdafa' SyntaxError: invalid syntax >>> s=b'asdasdas' >>> s_64 = base64.bencode(s) s_64 = base64.bencode(s) AttributeError: module 'base64' has no attribute 'bencode' >>> s_64 = base64.b64encode(s) >>> s_64 b'YXNkYXNkYXM=' >>> base64.b64decode(s_64) b'asdasdas' >>>
对于mac,Python有个模块hmac基本用法如下
>>> import hmac >>> hmac.new(b'slat') <hmac.HMAC object at 0x0000000004110B70> >>> hmac = hmac.new(b'slat') >>> hmac.update(b'asdas') >>> hmac.digest() b'xe8\xb6x11x9dj Yx06Ix1f[x06xebxebxf3' >>>
文档说明
class HMAC(builtins.object) | RFC 2104 HMAC class. Also complies with RFC 4231. | | This supports the API for Cryptographic Hash Functions (PEP 247). | | Methods defined here: | | __init__(self, key, msg=None, digestmod=None) | Create a new HMAC object. | | key: key for the keyed hash object. | msg: Initial input for the hash, if provided. | digestmod: A module supporting PEP 247. *OR* | A hashlib constructor returning a new hash object. *OR* | A hash name suitable for hashlib.new(). | Defaults to hashlib.md5. | Implicit default to hashlib.md5 is deprecated and will be | removed in Python 3.6. | | Note: key and msg must be a bytes or bytearray objects. | | copy(self) | Return a separate copy of this hashing object. | | An update to this copy won't affect the original object. | | digest(self) | Return the hash value of this hashing object. | | This returns a string containing 8-bit data. The object is | not altered in any way by this function; you can continue | updating the object after calling this function. | | hexdigest(self) | Like digest(), but returns a string of hexadecimal digits instead. | | update(self, msg) | Update this hashing object with the string msg.
立即了基础,来看bottle框架设置签名cookie和获取签名cookie的值的源码.
设置签名cookie
first_bottle.py
response.set_cookie('account',username,secret='salt')
def set_cookie(self, name, value, secret=None, **options): if not self._cookies: self._cookies = SimpleCookie() if secret: value = touni(cookie_encode((name, value), secret)) elif not isinstance(value, basestring): raise TypeError('Secret key missing for non-string Cookie.') if len(value) > 4096: raise ValueError('Cookie value to long.') self._cookies[name] = value for key, value in options.items(): if key == 'max_age': if isinstance(value, timedelta): value = value.seconds + value.days * 24 * 3600 if key == 'expires': if isinstance(value, (datedate, datetime)): value = value.timetuple() elif isinstance(value, (int, float)): value = time.gmtime(value) value = time.strftime("%a, %d %b %Y %H:%M:%S GMT", value) self._cookies[name][key.replace('_', '-')] = value
self._cookie默认是None,SimpleCookie继承BaseCookie,BaseCookie继承一个字典,所以暂且认为self.cookie是一个字典,
if secret 是判断,如果设置了密钥,就执行这一步,
def tob(s, enc='utf8'): return s.encode(enc) if isinstance(s, unicode) else bytes(s) def touni(s, enc='utf8', err='strict'): return s.decode(enc, err) if isinstance(s, bytes) else unicode(s) tonat = touni if py3k else tob
touni 函数是返回str类型的字符串,这里unicode=str,unicode(s) 相当于str(s)
def cookie_encode(data, key): ''' Encode and sign a pickle-able object. Return a (byte) string ''' msg = base64.b64encode(pickle.dumps(data, -1)) sig = base64.b64encode(hmac.new(tob(key), msg).digest()) return tob('!') + sig + tob('?') + msg
pickle.dumps将数据序列化,返回的是bytes类型的字符串,然后编码为base64 sig 是先用hmac加密,
最后将msg(消息) 和sig(签名)拼接,这样一个签名cookie就设置好了,注意这里的msg是一个(name,value)包含cookie的key和value
这样一个签名cookie就设置好了
理解了签名cookie的设置,再看获得签名cookie的值就比较简单了。。
大致原理是拿到cookie的值,通过?分割出message 和sig ,再拿message和secret 进行hmac 拿到新的sig,这个新的sig与分割出来的sig比较,如果一致,表示没有被篡改,这样吧message 用base64decode
然后pickle.loads 就拿到原来的数组了。数组的[1]就是那个value,
def cookie_decode(data, key): ''' Verify and decode an encoded string. Return an object or None.''' data = tob(data) if cookie_is_encoded(data): sig, msg = data.split(tob('?'), 1)if _lscmp(sig[1:], base64.b64encode(hmac.new(tob(key), msg).digest())): return pickle.loads(base64.b64decode(msg))
因为之前setcookie时在自古穿前面加了一个感叹 号! ,所以切片sig[1:]
def _lscmp(a, b): ''' Compares two strings in a cryptographically safe way: Runtime is not affected by length of common prefix. ''' return not sum(0 if x==y else 1 for x, y in zip(a, b)) and len(a) == len(b)
上面这个函数是逐个字符比较,如果比较的字符都相等那么就返回0,否则返回1,这样如果是两个字符串完全匹配,就都是0,调用sum() 相加肯定返回0 ,否则肯定不是1,但是必须在长度相等的条件下才可以
测试代码
>>> a='asdas' >>> unicode(a) Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> unicode(a) NameError: name 'unicode' is not defined >>> b='asd' >>> (0 if x==y else 1 for x,y in zip(a,b)) <generator object <genexpr> at 0x0000000003170200> >>> sum((0 if x==y else 1 for x,y in zip(a,b))) 0 >>> s=zip(a,b) >>> s <zip object at 0x0000000003147948> >>> for i in s: print(i) ('a', 'a') ('s', 's') ('d', 'd')
为什么比较字符串相等不直接用 a==b?