zoukankan      html  css  js  c++  java
  • Python字节码与解释器学习

    参考:http://blog.jobbole.com/55327/

    http://blog.jobbole.com/56300/

    http://blog.jobbole.com/56761/

    1. 在交互式命令行中执行命令的内部过程

    当你敲下return键的时候,python完成了以下四步:词法分析、句法分析、编译、解释。词法分析的工作就是将你刚才输入的那行代码分解为一些符号token(译者注:包括标示符,关键字,数字, 操作符等)。句法分析程序再接收这些符号,并用一种结构来展现它们之间的关系(在这种情况下使用的抽象语法树)。然后编译器接收这棵抽象语法树,并将它转化为一个(或多个)代码对象。最后,解释器逐个接收这些代码对象,并执行它们所代表的代码。

    每一行我们输入的命令,都要经过上面的四个步骤,才能够被执行。

    2. 函数对象

    对象是面向对象理论中的基本元素,在一些动态或者解释性语言中,函数也可以看作是一种对象,比如在JavaScript,以及功能性编程语言Haskell/Ocaml中,函数都是一种特殊的对象。

    函数是对象,就意味着函数可以像对象一样被执行各种操作,比如分配,释放,复制,赋值......

    “函数是最好的对象”说明函数是一种对象。它就如同一个列表或者举个例子来说 :MyObject 就是一个对象。既然 foo 是一个对象,那么我们就能在不调用它的情况下使用它(也就是说,foo 和 foo() 是大相径庭的)。我们能够将 foo 当作一个参数传递给另一个函数或者赋值给一个新函数名( other_function = foo )。有了如此棒的函数,一切皆为可能!

    另外,函数作为对象出现的时候,就是和函数调用有区别的,函数调用是一个动态的过程;而函数作为一个对象,是一个静态的实体概念,意思是你可以对这个对象施予一些操作,这与这个对象的类型有关,或者以面向对象的思想来说,你可以执行这个对象提供的各种接口操作(函数)。

    既然是对象,那么函数对象有哪些成员呢?

    >>> dir
    <built-in function dir>
    >>> dir(dir)
    ['__call__', '__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__self__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
    >>> dir(dir.func_code)
    
    Traceback (most recent call last):
      File "<pyshell#2>", line 1, in <module>
        dir(dir.func_code)
    AttributeError: 'builtin_function_or_method' object has no attribute 'func_code'
    >>> def foo(a):
    	x = 3
    	return x + a
    
    >>> foo
    <function foo at 0x0000000002E8F128>
    >>> dir(foo)
    ['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
    >>> 
    

    其中,内置函数dir的功能描述如下:

    dir([object])

    Without arguments, return the list of names in the current local scope. With an argument, attempt to return a list of valid attributes for that object.

    If the object has a method named __dir__(), this method will be called and must return the list of attributes. This allows objects that implement a custom __getattr__() or __getattribute__() function to customize the way dir() reports their attributes.

    If the object does not provide __dir__(), the function tries its best to gather information from the object’s __dict__ attribute, if defined, and from its type object. The resulting list is not necessarily complete, and may be inaccurate when the object has a custom __getattr__().

    The default dir() mechanism behaves differently with different types of objects, as it attempts to produce the most relevant, rather than complete, information:

    • If the object is a module object, the list contains the names of the module’s attributes.
    • If the object is a type or class object, the list contains the names of its attributes, and recursively of the attributes of its bases.
    • Otherwise, the list contains the object’s attributes’ names, the names of its class’s attributes, and recursively of the attributes of its class’s base classes.

    The resulting list is sorted alphabetically.

      

    除此之外,help内置函数也很重要,可以查看内置函数的帮助内容。

    首先,查看当前Python程序加载了哪些模块

    >>> for i in sys.modules.keys():
    ...     print "%20s:	%s
    " % (i, sys.modules[i])
    ...     print "*"*100
    
                copy_reg:	<module 'copy_reg' from '/usr/lib/python2.7/copy_reg.pyc'>
    
    ****************************************************************************************************
             sre_compile:	<module 'sre_compile' from '/usr/lib/python2.7/sre_compile.pyc'>
    
    ****************************************************************************************************
                    _sre:	<module '_sre' (built-in)>
    
    ****************************************************************************************************
               encodings:	<module 'encodings' from '/usr/lib/python2.7/encodings/__init__.pyc'>
    
    ****************************************************************************************************
                    site:	<module 'site' from '/usr/lib/python2.7/site.pyc'>
    
    ****************************************************************************************************
             __builtin__:	<module '__builtin__' (built-in)>
    
    ****************************************************************************************************
               sysconfig:	<module 'sysconfig' from '/usr/lib/python2.7/sysconfig.pyc'>
    
    ****************************************************************************************************
                __main__:	<module '__main__' (built-in)>
    
    ****************************************************************************************************
     encodings.encodings:	None
    
    ****************************************************************************************************
                     abc:	<module 'abc' from '/usr/lib/python2.7/abc.pyc'>
    
    ****************************************************************************************************
               posixpath:	<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>
    
    ****************************************************************************************************
             _weakrefset:	<module '_weakrefset' from '/usr/lib/python2.7/_weakrefset.pyc'>
    
    ****************************************************************************************************
                   errno:	<module 'errno' (built-in)>
    
    ****************************************************************************************************
        encodings.codecs:	None
    
    ****************************************************************************************************
           sre_constants:	<module 'sre_constants' from '/usr/lib/python2.7/sre_constants.pyc'>
    
    ****************************************************************************************************
                      re:	<module 're' from '/usr/lib/python2.7/re.pyc'>
    
    ****************************************************************************************************
                 _abcoll:	<module '_abcoll' from '/usr/lib/python2.7/_abcoll.pyc'>
    
    ****************************************************************************************************
                   types:	<module 'types' from '/usr/lib/python2.7/types.pyc'>
    
    ****************************************************************************************************
                 _codecs:	<module '_codecs' (built-in)>
    
    ****************************************************************************************************
    encodings.__builtin__:	None
    
    ****************************************************************************************************
               _warnings:	<module '_warnings' (built-in)>
    
    ****************************************************************************************************
             genericpath:	<module 'genericpath' from '/usr/lib/python2.7/genericpath.pyc'>
    
    ****************************************************************************************************
                    stat:	<module 'stat' from '/usr/lib/python2.7/stat.pyc'>
    
    ****************************************************************************************************
               zipimport:	<module 'zipimport' (built-in)>
    
    ****************************************************************************************************
          _sysconfigdata:	<module '_sysconfigdata' from '/usr/lib/python2.7/_sysconfigdata.pyc'>
    
    ****************************************************************************************************
                warnings:	<module 'warnings' from '/usr/lib/python2.7/warnings.pyc'>
    
    ****************************************************************************************************
                UserDict:	<module 'UserDict' from '/usr/lib/python2.7/UserDict.pyc'>
    
    ****************************************************************************************************
         encodings.utf_8:	<module 'encodings.utf_8' from '/usr/lib/python2.7/encodings/utf_8.pyc'>
    
    ****************************************************************************************************
                     sys:	<module 'sys' (built-in)>
    
    ****************************************************************************************************
                  codecs:	<module 'codecs' from '/usr/lib/python2.7/codecs.pyc'>
    
    ****************************************************************************************************
                readline:	<module 'readline' from '/usr/lib/python2.7/lib-dynload/readline.i386-linux-gnu.so'>
    
    ****************************************************************************************************
       _sysconfigdata_nd:	<module '_sysconfigdata_nd' from '/usr/lib/python2.7/plat-i386-linux-gnu/_sysconfigdata_nd.pyc'>
    
    ****************************************************************************************************
                 os.path:	<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>
    
    ****************************************************************************************************
           sitecustomize:	<module 'sitecustomize' from '/usr/lib/python2.7/sitecustomize.pyc'>
    
    ****************************************************************************************************
                  signal:	<module 'signal' (built-in)>
    
    ****************************************************************************************************
               traceback:	<module 'traceback' from '/usr/lib/python2.7/traceback.pyc'>
    
    ****************************************************************************************************
               linecache:	<module 'linecache' from '/usr/lib/python2.7/linecache.pyc'>
    
    ****************************************************************************************************
                   posix:	<module 'posix' (built-in)>
    
    ****************************************************************************************************
       encodings.aliases:	<module 'encodings.aliases' from '/usr/lib/python2.7/encodings/aliases.pyc'>
    
    ****************************************************************************************************
              exceptions:	<module 'exceptions' (built-in)>
    
    ****************************************************************************************************
               sre_parse:	<module 'sre_parse' from '/usr/lib/python2.7/sre_parse.pyc'>
    
    ****************************************************************************************************
                      os:	<module 'os' from '/usr/lib/python2.7/os.pyc'>
    
    ****************************************************************************************************
                _weakref:	<module '_weakref' (built-in)>
    
    ****************************************************************************************************
    

      

    可以通过下面代码查看__builtin__模块中的成员

    >>> num = 0
    >>> for i in dir(sys.modules["__builtin__"]):
    ...     print "%20s	" % i,
    ...     num += 1
    ...     if num == 5:
    ...             print ""
    ...             num = 0
    ... 
         ArithmeticError	      AssertionError	      AttributeError	       BaseException	         BufferError	
            BytesWarning	  DeprecationWarning	            EOFError	            Ellipsis	    EnvironmentError	
               Exception	               False	  FloatingPointError	       FutureWarning	       GeneratorExit	
                 IOError	         ImportError	       ImportWarning	    IndentationError	          IndexError	
                KeyError	   KeyboardInterrupt	         LookupError	         MemoryError	           NameError	
                    None	      NotImplemented	 NotImplementedError	             OSError	       OverflowError	
    PendingDeprecationWarning	      ReferenceError	        RuntimeError	      RuntimeWarning	       StandardError	
           StopIteration	         SyntaxError	       SyntaxWarning	         SystemError	          SystemExit	
                TabError	                True	           TypeError	   UnboundLocalError	  UnicodeDecodeError	
      UnicodeEncodeError	        UnicodeError	UnicodeTranslateError	      UnicodeWarning	         UserWarning	
              ValueError	             Warning	   ZeroDivisionError	                   _	           __debug__	
                 __doc__	          __import__	            __name__	         __package__	                 abs	
                     all	                 any	               apply	          basestring	                 bin	
                    bool	              buffer	           bytearray	               bytes	            callable	
                     chr	         classmethod	                 cmp	              coerce	             compile	
                 complex	           copyright	             credits	             delattr	                dict	
                     dir	              divmod	           enumerate	                eval	            execfile	
                    exit	                file	              filter	               float	              format	
               frozenset	             getattr	             globals	             hasattr	                hash	
                    help	                 hex	                  id	               input	                 int	
                  intern	          isinstance	          issubclass	                iter	                 len	
                 license	                list	              locals	                long	                 map	
                     max	          memoryview	                 min	                next	              object	
                     oct	                open	                 ord	                 pow	               print	
                property	                quit	               range	           raw_input	              reduce	
                  reload	                repr	            reversed	               round	                 set	
                 setattr	               slice	              sorted	        staticmethod	                 str	
                     sum	               super	               tuple	                type	              unichr	
                 unicode	                vars	              xrange	                 zip	>>> 
    

      

      

     3. dir内置命令是怎么实现的

    在/Python-2.7.8/Objects/object.c中

    1963 /* Implementation of dir() -- if obj is NULL, returns the names in the current
    1964    (local) scope.  Otherwise, performs introspection of the object: returns a
    1965    sorted list of attribute names (supposedly) accessible from the object
    1966 */
    1967 PyObject *
    1968 PyObject_Dir(PyObject *obj)
    1969 {
    1970     PyObject * result;
    1971 
    1972     if (obj == NULL)
    1973         /* no object -- introspect the locals */
    1974         result = _dir_locals();
    1975     else
    1976         /* object -- introspect the object */
    1977         result = _dir_object(obj);
    1978 
    1979     assert(result == NULL || PyList_Check(result));
    1980 
    1981     if (result != NULL && PyList_Sort(result) != 0) {
    1982         /* sorting the list failed */
    1983         Py_DECREF(result);
    1984         result = NULL;
    1985     }
    1986 
    1987     return result;
    1988 }
    

      

    可见,与help(dir)描述的基本一致。

    >>> def foo(a):
    ...     if a > x:
    ...             return a/1024
    ...     else:
    ...             return a
    ... 
    >>> type(foo)
    <type 'function'>
    >>> dir(foo)
    ['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
    >>> foo.__call__
    <method-wrapper '__call__' of function object at 0xb7420df4>
    >>> foo.__str__
    <method-wrapper '__str__' of function object at 0xb7420df4>
    >>> foo
    <function foo at 0xb7420df4>
    >>> foo.func_closure
    >>> type(foo.func_closure)
    <type 'NoneType'>
    >>> type(foo.func_code)
    <type 'code'>
    >>> foo.func_code
    <code object foo at 0xb7409d10, file "<stdin>", line 1>
    >>> dir(foo.func_code)
    ['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
    >>> foo.func_code.co_argcount
    1
    >>> foo.func_code.co_cellvars
    ()
    >>> foo.func_code.co_code
    '|x00x00tx00x00kx04x00rx14x00|x00x00dx01x00x15S|x00x00Sdx00x00S'
    >>> foo.func_code.co_consts
    (None, 1024)
    >>> foo.func_code.co_filename
    '<stdin>'
    >>> foo.func_code.co_firstlineno
    1
    >>> foo.func_code.co_flags
    67
    >>> foo.func_code.co_freevars
    ()
    >>> foo.func_code.co_lnotab
    'x00x01x0cx01x08x02'
    >>> foo.func_code.co_name
    'foo'
    >>> foo.func_code.co_names
    ('x',)
    >>> foo.func_code.co_nlocals
    1
    >>> foo.func_code.co_stacksize
    2
    >>> foo.func_code.co_varnames
    ('a',)
    >>> 
    

      

    其中,foo.func_code.co_code打印出来的就是Python的字节码。

    Help on built-in function ord in module __builtin__:
    
    ord(...)
        ord(c) -> integer
        
        Return the integer ordinal of a one-character string.
    

      

    >>> [ord(i) for i in foo.func_code.co_code]
    [124, 0, 0, 116, 0, 0, 107, 4, 0, 114, 20, 0, 124, 0, 0, 100, 1, 0, 21, 83, 124, 0, 0, 83, 100, 0, 0, 83]
    

    这就是那些组成python字节码的字节。解释器会循环接收各个字节,查找每个字节的指令然后执行这个指令。需要注意的是,字节码本身并不包括任何python对象,或引用任何对象。

    如果你想知道python字节码的意思,可以去找到CPython解释器文件(ceval.c),然后查阅100的意思、1的意思、0的意思,等等。  

    >>> import dis
    >>> dir(dis)
    ['EXTENDED_ARG', 'HAVE_ARGUMENT', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '_have_code', '_test', 'cmp_op', 'dis', 'disassemble', 'disassemble_string', 'disco', 'distb', 'findlabels', 'findlinestarts', 'hascompare', 'hasconst', 'hasfree', 'hasjabs', 'hasjrel', 'haslocal', 'hasname', 'opmap', 'opname', 'sys', 'types']
    >>> type(dis.dis)
    <type 'function'>
    >>> dir(dis.dis)
    ['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
    >>> [ord(i) for i in dis.dis.func_code.co_code]
    [124, 0, 0, 100, 1, 0, 107, 8, 0, 114, 23, 0, 116, 1, 0, 131, 0, 0, 1, 100, 1, 0, 83, 116, 2, 0, 124, 0, 0, 116, 3, 0, 106, 4, 0, 131, 2, 0, 114, 53, 0, 124, 0, 0, 106, 5, 0, 125, 0, 0, 110, 0, 0, 116, 6, 0, 124, 0, 0, 100, 2, 0, 131, 2, 0, 114, 80, 0, 124, 0, 0, 106, 7, 0, 125, 0, 0, 110, 0, 0, 116, 6, 0, 124, 0, 0, 100, 3, 0, 131, 2, 0, 114, 107, 0, 124, 0, 0, 106, 8, 0, 125, 0, 0, 110, 0, 0, 116, 6, 0, 124, 0, 0, 100, 4, 0, 131, 2, 0, 114, 246, 0, 124, 0, 0, 106, 9, 0, 106, 10, 0, 131, 0, 0, 125, 1, 0, 124, 1, 0, 106, 11, 0, 131, 0, 0, 1, 120, 174, 0, 124, 1, 0, 68, 93, 85, 0, 92, 2, 0, 125, 2, 0, 125, 3, 0, 116, 2, 0, 124, 3, 0, 116, 12, 0, 131, 2, 0, 114, 154, 0, 100, 5, 0, 124, 2, 0, 22, 71, 72, 121, 14, 0, 116, 13, 0, 124, 3, 0, 131, 1, 0, 1, 87, 110, 28, 0, 4, 116, 14, 0, 107, 10, 0, 114, 234, 0, 1, 125, 4, 0, 1, 100, 6, 0, 71, 124, 4, 0, 71, 72, 110, 1, 0, 88, 72, 113, 154, 0, 113, 154, 0, 87, 110, 78, 0, 116, 6, 0, 124, 0, 0, 100, 7, 0, 131, 2, 0, 114, 18, 1, 116, 15, 0, 124, 0, 0, 131, 1, 0, 1, 110, 50, 0, 116, 2, 0, 124, 0, 0, 116, 16, 0, 131, 2, 0, 114, 46, 1, 116, 17, 0, 124, 0, 0, 131, 1, 0, 1, 110, 22, 0, 116, 14, 0, 100, 8, 0, 116, 18, 0, 124, 0, 0, 131, 1, 0, 106, 19, 0, 22, 130, 2, 0, 100, 1, 0, 83]
    

      

    >>> dir(dis.dis.func_code)
    ['__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'co_argcount', 'co_cellvars', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_freevars', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
    >>> dis.dis.func_code.co_filename
    '/usr/lib/python2.7/dis.py'
    >>> dis.dis.func_code.co_consts
    ('Disassemble classes, methods, functions, or code.
    
        With no argument, disassemble the last traceback.
    
        ', None, 'im_func', 'func_code', '__dict__', 'Disassembly of %s:', 'Sorry:', 'co_code', "don't know how to disassemble %s objects")
    >>> dis.dis.func_code.co_names
    ('None', 'distb', 'isinstance', 'types', 'InstanceType', '__class__', 'hasattr', 'im_func', 'func_code', '__dict__', 'items', 'sort', '_have_code', 'dis', 'TypeError', 'disassemble', 'str', 'disassemble_string', 'type', '__name__')
    >>> dis.dis.func_code.co_varnames
    ('x', 'items', 'name', 'x1', 'msg')
    >>> dis.dis.func_code.co_stacksize
    6
    >>> dis.dis.func_code.co_nlocals
    5
    

      

    其实dis.dis也不过就是是一连串的字节码而已,它被Python解释器执行,从而完成指定的功能。

    下面我们就使用dis.dis来反汇编一下字节码

    >>> dis.dis(foo.func_code.co_code)
              0 LOAD_FAST           0 (0)
              3 LOAD_GLOBAL         0 (0)
              6 COMPARE_OP          4 (>)
              9 POP_JUMP_IF_FALSE    20
             12 LOAD_FAST           0 (0)
             15 LOAD_CONST          1 (1)
             18 BINARY_DIVIDE  
             19 RETURN_VALUE   
        >>   20 LOAD_FAST           0 (0)
             23 RETURN_VALUE   
             24 LOAD_CONST          0 (0)
             27 RETURN_VALUE  
    

      

      

  • 相关阅读:
    20155335 俞昆 第十周作业
    课下加分项目 MYPWD 20155335 俞昆
    20155335 俞昆 实时系统 实验三
    20155335俞昆 2017-2018-1 《信息安全系统设计》第9周学习总结
    2017-2018-1 20155335 俞昆 《信息安全系统设计基础》第7周学习总结
    实验二 20155335 实验报告 固件程序设计
    2017-2018-1 20155319 《信息安全系统设计基础》第八周学习总结
    信息安全技术 实验三 数字证书应用
    第二次实验
    2017-2018-1 20155319 《信息安全系统设计基础》第七周学习总结
  • 原文地址:https://www.cnblogs.com/long123king/p/3837686.html
Copyright © 2011-2022 走看看