zoukankan      html  css  js  c++  java
  • implementation of python list

    http://www.laurentluce.com/posts/python-list-implementation/


    Python list implementation

    March 10, 2011

    This post describes the CPython implementation of the list object. CPython is the most used Python implementation.

    Lists in Python are powerful and it is interesting to see how they are implemented internally.

    Following is a simple Python script appending some integers to a list and printing them.

    01 >>> l = []
    02 >>> l.append(1)
    03 >>> l.append(2)
    04 >>> l.append(3)
    05 >>> l
    06 [1, 2, 3]
    07 >>> for in l:
    08 ...   print e
    09 ...
    10 1
    11 2
    12 3

    As you can see, lists are iterable.

    List object C structure

    A list object in CPython is represented by the following C structure. ob_item is a list of pointers to the list elements. allocated is the number of slots allocated in memory.

    1 typedef struct {
    2     PyObject_VAR_HEAD
    3     PyObject **ob_item;
    4     Py_ssize_t allocated;
    5 } PyListObject;

    List initialization

    Let’s look at what happens when we initialize an empty list. e.g. l = [].

    1 arguments: size of the list = 0
    2 returns: list object = []
    3 PyListNew:
    4     nbytes = size * size of global Python object = 0
    5     allocate new list object
    6     allocate list of pointers (ob_item) of size nbytes = 0
    7     clear ob_item
    8     set list's allocated var to 0 = 0 slots
    9     return list object

    It is important to notice the difference between allocated slots and the size of the list. The size of a list is the same as len(l). The number of allocated slots is what has been allocated in memory. Often, you will see that allocated can be greater than size. This is to avoid needing calling realloc each time a new elements is appended to the list. We will see more about that later.

    Append

    We append an integer to the list: l.append(1). What happens? The internal C function app1() is called:

    1 arguments: list object, new element
    2 returns: 0 if OK, -1 if not
    3 app1:
    4     n = size of list
    5     call list_resize() to resize the list to size n+1 = 0 + 1 = 1
    6     list[n] = list[0] = new element
    7     return 0

    Let’s look at list_resize(). It over-allocates memory to avoid calling list_resize too many time. The growth pattern of the list is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, …

    1 arguments: list object, new size
    2 returns: 0 if OK, -1 if not
    3 list_resize:
    4     new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6) = 3
    5     new_allocated += newsize = 3 + 1 = 4
    6     resize ob_item (list of pointers) to size new_allocated
    7     return 0

    4 slots are now allocated to contain elements and the first one is the integer 1. You can see on the following diagram that l[0] points to the integer object that we just appended. The dashed squares represent the slots allocated but not used yet.

    Append operation complexity is O(1).

    Python lists

    We continue by adding one more element: l.append(2). list_resize is called with n+1 = 2 but because the allocated size is 4, there is no need to allocate more memory. Same thing happens when we add 2 more integers: l.append(3), l.append(4). The following diagram shows what we have so far.

    Python lists

    Insert

    Let’s insert a new integer (5) at position 1: l.insert(1,5) and look at what happens internally. ins1() is called:

    1 arguments: list object, where, new element
    2 returns: 0 if OK, -1 if not
    3 ins1:
    4     resize list to size n+1 = 5 -> 4 more slots will be allocated
    5     starting at the last element up to the offset where, right shift each element
    6     set new element at offset where
    7     return 0

    Python lists

    The dashed squares represent the slots allocated but not used yet. Here, 8 slots are allocated but the size or length of the list is only 5.

    Insert operation complexity is O(n).

    Pop

    When you pop the last element: l.pop(), listpop() is called. list_resize is called inside listpop() and if the new size is less than half of the allocated size then the list is shrunk.

    1 arguments: list object
    2 returns: element popped
    3 listpop:
    4     if list empty:
    5         return null
    6     resize list with size 5 - 1 = 4. 4 is not less than 8/2 so no shrinkage
    7     set list object size to 4
    8     return last element

    Pop operation complexity is O(1).

    Python lists

    You can observe that slot 4 still points to the integer but the important thing is the size of the list which is now 4.

    Let’s pop one more element. In list_resize(), size – 1 = 4 – 1 = 3 is less than half of the allocated slots so the list is shrunk to 6 slots and the new size of the list is now 3.

    You can observe that slot 3 and 4 still point to some integers but the important thing is the size of the list which is now 3.

    Python lists

    Remove

    Python list object has a method to remove a specific element: l.remove(5). listremove() is called.

    1 arguments: list object, element to remove
    2 returns none if OK, null if not
    3 listremove:
    4     loop through each list element:
    5         if correct element:
    6             slice list between element's slot and element's slot + 1
    7             return none
    8     return null

    To slice the list and remove the element, list_ass_slice() is called and it is interesting to see how it works. Here, low offset is 1 and high offset is 2 as we are removing the element 5 at position 1.

    1 arguments: list object, low offset, high offset
    2 returns: 0 if OK
    3 list_ass_slice:
    4     copy integer 5 to recycle list to dereference it
    5     shift elements from slot 2 to slot 1
    6     resize list to 5 slots
    7     return 0

    Remove operation complexity is O(n).

    Python lists

    That’s it for now. I hope you enjoyed the article. Please write a comment if you have any feedback. If you need help with a project written in Python or with building a new web service, I am available as a freelancer: LinkedIn profile. Follow me on Twitter @laurentluce.

    tags: 
    posted in Uncategorized by Laurent Luce

    Follow comments via the RSS Feed | Leave a comment | Trackback URL

  • 相关阅读:
    Flume案例:目录文件到HDFS
    Flume案例:本地文件到HDFS
    12c多租户架构下部署GoldenGate12c
    ogg_sqlserver_oracle_winerver2008
    ogg异构oracle-mysql
    搜索引擎 反向索引,为什么称之为反向索引?
    ORACLE 11G 之DATAGUARD搭建逻辑standby
    Java并发锁控制API详解
    Java中的锁分类与使用
    Spring Cloud入门(含案例介绍及源码)
  • 原文地址:https://www.cnblogs.com/threef/p/3278749.html
Copyright © 2011-2022 走看看