Nutshell

Thoughts come and go, words stay eternal

28 Aug 2022

[python internal] Implementation of Type List

Abstract

  1. The only way to shrink list without delete element is making a copy of the old one, like l = l[:len(l)] or a = l.copy()
  2. Large list might has lots of allocated but unused space
  3. Deleting element has an ability of shrinking list, so codes like l.remove(i) for i in range(0, len(l) will skip some elements and cause index out of range exception

Metrics

Operation Average Case Amortized Worst Case Increase Space Inplace Resize
copy O(n) O(n) O(n) N I
append O(1) O(1) O(1) Y I|-
clear O(n) O(n) - Y D
count O(n) O(n) - N -
extend O(k) O(k) O(k) Y I|-
index O(n) O(n) - N -
insert O(n) O(n) O(1) Y I|-
pop O(n) O(n) Y D|-
remove O(n) O(n) Y D|-
reverse O(n) O(n) Y -
sort O(n log n) O(n log n) - Y -
__add__ O(n + k) O(n + k) O(n + k) N I
__iadd__ O(k) O(k) O(k) Y I-
__mul__ O(n*k) O(n*k) O(n*k) N I
__imul__ O(n*(k-1)) O(n*(k-1)) O(n*(k-1)) Y I-

Comment:

  1. Resize:
    • I: realloc memory (allocate more memory)
    • -: not allocated memory
    • D: deallocated memory
    • N: malloc memory ( new create)
  2. Increase Space: the number of elements that added after call the function, not the total number, and not the memory size
    • pop(k) / remove(x): use memmove and free delete element, memmove will copy l[k+1:] to temporary memory area and then copy into l[k:]
    • pop(k): O(1) ?, it use memmove to insert / delete element when k not the last index in list, performance depend on memmove
    • remove(x): O(n), need to found the index of x in list first, then using memmove to delete element
    • clear(): O(n), need to traverse all objects then free every one, release memory immediately
  3. Get Memory Usage
    1. sys.getsizeof(object): object’s size + garbage collector overhead (notmally, 16)
    2. object.__sizeof__(): object’s size

Source Code

background:

  1. python-3.10.6

Shrinking list


shrinking list by deleting elements:

  1. function remove(x) and pop(k) , when i == 0, it has bad performance, and it’s better to create new one
// Python-3.10.6/Objects/listobject.c
//
// algorithm about how list resize, include expansion and shrink

static int
list_resize(PyListObject *self, Py_ssize_t newsize)
{

	...
	
    if (allocated >= newsize && newsize >= (allocated >> 1)) {
        assert(self->ob_item != NULL || newsize == 0);
        Py_SET_SIZE(self, newsize);
        return 0;
    }

    /* This over-allocates proportional to the list size, making room
     * for additional growth.  The over-allocation is mild, but is
     * enough to give linear-time amortized behavior over a long
     * sequence of appends() in the presence of a poorly-performing
     * system realloc().
     * Add padding to make the allocated size multiple of 4.
     * The growth pattern is:  0, 4, 8, 16, 24, 32, 40, 52, 64, 76, ...
     * Note: new_allocated won't overflow because the largest possible value
     *       is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t.
     */
    new_allocated = ((size_t)newsize + (newsize >> 3) + 6) & ~(size_t)3;
    /* Do not overallocate if the new size is closer to overallocated size
     * than to the old size.
     */
    if (newsize - Py_SIZE(self) > (Py_ssize_t)(new_allocated - newsize))
        new_allocated = ((size_t)newsize + 3) & ~(size_t)3;

    if (newsize == 0)
        new_allocated = 0;
    num_allocated_bytes = new_allocated * sizeof(PyObject *);
    items = (PyObject **)PyMem_Realloc(self->ob_item, num_allocated_bytes);
    if (items == NULL) {
        PyErr_NoMemory();
        return -1;
    }
    self->ob_item = items;
    Py_SET_SIZE(self, newsize);
    self->allocated = new_allocated;
    return 0;
}

Notice:

  1. list.remove(x) / list.pop(k) internal implementation
  2. l[i:j] = a (a is list object) internal implementation
// Python-3.10.6/Objects/listobject.c
//

static int
list_ass_slice(PyListObject *a, Py_ssize_t ilow, Py_ssize_t ihigh, PyObject *v)
{

	...
	
    PyObject *recycle_on_stack[8];
    PyObject **recycle = recycle_on_stack; /* will allocate more if needed */
    PyObject **item;
    PyObject **vitem = NULL;
    PyObject *v_as_SF = NULL; /* PySequence_Fast(v) */
    Py_ssize_t n; /* # of elements in replacement list */
    Py_ssize_t norig; /* # of elements in list getting replaced */
    Py_ssize_t d; /* Change in size */
    Py_ssize_t k;
    size_t s;
    int result = -1;            /* guilty until proved innocent */
#define b ((PyListObject *)v)
    if (v == NULL)
        n = 0;
    else {
        if (a == b) {
            /* Special case "a[i:j] = a" -- copy b first */
            v = list_slice(b, 0, Py_SIZE(b));
            if (v == NULL)
                return result;
            result = list_ass_slice(a, ilow, ihigh, v);
            Py_DECREF(v);
            return result;
        }
        v_as_SF = PySequence_Fast(v, "can only assign an iterable");
        if(v_as_SF == NULL)
            goto Error;
        n = PySequence_Fast_GET_SIZE(v_as_SF);
        vitem = PySequence_Fast_ITEMS(v_as_SF);
    }
    if (ilow < 0)
        ilow = 0;
    else if (ilow > Py_SIZE(a))
        ilow = Py_SIZE(a);

    if (ihigh < ilow)
        ihigh = ilow;
    else if (ihigh > Py_SIZE(a))
        ihigh = Py_SIZE(a);

    norig = ihigh - ilow;
    assert(norig >= 0);
    d = n - norig;

    if (Py_SIZE(a) + d == 0) {
        Py_XDECREF(v_as_SF);
        return _list_clear(a);
    }
    item = a->ob_item;
    /* recycle the items that we are about to remove */
    s = norig * sizeof(PyObject *);
    /* If norig == 0, item might be NULL, in which case we may not memcpy from it. */
    if (s) {
        if (s > sizeof(recycle_on_stack)) {
            recycle = (PyObject **)PyMem_Malloc(s);
            if (recycle == NULL) {
                PyErr_NoMemory();
                goto Error;
            }
        }
        memcpy(recycle, &item[ilow], s);
    }

    if (d < 0) { /* Delete -d items */
        Py_ssize_t tail;
        tail = (Py_SIZE(a) - ihigh) * sizeof(PyObject *);
        memmove(&item[ihigh+d], &item[ihigh], tail);
        if (list_resize(a, Py_SIZE(a) + d) < 0) {
            memmove(&item[ihigh], &item[ihigh+d], tail);
            memcpy(&item[ilow], recycle, s);
            goto Error;
        }
        item = a->ob_item;
    }
    else if (d > 0) { /* Insert d items */
        k = Py_SIZE(a);
        if (list_resize(a, k+d) < 0)
            goto Error;
        item = a->ob_item;
        memmove(&item[ihigh+d], &item[ihigh],
            (k - ihigh)*sizeof(PyObject *));
    }
    for (k = 0; k < n; k++, ilow++) {
        PyObject *w = vitem[k];
        Py_XINCREF(w);
        item[ilow] = w;
    }
    for (k = norig - 1; k >= 0; --k)
        Py_XDECREF(recycle[k]);
    result = 0;
 Error:
    if (recycle != recycle_on_stack)
        PyMem_Free(recycle);
    Py_XDECREF(v_as_SF);
    return result;
#undef b
}

>>>
>>> # shrinking list by deleting elements
>>>
>>> l = [1]
>>> l.append(2)
>>> sys.getsizeof(l)
120
>>> l.pop(len(l)-1)
2
>>> sys.getsizeof(l)
88
>>>
>>> # comparing with creating new one
>>>
>>> l = l[:len(l)]
>>> sys.getsizeof(l)
64
>>>

shrinking list by created new one:

  1. list_preallocate_exact call by [].extend() or l += l1
  2. list_new_prealloc call by creating new ones, like l = [1,2,3] or l = [1,2] + [3]
  3. list_new_prealloc and list_preallocate_exact only allocated difference size when the length of list is odd
// Python-3.10.6/Objects/listobject.c

static int
list_preallocate_exact(PyListObject *self, Py_ssize_t size)
{
    assert(self->ob_item == NULL);
    assert(size > 0);

    /* Since the Python memory allocator has granularity of 16 bytes on 64-bit
     * platforms (8 on 32-bit), there is no benefit of allocating space for
     * the odd number of items, and there is no drawback of rounding the
     * allocated size up to the nearest even number.
     */
    size = (size + 1) & ~(size_t)1;
    PyObject **items = PyMem_New(PyObject*, size);
    if (items == NULL) {
        PyErr_NoMemory();
        return -1;
    }
    self->ob_item = items;
    self->allocated = size;
    return 0;
}

// Python-3.10.6/Objects/listobject.c

static PyObject *
list_new_prealloc(Py_ssize_t size)
{
    assert(size > 0);
    PyListObject *op = (PyListObject *) PyList_New(0);
    if (op == NULL) {
        return NULL;
    }
    assert(op->ob_item == NULL);
    op->ob_item = PyMem_New(PyObject *, size);
    if (op->ob_item == NULL) {
        Py_DECREF(op);
        return PyErr_NoMemory();
    }
    op->allocated = size;
    return (PyObject *) op;
}

Test: Python List Resize

>>> # list_preallocate_exact
>>> l = []
>>> l.extend([1,2,3])
>>> sys.getsizeof(l)
88
>>> l.__sizeof__()
72
>>> l = l[:len(l)]
>>> l.__sizeof__()
64
>>> sys.getsizeof(l)
80
>>>
>>> # list_new_prealloc
>>> l = [1,2,3]
>>> sys.getsizeof(l)
88
>>> l.__sizeof__()
72
>>> l = l[:len(l)]
>>> sys.getsizeof(l)
80
>>> 
>>> 
>>> # only diff when the length of list is odd
>>> 
>>> # list_preallocate_exact
>>> l = []
>>> l.extend([1,2,3,4,5,6])
>>> sys.getsizeof(l)
104
>>> l.__sizeof__()
88
>>> l = l[:len(l)]
>>> sys.getsizeof(l)
104
>>> l.__sizeof__()
88
>>>
>>> # list_new_prealloc
>>> l = [1,2,3,4,5,6]
>>> sys.getsizeof(l)
104
>>> l.__sizeof__()
88
>>> l = l[:len(l)]
>>> sys.getsizeof(l)
104
>>> l.__sizeof__()
88
>>>

Simulation: How List Delete Element:

nums = [1,2,3,4,5,6]
i = 0
for j in range(len(nums)): # all of done by memmove
    if nums[j] != 4:
        nums[i] = nums[j]
        i = i + 1
    nums = nums[:i]

Real World Python

Common Bug in Range Delete Element

# P. IndexError: list index out of range
nums = [1,2,3,4,5,6]
for i in range(len(nums)):
    if nums[i] == 9:
        nums.pop(i)


# P. skip check some element
nums = [1,2,3,4,4,5,6]
# output: [1,2,3,4,5,6]
for v in nums:
    if v == 4:
        nums.remove(v)

Remove Element in Range Safely

# remove element in range safely
nums = [1,2,3,4,5,6]
i = 0
while i < len(nums):
    if nums[i] == 4:
        nums.pop(i)
    else:
        i = i + 1

# best way to remove element in range
nums = [1,2,3,4,5,6]
for i in range(len(nums) - 1, -1, -1):
    if nums[i] == 4:
        nums.pop(i)

# P. too many Load / Store ops cause bad performance (> 0.5x slower)
nums = [1,2,3,4,5,6]
i = 0
for j in range(len(nums)):
    if nums[j] != 4:
        nums[i] = nums[j]
        i = i + 1
    # result is nums[:i]

FAQ

Q: Difference between list.clear() and l = []
A: Two of them are free memory by setting item as empty list [], but list.clear() will looping and free all PyObjects, and memory will be relased immedialy (when function call), l = [] is making ListObject unreachable and let it recycle by GC in background.


Q: Difference between listobject (list type) and arraymodule (array type) ?
A: Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. Also, they use different resize algorithm, and ArrayModule require less memory than ListObject when has same number of objects.

References

  1. dis - Disassembler for Python bytecode
  2. python - sys.getsizeof()
  3. python - TimeComplexity
  4. python - PerformanceTips
  5. linux - realloc
  6. linux - memmove
  7. minilibc - the asm simulation of memmove
  8. python3 - array
  9. web - Compiler Explorer