In python, we have more than one data structure type, to save items in an ordered way. Why do we need different types and how it is different? When to use them?. In this blog we will look into all these answers, and have an understanding of each of its the commonalities and need of two different data structure types of List and Tuples. Also how the memory is managed for both of these types.
Note: Below points are referenced for CPython Implementation
Common Between List and Tuple
Both are sequence data type
Each item stored in a list can be of any data type
Elements can be accessed by indexing and slicing
Both allow duplicates
Let's see how each are different with examples.
List
Below are some highly talked notes about Lists.
Mutable in Nature
Sortable
Syntax: list(<values>) or [<values>]
Example:
Definition
Define and usage of the list. As said list can be used to save any different types of object. Like below.
Source:
Output:
<class 'list'>
[1, 2, 'three', 4, 5]
Slicing
Sample ways to access the contents from the list.
Source:
Output:
[1, 2]
[4, 5]
Mutable
We can edit the values in the list.
Source:
Output:
<class 'list'>
[100, 2, 'three', 4, 5]
Memory Allocation
Now we have come to see the crux of this blog, how the memory is managed while storing the items in the list. First we will see how much currently the memory is allocated, later how each time the size changes when new items are allocated.
Let's check much currently the memory is allocated.
Source:
96
Common function to see the how much memory is allocated before and after values append.
Please closely observe the size and memory address of the list before and post update.
Note: In the below scenario the memory address didn't change, but it will always not be the case
Output:
address: 140509666477824
before size of list: 96
updated list: [100, 2, 'three', 4, 5, 6]
address remains the same: 140509666477824
after size of list: 128
address: 140509666477824
before size of list: 128
updated list: [100, 2, 'three', 4, 5, 6, 7]
address remains the same: 140509666477824
after size of list: 128
address: 140509666477824
before size of list: 128
updated list: [100, 2, 'three', 4, 5, 6, 7, 8]
address remains the same: 140509666477824
after size of list: 128
address: 140509666477824
before size of list: 128
updated list: [100, 2, 'three', 4, 5, 6, 7, 8, 9]
address remains the same: 140509666477824
after size of list: 128
address: 140509666477824
before size of list: 128
updated list: [100, 2, 'three', 4, 5, 6, 7, 8, 9, 10]
address remains the same: 140509666477824
after size of list: 192
address: 140509666477824
before size of list: 192
updated list: [100, 2, 'three', 4, 5, 6, 7, 8, 9, 10, 11]
address remains the same: 140509666477824
after size of list: 192
address: 140509666477824
before size of list: 192
updated list: [100, 2, 'three', 4, 5, 6, 7, 8, 9, 10, 11, 12]
address remains the same: 140509666477824
after size of list: 192
As you could see the size of list expanded from 96 to 128, but for next couple of items it didn't change whereas it stayed there for sometime. And then the size got expanded to 192. The reason in Cpython the memory is "preallocated" in chunks before hand. The reason is it avoids making frequent heavy system calls. As well if you see the over allocation is not static its mild and linear.
The reallocation happens to extend the current memory needed.
So when you have huge array in need, and the realloc is not having so much space. It will go ahead and create new memory and copy, this will be an very expensive operation.
To avoid this we can preallocate required memory.
Please find the below the code snippet of C implementation of List. And also the pictorial representation.
C Source - from CPython:
![]() |
Figure 1 : Memory allocation in list Credits : https://www.laurentluce.com/images/blog/list/list_insert.png |
Empty List
When an empty list is created and how its address is pointed. It will always pointing to different address. And it will hold preallocated memory as well.
Source:
Output:
A and B are not mapped to same address
address of a: 140509666702080
address of b: 140509666510912
A size of list: 56
B size of list: 56
Removal and Insertion
When we perform removal, the allocated memory will shrink without changing the address of the variable. While performing insert, the allocated memory will expand and the address might get changed as well.
Source:
Output:
address before change: 140509666477824
Remove element from list three
Size of list after removal: 192
Remove element from list 4
Size of list after removal: 192
Remove element from list 5
Size of list after removal: 192
Remove element from list 6
Size of list after removal: 192
Remove element from list 7
Size of list after removal: 136
[100, 2, 8, 9, 10, 11, 12]
address after change: 140509666477824
Source:
Output:
address before change: 140509666477824
[100, 2, 8, 30, 9, 10, 11, 12]
address after change: 140509666477824
Sortable
While performing the sort operation, the address of the list doesn't get changed before and after sort operation.
Source:
Output:
# Note : even after sort the address remains the same
address before change: 140509666477824
[2, 8, 9, 10, 11, 12, 30, 100]
address after change: 140509666477824
List sort vs sorted function
.sort() and sorted() can provide exactly the sort order you need if you use them properly with both the reverse and key optional keyword arguments. Both have very different characteristics when it comes to output and in-place modifications.
Syntax: .sort() and sorted()
Example: .sort()
Example: sorted()
User List
User list is similar to user string, instead of string uses list class implementation to edit lists features. It acts as a wrapper class around the List objects.
Syntax: collections.UserList([list])
Example:
List comprehension
List comprehension offers a shorter syntax when you want to create a new list based on the values of an existing list. List comprehensions are faster than loops, but avoid to use when nested for readability
Syntax: newlist = [expression for item in iterable if condition == True]
Example:
Performance
Append – O(1)
a = [1, 2, 3]
b = [4, 5, 6]
a.append(b)
a = [1, 2, 3, [4, 5, 6]]
Extend – O(k)
a = [1, 2, 3]
b = [4, 5, 6]
a.extend(b)
a = [1, 2, 3, 4, 5, 6]
Pop last element - O(1)
Pop anywhere else - O(n) worst case
Insert - O(n) worst case
will insert the element in the passed index
Index - O(1)
Slice - O(k)
in Operator – O(n)
find the element in the array
Sort - O(nlogn) - tim sort is used
n = number of elements
k = k'th index
1 = order of 1
Tuples
Let's observe how the tuples are defined, and how it differs in allocation of memory compared to list.
Immutable in nature - Once defined the values cannot be changed.
Tuple consumes less memory as compared to the list
Syntax: tuple(<values>) or (<values>)
Example:
Definition
Quick example to define a tuple.
Source:
Output:
<class 'tuple'>
('one', 'two', 'three')
Slicing
Accessing tuple is as same as list.
Source:
Output:
('one', 'two')
three
Immutable
Tuples are immutable in nature, saying we cannot change the values of the created tuple.
Changing the single value
Find the error while trying to change the value of the tuple
Source:
Output:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-13-a780f574006b> in <module>
1 # tuple are immutable
----> 2 tuple1[0] = "cherry"
3 print(type(tuple1))
4 print(tuple1)
TypeError: 'tuple' object does not support item assignment
Replacing tuple to new tuple
We can override the existing tuple to new tuple, the address will also be overridden.
Source:
Output:
address before the change: 140509666846336
<class 'tuple'>
(1, 2, 3, 4)
address after the change: 140509667383792
Changing the list inside tuple
We know that the tuple can as well hold any values. Now we have tried to save list inside tuple. And try editing its value. Can we edit ? Will it change the list ??
Lets see the answer below:
Source:
t = (1, 2, [10, 20, 30])print(id(t), t)t[2] += [40, 50]
Output:
140509666702784 (1, 2, [10, 20, 30])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-15-cc42d1fec8cf> in <module>
1 t = (1, 2, [10, 20, 30])
2 print(id(t), t)
----> 3 t[2] += [40, 50]
TypeError: 'tuple' object does not support item assignment
It has clearly thrown error, so it should have not updated the values as well.
Source:print(id(t), t)Output: 140509666702784 (1, 2, [10, 20, 30, 40, 50])
But if you see the values would be appended. This is an edge case where python behaves strangely. So, putting mutable items in tuples is not a good idea.
Memory Allocation
Check the memory allocated, it uses only required memory. They are not over-allocated as they are not resizable.
Source:print(sys.getsizeof(tuple1))Output: 72
Reuse Memory
To reduce memory fragmentation and speed up allocations, Python reuses old tuples.
If a tuple no longer needed and has less than 20 items.
Instead of deleting it permanently Python moves it to a free list and later uses it.
Note: Though its always not certain, the same memory will be used.
Source:
Output:140509665739520
140509665739520
Empty Tuple
When creating two empty tuple it will be point same address space.
Empty tuple acts as a singleton, that is, there is always only one tuple with a length of zero.
When creating an empty tuple Python points to already preallocated one.
In such way that any empty tuple has the same address in the memory.
This is possible because tuples are immutable and sometimes saves a lot of memory.
Source:
Output:A and B are not mapped to same address
address of a: 140509600395328
address of b: 140509600395328
A size of list: 40
B size of list: 40
Removal and Insertion
We cannot update the existing tuple but we can create new tuple with it, it will be copied into new address.
Source:
Output:140509667635968 ('one', 'two', 'three')
140509666902208 ('1', '2', '3')
140509666693664 ('one', 'two', 'three', '1', '2', '3')
Sort
Yes as said tuples are immutable, we cannot implicitly sort tuple.
But we can make use of sorted function to sort tuple.
Note: It will return as list.
Source:
Output:sorted_tuple1 = sorted(tuple1)print(id(sorted_tuple1), type(sorted_tuple1), sorted_tuple1)
140509667589312 <class 'list'> ['one', 'three', 'two']
Named Tuple
The namedtuple and normal type uses exactly same amount of memory. Because the field names are stored in the class
So we can either use tuple or named tuple, it will be the same. And named tuple will increase the readability of the program as well. Bonus!!
Named Tuples are not changeable.
Syntax: namedtuple(<name>, <values>)
Example:
Memory Allocation Source:
Output:City(name='Chennai', country='India', status='Red')
Size of named tuple: 64
('Chennai', 'India', 'Red')
Size of normaltuple tuple: 64
Performance
Index - O(1)
Slice - O(k)
in Operator - O(n)
n = number of elements
k = k'th index
1 = order of 1
Note: As Tuples are immutable it has only limited features
Conclusion
Use list when:
When the collection needs to be changed constantly.
Working with arguments.
Returning 2 or more items from a function.
Iterating over dictionary's key-value pairs.
Using string formatting.
Lists are more complex to implement, preferring tuple saves memory and time (list uses 3000+ lines of code while tuple needs only 1000+ lines of C code)
Reference
https://www.laurentluce.com/posts/python-list-implementation/https://github.com/python/cpython/blob/master/Objects/listobject.c
Fluent Python, 2nd Edition [Book] from O'Rielly
https://cdn.educba.com/academy/wp-content/uploads/2019/06/Python-Tuple-vs-List.jpg


No comments:
Post a Comment