Collections ¶
In
variables
, we introduced data types like
int
,
float
, and
bool
.
However, these types can only represent single, simple values.
In many cases, we need more flexible ways to store, access, and manipulate data.
This brings us to the topic of collections.
Collections allow us to work with groups of data. One analogy is to think of scalar types as single eggs, while collections are like egg cartons - structured collections to store multiple eggs together. This document will cover the core collections types: lists, tuples, dictionaries. Each solve related but distinct needs.
Lists ¶
Lists represent the simplest, most versatile, and most used ordered collections type.
They can story more than one type of data, they have a specific order, and the contents can change.
For example, we can create a list of things I have in my pocket.
We instruct Python to create a list by surrounding data with square brackets:
[ ]
.
in_my_pocket = [21, False, None, "apple", 3.14]
print(in_my_pocket)
[21, False, None, 'apple', 3.14]
We do not necessarily have to start with data inside.
We can create an empty list as well with just
[ ]
.
in_my_other_pocket = []
print(in_my_other_pocket)
[]
Indexing ¶
One of the most fundamental things you use lists for is to get individual values.
Because the order of the items in the list matter, I can refer to an item using it's position in the list.
This position in programming languages is called the
index
; with the first index being
0
, then
1
, then
2
, and so on.
Let's print the first value.
print(in_my_pocket[0])
21
You may have noticed that we are using
[]
to both create the list and index it.
This was designed on purpose to be more readable, and the main difference is if
[]
is directly next to a variable like in our
in_my_pocket[0]
example.
If we did
in_my_pocket [0]
we would get an error.
While we can count from start to end of the list, we can also count backwards.
To get the last element of a list, we can use the index
-1
.
print(in_my_pocket[-1])
3.14
A table of the available elements are shown below.
Element | Index from start | Index from end |
---|---|---|
21
|
0 | -5 |
False
|
1 | -4 |
None
|
2 | -3 |
"apple"
|
3 | -2 |
3.14
|
4 | -1 |
Either of these indices would get you the same element.
Length ¶
Another important thing is that you cannot use an index that is outside the bounds of the list.
For example we cannot ask for the 6th element of
in_my_pocket
since there are only five items.
We can get the length of a list by using the
len
function.
print(len(in_my_pocket))
5
This means that there are 5 elements and the largest (positive) index I can use is minus one of that, so
4
.
Slicing ¶
While accessing individual list elements is useful, often we want to extract a subsection, or slice, of a list's items. This is accomplished in Python using slicing. Slicing allows copying a portion of a list—either from one or both sides. This enables cleanly breaking large lists into usable parts.
In Python, the colon
:
is used for slicing with the general syntax of
start:stop:step
.
-
start
: The index at which the slice begins (inclusive). If omitted orNone
, it defaults to the beginning of the sequence. -
stop
: The index at which the slice ends (exclusive). If omitted orNone
, it defaults to the end of the sequence. -
step
: The step size or the number of indices between each slice. If omitted orNone
, it defaults to 1.
First, let us print the full list like we normally have been doing.
print(in_my_pocket)
[21, False, None, 'apple', 3.14]
We can get the same view if we slice
in_my_pocket
without specifying
start
,
stop
, or
step
.
print(in_my_pocket[::])
[21, False, None, 'apple', 3.14]
Remember that
None
is often used to specify the absence of a value, so it should give us the same as
[::]
.
print(in_my_pocket[None:None:None])
[21, False, None, 'apple', 3.14]
Yup!
This gives us the same slice of the list
in_my_pocket
.
Now, if I want only the first two elements of the list, I want to tell Python to stop slicing (and not include) at the index of
2
.
print(in_my_pocket[:2])
[21, False]
I can also tell Python I want the elements at index
2
and
3
.
print(in_my_pocket[2:4])
[None, 'apple']
in_my_pocket = [21, False, None, "apple", 3.14]
print(in_my_pocket)
in_my_pocket[2] = "wallet"
print(in_my_pocket)
[21, False, None, 'apple', 3.14] [21, False, 'wallet', 'apple', 3.14]
This works the same way as memory allocation for
variables
.
Python will go to where the third element is located in memory, throw out
None
, and put
"wallet"
there instead.
Append ¶
However, what if I want to add an element to my list and not remove anything?
We can do this with
append
.
For example, let's say I keep everything in my pocket that is already there, but then I want to add my keys.
print(in_my_pocket)
print(len(in_my_pocket))
in_my_pocket.append("keys")
print(in_my_pocket)
print(len(in_my_pocket))
[21, False, 'wallet', 'apple', 3.14] 5 [21, False, 'wallet', 'apple', 3.14, 'keys'] 6
Now our list has an additional element and a new length of
6
.
Delete ¶
Just like we can add things to a list, we can also remove them with the
del
keyword in Python
print(in_my_pocket)
print(len(in_my_pocket))
del in_my_pocket[3]
print(in_my_pocket)
print(len(in_my_pocket))
[21, False, 'wallet', 'apple', 3.14, 'keys'] 6 [21, False, 'wallet', 3.14, 'keys'] 5
Tuples ¶
A tuple is just like a list except the content cannot change, meaning I cannot do any mutation (i.e., replace, append, or delete).
in_my_closed_pocket = (21, False, None, "apple", 3.14)
print(in_my_closed_pocket)
(21, False, None, 'apple', 3.14)
print(in_my_closed_pocket[3])
apple
in_my_closed_pocket.append("wallet")
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[16], line 1 ----> 1 in_my_closed_pocket.append("wallet") AttributeError: 'tuple' object has no attribute 'append'
del in_my_closed_pocket[3]
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[17], line 1 ----> 1 del in_my_closed_pocket[3] TypeError: 'tuple' object doesn't support item deletion
So why would you want to use a tuple instead of a list?
- If you need a collection of items that should not be changed or modified throughout the program, using a tuple provides immutability. This prevents accidental modifications and ensures data consistency.
- Tuples are generally more memory-efficient than lists because of their immutability. If your data does not need to change, using a tuple can result in better performance.
Dictionaries ¶
Lists allow us to store ordered collections of data which can be flexibly accessed via indices. However, frequently we need an alternative access pattern—looking up values by a descriptive key rather than numerical index. This is enabled in Python using dictionaries.
Dictionaries provide a flexible mapping of unique keys to associated values, like a real world dictionary maps words to definitions. Defining a dictionary uses braces with colons separating keys and values.
person_favorites = {
"color": "blue",
"food": ["Chinese", "Thai", "American"],
"number": 32,
}
print(person_favorites)
{'color': 'blue', 'food': ['Chinese', 'Thai', 'American'], 'number': 32}
Dictionaries have some key capabilities:
- Store mappings of objects to easy retrieval by descriptive keys;
- High performance lookup time even for large data sets;
- Keys can use many immutable types: strings, numbers, tuples;
- Values can be any Python object;
- Extensible structure allowing easy growth.
print(person_favorites.keys())
dict_keys(['color', 'food', 'number'])
print(person_favorites["color"])
print(person_favorites["food"])
print(person_favorites["number"])
blue ['Chinese', 'Thai', 'American'] 32
person_favorites["color"] = "red"
print(person_favorites)
{'color': 'red', 'food': ['Chinese', 'Thai', 'American'], 'number': 32}
person_favorites["city"] = "Pittsburgh"
print(person_favorites)
{'color': 'red', 'food': ['Chinese', 'Thai', 'American'], 'number': 32, 'city': 'Pittsburgh'}
person_favorites["food"][2] = "Italian"
print(person_favorites)
{'color': 'red', 'food': ['Chinese', 'Thai', 'Italian'], 'number': 32, 'city': 'Pittsburgh'}
Iterables ¶
You may hear people say "iterable" and "sequence" interchangeably. They are not the same! An iterable is any object that can be iterated over, meaning you can go through its data one element at a time. Sequences, on the other hand, can be iterated over and get specific values based on an index. Every sequence is an iterable, but not every iterable is a sequence. This is not really important for this course, but becomes crucial for type hints and efficient algorithm design.