Choosing the Right Python Data Structure: List, Set, Dictionary, Tuple Comparison

Choosing the Right Python Data Structure: List, Set, Dictionary, Tuple Comparison

List vs. Set vs. Dictionary vs. Tuple in Python: A Comprehensive Guide

Hi there! Hope you are doing well. Unsure when to use Lists, Sets, Dictionaries, or Tuples in Python? This blog post cuts through the confusion, explaining each of the above Python data structures, its storage management, and performance i.e. time complexity. We will also cover a few most common and essential FAQs on each of these Python data structures.

Table of Contents

Python Data Structure Intro

Data Structures are fundamentals of any programming language around which a program is built. Data structures are essentially organized collections of data in Python. It’s a way to store and manage data efficiently, allowing you to access, modify, and manipulate information in a structured way. There are 4 types of built-in Python data structures. Lists, Sets, Dictionaries, and Tuples.

They can hold various data types like integers, strings, floats, or even other data structures. Some data structures are mutable, meaning their contents can be changed after creation (e.g., lists). Others are immutable, meaning their contents cannot be modified after creation (e.g., tuples). Some data structures maintain the order in which elements are added (e.g., lists). Others are unordered, meaning the order doesn’t necessarily reflect the order of addition (e.g., sets).

Lists

1. What is Python List

Python List is an ordered, mutable collection of items enclosed in square brackets []. A list can hold data types like integers, strings, and objects. It can also have duplicate items. The Python list can also hold different data types at the same time. A list can be nested, which means can have other lists and any objects. 

2. How List is stored in Python

In Python, lists are stored in memory as arrays of references to the elements they contain. This means that a Python list doesn’t directly store the actual elements. Instead, it stores references (or pointers) to the memory locations where the elements are stored. 

For example, if you have a list of integers [1, 2, 3], each element in the list is a reference pointing to the memory locations where the integers 1, 2, and 3 are stored.

This structure allows lists to hold elements of different data types and to efficiently handle variable-sized elements.

Internally, a Python list is implemented as a dynamic array. This means that the list starts with a small allocated memory block that can hold a certain number of elements (e.g., 4 or 8).

As elements are added to the list and it exceeds its current capacity, Python dynamically reallocates memory to accommodate more elements. This resizing process typically involves doubling the capacity of the array.

3. List Performance (Time Complexity)

  • Accessing an element by index: O(1) – fastest among other Python data structure types, because elements are stored in contiguous memory locations. You can jump directly to the desired element’s position.
  • Inserting or deleting an element at the end: O(1) – is fast usually, as minimal operations are needed to update the internal pointer.
  • Inserting or deleting an element at the beginning or middle: O(n) – In the worst case, Inserting or removing elements in the middle of the list can be slow, because it requires shifting other elements in the list to maintain order.
  • Searching for an element: O(n) – linear search of an element by value in worst case, requires iterating through the entire list,  which can be slow for large lists.
  • Sorting:
    • average and best case -> O(n log n) 
    • Worst case-> O(n^2) – the sorting time grows proportionally to the logarithm of the list size (n) multiplied by n. This is the time complexity for most sorting algorithms like:
    • Timesort (average and best-case time complexity O(n log n))
    • Merge sort (guaranteed time complexity of O(n log n) for all cases)
    • Heap Sort (average and worst-case time complexity of O(n log n).)
    • Quick Sort (average O(n log n), the worst-case scenario can be O(n^2))
    • Selection Sort (worst-case time complexity of O(n^2))
    • Bubble Sort (worst-case time complexity of O(n^2))

Sets

1. What is Python Set

Python Set is an unordered, mutable (i.e. can remove or add items) collection of immutable items (like, string, int, float, tuple) enclosed in curly brackets {}. Set stores unique items, so no duplicate items are allowed in the set.

Python sets can hold data types like integers, strings, boolean, and tuples. A set can be nested, with immutable elements/items, like a tuple, but a set can’t have a set as an item since the set is mutable. 

2. How Set is stored in Python

Sets in Python employ a data structure called a hash table for storage, unlike lists, which use a contiguous array of pointers. It’s essentially an array of buckets that hold elements and their corresponding hash values. When we add an element to a set, using a built-in hash function, Python calculates its hash value. This process results in converting the item data into a unique integer value.

The hash value is then used to determine the bucket location within the hash table where the element should reside. Ideally, each element would have its unique bucket. 

However, collisions can occur when two different elements generate the same hash value. In such cases, Python employs collision resolution strategies to store these elements efficiently. A common strategy is chaining, where collided elements are linked together in a separate data structure attached to the bucket. Thus the operations on the hash table are somewhat similar to the Linked List.

3. Set Performance (Time Complexity)

  • Adding or removing elements: O(1) (average case) 
  • Checking for membership (element in the set) or search: O(1) (average case) – Checking if an element exists in a set, is the fastest among other data types in Python, due to efficient hash table lookup

NOTE: if a hash collision happens the time complexity can degrade to O(n) – (worst case).

Dictionaries

1. What is the Python Dictionary

Python Dictionary is a collection of key-value pairs that are stored in an organized manner {“key”: ”value”}. Each item in the dictionary has a unique key and its corresponding value. Since Python 3.7, items in the dictionary are ordered, while before that, they were unordered.

The dictionary is a mutable data structure, meaning that its elements can be added, removed, or updated. Keys must be immutable data types such as strings, numbers, or tuples. This ensures that the keys cannot be changed once they are assigned, which leads to efficient lookups and prevents conflicts within the dictionary.

Python dictionary values can hold almost all data types, including integers, strings, boolean, tuples, sets, and other dictionaries.

2. How is a Dictionary stored in Python?

Dictionaries in Python are stored using a data structure called a hash table, similar to sets. Each key is mapped to its corresponding value using a hash function. This function converts the key’s data into a unique integer value (hash value).

The hash value is then used to determine the location in the hash table, known as a bucket, where the key-value pair should be stored. In an ideal scenario, each key-value pair should have its unique bucket.

However, collisions can occur when different keys generate the same hash value. Python employs collision resolution strategies, like chaining, to store these colliding key-value pairs within the same bucket.

Dictionaries can dynamically grow or shrink as needed. The hash table can be resized during rehashing to accommodate more key-value pairs. Rehashing involves creating a new, larger hash table and redistributing the key-value pairs based on their recalculated hash values.

3. Dictionary Performance (Time Complexity)

  • Inserting (add) and Deletion (remove): O(1) – on average, is quite fast due to efficient hash tables for storage.
  • Lookup (get) / search: O(1) – Retrieving a value associated with a specific key is another fast operation with an average-case time complexity of O(1). The hash function again guides the search to the appropriate bucket, and Python can quickly determine if the key exists and return its corresponding value.

NOTE: Same as set, if a hash collision happens the time complexity can degrade to O(n) – (worst case).

Tuples

1. What is Python Tuples

Python Tuples are ordered, immutable collection items enclosed in rounded brackets (). Unlike Python set that is unordered and mutable. Tuples can hold elements of different data types (e.g., strings, integers, floats) within the same tuple even data structures like lists, sets, dictionaries, and tuple itself. Tuple allows duplicate items.

2. How Tuples is stored in Python

Tuples are a data type in Python that store their elements directly in a contiguous block of memory. This means that all the elements within the tuple are physically placed side-by-side in memory, similar to how an array works in other programming languages. When you create a tuple, Python allocates a single block of memory to store all the elements together. The size of this block is determined by the combined size of all the elements within the tuple.

Since tuples are immutable, the elements within them cannot be changed once the tuple is created. This immutability simplifies memory management and allows for a more efficient storage approach. Python doesn’t need to worry about potential modifications to element sizes or pointers.

Due to the contiguous allocation and fixed size, Python can efficiently calculate the memory address of any element within the tuple based on its index. This enables fast direct access to elements using indexing.

3. Tuple Performance (Time Complexity)

  • Accessing an element by index: O(1)
  • Searching for an element: O(n) – worst case and best case O(1)

NOTE: Since tuples are immutable, insertion and deletion operations are not supported.

FAQs

List FAQ:

1. When to use Lists?

Lists are versatile and widely used for storing collections of items that need to be ordered and might change over time.

Common use cases include storing shopping lists, student grades, or sequences of instructions.

2. How do I create and access elements in a list?

You can create lists using square brackets []. Elements are separated by commas within the brackets.

my_list = [1, "apple", 3.14]
first_element = my_list[0]  # first_element will be 1 (accessing by index)

3. Ways to join/merge two or more Lists?

Consider two list1 = [1, 2, 3] and list2 = [4, 5, 6]

You can use the + operator to concatenate two lists. This creates a new list containing all elements from both lists.

list1 = [1, 2, 3]
list2 = [4, 5, 6]
joined_list = list1 + list2

The extend() method is used to add elements from another list to an existing list. This modifies the original list. 

list1 = [1, 2, 3]
list2 = [4, 5, 6]
list1.extend(list2)

Using the * operator, this method is a new addition to list concatenation and works only in Python 3.6+. Any no. of lists can be concatenated and returned in a new list using this operator. 

list1 = [1, 2, 3]
list2 = [4, 5, 6]
joined_list = [*list1, *list2]

List comprehension allows you to create a new list by iterating over elements of existing lists and adding them to the new list.

list1 = [1, 2, 3]
list2 = [4, 5, 6]
joined_list = [item for sublist in [list1, list2] for item in sublist]

Set FAQ:

1. When to use Sets?

Sets are ideal for situations where you need to store unique elements, check for membership quickly, or remove duplicates from a collection.

Common use cases include storing unique IDs, checking for available letters in a Scrabble game, or removing duplicates from a list of words.

2. How do I create and check for membership in a set?

You can create sets using curly braces {}. Elements are separated by commas within the braces.

my_set = {1, "apple", 3.14, "apple"}  # Duplicate "apple" will be ignored

if "banana" in my_set:
    print("banana is in the set")
else:
    print("banana is not in the set")

3. Ways to join/merge two sets?

Consider sets, set1 = {1, 2, 3},  set2 = {3, 4, 5} and set3 = {5, 6, 7}

Use of the union() method, returns a new set containing all elements from the original set(s) as well as elements from other sets.

set1 = {1, 2, 3}
set2 = {3, 4, 5}
merged_set = set1.union(set2)

You can also use the union() method with multiple sets:

set1 = {1, 2, 3}
set2 = {3, 4, 5}
set3 = {5, 6, 7}
merged_set = set1.union(set2, set3)

The | operator performs the union operation between sets, creating a new set with elements from both sets.

set1 = {1, 2, 3}
set2 = {3, 4, 5}
set3 = {5, 6, 7}
merged_set = set1 | set2
merged_set = set1 | set2 | set3

Dictionary FAQ

1. When to use Dictionaries?

Dictionaries are powerful for storing and retrieving data based on unique keys.

Common use cases include storing user information (name, age, etc.), phonebook entries (name and phone number), or word definitions in a dictionary app.

2. How do I create and access elements in a dictionary?

You can create dictionaries using curly braces {}. Keys and values are separated by colons, and key-value pairs are separated by commas.

my_dict = {"name": "Alice", "age": 30, "city": "New York"}
name = my_dict["name"]  # name will be "Alice" (accessing by key)

3. Ways to join/merge two or more dictionaries?

Consider dictionaries: dict1 = {‘a’: 1, ‘b’: 2}, dict2 = {‘c’: 3, ‘d’: 4} and dict3 = {‘e’: 5, ‘f’: 6}

You can use the update() method to merge the key-value pairs from one dictionary into another. If there are common keys, the values from the updating dictionary overwrite the values in the target dictionary.

dict1 = {'a': 1, 'b': 2}
dict2 = {'c': 3, 'd': 4}
dict1.update(dict2)

Dictionary unpacking allows you to merge dictionaries in a single expression. This method is useful when you want to merge dictionaries without modifying the original dictionaries.

merged_dict = {**dict1, **dict2}

You can also merge multiple dictionaries using dictionary unpacking:

dict1 = {'a': 1, 'b': 2}
dict2 = {'c': 3, 'd': 4}
dict3 = {'e': 5, 'f': 6}
merged_dict = {**dict1, **dict2, **dict3}

Tuple FAQ:

1. When to use Tuples?

Tuples are useful for representing fixed data that shouldn’t be changed, like coordinates (x, y) or multiple return values from a function.

Since they are immutable, they can be used as dictionary keys where the key itself needs to remain unchanged.

2. How do I create and access elements in a tuple?

You can create tuples using parentheses (). Elements are separated by commas within the parentheses.

my_tuple = (1, "apple", 3.14)
first_element = my_tuple[0]  # first_element will be 1 (accessing by index)

3. Ways to join/merge two or more tuples?

Consider tuples: tuple1 = (1, 2, 3), tuple2 = (4, 5, 6) and  tuple3 = (7, 8, 9)

You can use the + operator to concatenate two or more tuples, creating a new tuple with elements from all the tuples.

tuple1 = (1, 2, 3)
tuple2 = (4, 5, 6)
merged_tuple = tuple1 + tuple2

You can also concatenate multiple tuples together using the + operator:

tuple1 = (1, 2, 3)
tuple2 = (4, 5, 6)
tuple3 = (7, 8, 9)
merged_tuple = tuple1 + tuple2 + tuple3

You can use tuple unpacking and packing, allowing you to merge tuples into a new tuple. This method is particularly useful when you have multiple tuples to merge.

tuple1 = (1, 2, 3)
tuple2 = (4, 5, 6)
merged_tuple = (*tuple1, *tuple2)

You can merge multiple tuples using tuple unpacking and packing in a single line:

tuple1 = (1, 2, 3)
tuple2 = (4, 5, 6)
tuple3 = (7, 8, 9)
merged_tuple = (*tuple1, *tuple2, *tuple3)

Summary

Selecting the appropriate data structure can significantly impact the efficiency of your Python code. By comprehending the advantages and disadvantages of Lists, Sets, Dictionaries, and Tuples, you can make knowledgeable decisions that will enhance your program’s performance.

Now that you have a fundamental understanding, try incorporating these data structures into your Python projects! If you have any queries, please leave a comment below.

Still uncertain about which data structure to use? No worries! You can contact me to discuss your particular situation, and I will happily assist you in selecting the most appropriate one.

Hope I have made it easy to understand Python Data structures like Lists, Sets, Diciotny, and Tuples. If you like this article and think it was easy to understand and might help someone you know, do share it with them. Thank You! See you soon.

If you have any questions or comments feel free to reach me at.

Checkout out other Python concept covered in Python Tutorial