What is a Linked List?
Introduction
At its most generic: a list is a data structure for a collection of items. Don’t be put off by the term data structure. It just means a useful way of grouping the data of your application. Before we consider what a linked list is, let’s explore lists a little.
Typically in programming, you need to group items together so you can perform the same task on them. For example, your program might be a Pac-man clone. You need to store the monster objects in a list so you can update them in bulk: move towards the player, redraw them on the screen, etc.
Arrays
What would be the easiest way to hold these monsters? Let’s begin by looking at the simplest list of all: the data structure called an Array. An array is the simplest way to group information and common to most programming languages. An array is pretty simple in Python and looks like the following:
# Python Array or List
monsters = ["Inky", "Blinky", "Pinky", "Clyde"]
# Print the last item.
print(monsters[-1])
Strictly speaking, Python doesn't give you an array out of the box. The array above is dynamic and I could add or remove monsters using functions like append()
or pop()
. A typical array, in older languages like C or Java, is static. This means that when the array is created, the programmer has to assign it a size. Behind the scenes, Python is doing this to give you the convenience of a dynamic array. However, this comes with some overhead in terms of memory and speed.
You might say, why can’t I use an array for my Pac-man clone? There are only four monsters and the array won't resize. Imagine you wanted to change the number of monsters? Or what about the list of power pills that Pac-man devours? An array might not be the best choice here if you are trying to maximize your game's efficiency.
Linked Lists
Let's look at an alternative data structure: the linked list.
A linked list is a chain of Nodes. Think of a node as the placeholder for the item in your collection: the monster in your game or whatever else you need to store. In fact, a node is a very simple data structure of two parts: the placeholder and a reference to the next node.
In the image above, the question mark represents the item being stored (the value) and the arrow is a reference to the next node.
Let's code this in Python:
# Python version of Linked List Node.
class Node:
def __init__(self, value):
self.value = value
self.next = None # Pointer starts out pointing at nothing
Note: By default, the reference or pointer is set to None
. This means there is nothing else in the list. Essentially, a single node is a list of just itself.
Let's imagine our array of Pac-man monsters as a linked list:
Notice how each node contains the name of the monster and a reference to the next node?
"Clyde", as the last node, points to nothing to indicate the end of the list.
Let's try this in Python:
# The Monsters
node_1 = Node("Inky")
node_2 = Node("Blinky")
node_3 = Node("Pinky")
node_4 = Node("Clyde")
# Link them to each other
node_1.next = node_2
node_2.next = node_3
node_3.next = node_4
# Traverse the nodes, printing each ones value
head = node_1
while (head != None):
print(head.value)
head = head.next
If you copy and run the code, you should have each monster's name output to the console. To accomplish this, the while
loop uses a head node to traverse the list. The loop knows to stop by checking the next
property of each node. A None
value, the default, marks the end of the list.
Conclusion: A Linked List Example
Now that we understand a basic linked list, we can see an advantage they have over typical arrays. As long as I have a reference to the last item in the list, I can add a new item without incurring any extra memory cost. If you have a large list, this may well be worth it. There is a potential drawback however. For example, there is no way to access a node by an index as we saw with the array example.
Let's conclude with a final scenario: A linked list question that might come up in a coding job interview. You might be asked: "How do I detect a loop in a linked list?"
Let's expand our monsters example and add Pac-man to the mix:
# Add Pac-man after Clyde - He's going to chase Blinky!
node_5 = Node("Pac-man")
node_4.next = node_5
node_5.next = node_2 # Uh-oh!
What happens when we traverse this list? We're going to be stuck in an infinite loop. Every time we get to Pac-man, instead of finding a None
, we are sent back to another node. Before you look at the answer below, think about how you might detect and prevent these loops.
The answer involves using a Set. A set is another data structure for holding a collection of objects. The useful part here is that a set doesn't allow for duplicate values. By storing each node's next
value here, we can check for loops in the list.
# Check linked list for loops, i.e. nodes pointing to previous nodes.
def hasLoop(head):
# Store all values in a set to disallow duplicates.
allValues = set()
while(head != None):
if (head in allValues):
# Found an existing node. There's a loop
return True
allValues.add(head)
head = head.next
# There was no duplicate.
return False
# Code to check for loops
head = node_1
print("Does list contain a loop? " + str(hasLoop(node_1)))