3 min read

Anagrams

Anagrams

Anagrams have been part of our literary history throughout the centuries. Let's learn more about what they are, how they are used and how we can tell if two string are anagrams of each other.

The Problem

Given two strings s and t, determine if t is an anagram of s.

Example

  • s = listen
  • t = silent

Some Background

First, to define what is an anagram more formally, "An anagram is a word or phrase formed by rearranging the characters of a different word or phrase."[1]
Anagrams are found since ancient history and can be found all the way back in ancient Greece. It was used all across Europe in the Middle Age and was very popular among the users of Latin.

Uses of Anagrams

Anagrams have been used in multiple ways across the ages, in serious or not so much situations:

  • Pseudonym creation
  • Parody, criticism or satire
    • George Bush = He bugs Gore
  • Character naming[2]
    • William Shakespeare’s “Hamlet” is an anagram and directly inspired by the legendary Scandinavian prince “Amleth”.
    • In Harry Potter, J.K. Rowling uses the anagram “I am Lord Voldemort” / “Tom Marvolo Riddle,” as the two different identities of the series' villain.
  • Ciphers/Cryptography
    • Ciphers using permutations and/or transpositions can be broken using anagramming.

The Solution

Below is the code for our solution, in Python. Let's take a look at the code bits by bits and try to undersatand what is happening there:

def isAnagram(s: str, t: str) -> bool:
    if len(s) == len(t):
        s_map = {}
        t_map = {}
        for l in s:
            if l in s_map:
                s_map[l] += 1
            else:
                s_map[l] = 0
        for l in t:
            if l in t_map:
                t_map[l] += 1
            else:
                t_map[l] = 0

        for l in s_map:
            if l not in t_map or s_map[l] != t_map[l]:
                return False
    else:
        return False
    return True

You can find the code above in our Github repository

Let's start with the first if statement:

if len(s) == len(t):

We know from the definition that Angrams are a rearrangement of the original word, which means that if t is an anagram of s then the two words should be the same length. If they are not we can by definition determine that s and t are not anagrams.

for l in s:
    if l in s_map:
        s_map[l] += 1
    else:
        s_map[l] = 0
for l in t:
    if l in t_map:
        t_map[l] += 1
    else:
        t_map[l] = 0

Next we create a list that maps a letter to the number of times it appears in each word.

for l in s_map:
    if l not in t_map or s_map[l] != t_map[l]:
        return False

Lastly, we check if any of the characters in s is not present in t and if the numbers of apperances of each letter is different. If any of the above cases is true than we know that t is not an anagram of s. Otherwise we can know for sure that they are anagrams, as they are the same length and contain the same characters the same number of times.

Algorithm Analysis

For a quick evaluation of the solution above let's look at it's complexity (number of steps it takes) and memory usage.

If you need to need to review the concepts of Algorithm Analysis, checkout our series on the subject here.

Complexity

We need to find what are the most costly parts of our algorithm to calculate its complexity.
Our algorithm loops through each string once and in the end through the list of unique characters in s. Considering that the list of unique characters in s can only be less than or equal to the total number of characters in s, we can assume that the operation that will take the longest is looping through either s or t. That means that our algorithm will be O(n) where n is directly proportional to the size of the strings.

Memory

Our biggest usage of memory comes from creating the mapping of unique characters in each string with their respective frequency. Our memory usage is also directly proportional to the number of characters in strings s and t, so we can say that our memory usage is O(n).

Linear complexity and memory usage is not too bad, but could we have done better? Let us know in the comments on on Twitter: what's your best solution?


  1. https://en.wikipedia.org/wiki/Anagram ↩︎

  2. https://literarydevices.net/anagram/ ↩︎