Anagrams
Anagrams have been part of our literary history throughout the centuries. Let's learn more about what they are, how they are used and how we can tell if two string are anagrams of each other.
The Problem
Given two strings s and t, determine if t is an anagram of s.
Example
s
= listent
= silent
Some Background
First, to define what is an anagram more formally, "An anagram is a word or phrase formed by rearranging the characters of a different word or phrase."[1]
Anagrams are found since ancient history and can be found all the way back in ancient Greece. It was used all across Europe in the Middle Age and was very popular among the users of Latin.
Uses of Anagrams
Anagrams have been used in multiple ways across the ages, in serious or not so much situations:
- Pseudonym creation
- Parody, criticism or satire
- George Bush = He bugs Gore
- Character naming[2]
- William Shakespeare’s “Hamlet” is an anagram and directly inspired by the legendary Scandinavian prince “Amleth”.
- In Harry Potter, J.K. Rowling uses the anagram “I am Lord Voldemort” / “Tom Marvolo Riddle,” as the two different identities of the series' villain.
- Ciphers/Cryptography
- Ciphers using permutations and/or transpositions can be broken using anagramming.
The Solution
Below is the code for our solution, in Python. Let's take a look at the code bits by bits and try to undersatand what is happening there:
def isAnagram(s: str, t: str) -> bool:
if len(s) == len(t):
s_map = {}
t_map = {}
for l in s:
if l in s_map:
s_map[l] += 1
else:
s_map[l] = 0
for l in t:
if l in t_map:
t_map[l] += 1
else:
t_map[l] = 0
for l in s_map:
if l not in t_map or s_map[l] != t_map[l]:
return False
else:
return False
return True
You can find the code above in our Github repository
Let's start with the first if
statement:
if len(s) == len(t):
We know from the definition that Angrams are a rearrangement of the original word, which means that if t
is an anagram of s
then the two words should be the same length. If they are not we can by definition determine that s
and t
are not anagrams.
for l in s:
if l in s_map:
s_map[l] += 1
else:
s_map[l] = 0
for l in t:
if l in t_map:
t_map[l] += 1
else:
t_map[l] = 0
Next we create a list that maps a letter to the number of times it appears in each word.
for l in s_map:
if l not in t_map or s_map[l] != t_map[l]:
return False
Lastly, we check if any of the characters in s
is not present in t
and if the numbers of apperances of each letter is different. If any of the above cases is true than we know that t
is not an anagram of s
. Otherwise we can know for sure that they are anagrams, as they are the same length and contain the same characters the same number of times.
Algorithm Analysis
For a quick evaluation of the solution above let's look at it's complexity (number of steps it takes) and memory usage.
If you need to need to review the concepts of Algorithm Analysis, checkout our series on the subject here.
Complexity
We need to find what are the most costly parts of our algorithm to calculate its complexity.
Our algorithm loops through each string once and in the end through the list of unique characters in s
. Considering that the list of unique characters in s
can only be less than or equal to the total number of characters in s
, we can assume that the operation that will take the longest is looping through either s
or t
. That means that our algorithm will be O(n) where n is directly proportional to the size of the strings.
Memory
Our biggest usage of memory comes from creating the mapping of unique characters in each string with their respective frequency. Our memory usage is also directly proportional to the number of characters in strings s
and t
, so we can say that our memory usage is O(n).
Linear complexity and memory usage is not too bad, but could we have done better? Let us know in the comments on on Twitter: what's your best solution?