All Unique Characters
The Problem
Even though this is a very basic problem it touches many important points for anyone trying to understand in a deeper level how to deal with strings. The basics of this problem will create the foundational thinking to solve many more complex string problems.
The question says:
"Implement an algorithm to determine if a string has all unique characters. What if you cannot use additional data structures?"
The first thing you need to consider (or ask you interviewer) is which character set (encoding) are you going to be dealing with: ASCII? Unicode?
For our solutions here we will assume that the encoding is ASCII characters.
The next consideration is: Can we use extra space? (another data structure to help us with the task).
The Solutions
Our first solution will make use of an extra data structure, the array chars
as seen below
func hasUniqueChars(str string) bool {
chars := make([]int, 256)
if len(str) > 256 {
return false
}
for _, char := range str {
asciiCode := int(char)
chars[asciiCode] += 1
if chars[asciiCode] > 1 {
return false
}
}
return true
}
You might be asking what the 256
means there. The ASCII character set has a total of 256 characters. So we will be creating a slice of ints that will represent the whole character set and we will be updating that data structure with the count of each character at its respective position, making it very easy to tell when a character appears more than once.
Also knowing the size of your character set helps with another optimization: If you know that you have at most 256 characters, any string longer than 256 characters inevitably will have at least one repeated character and we can stop the algorithm right away.
In case you cannot use an auxiliary data structure, the simplest solution is to use nested for
loops:
func hasUniqueCharsNoAuxStruct(str string) bool {
for idx, char1 := range str {
for _, char2 := range str[idx+1:] {
if char1 == char2 {
return false
}
}
}
return true
}
In this case we will be looping through the characters twice:
- We will take the first character and save it in char1
- Then we take the following character and save it in char2
Then we can just compare if the same characters appears more than once in the string. As we will be checking until the very end of string, at the end of the internal loop, for any index of the outer loop, we can guarantee that the character at position idx
has only appeared once. Otherwise the outer loop would have returned false.
Evaluating the Solutions
I'm not going to discuss the details of algorithm analysis in this post. This will come at a later video/post.
However looking at our first solution, you can see that we only go through the string once, meaning that the speed of our solution is bound to the length of the string. We say that the algorithm is O(n) where n is the length of the string. Also we can see that we are using constant space and that the space is directly related to the size of the character set, it doesn't really matter if all those characters are present in our string or not.
For our second solution, we are not using any extra structures, which means that this solution is leaner in terms of space used. However we are looping several times through the string to check the characters:
Lets say we have the string test
For the first character, t
, we will be going through e,s,t
.
For the second character, e
, we will be going through s,t
and etc.
That gives us 'n' internal loops of lenght: n-1, n-2,...,1
Based on this you can intuitively see that the function of the speed here is related to n*n, meaning a quadratic (polynomial) function. We will say that this algorithm is O(n²).
You can find the above solutions at our Problems repository.
The homework
In the YouTube video I left an exercise for the viewers: Try to come up with a solution that is faster than O(n²) without using any extra data structures.
The solution is very simple:
- Sort the string in place, using an algorithm like Quicksort.
- Go through the string once, checking if the present character is the same as the next one.
You're done! We'll say that this algorithms speed will be O(n*logn) as Quicksort takes that time to sort a string in place.
Disclaimer: Strings are immutable in Go. That means that strictly speaking we cannot implement a solution that will not use extra space. This is the same for other languages with immutable strings like Python, Java and C# and Javascript. To be able to implement these operations in place you'll need to use a language like C,C++, Ruby or PHP.
A trick you could actually do in Go is to retrieve the string as a byte array, manipulate the byte array(slice) in place,as that is possible, and when outputting the values cast the byte array to string.