Today’s challenge, taken from https://leetcode.com/problems/longest-substring-without-repeating-characters/, requires us to write a program to determine the length of the longest substring of a given string. The substring should not have any repeating characters.
For instance, suppose the given string is 'abcbabababab'
. The longest substring is 'abc'
or 'cba'
. Hence, the length is 3.
If the given string is 'pwwkew'
, the longest substring is 'wke'
or 'kew'
. Hence, the length is 3.
Write a function to do the above.
Hint
We can declare a string called longest
to store the longest substring. With that, we can then iterate through each character in the given string and append the character to longest
, as long as the character is not already in longest
.
For instance, suppose the given string is 'abcacefgh'
. As we iterate through 'abcacefgh'
, we’ll append 'a'
, 'b'
and 'c'
to longest
.
However, what happens when we reach the 4th character – 'a'
? As 'a'
is already in longest
, what should we do? Similarly, what happens when we reach the 5th character 'c'
?
Think about it and try to code the function.
Solution
def lengthOfLongestSubstring(s):
longest = []
tempHighest = 0
for i in range(len(s)):
if s[i] not in longest:
longest.append(s[i])
elif s[i] == longest[0]:
del longest[0]
longest.append(s[i])
else:
if tempHighest < len(longest):
tempHighest = len(longest)
#start a new substring
longest = longest[longest.index(s[i])+1:]
longest.append(s[i])
return max(tempHighest, len(longest))
Run Through
Here, we declare a function called lengthOfLongestSubstring()
with one parameter s
.
Within the function, we declare an empty list called longest
and a variable called tempHighest
.
longest
is used to store the required substring extracted from s
as we iterate through it.
tempHighest
, on the other hand, is used to store the length of the longest substring we found.
Next, we use a for
loop to iterate through s
.
There are three cases to consider.
Suppose the given string is 'abcacefgh'
.
The first case is when the current letter is not in longest
. When that happens, the if
condition (if s[i] not in longest
) evaluates to True
and the if
-block is executed (i.e. we append the current letter to longest
).
In our example, as we iterate through s
, 'a'
, 'b'
and 'c'
are not in longest
. Hence, we append them to longest
. After the 3rd iteration, longest = 'abc'
.
Next, we have the second case where the current letter is in longest
and is the first letter in longest
. When that happens, the elif
condition (s[i] == longest[0]
) evaluates to True
and the elif
block is executed. The statements in the elif block do
the following:
- The first letter in
longest
is removed using thedel
keyword. - The current letter is then appended to
longest
.
In our example, the 4th letter 'a'
is in longest
and is the first letter in longest
. Hence, we remove 'a'
from the front of longest
and append 'a'
to the end.
longest
thus becomes 'bca'
.
Last but not least, we have the third case where the current letter is in longest
but is not the first letter. When that happens, we do the following:
We first update the value of tempHighest
using the if
-statement below:
if tempHighest < len(longest):
tempHighest = len(longest)
Next, we use the built-in function index()
to find the index of the current letter in longest
(longest.index(s[i])
).
We then remove all characters from the start of longest
to where the current letter is found using the statement
longest = longest[longest.index(s[i])+1:]
The slice notation [longest.index(s[i])+1:]
indicates that we want to extract a substring starting from the element after longest.index(s[i])
to the end of longest
.
Finally, we append the current letter to longest
.
longest.append(s[i])
With reference to our example, when we reach the 5th letter in 'abcacefgh'
, we realize that 'c'
is in longest
(which stores 'bca'
at the moment) but is not the first letter. Hence, the else
block is executed.
We first update tempHighest
to 3
(which is the length of longest
). Next, we use longest.index(s[i])
to find the index of 'c'
in longest
. As the resulting substring cannot have two 'c'
s in it, we need to remove 'c'
(and any letter before it) from longest
.
longest.index(s[i])
gives us the value 1
as 'c'
is at index 1.
The slice notation [longest.index(s[i])+1: ]
thus becomes [2: ]
, which means we want to get elements from index 2 up to the end of longest
. That gives us the string 'a'
.
We then update the value of longest
to this new substring and append 'c'
to the end of it.
longest
thus becomes 'ac'
.
This process is repeated for the rest of the letters in 'abcacefgh'
.
As 'e'
, 'f'
, 'g'
and 'h'
are not in longest
, they get appended to longest
.
By the time we reach the end of s
, longest = 'acefgh'
.
Once we finish iterating through s
, we return either the value of tempHighest
(which is 3
at the moment) or the length of longest
(which is 6
), depending on which is greater.
In our example, as 6 is greater than 3, we return 6.