Python Programming Challenge 10: Longest Substring Without Repeating Characters


Today’s challenge, taken from https://leetcode.com/problems/longest-substring-without-repeating-characters/, requires us to write a program to determine the length of the longest substring of a given string. The substring should not have any repeating characters.

For instance, suppose the given string is 'abcbabababab'. The longest substring is 'abc' or 'cba'. Hence, the length is 3.

If the given string is 'pwwkew', the longest substring is 'wke' or 'kew'. Hence, the length is 3.

Write a function to do the above.

Hint

We can declare a string called longest to store the longest substring. With that, we can then iterate through each character in the given string and append the character to longest, as long as the character is not already in longest.

For instance, suppose the given string is 'abcacefgh'. As we iterate through 'abcacefgh', we’ll append 'a', 'b' and 'c' to longest.

However, what happens when we reach the 4th character – 'a'? As 'a' is already in longest, what should we do? Similarly, what happens when we reach the 5th character 'c'?

Think about it and try to code the function.

Solution

def lengthOfLongestSubstring(s):
    
    longest = []
    tempHighest = 0

    for i in range(len(s)):
        if s[i] not in longest:
            longest.append(s[i])
        elif s[i] == longest[0]:
            del longest[0]
            longest.append(s[i])
        else:
            if tempHighest < len(longest):
                tempHighest = len(longest)
            #start a new substring
            longest = longest[longest.index(s[i])+1:]
            longest.append(s[i])
    return max(tempHighest, len(longest))

Run Through

Here, we declare a function called lengthOfLongestSubstring() with one parameter s.

Within the function, we declare an empty list called longest and a variable called tempHighest.

longest is used to store the required substring extracted from s as we iterate through it.

tempHighest, on the other hand, is used to store the length of the longest substring we found.

Next, we use a for loop to iterate through s.

There are three cases to consider.

Suppose the given string is 'abcacefgh'.

The first case is when the current letter is not in longest. When that happens, the if condition (if s[i] not in longest) evaluates to True and the if-block is executed (i.e. we append the current letter to longest).

In our example, as we iterate through s, 'a', 'b' and 'c' are not in longest. Hence, we append them to longest. After the 3rd iteration, longest = 'abc'.

Next, we have the second case where the current letter is in longest and is the first letter in longest. When that happens, the elif condition (s[i] == longest[0]) evaluates to True and the elif block is executed. The statements in the elif block do the following:

  • The first letter in longest is removed using the del keyword.
  • The current letter is then appended to longest.

In our example, the 4th letter 'a' is in longest and is the first letter in longest. Hence, we remove 'a' from the front of longest and append 'a' to the end.

longest thus becomes 'bca'.

Last but not least, we have the third case where the current letter is in longest but is not the first letter. When that happens, we do the following:

We first update the value of tempHighest using the if-statement below:

if tempHighest < len(longest):
    tempHighest = len(longest)

Next, we use the built-in function index() to find the index of the current letter in longest (longest.index(s[i])).

We then remove all characters from the start of longest to where the current letter is found using the statement

longest = longest[longest.index(s[i])+1:]

The slice notation [longest.index(s[i])+1:] indicates that we want to extract a substring starting from the element after longest.index(s[i]) to the end of longest.

Finally, we append the current letter to longest.

longest.append(s[i])

With reference to our example, when we reach the 5th letter in 'abcacefgh', we realize that 'c' is in longest (which stores 'bca' at the moment) but is not the first letter. Hence, the else block is executed.

We first update tempHighest to 3 (which is the length of longest). Next, we use longest.index(s[i]) to find the index of 'c' in longest. As the resulting substring cannot have two 'c's in it, we need to remove 'c' (and any letter before it) from longest.

longest.index(s[i]) gives us the value 1 as 'c' is at index 1.

The slice notation [longest.index(s[i])+1: ] thus becomes [2: ], which means we want to get elements from index 2 up to the end of longest. That gives us the string 'a'.

We then update the value of longest to this new substring and append 'c' to the end of it.

longest thus becomes 'ac'.

This process is repeated for the rest of the letters in 'abcacefgh'.

As 'e', 'f', 'g' and 'h' are not in longest, they get appended to longest.

By the time we reach the end of s, longest = 'acefgh'.

Once we finish iterating through s, we return either the value of tempHighest (which is 3 at the moment) or the length of longest (which is 6), depending on which is greater.

In our example, as 6 is greater than 3, we return 6.

Written by Jamie | Last Updated January 20, 2020

Recent Posts