strStr() is a function that takes two strings as input, haystack and needle, and returns the index of the first occurrence of needle in haystack, or -1 if needle is not present in haystack.

For example, if haystack = "hello world" and needle = "world", then the function should return 6, because "world" first appears at index 6 in "hello world". If needle = "goodbye", then the function should return -1, because "goodbye" does not appear in "hello world".

This problem is often used in programming interviews because it requires the use of string manipulation techniques and searching algorithms. There are many ways to solve this problem, but some common approaches include using a sliding window or using built-in string functions in the programming language being used.

Implement strStr() Solutions

Solution 1: Brute Force

One straightforward solution to the strStr() problem is to use a brute force approach, which involves checking each possible substring of haystack that has the same length as needle. If a substring is found that matches needle, then its starting index in haystack is returned. If no matching substring is found, then -1 is returned. Here’s some pseudocode to illustrate this approach:

function strStr(haystack, needle):
    n = length(haystack)
    m = length(needle)
    for i = 0 to n - m:
        if haystack[i:i+m] == needle:
            return i
    return -1

The time complexity of this solution is O((n – m + 1) * m), where n is the length of haystack and m is the length of needle. This is because we are checking each substring of length m in haystack, and there are n - m + 1 such substrings. Comparing two strings of length m takes O(m) time. Therefore, the overall time complexity of this solution is O(nm).

Solution 2: Knuth-Morris-Pratt (KMP) Algorithm

The KMP algorithm is a more advanced solution to the strStr() problem that uses a pre-processing step to build a pattern-matching table based on the needle string. This table allows us to skip certain characters in the haystack string during the matching process, which improves performance. Here’s some pseudocode to illustrate this approach:

function strStr(haystack, needle):
    n = length(haystack)
    m = length(needle)
    if m == 0:
        return 0
    prefix = compute_prefix(needle)
    j = 0
    for i = 0 to n - 1:
        while j > 0 and haystack[i] != needle[j]:
            j = prefix[j-1]
        if haystack[i] == needle[j]:
            j += 1
        if j == m:
            return i - m + 1
    return -1

function compute_prefix(needle):
    m = length(needle)
    prefix = [0] * m
    j = 0
    for i = 1 to m - 1:
        while j > 0 and needle[i] != needle[j]:
            j = prefix[j-1]
        if needle[i] == needle[j]:
            j += 1
        prefix[i] = j
    return prefix

The time complexity of this solution is O(n + m), where n is the length of haystack and m is the length of needle. The pre-processing step takes O(m) time to build the prefix table, and the main loop of the algorithm takes O(n) time to scan through the haystack string. The inner while loop in the main loop executes at most m times, so it does not contribute significantly to the overall time complexity.

See also  Longest Palindromic Substring

Solution 3: Rabin-Karp Algorithm

The Rabin-Karp algorithm is another advanced solution to the strStr() problem that uses a rolling hash function to compute the hash values of substrings of haystack and needle. This allows us to compare substrings more efficiently, because we can first compare their hash values before comparing the actual strings. Here’s some pseudocode to illustrate this approach:

function strStr(haystack, needle):
    n = length(haystack)
    m = length(needle)
    if m == 0:
        return 0
    h = pow(26, m-1) % modulus
    needle_hash = compute_hash(needle, h)
    haystack_hash = compute_hash(haystack[:m], h)
    for i = 0 to n - m:
        if needle_hash == haystack_hash and needle == haystack[i:i+m]:
            return i
        if i < n - m:
            haystack_hash = recalculate_hash(haystack[i], haystack[i+m], haystack_hash, h)
    return -1

function compute_hash(string, h):
    hash_value = 0
    for c in string:
        hash_value = (hash_value * 26 + ord(c) - ord('a')) % modulus
    return hash_value

function recalculate_hash(old_char, new_char, old_hash, h):
    new_hash = (old_hash - (ord(old_char) - ord('a')) * h) % modulus
    new_hash = (new_hash * 26 + ord(new_char) - ord('a')) % modulus
    return new_hash

The time complexity of this solution is O(n + m), where n is the length of haystack and m is the length of needle. The algorithm computes the hash values of needle and the first substring of length m in haystack in O(m) time, and then scans through haystack one character at a time, updating the hash value of each substring in O(1) time using the rolling hash function. The comparison of substrings takes O(m) time, but is only performed when their hash values match.

The Rabin-Karp algorithm has the same time complexity as the KMP algorithm, but maybe faster in practice for certain inputs because it can avoid some comparisons by using hash values. However, it requires more memory than the KMP algorithm to store the hash values and modulus value.

See also  Leetcode 463 Island Perimeter Solved

Which approach is better and why?

It depends on the specific use case and input data. The brute force solution has the worst time complexity and may be too slow for large inputs. The KMP algorithm has a better worst-case time complexity than the brute force solution, but requires extra memory to store the prefix table.

The Rabin-Karp algorithm has the same time complexity as the KMP algorithm, but may be faster in practice for certain inputs. Therefore, it’s important to consider the specific constraints and requirements of the problem when choosing a solution.

The KMP algorithm is a good solution to the strStr() problem because it has a better worst-case time complexity than the brute force solution, and it is relatively easy to understand and implement.

Implement strStr() in Python using KMP Algorithm:

def strStr(haystack: str, needle: str) -> int:
    n = len(haystack)
    m = len(needle)
    if m == 0:
        return 0
    prefix = compute_prefix(needle)
    j = 0
    for i in range(n):
        while j > 0 and haystack[i] != needle[j]:
            j = prefix[j-1]
        if haystack[i] == needle[j]:
            j += 1
        if j == m:
            return i - m + 1
    return -1

def compute_prefix(needle: str) -> List[int]:
    m = len(needle)
    prefix = [0] * m
    j = 0
    for i in range(1, m):
        while j > 0 and needle[i] != needle[j]:
            j = prefix[j-1]
        if needle[i] == needle[j]:
            j += 1
        prefix[i] = j
    return prefix

Implement strStr() in Ruby using KMP Algorithm

def str_str(haystack, needle)
    n = haystack.length
    m = needle.length
    return 0 if m == 0
    prefix = compute_prefix(needle)
    j = 0
    n.times do |i|
        while j > 0 && haystack[i] != needle[j]
            j = prefix[j-1]
        end
        if haystack[i] == needle[j]
            j += 1
        end
        if j == m
            return i - m + 1
        end
    end
    return -1
end

def compute_prefix(needle)
    m = needle.length
    prefix = Array.new(m, 0)
    j = 0
    (1...m).each do |i|
        while j > 0 && needle[i] != needle[j]
            j = prefix[j-1]
        end
        if needle[i] == needle[j]
            j += 1
        end
        prefix[i] = j
    end
    return prefix
end

Implement strStr() in JavaScript using KMP Algorithm

function strStr(haystack, needle) {
    const n = haystack.length;
    const m = needle.length;
    if (m == 0) {
        return 0;
    }
    const prefix = computePrefix(needle);
    let j = 0;
    for (let i = 0; i < n; i++) {
        while (j > 0 && haystack[i] != needle[j]) {
            j = prefix[j-1];
        }
        if (haystack[i] == needle[j]) {
            j++;
        }
        if (j == m) {
            return i - m + 1;
        }
    }
    return -1;
}

function computePrefix(needle) {
    const m = needle.length;
    const prefix = Array(m).fill(0);
    let j = 0;
    for (let i = 1; i < m; i++) {
        while (j > 0 && needle[i] != needle[j]) {
            j = prefix[j-1];
        }
        if (needle[i] == needle[j]) {
            j++;
        }
        prefix[i] = j;
    }
    return prefix;
}

Implement strStr() in Java using KMP Algorithm

public int strStr(String haystack, String needle) {
    int n = haystack.length();
    int m = needle.length();
    if (m == 0) {
        return 0;
    }
    int[] prefix = computePrefix(needle);
    int j = 0;
    for (int i = 0; i < n; i++) {
        while (j > 0 && haystack.charAt(i) != needle.charAt(j)) {
            j = prefix[j-1];
        }
        if (haystack.charAt(i) == needle.charAt(j)) {
            j++;
        }
        if (j == m) {
            return i - m + 1;
        }
    }
    return -1;
}

public int[] computePrefix(String needle) {
    int m = needle.length();
    int[] prefix = new int[m];
    int j = 0;
    for (int i = 1; i < m; i++) {
        while (j > 0 && needle.charAt(i) != needle.charAt(j)) {
            j = prefix[j-1];
        }
        if (needle.charAt(i) == needle.charAt(j)) {
            j++;
        }
        prefix[i] = j;
    }
    return prefix;
}

In each of these implementations, we first compute the prefix table using the compute_prefix() function, which takes O(m) time. Then, we scan through the haystack string and try to match needle using the prefix table, which takes O(n) time in the worst case. Therefore, the overall time complexity of the algorithm is O(m + n).