This article is participating in the “Java Theme Month – Java Brush questions punch card”, see the activity link for details

Topic describes

This is LeetCode 30. String all the words together, Hard.

Given a string s and some words of the same length.

Find the starting position of the substring in S that happens to be formed by concatenating all the words in words.

Note that the substring must match exactly the words in words, without any other characters in the middle, but you do not need to consider the sequence in which words are concatenated.

Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"] Output: [0,9] Explanation: Substrings starting from index 0 and 9 are "barfoo" and "foobar", respectively. The order of the outputs is not important, [9,0] is also a valid answer.Copy the code

Example 2:

Input: s = "wordgoodGoodGoodBestWord ", words = ["word","good","best","word"] output: []Copy the code

Naive hash table

Let n be the length of the string s, m the length of the array words (the number of words), and w the length of a single word.

Since each word in words has a fixed length, and the string we are looking for can only contain exactly all words, the length of all the target substrings we are looking for is M ∗wm * wm∗ W.

So an intuitive way to think about it is:

  1. Use hash tablesmaprecordwordsThe number of occurrences of each word in
  2. The enumerationsEach character in the
    m w m * w
    The substringsub
  3. Use hash tablescur statisticalsubNumber of occurrences per word (everywLength as a word)
  4. To comparecurmapWhether or not the same

Note: In Step 3, if you find that sub contains words that do not appear, you can prune them directly.

The prune uses a labeled continue statement directly back to the outer loop.

Code:

class Solution {
    public List<Integer> findSubstring(String s, String[] words) {
        List<Integer> ans = new ArrayList<>();
        if (words.length == 0) return ans;

        int n = s.length(), m = words.length, w = words[0].length();

        Map<String, Integer> map = new HashMap<>();
        for (String word : words) {
            map.put(word, map.getOrDefault(word, 0) + 1);
        }

        out:for (int i = 0; i + m * w <= n; i++) {
            Map<String, Integer> cur = new HashMap<>();
            String sub = s.substring(i, i + m * w);
            for (int j = 0; j < sub.length(); j += w) {
                String item = sub.substring(j, j + w);
                if(! map.containsKey(item))continue out;
                cur.put(item, cur.getOrDefault(item, 0) + 1);
            }
            if (cmp(cur, map)) ans.add(i);
        }

        return ans;
    }
    boolean cmp(Map<String, Integer> m1, Map<String, Integer> m2) {
        if(m1.size() ! = m2.size())return false;
        for (String k1 : m1.keySet()) {
            if(! m2.containsKey(k1) || ! m1.get(k1).equals(m2.get(k1)))return false;
        }
        for (String k2 : m2.keySet()) {
            if(! m1.containsKey(k2) || ! m1.get(k2).equals(m2.get(k2)))return false;
        }
        return true; }}Copy the code
  • Time complexity: YeswordsThe words in are stored in a hash table with the complexity of
    O ( m ) O(m)
    ; Then the first level loops through the enumerationsEach character in is the starting point and the complexity is
    O ( n ) O(n)
    ; In the loopsubDivided intomThree words were counted and enumeratedm - 1Zero subscripts, and the complexity is zero
    O ( m ) O(m)
    ; The length of each string isw. The overall complexity is
    O ( n m w ) O(n * m * w)
  • Space complexity: O(m∗ W)O(M * W)O(m∗w)

Sliding window & hash table

In fact, we can optimize the process of enumerating the starting points.

We can classify the starting points according to the remainder of the current subscript and word length, so that we do not need to create new hash tables and word statistics frequently.

Code:

class Solution {
    public List<Integer> findSubstring(String s, String[] words) {
        List<Integer> ans = new ArrayList<>();
        if (words.length == 0) return ans;

        int n = s.length(), m = words.length, w = words[0].length();

        // Count the number of occurrences of each target word in words
        Map<String, Integer> map = new HashMap<>();
        for (String word : words) {
            map.put(word, map.getOrDefault(word, 0) + 1);
        }

        for (int i = 0; i < w; i++) {
            // Build a map of the current substring and count the occurrence times of "each target word" in the current substring
            Map<String, Integer> curMap = new HashMap<>();
            // The sliding window size is fixed to m * w
            // Each time the next word is added to the cur, the previous word is moved out
            for (int j = i; j + w <= n; j += w) {   
                String cur = s.substring(j, j + w);
                if (j >= i + (m * w)) {
                    int idx = j - m * w;
                    String prev = s.substring(idx, idx + w);
                    if (curMap.get(prev) == 1) {
                        curMap.remove(prev);
                    } else {
                        curMap.put(prev, curMap.get(prev) - 1);
                    }
                }
                curMap.put(cur, curMap.getOrDefault(cur, 0) + 1);
                // If the map of the current substring is the same as the map of words, it means that the current substring contains "all target words", and the start subscript is the result set
                if (map.containsKey(cur) && curMap.get(cur).equals(map.get(cur)) && cmp(map, curMap)) {
                    ans.add(j - (m - 1) * w); }}}return ans;
    }
    // Compare two maps to see if they are the same
    boolean cmp(Map<String, Integer> m1, Map<String, Integer> m2) {
        if(m1.size() ! = m2.size())return false;
        for (String k1 : m1.keySet()) {
            if(! m2.containsKey(k1) || ! m1.get(k1).equals(m2.get(k1)))return false;
        }
        for (String k2 : m2.keySet()) {
            if(! m1.containsKey(k2) || ! m1.get(k2).equals(m2.get(k2)))return false;
        }
        return true; }}Copy the code
  • Time complexity: YeswordsThe words in are stored in a hash table with the complexity of
    O ( m ) O(m)
    ; It then enumerates the results of the mod, with the complexity of
    O ( w ) O(w)
    ; Maximum processing per loopnThe value is a string of length and the complexity is
    O ( n ) O(n)
    . The overall complexity is
    O ( m + w n ) O(m + w * n)
  • Space complexity: O(m∗ W)O(M * W)O(m∗w)

The last

This is the 30th article in our “Brush through LeetCode” series, which started on 2021/01/01. There are 1916 questions in LeetCode as of the start date, some of which have lock questions. We will finish all the questions without lock first.

In this series of articles, in addition to explaining how to solve the problem, I will also present the most concise code possible. If a general solution is involved, the corresponding code template will also be used.

In order to facilitate the students to debug and submit the code on the computer, I set up the relevant warehouse: github.com/SharingSour…

In the repository, you’ll see links to the series of articles, the corresponding code for the series of articles, the LeetCode source links, and other preferred solutions.