This series is my summary of data structure and algorithm when I was looking for a job many years ago. There are basic parts and classic interview questions of various companies. It was first published on CSDN. Is sorted out for a series of friends in need of reference, if there is a mistake, welcome to correct. The full code address for this series is here.

0 overview

Binary search is a simple algorithm in itself, but because of its simplicity, it is more prone to writing errors. Even in the early days of binary lookup algorithms, there were bugs (spillover bugs) that weren’t fixed until decades later (see Programming Pearls). This paper intends to summarize the binary search algorithm and analyze and summarize the problems derived from binary search. If there are any mistakes, please correct them. The full code for this article is here.

1 binary search basis

I believe we all know the basic algorithm of binary search, as shown below, this is the binary search algorithm code:

Int l = 0, u = n-1; int l = 0, u = n-1;while(l <= u) { int m = l + (u - l) / 2; // Same as (l+u) / 2, here to prevent overflowif (t > a[m])
            l = m + 1;
        else if (t < a[m])
            u = m - 1;
        else
            return m;
    }
    return -(l+1);
}
Copy the code

The idea is to start in the middle of the array and eliminate half of the data at a time in order (lgN). It depends on the fact that the array is ordered. If t exists in an array, return t’s position in the array; Otherwise, it does not exist and returns -(l+1).

Now, I need to explain why instead of returning -1 when t is not in the array, I return -(l+1). First we can look at the value of L, and if the lookup fails, l is exactly where T should be inserted into the array.

For example, given that the ordered array a={1, 3, 4, 7, 8}, then if t=0, it is clear that t is not in the array, the binary search algorithm will eventually make l=0 > u=-1 exit the loop; If t=9, then t is also not in the array, and finally l=5 > u=4 exits the loop. If t=5, the loop ends with l=3 > u=2. Therefore, in some algorithms, such as DHT (consistent hashing), this return value is needed to insert the newly added node into the appropriate position, and it is also used in the NlgN algorithm for the longest increasing subsequence, see the post on longest increasing subsequence algorithm.

The other little point is that the reason why you return -(l+1) instead of just returning -l is because l might be 0, and if you return -l you can’t tell if you return 0 normally or if you don’t get it.

2 Locate the first occurrence of a number in an ordered array

Now consider a slightly more complicated problem. If there are repeated numbers in an ordered array, such as the array a={1, 2, 3, 3, 5, 7, 8}, you need to find where 3 first occurs. This is where 3 first appears at position 2. This problem has a good analysis in chapter 9 of Programming Abet, which is directly used here. The essence of the algorithm lies in the clever design of the loop invariant, the code is as follows:

/** ** binarySearchFirst(int a[], int n, int t) {int l = -1, u = n;while(l + 1 ! U) = {/ * loop invariant a [l] < t < = a [u] && l < u * / int m = l + (u - l) / 2; // Same as (l+u) / 2if (t > a[m])
            l = m;
        else
            u = m;
    }
    /*assert: l+1=u && a[l]<t<=a[u]*/
    int p = u;
    if(p>=n || a[p]! =t) p = -1;return p;
}
Copy the code

Algorithm analysis: Set two nonexistent elements A [-1] and a[n], so that a[-1] < t <= a[n], but we will not access these two elements, because (L +u)/2 > L =-1, (L +u)/2 < u=n. The cyclic invariant is la[l] &&t <=a[u]. The loop must exit with l+1=u and a[l] < t <= a[u]. The value of u after the loop exits is the possible position of t, and its range is [0, n]. If t is in the array, the first position of t is p=u, and if not, p=-1 is set to return. This algorithm solves more complex problems, but it is more efficient than the original version of binary search because it requires only one comparison per loop, compared with two comparisons that the previous program typically required.

For example, if a={1, 2, 3, 3, 5, 7, 8}, if t=3, we can get p=u=2, if t=4, a[3]

=n, such as t=9, u=7, and p=-1. In particular, l=-1 and u=n cannot be written as l=0 and u=n-1. Although these two values are not accessible, changing them to the latter will cause binary lookup to fail and the first number will not be accessed. If a={1, 2, 3, 4, 5} is used to search for 1, the initial setting of l=0 and u=n-1 will cause the search to fail.

What if you want to find the last position of a number in an array? In fact, this is similar to the above algorithm, slightly change the above algorithm can be, the code is as follows:

/** * binarySearchLast(int a[], int n, int t) {int l = -1, u = n;while(l + 1 ! U) = {/ * loop invariant, a [l] < = t < a [u] * / int m = l + (u - l) / 2;if (t >= a[m])
            l = m;
        else
            u = m;
    }
    /*assert: l+1 = u && a[l] <= t < a[u]*/
    int p = l;
    if(p<=-1 || a[p]! =t) p = -1;return p;
}
Copy the code

Of course, there is also a way to query the first and last occurrence of the number of code in a program, just need to modify the original binary lookup slightly, the code is as follows:

/** * int searchFirstandLast (int a[], int n, int t, int firstFlag) {int l = 0; int u = n - 1;while(l <= u) {
        int m = l + (u - l) / 2;
        if(a[m] == t) {// if (a[m] == t)if(firstFlag) {// Query the location of the first occurrenceif(m ! = 0 && a[m-1] ! = t)return m;
                else if(m == 0)
                    return 0;
                else
                    u = m - 1;
            } else{// Query the location of the last occurrenceif(m ! = n-1 && a[m+1] ! = t)return m;
                else if(m == n-1)
                    return n-1;
                elsel = m + 1; }}else if(a[m] < t)
            l = m + 1;
        else
            u = m - 1;
    }

    return- 1; }Copy the code

3 Rotate array elements to find problems

The title

Moving the first elements of an ordered array to the end of the array is called rotation. For example, array {3, 4, 5, 1, 2} is a rotation of {1, 2, 3, 4, 5}. Now, given the rotated array and a number, we don’t know how many bits we rotated, we’re asked to come up with an algorithm that calculates the index of the given number in the array, and returns -1 if the number is not found. The number of searches cannot exceed N.

Analysis of the

As you can see from the problem, the rotated array is completely unordered, but its front and back parts are partially ordered. Binary lookup can therefore be used to solve the problem.

Solution 1: two binary searches

First determine the array split point, that is, the array on both sides of the split point is ordered. For example, the array is split at position 2, with the first part {3,4,5} in order and the second part {1,2} in order. Then use binary search for both parts. The code is as follows:

*/ int binarySearchRotateTwice(int a[], int n, int t) {int p = findRotatePosition(a, n); // Find the rotation positionif (p == -1)
        returnbinarySearchFirst(a, n, t); Int left = binarySearchFirst(a, p+1, t); // Find the left halfif (left != -1)
        returnleft; Int right = binarySearchFirst(a+p+1, n-p-1, t); // If the left part is not found, find the right partif (right == -1)
        return- 1;returnright+p+1; Int findRotatePosition(int a[], int n) {int I; // findRotatePosition(int a[], int n) {int I;for (i = 0; i < n-1; i++) {
        if (a[i+1] < a[i])
            return i;
    }
    return- 1; }Copy the code

Solution 2: a binary search

Binary search algorithm has two key points: 1) array order; 2) Determine whether the next binary search should be carried out in the first half of the interval or the second half of the interval according to the size relationship between the middle element of the current interval and T.

After careful analysis of the problem, it can be found that at least one to the left ([l, m]) and one to the right ([m, u]) of M is ordered each time m is solved in terms of L and u. A [m] is compared with a[L] and a[u] respectively to determine which segment is ordered.

  • If the left-hand side is ordered, ift<a[m] && t>a[l],u=m-1; In other cases,l =m+1;
  • If the right-hand side is ordered, ift> a[m] && t<a[u]l=m+1; In other cases,u =m-1; The code is as follows:
/** */ int binarySearchRotateOnce(int a[], int n, int t) {int l = 0, u = n-1;while (l <= u) {
        int m = l + (u-l) / 2;
        if (t == a[m])
            return m;
        if(a[m] >= a[l]) {// Array left semi-orderedif (t >= a[l] && t < a[m])
                u = m - 1;
            else
                l = m + 1;
        } else{// The right half of the array is sortedif (t > a[m] && t <= a[u])
                l = m + 1;
            elseu = m - 1; }}return- 1; }Copy the code

The resources

  • Programming Abecas 2nd edition Chapter 9
  • Rotate the binary lookup of the array