Introduction

Binary Search is quite easy to understand conceptually. Basically, it splits the search space into two halves and only keeps the half that probably has the search target and throws away the other half that would not possibly have the answer. In this manner, we reduce the search space to half the size at every step, until we find the target. Binary Search helps us reduce the search time from linear O(n) to logarithmic O(log n). But when it comes to implementation, it's rather difficult to write a bug-free code in just a few minutes.

Some of the most common problems include:

A rather common misunderstanding of binary search is that people often think this technique could only be used in simple scenarios like "Given a sorted array, find a specific value in it". As a matter of fact, it can be applied to much more complicated situations.

Most Generalized Binary Search

Suppose we have a search space. It could be an array, a range, etc. Usually, it's sorted in ascending order. For most tasks, we can transform the requirement into the following generalized form:

Minimize k, s.t. condition(k) is True

The following code is the most generalized binary search template:

int condition(vector<int> arr, int mid){
             //write the condition here   
}
int binary_search(vector<int> arr){
    int left = min(search_space);
    int right = max(search_space);// could be [0,n], [1,n], etc
    while(left<right){
        mid = (left + right)/2;
        if(condition(arr,mid))right = mid;
        else left = mid +1;
    }
    return left;
}

What's really nice of this template is that, for most of the binary search problems, we only need to modify three parts after copy-pasting this template, and never need to worry about corner cases and bugs in code anymore:

Below I'll show you guys how to apply this powerful template to many LeetCode problems.

Basic Application

  1. Suppose you are a product manager and currently leading a team to develop a new product. Since each version is developed based on the previous version, all the versions after a bad version are also bad. Suppose you have n versions [1, 2, ..., n] and you want to find out the first bad one, which causes all the following ones to be bad. You are given an API bool isBadVersion(version) which will return whether the version is bad.

    Example:

    Given n = 5, and version = 4 is the first bad version.
    
    call isBadVersion(3) -> false
    call isBadVersion(5) -> true
    call isBadVersion(4) -> true
    
    Then 4 is the first bad version.
    
    

    First, we initialize left = 1 and right = n to include all possible values. Then we notice that we don't even need to design the condition function. It's already given by the isBadVersion API. Finding the first bad version is equivalent to finding the minimal k satisfying isBadVersion(k) is True. Our template can fit in very nicely:

    
    int firstBadVersion(vector<int> arr, int n){
        int left = 1;
        int right = n;
        while(left<right){
            mid = (left + right)/2;
            if(isBadVersion(arr,mid))right = mid;
            else left = mid +1;
        }
        return left;
    }