Introduction to the

Bucket Sort, also known as box Sort, is the main idea: the elements in the set to be sorted in the same value domain are stored in the same Bucket, that is, the set is divided into multiple regions according to the element value characteristics, then the buckets formed after splitting are in an ordered state from the value domain. If the elements in each bucket are sorted, the set of elements in all buckets is sorted.

Bucket sort is an extended version of counting sort, which can be thought of as each bucket stores the same elements, whereas each bucket stores a range of elements. In bucket sorting, ensure that the elements are evenly distributed. Otherwise, bucket sorting fails when all data is concentrated in the same bucket.

Algorithm implementation steps

  1. According to the range of difference between the largest element and the smallest element in the set to be sorted and the mapping rule, the number of buckets to be applied is determined.
  2. Iterate through the sorting sequence, placing each element in the corresponding bucket;
  3. Sort buckets that are not empty;
  4. Access buckets in sequence, and put the elements in the bucket back to the corresponding positions in the original sequence to complete the sorting.

Python code implementation

# bucket_sort code implementation

from typing import List

def bucket_sort(arr:List[int]):
    """ Bucket sort ""
    min_num = min(arr)
    max_num = max(arr)
    # Bucket size
    bucket_range = (max_num-min_num) / len(arr)
    # bucket array
    count_list = [ [] for i in range(len(arr) + 1)]
    Fill in the bucket array
    for i in arr:
        count_list[int((i-min_num)//bucket_range)].append(i)
    arr.clear()
    In this case, we call sorted directly for bucket sorting
    for i in count_list:
        for j in sorted(i):
            arr.append(j)
Copy the code
# Test data

if __name__ == '__main__':
    import random
    random.seed(54)
    arr = [random.randint(0.100) for _ in range(10)]
    print("Raw data:", arr)
    bucket_sort(arr)
    print("Bucket sort result:", arr)
Copy the code
# output resultRaw data: [17.56.71.38.61.62.48.28.57.42] bucket sort result: [17.28.38.42.48.56.57.61.62.71]
Copy the code

The demo

Algorithm analysis

  • Time complexity

    Best case: the input sequence is sorted, and the time complexity of insertion sort is inIn the best case, the time complexity is.

    Worst case: the size of the sequence to be sorted is, which is divided intoOne bucket, need to be carried outSecond loop, put each element into the corresponding bucket;Second loop, sorting the data in each bucket (on average, each bucket hasElement).

    If the quicksort algorithm is used to sort, the average time complexity of a single sort is, the worst time complexity is. The bucket sorting process is inserted in the form of linked list, so the time complexity of the whole bucket sorting is:


    whenIs, the time complexity is the lowest; whenIs, the time complexity is the highest.

    So the worst time complexity is:.

    Average: The average time complexity is.

  • Spatial complexity

    Buckets need to be created during bucket sortingExtra space for a bucket, as wellSo the space complexity of bucket sort is.

  • The stability of

    The stability of bucket sorting depends on the algorithm used in bucket sorting. If it is a queue, it can ensure that the relative positions of the same elements taken out and put back remain the same, that is, the sorting is stable, but if it is implemented by stack, the sorting must be unstable. Bucket sorting is a stable sorting algorithm because bucket sorting can be stable.

  • Comprehensive evaluation of

    Time complexity (average) Time complexity (best) Time complexity (worst) Spatial complexity The sorting way The stability of
    out-place stable

Contact us

Personal blog: www.bling2.cn/

Github address: github.com/lb971216008…

Zhihu column: zhuanlan.zhihu.com/Use-Python-…

Small column: xiaozhuanlan.com/Use-Python-…

Blog Park: www.cnblogs.com/Use-Python-…