shifted array search

🏠

This is an interesting algorithm i very recently encountered. Finding the first element in a sorted and shifted array. Let's begin with a definitive refresher on binary search.

Now let's think of all the configurations:

11 2 3 4 5 6 7 8
28 1 2 3 4 5 6 7
37 8 1 2 3 4 5 6
46 7 8 1 2 3 4 5
55 6 7 8 1 2 3 4
64 5 6 7 8 1 2 3
73 4 5 6 7 8 1 2
82 3 4 5 6 7 8 1

What's the crucial pattern above, which we can look for? Spotted it? There's only one item in the list with a neighbour greater than itself, or, vice versa, only 1 item who's right neighbour is smaller than itself.

iterative

 1def find_smallest_in_shifted_array(A):
 2  l = 0
 3  r = len(A)-1
 4  while l <= r:
 5    m = (l + r) // 2
 6    if A[m-1] > A[m]:
 7      return m
 8    if A[m] < A[r]:
 9      r = m - 1
10    else:
11      l = m + 1
12    
13  return 0
14
15print(find_smallest_in_shifted_array([*range(5, 10)]+[*range(0, 5)]))

The key is A[m-1] > A[m] because since the data is sorted, the rule for sorted data remains: an entry is always greater than its left neighbour. That's just how sorted data works.

The only time this is no longer true, is when the element being considered is the first one. Because we get a jump. Look carefully at the values above, the only time the left element is greater than the right, is at ... 8 1 ... at this remains true for all configurations.

The second key is if A[m] < A[r] then we move the right pointer to the left. This also makes intuitive sense: if the right pointer is greater than the middle, then surely the smallest number is to the left. Check all the configurations, to see if you can convince yourself this rule remains true throughout.

recursive

 1def find_smallest_in_shifted_array(A, l=0, r=-1):
 2  if r == -1:
 3    r = len(A)-1
 4  m = (l + r) // 2
 5  if A[m-1] > A[m]:
 6    return m
 7  if A[r] < A[m]:
 8    return find_smallest_in_shifted_array(A, m+1, r)
 9  else:
10    return find_smallest_in_shifted_array(A, l, m-1)
11  return 0
12  
13print(find_smallest_in_shifted_array([*range(5, 10)]+[*range(0, 5)]))

See also:

binary search