There are 2 sorted arrays A and B of size n each. Problem is to find the median of the array obtained after merging the above 2 arrays in time complexity of O(log(n)).

Since given two arrays A and B are sorted array, we can check whether an element A[i] is the median in constant time. Suppose that the median is A[i]. Since the array is sorted, it is greater than exactly i − 1 values in array A. Then if it is the median, it is also greater than exactly j =  (n) − (i − 1) elements in B (since middle elment, n = (i-1) + j). This also implies that B[j] ≤ A[i] ≤ B[j + 1].

If A[i] is not the median, then depending on whether A[i] is greater or less than B[j] and B[j + 1], you know that A[i] is either greater than or less than the median. Thus we can binary search for A[i] in θ(lg n) worst-case time.

Algorithm to find the median

  1. Calculate the medians m1 and m2 of the input arrays A
    and B respectively.
  2. If m1 and m2 both are equal then we are done, return m1 (or m2)
  3. If m1 is greater than m2, then median is present in one
    of the below two subarrays.

    1. From first element of A to m1
    2. From m2 to last element of B
  4. If m2 is greater than m1, then median is present in one
    of the below two subarrays.

    1. From m1 to last element of A
    2. From first element of B to m2
  5. Repeat the above process until size of both the subarrays
    becomes 2.
  6. If size of the two arrays is 2 then use below formula to get
    the median.
    Median = (max(A[0], B[0]) + min(A[1], B[1]))/2

The key to understanding this algorithm is to look at what you’re eliminating at each step. If m1 > m2, we know that m1 is larger than or equal to the first half of A, also m1 is greater than half the elements in the merged array. It doesn’t tell us where in relation to the merged median m1 is though. All we really know about the relation between A and the merged median is that we can eliminate everything greater than m1 (and less than m2 from ar2). The median of the merged list is somewhere in what remains.

Let us call the median of the merged array m. Since the first halves of A and B come before m1. We have that m1 => m, this means that no values larger than m1 need to be considered in A. So we only need to look in the first half of A. Similarly, since the second halves of A and B come after m1, we have that m2 <= m, this means that no values smaller than m2 need to be considered and we only need to look in the second half of B.