Explain Maximum subarray problem. How it is resolved by divide and conquer algorithm. Also analyze your algorithm.

276views

written 3.1 years ago by

binitamayekar ★ 6.7k

• modified 3.1 years ago

Maximum Subarray Problem

The Maximum Subarray Problem finds out the series of contiguous elements with the maximum sum in any given array.
If all the elements in an array are positive then it is easy, to find the sum of all the elements of the array and it has the largest sum over any other subarrays you can make out from that array.
But the problem gets more tricky when some of the elements are negative then the subarray whose sum of the elements is largest over the sum of the elements of any other subarrays of that element can place anywhere in the array.
Lets's see examples this will give more clarity about the maximum subarray problem.

1] Input - Array [] = [3, 5, 1, 7, 9]

Output - 25

All array elements are non-negative. So the maximum subarray sum would be the sum of the entire array.

2] Input - Array [] = [2, -4, 1, 9, -6, 7, -3]

Output - 11

The subarray [1, 9, -6, 7] has the maximum sum.

3] Input - Array [] = [-4, 5, 7, -6, 10, -15, 3]

Output - 16

The subarray [5, 7, -6, 10] has the maximum sum.

How Divide-and-Conquer Resolve this Maximum Subarray Problem

Suppose we want to calculate the maximum sub-array sum of the array X[l … r], where l and r are the left and right end of the input array.
Let's divide the array into two equal subarrays by calculating the mid-index: X[l ... mid] and X [mid + 1 ... r]. Then the sub-array X[i … j] with maximum sum must lie in one of the following places:
- Entirely in the left sub-array X[l…mid], where l ≤ i ≤ j ≤ mid.
- Entirely in the right sub-array X[mid+1…r], where mid+1 ≤ i ≤ j ≤ r
- Subarray crossing the mid-point, so that low ≤ i ≤ mid ≤ j ≤ r. This includes some rightmost elements of the left subarray and some leftmost elements of the right subarray.
Then recursively find the maximum sub-array sum of X[l…mid] and X[mid + 1…r] because these two sub-problems are smaller instances of the same problem of finding a maximum sub-array sum.

int leftMaxSum = findMaxSubarraySum (X, l, mid)
int rightMaxSum = findMaxSubarraySum (X, mid + 1, r)

Then calculate the maximum sub-array that crosses the mid-point and takes the maximum of all three possibilities.

int crossingMaxSum = maxCrossingSum(X, l, mid, r)
return max(leftMaxSum, rightMaxSum, crossingMaxSum)

Base case:

when l == r, there is only one element in the array, and we can directly return the max subarray sum as X[l] or X[r].
This would be the smallest version of the problem, where recursion will return the value directly.

Pseudocode:

int getMaxSubarraySum (int X[], int l, int r)
{
    if (l == r)
        return X[l]
    else
    {
        int mid = l + (r - l)/2
        int leftMaxSum = getMaxSubarraySum (X, l, mid)
        int rightMaxSum = getMaxSubarraySum (X, mid+1, r)
        int crossingMaxSum = maxCrossingSum(X, l, mid, r)
        return max (leftMaxSum, rightMaxSum, crossingMaxSum)
   }
}

Main Idea behind the Solution

Sub-array crossing the mid-point comprises two sub-arrays: X[i…mid] and X[mid + 1…j].
So we need to find the max sub-array sum from X[i…mid] and X[mid + 1…j] and then combine them.

The idea works as follows:

Step 1: find the maximum contiguous sum of the left half X[l…mid].

Initialize a variable sum to store the continuous sum from mid to some index i.
Initialize a variable maxLeftSum to store the max sum found so far.
Since this subarray must contain X[mid].
So run a loop starting from index i = mid to l, calculate the contiguous sum and store it in the variable sum.
Whenever sum > maxLeftSum, we update maxLeftSum with sum.

int sum = 0
int maxLeftSum = INT_MIN
for(int i = mid; i <= l; i = i - 1)
{
  sum = sum + X[i]
  if (sum > maxLeftSum)            
      maxLeftSum = sum
}

Step 2: find the maximum contiguous sum of the right half X[mid + 1 ... r].

Use the same variable sum and initialize it with 0.
Initialize a variable maxRightSum to store the max sum found so far.
Now run a loop starting from the index j = mid + 1 to r, to calculate the continuous sum and store it in the variable sum.
Whenever sum > maxRightSum, update maxRightSum with sum.

sum = 0
int maxRightSum = INT_MIN
for(int i = mid + 1; i <= r; i = i + 1)
{
  sum = sum + X[i]
  if (sum > maxRightSum)
      maxRightSum = sum
}

Step 3: Finally, to get the maximum contiguous subarray crossing the mid, we return the sum of variables maxLeftSum and maxRightSum.

Complete Pseudocode to find max subarray crossing the mid:

int maxCrossingSum (int X[], int l, int mid, int r)
{
    int sum = 0
    int maxLeftSum = INT_MIN
    for(int i = mid; i <= l; i = i - 1)
    {
        sum = sum + X[i]
        if (sum > maxLeftSum)
            maxLeftSum = sum
    }

    sum = 0
    int maxRightSum = INT_MIN
    for(i = mid + 1; i <= r; i = i + 1)
    {
        sum = sum + X[i]
        if (sum > maxRightSum)
            maxRightSum = sum
    }
    return (maxLeftSum + maxRightSum)
}

Time-Complexity:

Let's assume that the input size n is a power of 2, and $T(n)$ is the time complexity of finding the max subarray.
To calculate the overall time complexity, add the time complexities of the divide, conquer, and combine parts.
- For the base case (n = 1), the algorithm takes constant time. $T(1) = O(1)$ . The recursive case occurs when n > 1.
- The time complexity of the Divide part = $O(1)$ because there is only a need to calculate the mid to divide the array into two halves.
- The input size of each subproblem is n/2, need to spend $T(n/2)$ time solving each. Here, it solves two subproblems of size n/2, therefore the overall time complexity of the Conquer part = $2T(n/2)$ .
- To find the max subarray crossing the mid-value, run two separate single loops in the opposite direction. In the worst case, each loop runs n/2 times. Hence, the time complexity of the Combined part = $O(n)$ .

The General Recurrence Relation:

$T(n) = O(1)$ , if n = 1

$T(n) = 2 T(n/2) + O(n)$ , if n > 1

Therefore, overall Time Complexity = $O(nlogn)$

Space Complexity:

Space complexity is equal to the size of the recursion call stack, which depends on the height of the recursion tree.
At each stage, the input size decreases by 2, so the height of the recursion tree will be $O(logn)$ .
Therefore, Space Complexity = $O(logn)$ .

ADD COMMENT EDIT