Subarray Inversions

We have an array A of n integers that we’re planning on sorting. Specifically, we want to know how close the array is to sorted. In order to achieve this, we introduce the idea of an inversion. An inversion in array is a pair of two indices i and jsuch that i < j and A[i] > A[j]. If our array contains one inversion, we need only swap A[i]and A[j] in order to sort the array. An array that contains 0 inversions is by definition sorted.

Problem: Given an array A of n integers, find the sum of the number of inversions in all subarrays of length k. To clarify, one must determine the number of inversions in each of the n – k + 1 subarrays of length k and add them together.

Input: The first line contains two space separated integers n and k. The next line contains a sequence of n space separated integers where the ith integer in the sequence is A[i].

Examples:

Input : arr[] = {1 2 3 4 5 6}
        k = 2
Output : 0

Input : arr[] = {1 6 7 2}
        k = 3
Output : 2
There are two subarrays of size 3,
{1, 6, 7} and {6, 7, 2}. Count of
inversions in first subarray is 0
and count of inversions in second
subarray is 2. So sum is 0 + 2 = 2

Input :  8 4
12 3 14 8 15 1 4 22
Output : 14

Input :  10 5
15 51 44 44 76 50 29 88 48 50
Output : 25

[1] Naive Approach

This problem seems fairly trivial at first, and we can easily implement a naive algorithm to brute force the solution. We simply create a window of length k and roll the window along A, adding the number of inversions in the window at each iteration. To find the number of inversions, the simplest approach is to use two for loops to consider all pairs of elements and increment a counter if the pair forms an inversion.

This approach is very easy to implement, but is it efficient? Let’s analyze the algorithm. The outermost loop runs n – k + 1 times, once for each k-subarray of A. At each of these iterations, we find the number of inversions in the window. To do this, we consider element 1 and elements 2, …, n, then element 2 and elements 3, …, n until element n – 1 and element n. Effectively, we’re performing n + (n – 1) + (n – 2) + … + 1 = n(n + 1)/2operations. Thus, our algorithm performs approximately (n – k + 1)(n)(n + 1)/2operations which is O(n^3 – kn^2). Since k can range from 1 to n, the worst case performance for this algorithm is O(n^3) when k = n/2. We can do better!

public class Subarray_Inversions {
    // Inversion counting method, slides window of [start,
    // end] across array
    static int inversion_count(int n, int k, int[] a)
    {
        int count = 0;
        for (int start = 0; start < n - k + 1; start++) {
            int end = start + k;
            count += bubble_count(a, start, end);
        }
        return count;
    }
    // Helper function, counts number of inversions via
    // bubble sort loop
    public static int bubble_count(int[] arr, int start, int end)
    {
        int count = 0;
        for (int i = start; i < end; i++) {
            for (int j = i + 1; j < end; j++) {
                if (arr[i] > arr[j]) {
                    count++;
                }
            }
        }
        return count;
    }
    public static void main(String[] args)
    {
        int n = 10;
        int[] arr = { 15, 51, 44, 44, 76, 50, 29, 88, 48, 50 };
        int k = 5;
        long result = inversion_count(n, k, arr);
        System.out.println(result);
    }
}

Output:

 25

[2] Mergesort-based Implementation

One optimization we can make is improving our inefficient, quadratic-time inversion counting method. One approach could involve using a mergesort-based method as outlined in this article. Since this runs in O(nlogn), our overall runtime is reduced to O(n^2logn), which is better, but still won’t be able to handle cases for which n = 10^6 for instance.

import java.util.*;
public class Subarray_Inversions {
    // Inversion counting method, slides window of [start,
    // end] across array
    static int inversion_count(int n, int k, int[] a)
    {
        int count = 0;
        for (int start = 0; start < n - k + 1; start++) {
            int[] sub_array = new int[k];
            for (int i = start; i < start + k; i++) {
                sub_array[i - start] = a[i];
            }
            count += subarray_inversion_count(sub_array);
        }
        return count;
    }
    // Counts number of inversions when merging
    public static long merge_inversion_count(int[] arr,
                                int[] left, int[] right)
    {
        int i = 0, j = 0, count = 0;
        while (i < left.length || j < right.length) {
            if (i == left.length) {
                arr[i + j] = right[j];
                j++;
            } else if (j == right.length) {
                arr[i + j] = left[i];
                i++;
            } else if (left[i] <= right[j]) {
                arr[i + j] = left[i];
                i++;
            } else {
                arr[i + j] = right[j];
                count += left.length - i;
                j++;
            }
        }
        return count;
    }
    // Divide and conquer approach -- splits array and counts
    // inversions via merge method
    public static long subarray_inversion_count(int[] arr)
    {
        if (arr.length < 2)
            return 0;
        int m = (arr.length + 1) / 2;
        int left[] = Arrays.copyOfRange(arr, 0, m);
        int right[] = Arrays.copyOfRange(arr, m, arr.length);
        return subarray_inversion_count(left) +
               subarray_inversion_count(right) +
               merge_inversion_count(arr, left, right);
    }
    public static void main(String[] args)
    {
        int n = 10;
        int[] arr = { 15, 51, 44, 44, 76, 50, 29, 88, 48, 50 };
        int k = 5;
        long result = inversion_count(n, k, arr);
        System.out.println(result);
    }
}

Output:

 25

[3] Overlapping Subarrays Implementation

Let’s revisit our overall approach. We’re looking at the window [0, k) and finding the number of inversions, then we look at [1, k+1). There’s a significant overlap in this range: we’ve already counted the number of inversions in [1, k) during the first iteration, and now we’re counting them again! Instead of counting inversions, let’s count the change in inversions from one window to the next. Effectively, shifting the window is just adding one element to the head of the window and removing an element from its tail — the body of the window remains the same. Checking for inversions among internal elements would be redundant; all we need to do is add the number of inversions induced by the new element and subtract the number of inversions induced by the removed element. We now only need to count the number of inversions in the first window, which we can do in O(klogk) time, and for each of the n – k additional windows, we simply perform one iteration through the k elements in the array to find the change in the number of inversions. Our overall runtime is now O(k(n – k) + klogk) = O(nk – k)which is still worst case O(n^2).

import java.util.*;
public class Subarray_Inversions {
    // Inversion counting method, slides window of [start,
    // end] across array
    static long inversion_count(int n, int m, int[] arr)
    {
        long count = 0;
        count += subarray_inversion_count_initial(Arrays.copyOfRange(arr, 0, m));
        long subarray_count = subarray_inversion_count_initial(Arrays.copyOfRange(arr, 1, m));
        for (int start = 1; start <= n - m; start++) {
            int end = start + m - 1;
            long[] ans = subarray_inversion_count(arr, start, end, subarray_count);
            count += ans[0];
            subarray_count = ans[1];
        }
        return count;
    }
    // start >=1; find inversion in interval [start, end)
    public static long[] subarray_inversion_count(int[] arr, int start,
                                          int end, long subarray_count)
    {
        int new_element = arr[end];
        long count = subarray_count;
        for (int i = start; i < end; i++) {
            if (new_element < arr[i])
                count++;
        }
        long totalSum = count;
        int last_element = arr[start];
        for (int i = start + 1; i <= end; i++) {
            if (last_element > arr[i])
                count--;
        }
        long[] ans = { totalSum, count };
        return ans;
    }
    // Counts number of inversions when merging
    public static long merge_inversion_count(int[] arr, int[] left,
                                                       int[] right)
    {
        int i = 0, j = 0, count = 0;
        while (i < left.length || j < right.length) {
            if (i == left.length) {
                arr[i + j] = right[j];
                j++;
            } else if (j == right.length) {
                arr[i + j] = left[i];
                i++;
            } else if (left[i] <= right[j]) {
                arr[i + j] = left[i];
                i++;
            } else {
                arr[i + j] = right[j];
                count += left.length - i;
                j++;
            }
        }
        return count;
    }
    // Divide and conquer approach -- splits array and counts
    // inversions via merge method
    public static long subarray_inversion_count_initial(int[] arr)
    {
        if (arr.length < 2)
            return 0;
        int m = (arr.length + 1) / 2;
        int left[] = Arrays.copyOfRange(arr, 0, m);
        int right[] = Arrays.copyOfRange(arr, m, arr.length);
        return subarray_inversion_count_initial(left) +
               subarray_inversion_count_initial(right) +
               merge_inversion_count(arr, left, right);
    }
    public static void main(String[] args) throws Exception
    {
        int n = 10;
        int[] arr = { 15, 51, 44, 44, 76, 50, 29, 88, 48, 50 };
        int k = 5;
        long result = inversion_count(n, k, arr);
        System.out.println(result);
    }
}

Output:

 25

[4] Binary Indexed Tree Implementation

Iterating over each window seems inevitable, so the bottleneck here appears to be the way we handle the windows. We know that consecutive windows overlap significantly, so all we need to know is the number of elements greater than the newly added element and number of elements less than the newly removed element. Many times, algorithms that perform the same or similar operation(s) repeatedly can be improved by the use of a more robust data structure. In our case, we’re looking for a dynamic data structure that can efficiently answer queries about an element’s relative position when sorted. We can use a self-balancing binary tree to actually maintain a sorted list, but insertion/removal takes logarithmic time. We can do this in constant time using a Binary Indexed Tree, or Fenwick Tree.

A binary indexed tree is a tree represented in the form of an array. It uses clever bit manipulation to compute the cumulative sum of elements very efficiently. We can call the function update(index, val) function to add val to BIT[index] and all of the ancestors of index. The function read(index) returns the sum of the values stored at BIT[index]and all of the ancestors of index in the tree. Thus, calling read(index) returns the cumulative sum of all elements in BIT less than or equal to index. Instead of storing values, if we simply store 1, we can use read(index + 1) to determine the number of elements less than index. Now, we can construct a binary indexed tree by inserting the elements (updating) of the first window. For subsequent windows, we can remove the trailing element by calling update(tail_element, -1) and add the new element with update(head_element, 1). Since this is a tree, the longest possible root-node path is O(logk), Thus, we achieve an optimal runtime of O(nlogk + klogk) = O(nlogk)!

Or do we…? Remember, binary indexed trees allocate memory for every possible value in the range [0, max_element], so this requires O(max_element) time and space. For very sparse arrays, this can be quite costly. Instead, we can define a hash function to . We can do this because we’re only concerned about inversions — as long as we keep the relative magnitude the same (i.e. A[i] <= A ==> A[hash(i)] <= A[hash(j)]), our solution will still be correct. Thus, we can map all the elements in A to the set {0, 1, 2, …, n}, yielding a guaranteed runtime of O(nlogk).

import java.util.*;
public class Subarray_Inversions {
    // Declare binary indexed tree with global scope
    static BIT bit;
    // first window, counts first k elements and creates
    // BIT
    static long inversionCountBIT1(int[] arr, int start,
                                               int end)
    {
        bit = new BIT(arr.length);
        long count = 0;
        for (int index = start; index >= end; index--) {
            count += bit.read(arr[index]);
            bit.update(arr[index], 1);
        }
        return count;
    }
    // subsequent windows, removes previous element, adds
    // new element, and increments change in inversions
    static long inversionCountBIT2(int[] arr, int start,
                                       int end, long val)
    {
        bit.update(arr[start + 1], -1); // remove trailing element
        // find number of elements in range [start, end-1]
        // greater than first
        int numGreaterThanFirst = start - end - bit.read(arr[start + 1] + 1);
        long count = val + bit.read(arr[end]) - numGreaterThanFirst;
        bit.update(arr[end], 1); // adds leading element
        return count;
    }
    // Main method to count inversions in size k subarrays
    public static long inversionCount(int n, int k, int[] arr)
    {
        bit = new BIT(n);
        HashMap<Integer, Integer> freq = new HashMap<Integer, Integer>();
        int[] asort = arr.clone();
        // Map elements from [A[0]...A[n-1]] to [1...n]
        Arrays.sort(asort);
        int index = 0;
        int current = 1;
        for (int i = 0; i < n; i++) {
            if (!freq.containsKey(asort[i])) {
                freq.put(asort[i], current);
                current++;
            }
        }
        for (int i = 0; i < n; i++) {
            arr[i] = freq.get(arr[i]);
        }
        long count = 0;
        long val = 0;
        //[start - end] ==> start - end = k+1
        for (int start = n - 1; start >= k - 1; start--) {
            int end = start - k + 1;
            if (start == n - 1) { // First pass
                val = inversionCountBIT1(arr, n - 1, n - k);
            } else { // subsequent passes
                val = inversionCountBIT2(arr, start, end, val);
            }
            count += val;
        }
        return count;
    }
    public static void main(String[] args) throws Exception
    {
        int n = 10;
        int[] arr = { 15, 51, 44, 44, 76, 50, 29, 88, 48, 50 };
        int k = 5;
        long result = inversionCount(n, k, arr);
        System.out.println(result);
    }
    // Implementation of Binary Indexed Tree
    static class BIT {
        int[] tree;
        int maxVal;
    public BIT(int N)
        {
            tree = new int[N + 1];
            maxVal = N;
        }
        // Updates BIT, starting with element at index
        // and all ancestors
        void update(int index, int val)
        {
            while (index <= maxVal) {
                tree[index] += val;
                index += (index & -index);
            }
        }
        // Returns the cumulative frequency of index
        // starting with element at index and all ancestors
        int read(int index)
        {
            --index;
            int cumulative_sum = 0;
            while (index > 0) {
                cumulative_sum += tree[index];
                index -= (index & -index);
            }
            return cumulative_sum;
        }
    };
}

Output:

 25


Disclaimer: This does not belong to TechCodeBit, its an article taken from the below
source and credits.
source and credits: http://www.geeksforgeeks.org
We have built the accelerating growth-oriented website for budding engineers and aspiring job holders of technology companies such as Google, Facebook, and Amazon
If you would like to study our free courses you can join us at

http://www.techcodebit.com. #techcodebit #google #microsoft #facebook #interview portal #jobplacements
#technicalguide

rakesh

Leave a Reply

Your email address will not be published. Required fields are marked *

Skip to toolbar