014 — Reverse Pairs¶

Source¶

Original source: LeetCode 493 — Reverse Pairs
Platform: LeetCode 493
Topic: segment tree and fenwick tree
Difficulty: Hard
Pattern: BIT / merge-sort counting
Basic idea inherited from: #10 (count inversions), #13 (count of smaller after self)

1. Problem in simple words¶

একটা array a দেওয়া। একটা reverse pair হলো এমন জোড়া (i, j) যেখানে i < j এবং a[i] > 2 · a[j]। মোট কতগুলো reverse pair আছে গোনো। এটা inversion (#10)-এরই কাছাকাছি, কিন্তু শর্তটা a[i] > a[j] নয় — a[i] > 2·a[j]।

a = [1, 3, 2, 3, 1]

reverse pairs: (3,1) at idx (1,4),  (2,1) — না, 2 > 2*1=2 মিথ্যা,
               (3,1) at idx (3,4)
                                       ->  মোট 2

2. Which basic idea does this inherit from?¶

এটা সরাসরি #10 (count inversions) আর #13 (count of smaller after self)-এর উপর দাঁড়িয়ে। একই reverse-sweep + frequency-BIT কাঠামো; শুধু threshold বদলে 2·a[j] হয়েছে, যা একটু সাবধানি query সীমা দাবি করে।

3. What is the hidden math or logic?¶

লুকানো logic: array ডান-থেকে-বাঁ হাঁটো; প্রতিটা a[i]-এর জন্য জানতে চাও তার ডানে কয়টা a[j] আছে যাতে a[i] > 2·a[j], মানে a[j] < a[i] / 2। তাই frequency table-এ "value < a[i]/2 কয়টা ইতিমধ্যে দেখা"-র একটা prefix-count চাই। সব যোগ করলেই মোট।

4. Which data structure is needed?¶

একটা Fenwick tree (BIT) value-দের rank-এ indexed। কিন্তু query threshold a[i]/2 original value-গুলোর মধ্যে নাও থাকতে পারে — তাই compression-এ আমরা original value আর threshold দুটোই rank করি, নয়তো bisect দিয়ে threshold-এর rank খুঁজি। Value বড় (10^9 পর্যন্ত), তাই compression বাধ্যতামূলক।

5. Why this data structure fits¶

প্রশ্নটা একটা prefix-count over a frequency-of-values table — "ডানে কয়টা value < a[i]/2"। BIT এই prefix-count + point-increment O(log n)-এ দেয়। threshold value আলাদা বলে আমরা sorted distinct value-দের উপর bisect করে rank বের করি, তারপর prefix_sum। merge sort-ও সমান্তরাল পথ।

6. Real-life analogy¶

ভাবো একদল মানুষ এক এক করে ঘরে ঢুকছে, সবার হাতে টাকার অঙ্ক। শেষ থেকে দেখলে — প্রতিটা নতুন মানুষের জন্য জানতে চাও "এর পরে যারা ঢুকেছে (board-এ আছে), তাদের কয়জনের অঙ্ক আমার অর্ধেকেরও কম"। প্রতিবার সবাইকে না গুনে, অঙ্ক অনুযায়ী একটা tally রাখো; নতুন মানুষে শুধু my/2-এর নিচের অংশটা গোনো।

7. Visual explanation¶

a = [1, 3, 2, 3, 1]
sorted distinct values: [1, 2, 3]   (compression-এর basis)

ডান -> বাঁ; প্রতিটা x-এ আগে "ডানে কয়টা value < x/2" গোনো, পরে x mark করো:
  a[4]=1: x/2=0.5, value<0.5 কয়টা? = 0      -> +0;  mark 1
  a[3]=3: x/2=1.5, value<1.5 কয়টা? = 1 (1)  -> +1;  mark 3
  a[2]=2: x/2=1.0, value<1.0 কয়টা? = 0      -> +0;  mark 2
  a[1]=3: x/2=1.5, value<1.5 কয়টা? = 1 (1)  -> +1;  mark 3
  a[0]=1: x/2=0.5, value<0.5 কয়টা? = 0      -> +0;  mark 1
                                       total = 2

8. Example input and output¶

input:  a = [1, 3, 2, 3, 1]
output: 2

input:  a = [2, 4, 3, 5, 1]
output: 3            ((4,1),(3,1),(5,1) — 4>2,3>2 না... নিচে যাচাই হবে)

input:  a = [5, 4, 3, 2, 1]
output: 4            (5>2*2,5>2*1,4>2*1,3>2*1)

9. Brute force thinking¶

সরল চিন্তা: প্রতিজোড়া (i, j), i < j, দেখো a[i] > 2·a[j] কি না।

for i in range(n):
    for j in range(i+1, n):
        if a[i] > 2 * a[j]:
            count += 1

নির্ভুল, এটাই reference।

10. Why brute force is slow¶

O(n^2) জোড়া। n = 5·10^4 হলেও 2.5·10^9 — TLE। BIT (বা merge-sort) counting এটাকে O(n log n)-এ নামায়।

11. Key observation¶

মূল observation: শর্তটা a[j] < a[i] / 2 — মানে এটাও একটা threshold prefix-count over values, ঠিক #13-এর মতো, শুধু threshold a[i]/2। threshold original value নাও হতে পারে, তাই sorted distinct value-দের উপর bisect করে "কতগুলো distinct value < a[i]/2" বের করি।

12. Optimized intuition¶

Compression: vals = sorted(set(a))। ডান-থেকে-বাঁ: প্রতিটা a[i]-এ bisect_left(vals, a[i]/2) দেয় কতগুলো distinct value strictly < a[i]/2 — সেটাই BIT-এ query করার prefix length k, তাই prefix_sum(k)। তারপর a[i]-এর rank-এ add(rank+1, 1) (1-indexed)। Float এড়াতে 2*a[j] < a[i] integer-এ রাখা যায়, কিন্তু এখানে a[i]/2-এ bisect নিরাপদ (নিচে integer-safe version)।

13. Thinking tweak¶

মোচড়: inversion-এর a[i] > a[j]-কে general a[i] > c·a[j] ভাবো — তখন query সীমা একটা derived threshold, যা value-list-এ নাও থাকতে পারে; bisect দিয়ে তার rank খোঁজো। এই "threshold rank খুঁজে prefix-count" trick scaled-comparison counting-এর চাবি।

14. Step-by-step dry run¶

a = [1, 3, 2, 3, 1], vals = [1, 2, 3], BIT খালি (1-indexed, size 3)। ডান → বাঁ:

a[4]=1: threshold 0.5; bisect_left(vals, 0.5)=0 → prefix_sum(0)=0 → +0; mark 1 (add(1,1))।
a[3]=3: threshold 1.5; bisect_left(vals, 1.5)=1 → prefix_sum(1)=1 (value 1 marked) → +1; mark 3 (add(3,1))।
a[2]=2: threshold 1.0; bisect_left(vals, 1.0)=0 → prefix_sum(0)=0 → +0; mark 2 (add(2,1))।
a[1]=3: threshold 1.5; bisect_left(vals, 1.5)=1 → prefix_sum(1)=1 (value 1) → +1; mark 3 (add(3,1))।
a[0]=1: threshold 0.5; prefix_sum(0)=0 → +0; mark 1。
মোট = 0+1+0+1+0 = 2। ✓

15. Python solution¶

from bisect import bisect_left


class Fenwick:
    """1-indexed BIT, value-দের frequency table। index মানে VALUE-র rank।
    add(i, 1): value-rank i দেখলাম; prefix_sum(i): rank 1..i মোট গোনা।
    দুটোই O(log n)।"""

    def __init__(self, n):
        self.n = n
        self.tree = [0] * (n + 1)

    def add(self, i, delta):
        while i <= self.n:
            self.tree[i] += delta
            i += i & (-i)

    def prefix_sum(self, i):
        s = 0
        while i > 0:
            s += self.tree[i]
            i -= i & (-i)
        return s


def reverse_pairs_bit(a):
    n = len(a)
    if n == 0:
        return 0
    vals = sorted(set(a))                       # compression basis
    bit = Fenwick(len(vals))
    total = 0
    for i in range(n - 1, -1, -1):             # ডান -> বাঁ
        # ডানে কয়টা a[j] আছে যাতে 2*a[j] < a[i]  <=>  a[j] < a[i]/2
        # integer-safe: কতগুলো distinct value v সন্তুষ্ট করে 2*v < a[i]
        # bisect_left দিয়ে threshold-এর insertion point বের করি
        lo, hi = 0, len(vals)                   # ব্যবহার করি না; bisect-key trick নিচে
        # 2*v < a[i]  <=>  v < a[i]/2 ; integer key: ছোট v যেগুলোর 2v < a[i]
        # সমান্তরালে: bisect_left(vals_times2_sorted, a[i]) দিলে strictly-ছোট গোনা
        k = bisect_left(vals, a[i] / 2)        # distinct value strictly < a[i]/2
        total += bit.prefix_sum(k)             # ওই k-টা rank ইতিমধ্যে কয়বার দেখা
        r = bisect_left(vals, a[i]) + 1        # a[i]-র 1-indexed rank
        bit.add(r, 1)
    return total


def reverse_pairs_merge(a):
    # merge sort, merge-এর আগে cross reverse-pair গোনে
    def sort_count(arr):
        if len(arr) <= 1:
            return arr, 0
        mid = len(arr) // 2
        left, cl = sort_count(arr[:mid])
        right, cr = sort_count(arr[mid:])
        # cross count: left-এর প্রতিটা x-এর জন্য right-এ কয়টা y যাতে x > 2*y
        cm = 0
        j = 0
        for x in left:
            while j < len(right) and x > 2 * right[j]:
                j += 1
            cm += j
        # merge
        merged = []
        i = j2 = 0
        while i < len(left) and j2 < len(right):
            if left[i] <= right[j2]:
                merged.append(left[i]); i += 1
            else:
                merged.append(right[j2]); j2 += 1
        merged.extend(left[i:])
        merged.extend(right[j2:])
        return merged, cl + cr + cm

    return sort_count(a)[1]


def brute(a):
    # obviously-correct O(n^2) reference
    total = 0
    for i in range(len(a)):
        for j in range(i + 1, len(a)):
            if a[i] > 2 * a[j]:
                total += 1
    return total


# ---- tests ----
a = [1, 3, 2, 3, 1]
assert reverse_pairs_bit(a) == 2
assert reverse_pairs_merge(a) == 2
assert brute(a) == 2

assert reverse_pairs_bit([5, 4, 3, 2, 1]) == brute([5, 4, 3, 2, 1])
assert reverse_pairs_bit([1, 2, 3, 4, 5]) == 0          # increasing
assert reverse_pairs_bit([]) == 0                       # empty
assert reverse_pairs_bit([7]) == 0                      # single
assert reverse_pairs_bit([2, 4, 3, 5, 1]) == brute([2, 4, 3, 5, 1])

# তিন পদ্ধতি random array-তে সবসময় একমত (duplicate, negative-সহ)
import random
for _ in range(400):
    arr = [random.randint(-8, 8) for _ in range(random.randint(0, 14))]
    expected = brute(arr)
    assert reverse_pairs_bit(arr) == expected
    assert reverse_pairs_merge(arr) == expected

print("সব test pass!")

16. Line-by-line code explanation¶

Fenwick — value-frequency BIT, index = rank।
vals — sorted distinct value, compression-এর basis; threshold এদের মধ্যে নাও থাকতে পারে বলে bisect।
k = bisect_left(vals, a[i] / 2) — কতগুলো distinct value strictly < a[i]/2; এটাই query করার prefix length।
r = bisect_left(vals, a[i]) + 1 — a[i]-র 1-indexed rank, mark করতে।
reverse_pairs_merge — merge-এর আগে two-pointer-এ cross count (x > 2*right[j]), তারপর সাধারণ merge।
brute — O(n^2), নির্ভুল reference।

17. Output walkthrough¶

[1,3,2,3,1]-এ BIT দেয় 2 (section 14), merge sort-ও 2, brute-ও 2। তারপর 400-টা random array-তে — duplicate আর negative-সহ — তিন পদ্ধতি সবসময় একমত। শেষে print: সব test pass!।

18. Time complexity¶

দুই পদ্ধতিই O(n log n): BIT-এ প্রতি element-এ bisect + prefix_sum + add, প্রতিটা O(log n); merge sort O(n log n) (cross count two-pointer O(n) per merge)।

19. Space complexity¶

O(n): BIT array + vals (BIT পথ); merge sort-এ recursion + temporary list।

20. Common mistakes¶

threshold-কে value-list-এ আছে ধরে নেওয়া — a[i]/2 প্রায়ই distinct value-গুলোর বাইরে; bisect দিয়ে rank বের করতেই হবে।
merge step-এ cross-count আর merge একসাথে গুলিয়ে ফেলা — দুটো আলাদা pass (নয়তো pointer reset ভুল)।
strict vs non-strict: শর্ত > (a[i] > 2·a[j]), তাই bisect_left(..., a[i]/2) — strictly ছোট value।
compression বাদ — value 10^9 হলে BIT অসম্ভব বড়।

21. Which basic problem this inherits from¶

ভিত্তি: #10 (inversions) + #13 (count of smaller after self) — একই reverse-sweep + frequency-BIT, শুধু threshold scaled (2·a[j])। আরো গভীরে ../fenwick-tree.md section 9।

22. Similar harder problems¶

Count of Range Sum (prefix-sum-এর উপর BIT + threshold) — https://leetcode.com/problems/count-of-range-sum/
Count of Smaller Numbers After Self (একই কাঠামো, threshold = a[i]) — https://leetcode.com/problems/count-of-smaller-numbers-after-self/ (#13)
Number of Pairs Satisfying Inequality — https://leetcode.com/problems/number-of-pairs-satisfying-inequality/

23. Pattern learned¶

BIT / merge-sort counting with a scaled threshold: scaled-comparison জোড়া গোনা = reverse-sweep + frequency-BIT, কিন্তু query সীমা একটা derived threshold (a[i]/2) যা bisect দিয়ে rank-এ অনুবাদ করতে হয়। merge sort এর সমান্তরাল পথ।

24. Final summary¶

ভবিষ্যতের আমাকে: reverse pairs = inversion-এর scaled রূপ (a[i] > 2·a[j])। reverse-sweep + frequency-BIT, কিন্তু threshold a[i]/2 value-list-এ না থাকলে bisect-এ তার rank খুঁজে prefix_sum। "derived threshold-এর rank খোঁজা" — এই একটা বাড়তি ধাপই #13 থেকে এটাকে আলাদা করে।

মনে রাখার নিয়ম (legal): সব নিজের ভাষায় লেখা; problem statement copy করা হয়নি; official link শুধু attribution-এর জন্য; "asked by Google/Amazon" দাবি নেই — এটা একটা common interview pattern।