🎯 Subarrays With Exactly `K` Distinct Integers #992

This code solves the classic “Subarrays with Exactly K Distinct Integers” problem (LeetCode 992). You’re using the very powerful pattern:

🧠 exactly(K) = atMost(K) - atMost(K - 1)

Let’s turn this into clean, structured notes you can drop straight into your DSA repo: problem, intuition, brute force, optimal approach, full code, tests, complexity, variations, FAQs… with a bit of personality sprinkled in 😄

🧾 Problem Statement

Given an integer array nums and an integer k, return the number of contiguous subarrays that contain exactly k distinct integers.

Example

nums = [1, 2, 1, 2, 3], k = 2

Valid subarrays with exactly 2 distinct:
[1, 2]
[1, 2, 1]
[2, 1, 2]
[1, 2]
Total = 4

Return 4.

🧠 High-Level Intuition

Counting exactly k distinct elements is tricky directly… but counting at most k is much easier with sliding window.

Key idea:

Let:
- F(k) = number of subarrays with at most k distinct integers.
Then:
- Subarrays with exactly k distinct = F(k) - F(k-1).

Because:

Any subarray with exactly k distinct is counted in F(k),
But not counted in F(k-1) (since it has too many distinct elements),
So subtracting removes all subarrays with fewer than k distinct, leaving only those with exactly k.

So the whole problem reduces to:

Efficiently compute atMostK(nums, k)
Efficiently compute atMostK(nums, k-1)
Answer = difference.

🐢 Brute Force Approach

Idea

Check every subarray nums[i..j]:

For each i from 0 to n-1:
- Keep a frequency map from i to each j ≥ i.
- Count distinct elements in nums[i..j].
- If distinct == k, increment answer.

Complexity

There are O(n²) subarrays.
Maintaining distinct count per (i,j) naive → O(n) each.
Total: O(n³) naive, or O(n²) with careful reuse of counts.

Way too slow for large n (like 10⁵). Time to bring in the sliding window superheroes. 🦸

⚡ Optimal Approach — Sliding Window for “At Most K Distinct”

Core concept

We’ll create a helper:

int atMostK(vector<int>& nums, int K)

that returns the number of subarrays with at most K distinct integers.

We use a sliding window [left..right] with:

unordered_map<int, int> freq to count occurrences of each integer
K (we treat it as “allowed distinct remaining” and decrement/increment as needed)
For each right, we extend window and if we exceed K distinct, we shrink from the left until valid again.

For each position right, once the window [left..right] has at most K distinct numbers, the number of valid subarrays ending at right is:

right - left + 1

Why? Because all subarrays:

[left, right]
[left+1, right]
[left+2, right]
…
[right, right]

have at most K distinct numbers (since they are smaller windows inside a valid window).

We add these to a running count.

🧩 Walking Through `atMostK`

Here’s your helper function, annotated:

int atMostK(vector<int>& nums, int K) {
    unordered_map<int, int> freq;
    int left = 0, count = 0;

    for (int right = 0; right < nums.size(); right++) {
        // When we see a new distinct number
        if (freq[nums[right]] == 0) {
            K--;  // we now use one of our allowed distinct slots
        }
        freq[nums[right]]++;

        // If we've used more than K distinct numbers, shrink from the left
        while (K < 0) {
            freq[nums[left]]--;
            if (freq[nums[left]] == 0) {
                K++;  // we freed up a distinct number
            }
            left++;
        }

        // All subarrays ending at 'right' with starting index in [left..right] are valid
        count += (right - left + 1);
    }

    return count;
}

Important subtle point

K is passed by value into atMostK (copied), so modifications inside atMostK don’t affect the original k in subarraysWithKDistinct. This is crucial and totally safe.

Within atMostK, we interpret K as “how many more distinct integers we’re allowed to introduce into the current window.”

🧮 Putting It Together: `exactly K = atMost(k) - atMost(k-1)`

Your main function:

int subarraysWithKDistinct(vector<int>& nums, int k) {
    return atMostK(nums, k) - atMostK(nums, k - 1);
}

This uses the identity:

# exactly k = # at most k − # at most (k − 1)

Done. Clean. Beautifully mathy. 🧡

✅ Polished Full Code (with tiny improvements)

#include <bits/stdc++.h>
using namespace std;

class Solution {
public:
    long long atMostK(const vector<int>& nums, int K) {
        unordered_map<int, int> freq;
        int left = 0;
        long long count = 0;

        for (int right = 0; right < (int)nums.size(); ++right) {
            if (freq[nums[right]] == 0) {
                K--; // we are using one more distinct integer
            }
            freq[nums[right]]++;

            // If we have more than K distinct, shrink window from left
            while (K < 0) {
                freq[nums[left]]--;
                if (freq[nums[left]] == 0) {
                    K++; // freed one distinct type
                }
                left++;
            }

            // Number of subarrays ending at right with at most K distinct
            count += (right - left + 1);
        }

        return count;
    }

    int subarraysWithKDistinct(vector<int>& nums, int k) {
        long long ans = atMostK(nums, k) - atMostK(nums, k - 1);
        return (int)ans;
    }
};

🔍 Example Walkthrough

Let’s walk the classic example:

nums = [1, 2, 1, 2, 3], k = 2

Compute atMostK(nums, 2):
- We count all subarrays with at most 2 distinct integers.
Compute atMostK(nums, 1):
- We count all subarrays with at most 1 distinct integer.
Subtract them:
- Those with exactly 2 distinct = atMost(2) − atMost(1).

I won’t expand every step here (would be long), but if you want, we can do a fully tabulated dry run later.

⏱ Complexity Analysis

Time:
- atMostK is O(n) (each element is visited at most twice: once by right, once by left).
- We call it twice → O(2n) → O(n) overall.
Space:
- unordered_map<int,int> stores at most k or number of distinct elements in nums, so O(min(n, distinct(nums))).
- Overall: O(n) in the worst case, but often much less.

🧪 Test Cases

1. Basic

nums = {1,2,1,2,3}, k = 2 -> 4
nums = {1,2,1,3,4}, k = 3 -> ?  // try it by hand

2. Edge Cases

nums = {1,1,1,1}, k = 1 -> 10 (all subarrays)
nums = {1,2,3},   k = 1 -> 3 (each single element)
nums = {1,2,3},   k = 3 -> 1 (whole array)
nums = {},        k = 1 -> 0

3. More Interesting

nums = {1,2,1,3,4}, k = 2
// Subarrays with exactly 2 distinct: [1,2], [1,2,1], [2,1], [1,3], [3,4] etc.

⚠️ Common Pitfalls

Forgetting the K < 0 condition in the inner while loop (must be strictly < 0, not <=).
Confusing exactly K with at most K — remember to subtract atMost(k-1).
Not handling large counts (use long long in heavy test scenarios).
Not realizing K is passed by value, and being worried that it’s being “mutated globally” — it’s local to each atMostK call.

🔁 Variations & Related Problems

This pattern is super reusable — MCU-level reusable 😂

Strings:
- “Substrings with Exactly K Distinct Characters” — same logic but with char instead of int.
Arrays of other types:
- Just change unordered_map<int,int> to unordered_map<whatever,int>.
At most K distinct but no “exactly”:
- Directly return atMostK(nums, k).
Longest subarray with at most K distinct:
- Similar sliding window, but tracking max length instead of count.

💡 Tips for Interviews

Explicitly state the identity:

“I’ll compute the number of subarrays with at most K distinct, call it F(K). Then answer = F(K) − F(K−1).”
Explain the sliding window invariant:

“At every step, the window [left..right] contains at most K distinct integers.”
Mention amortized O(n) time:

“Both pointers move forward only and never move back.”
If asked about optimization:
- Point out unordered_map is average O(1); could use map but that’s O(log n)`.

❓ FAQs

Q: Why treat K as mutable inside atMostK? A: It’s a neat trick: instead of maintaining a separate distinctCount, we let K represent how many more distinct integers we’re allowed to include. When we add a new distinct, we do K--; when we remove one completely, we do K++.

Q: Could I implement atMostK by storing a distinctCount variable instead? A: Absolutely. You’d keep distinctCount, increment it when a new key is introduced, decrement when a key’s freq hits 0, and check while (distinctCount > K).

Q: Does this work if nums has negative numbers or large values? A: Yes — values don’t matter for distinct count. unordered_map<int,int> handles any int.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎯 Subarrays With Exactly `K` Distinct Integers #992

🧾 Problem Statement

Example

🧠 High-Level Intuition

🐢 Brute Force Approach

Idea

Complexity

⚡ Optimal Approach — Sliding Window for “At Most K Distinct”

Core concept

🧩 Walking Through `atMostK`

Important subtle point

🧮 Putting It Together: `exactly K = atMost(k) - atMost(k-1)`

✅ Polished Full Code (with tiny improvements)

🔍 Example Walkthrough

⏱ Complexity Analysis

🧪 Test Cases

1. Basic

2. Edge Cases

3. More Interesting

⚠️ Common Pitfalls

🔁 Variations & Related Problems

💡 Tips for Interviews

❓ FAQs

FilesExpand file tree

02.SubstringWithKDiffInt.md

Latest commit

History

02.SubstringWithKDiffInt.md

File metadata and controls

🎯 Subarrays With Exactly K Distinct Integers #992

🧾 Problem Statement

Example

🧠 High-Level Intuition

🐢 Brute Force Approach

Idea

Complexity

⚡ Optimal Approach — Sliding Window for “At Most K Distinct”

Core concept

🧩 Walking Through atMostK

Important subtle point

🧮 Putting It Together: exactly K = atMost(k) - atMost(k-1)

✅ Polished Full Code (with tiny improvements)

🔍 Example Walkthrough

⏱ Complexity Analysis

🧪 Test Cases

1. Basic

2. Edge Cases

3. More Interesting

⚠️ Common Pitfalls

🔁 Variations & Related Problems

💡 Tips for Interviews

❓ FAQs

🎯 Subarrays With Exactly `K` Distinct Integers #992

🧩 Walking Through `atMostK`

🧮 Putting It Together: `exactly K = atMost(k) - atMost(k-1)`