🎯 Subarrays With Exactly K Distinct Integers #992
This code solves the classic “Subarrays with Exactly K Distinct Integers” problem (LeetCode 992). You’re using the very powerful pattern:
🧠
exactly(K) = atMost(K) - atMost(K - 1)
Let’s turn this into clean, structured notes you can drop straight into your DSA repo: problem, intuition, brute force, optimal approach, full code, tests, complexity, variations, FAQs… with a bit of personality sprinkled in 😄
Given an integer array nums and an integer k, return the number of contiguous subarrays that contain exactly k distinct integers.
nums = [1, 2, 1, 2, 3], k = 2
Valid subarrays with exactly 2 distinct:
[1, 2]
[1, 2, 1]
[2, 1, 2]
[1, 2]
Total = 4Return 4.
Counting exactly k distinct elements is tricky directly… but counting at most k is much easier with sliding window.
Key idea:
-
Let:
F(k)= number of subarrays with at mostkdistinct integers.
-
Then:
- Subarrays with exactly
kdistinct =F(k) - F(k-1).
- Subarrays with exactly
Because:
- Any subarray with exactly
kdistinct is counted inF(k), - But not counted in
F(k-1)(since it has too many distinct elements), - So subtracting removes all subarrays with fewer than
kdistinct, leaving only those with exactlyk.
So the whole problem reduces to:
- Efficiently compute
atMostK(nums, k) - Efficiently compute
atMostK(nums, k-1) - Answer = difference.
Check every subarray nums[i..j]:
-
For each
ifrom0ton-1:- Keep a frequency map from
ito eachj ≥ i. - Count distinct elements in
nums[i..j]. - If distinct ==
k, increment answer.
- Keep a frequency map from
- There are O(n²) subarrays.
- Maintaining distinct count per
(i,j)naive → O(n) each. - Total: O(n³) naive, or O(n²) with careful reuse of counts.
Way too slow for large n (like 10⁵). Time to bring in the sliding window superheroes. 🦸
We’ll create a helper:
int atMostK(vector<int>& nums, int K)that returns the number of subarrays with at most K distinct integers.
We use a sliding window [left..right] with:
unordered_map<int, int> freqto count occurrences of each integerK(we treat it as “allowed distinct remaining” and decrement/increment as needed)- For each
right, we extend window and if we exceedKdistinct, we shrink from the left until valid again.
For each position right, once the window [left..right] has at most K distinct numbers, the number of valid subarrays ending at right is:
right - left + 1
Why? Because all subarrays:
[left, right][left+1, right][left+2, right]- …
[right, right]
have at most K distinct numbers (since they are smaller windows inside a valid window).
We add these to a running count.
Here’s your helper function, annotated:
int atMostK(vector<int>& nums, int K) {
unordered_map<int, int> freq;
int left = 0, count = 0;
for (int right = 0; right < nums.size(); right++) {
// When we see a new distinct number
if (freq[nums[right]] == 0) {
K--; // we now use one of our allowed distinct slots
}
freq[nums[right]]++;
// If we've used more than K distinct numbers, shrink from the left
while (K < 0) {
freq[nums[left]]--;
if (freq[nums[left]] == 0) {
K++; // we freed up a distinct number
}
left++;
}
// All subarrays ending at 'right' with starting index in [left..right] are valid
count += (right - left + 1);
}
return count;
}K is passed by value into atMostK (copied), so modifications inside atMostK don’t affect the original k in subarraysWithKDistinct. This is crucial and totally safe.
Within atMostK, we interpret K as “how many more distinct integers we’re allowed to introduce into the current window.”
Your main function:
int subarraysWithKDistinct(vector<int>& nums, int k) {
return atMostK(nums, k) - atMostK(nums, k - 1);
}This uses the identity:
# exactly k=# at most k−# at most (k − 1)
Done. Clean. Beautifully mathy. 🧡
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
long long atMostK(const vector<int>& nums, int K) {
unordered_map<int, int> freq;
int left = 0;
long long count = 0;
for (int right = 0; right < (int)nums.size(); ++right) {
if (freq[nums[right]] == 0) {
K--; // we are using one more distinct integer
}
freq[nums[right]]++;
// If we have more than K distinct, shrink window from left
while (K < 0) {
freq[nums[left]]--;
if (freq[nums[left]] == 0) {
K++; // freed one distinct type
}
left++;
}
// Number of subarrays ending at right with at most K distinct
count += (right - left + 1);
}
return count;
}
int subarraysWithKDistinct(vector<int>& nums, int k) {
long long ans = atMostK(nums, k) - atMostK(nums, k - 1);
return (int)ans;
}
};Let’s walk the classic example:
nums = [1, 2, 1, 2, 3], k = 2
-
Compute
atMostK(nums, 2):- We count all subarrays with at most 2 distinct integers.
-
Compute
atMostK(nums, 1):- We count all subarrays with at most 1 distinct integer.
-
Subtract them:
- Those with exactly 2 distinct = atMost(2) − atMost(1).
I won’t expand every step here (would be long), but if you want, we can do a fully tabulated dry run later.
-
Time:
atMostKis O(n) (each element is visited at most twice: once byright, once byleft).- We call it twice → O(2n) → O(n) overall.
-
Space:
unordered_map<int,int>stores at mostkor number of distinct elements innums, so O(min(n, distinct(nums))).- Overall: O(n) in the worst case, but often much less.
nums = {1,2,1,2,3}, k = 2 -> 4
nums = {1,2,1,3,4}, k = 3 -> ? // try it by handnums = {1,1,1,1}, k = 1 -> 10 (all subarrays)
nums = {1,2,3}, k = 1 -> 3 (each single element)
nums = {1,2,3}, k = 3 -> 1 (whole array)
nums = {}, k = 1 -> 0nums = {1,2,1,3,4}, k = 2
// Subarrays with exactly 2 distinct: [1,2], [1,2,1], [2,1], [1,3], [3,4] etc.- Forgetting the
K < 0condition in the inner while loop (must be strictly < 0, not <=). - Confusing exactly K with at most K — remember to subtract
atMost(k-1). - Not handling large counts (use
long longin heavy test scenarios). - Not realizing
Kis passed by value, and being worried that it’s being “mutated globally” — it’s local to eachatMostKcall.
This pattern is super reusable — MCU-level reusable 😂
-
Strings:
- “Substrings with Exactly K Distinct Characters” — same logic but with
charinstead ofint.
- “Substrings with Exactly K Distinct Characters” — same logic but with
-
Arrays of other types:
- Just change
unordered_map<int,int>tounordered_map<whatever,int>.
- Just change
-
At most K distinct but no “exactly”:
- Directly return
atMostK(nums, k).
- Directly return
-
Longest subarray with at most K distinct:
- Similar sliding window, but tracking max length instead of count.
-
Explicitly state the identity:
“I’ll compute the number of subarrays with at most K distinct, call it F(K). Then answer = F(K) − F(K−1).”
-
Explain the sliding window invariant:
“At every step, the window [left..right] contains at most K distinct integers.”
-
Mention amortized O(n) time:
“Both pointers move forward only and never move back.”
-
If asked about optimization:
- Point out
unordered_mapis average O(1); could usemapbut that’s O(log n)`.
- Point out
Q: Why treat K as mutable inside atMostK?
A: It’s a neat trick: instead of maintaining a separate distinctCount, we let K represent how many more distinct integers we’re allowed to include. When we add a new distinct, we do K--; when we remove one completely, we do K++.
Q: Could I implement atMostK by storing a distinctCount variable instead?
A: Absolutely. You’d keep distinctCount, increment it when a new key is introduced, decrement when a key’s freq hits 0, and check while (distinctCount > K).
Q: Does this work if nums has negative numbers or large values?
A: Yes — values don’t matter for distinct count. unordered_map<int,int> handles any int.