Input: array nums (may contain duplicates).
Output: list of all unique subsets (each subset is a list of numbers). Order of subsets doesn’t matter. Each subset may be returned in any order, but there must be no duplicate subsets in the result.
Example:
nums = [1,2,2]
Output (any order): [ [], [1], [2], [1,2], [2,2], [1,2,2] ]
If nums had no duplicates, generating all subsets is straightforward: at index i either include nums[i] or exclude it (2 choices per element) → 2^n subsets.
With duplicates, the same subset can be produced in multiple ways if we treat identical values as distinct by position. To avoid duplicates we:
- Sort
numsfirst — equal numbers are adjacent. - During recursion at a given level (same
ind), skip anynums[i]that equalsnums[i-1](andi > ind). That prevents choosing the second occurrence as the first element at this recursion level — thus avoiding duplicate subsets.
This produces subsets in non-decreasing order (because we move left-to-right through the sorted array) and ensures uniqueness.
- Enumerate all
2^nsubsets (e.g., using bitmasks), then use asetof vectors (or serialize subsets) to remove duplicates. - Costly: generating
2^nfor n up to 20–30 is okay, but deduping via hashing/sets has added overhead. Sorting each subset for canonical form and hashing is extra work. Backtracking with skip is simpler and more efficient.
Key rules:
-
Sort
nums. -
Recurse with an index
ind, and maintain current subsetds. -
At a recursion level, loop
ifromindton-1:- If
i > ind && nums[i] == nums[i-1],continue;(skip duplicate choices at same level). - Include
nums[i]→ds.push_back(nums[i]), recurse withi+1, then backtrack (pop_back()).
- If
-
Also include the path where you don't pick any more elements — typically by pushing
dswhen you reach or consider each level. Two common patterns:- Push
dsat every recursive invocation (gives subsets as you go). - Or push only at base case
ind == n. Both produce all subsets; pushing at each node gives an iterative-like order.
- Push
Either way, the duplicate-skip rule prevents duplicate subsets.
#include <bits/stdc++.h>
using namespace std;
class Solution {
public:
vector<vector<int>> subsetsWithDup(vector<int>& nums) {
vector<vector<int>> res;
vector<int> ds;
sort(nums.begin(), nums.end());
backtrack(0, nums, ds, res);
return res;
}
private:
void backtrack(int ind, const vector<int>& nums, vector<int>& ds, vector<vector<int>>& res) {
// record current subset (we record at every node so we include partial subsets)
res.push_back(ds);
for (int i = ind; i < (int)nums.size(); ++i) {
// skip duplicates at the same recursion level
if (i > ind && nums[i] == nums[i - 1]) continue;
ds.push_back(nums[i]);
backtrack(i + 1, nums, ds, res); // move to next index; each element used at most once
ds.pop_back();
}
}
};Notes:
- We
push_back(ds)at the start ofbacktrack, so the empty subset is included and every extension is recorded in DFS order. i > ind && nums[i] == nums[i-1]ensures duplicates at the same recursion depth are skipped.- Passing
numsbyconst&avoids copies.
nums = [1,2,2] → after sorting [1,2,2]:
-
Start
ind=0, ds=[]→ record[]. -
i=0: pick1→ds=[1]→ record[1].-
recurse
ind=1:-
i=1: pick2→ds=[1,2]→ record[1,2].- recurse
ind=2:i=2: pick2→ds=[1,2,2]→ record[1,2,2]. - backtrack (pop last 2)
- recurse
-
i=2:i>indandnums[2] == nums[1]→ skip (prevents[1,2]duplication at this level)
-
-
-
backtrack,
i=1at top-level: pick2→ds=[2]→ record[2].- recurse
ind=2:i=2: pick2→ds=[2,2]→ record[2,2].
- recurse
-
i=2at top-level:i>ind && nums[2] == nums[1]→ skip.
Result: [[], [1], [1,2], [1,2,2], [2], [2,2]] — unique subsets only.
-
Time complexity:
O(2^n * n)in worst case. There are up to2^nsubsets (n = nums.size()), and copying/pushing a subset into results takes up toO(n)time. Duplicate-skip prunes many branches when duplicates exist. -
Space complexity:
- Recursion depth
O(n)(stack + current subsetds). - Output size can be
O(2^n * n)in worst case (dominates memory usage). - Additional overhead: sorting is
O(n log n).
- Recursion depth
-
Simple with duplicates
- Input:
[1,2,2]Output (order may vary):[[], [1], [2], [1,2], [2,2], [1,2,2]]
- Input:
-
All elements same
- Input:
[2,2,2]Output:[[], [2], [2,2], [2,2,2]]
- Input:
-
No duplicates
- Input:
[1,2,3]Output: all2^3 = 8subsets.
- Input:
-
Empty input
- Input:
[]→ Output:[[]]
- Input:
-
Larger mix
- Input:
[1,2,2,3]— verify duplicates handled and all unique subsets produced.
- Input:
- Sort first. This is the single most important step for handling duplicates easily.
- Skip duplicates only at the same recursion level using the
i > ind && nums[i] == nums[i-1]guard. - Record subsets at every recursion call (push current
ds), or push only at base-case — both work. Recording at every node yields the empty set first and then progressively larger subsets. - Use
ds.reserve(nums.size())if you want modest micro-optimizations. - If you want subsets sorted lexicographically, sort
numsand then the generation order will be deterministic (DFS order), but you may need to post-process if strict lexicographic order is required.
-
Subsets (no duplicates) — identical approach without duplicate-skip.
-
Subsets II iterative — iterative approach that uses previous size to avoid duplicates:
- Maintain
startindex for each unique value group and only extend subsets formed before this value's first appearance.
- Maintain
-
Combination Sum / Permutations / Partition problems — similar backtracking skeletons but different constraints/choices.
-
Generate unique permutations with duplicates — uses similar sort + skip rules but with different loop structure.
Q: Why do we skip when i > ind && nums[i] == nums[i-1]?
A: Because choosing
nums[i]at this recursion depth would create the same subsets as choosing the previous identical valuenums[i-1]. We only want the first occurrence at this level to avoid duplicates.
Q: Can we avoid sorting?
A: You can avoid sorting by using a hash-set to deduplicate final results, but that is usually slower and more memory-heavy. Sorting plus skip is succinct and efficient.
Q: Where to push the current subset into results — start or base case?
A: Both are valid:
- Push at every call records current partial subset (includes empty set).
- Push at base case (ind==n) records when you've considered all elements. Either yields a complete power set when combined with recursive exploration. I prefer pushing at every call for clarity.
Q: Does the algorithm handle negative numbers?
A: Yes— duplicates logic is independent of sign. Sorting works for negatives as well. No pruning based on value is used here.
Q: Complexity concerns for large n?
A: The number of subsets grows exponentially; practical
nfor exhaustive subset generation is typically ≤ 20–25. For largern, you need problem-specific constraints or different objectives (e.g., count only, find subset with property, etc.).