问题
Remove all duplicates from a string and select the lexicographical smallest string possible. For example, the string cbacdcbc would return acdb, not adcb.
So this has a relatively simple solution if we don't have to select the string that's lexicographical smallest, but considering that fact, I'm not sure how to come to an efficient solution. Here's what I have so far:
string removeDuplicateLetters(string s)
{
vector<bool> v(26,0);
for(int i = 0; i < s.size(); i++) {
v[s[i]-'a'] = 1;
}
string ss = "";
for(int i = 0; i < s.size(); i++) {
if(v[s[i]-'a']) {
ss += s[i];
v[s[i]-'a'] = 0;
}
}
return ss;
}
回答1:
Algorithm
- Check which letters are present in the input string:
a,b,c,d
. - Find the first
a
that has all ofb,c,d
after it.
Or if that's not possible, find the firstb
that has all ofa,c,d
after it.
Or if that's not possible, find the firstc
that has all ofa,b,d
after it.
Or if that's not possible, find the firstd
. - Discard the start of the input string up to the selected character.
- Repeat from step 2 with the remaining characters to be found.
Code example
(in Javascript; my C++ is rusty). It creates a bit pattern chars
to store which characters are still to be found, and an array after
of bit patterns to store which characters are still available after each position.
function smallestString(input) {
var chars = 0, after = [];
for (var i = input.length - 1; i >= 0; i--) {
chars |= 1 << (input.charCodeAt(i) - 97);
after[i] = chars;
}
var result = "", start = 0, pos;
while (chars) {
for (var i = 0; i < 26; i++) {
if (chars & (1 << i)) {
pos = input.indexOf(String.fromCharCode(97 + i), start);
if (chars == (chars & after[pos])) {
result += String.fromCharCode(97 + i);
chars -= 1 << i;
break;
}
}
}
start = pos + 1;
}
return result;
}
document.write(smallestString("cbacdcbc") + "<BR>");
document.write(smallestString("thequickbrownfoxjumpsoverthelazydog"));
回答2:
Sketch of an algorithm.
Pass over the string, build a map of how many times each character appears, and the position of the rightmost (and possibly the only) occurrence of every character.
Find the smallest character that can appear in the first position. To do this, go left to right, noting the smallest character encountered; stop when you hit the rightmost occurrence of any character. Remove all characters that precede the smallest one, and all other copies of the smallest one; update the map accordingly.
Repeat starting from the character that immediately follows the smallest one from step #2.
Can terminate early once all counters in the map reach 1. Removal of other copies can be combined with normal iteration (just mark characters to be removed with 0 in the map-of-counters, skip them over in the normal search, remove them when removing the prefix).
This algorithm is quadratic in the worst case, at least in the size of the alphabet (the worst case is abc...zabc...
; the algorithm examines half the string for each character, only to decide to keep it). I think this can be fixed by keeping track not just of the smallest, but also second smallest and third smallest and so on, in a kind of priority queue structure (details are left as n exercise for the reader).
回答3:
m69's javascript in c++:
string smallestString(string input) {
int chars = 0;
int after[sizeof(input)];
for (int i = input.length() - 1; i >= 0; i--) {
chars |= 1 << (input[i] - 97);
after[i] = chars;
}
string result = "";
int start = 0, pos;
while (chars) {
for (int i = 0; i < 26; i++) {
if (chars & (1 << i)) {
pos = input.find('a' + i, start);
if (chars == (chars & after[pos])) {
result += 'a' + i;
chars -= 1 << i;
break;
}
}
}
start = pos + 1;
}
return result;
}
回答4:
I find this approach simple.
Firstly find the count of each character.
input: s
vector<int> cnt(26);
int n=s.size();
for(int i=0;i<n;i++) cnt[s[i]-'a']++;
Have a visited vector, vector<bool> visit(26);
string ans="";
for(int i=0;i<n;i++){
int t=s[i]-'a';
cnt[t]--;
if(visit[t]) continue;
while(ans.size()>0 && s[i]<ans.back() && cnt[ans.back()-'a']>0){
visit[ans.back()-'a']=false;
ans.pop_back();
}
ans.push_back(s[i]);
visit[t]=true;
}
return ans;
The time complexity is O(n)
来源:https://stackoverflow.com/questions/34476853/select-lexicographical-smallest-string-after-duplicates-removed