Page 1 of 3

### 10745 - Dominant Strings

Posted: Sun Oct 17, 2004 12:53 pm
Hello,

We tried solving the problem by reading the strings one after the other and maintaining a list of the currently nondominated strings. When a new string is read, the list is scanned and every member of the list that is dominated by the new string is removed from the list. If some of the strings in the list dominates the new one the scan stops and we read the next string. At the end of the scan, if none of the strings in the list dominates the new string, it is added to the list. However, this solution timed out. Could some of you post an idea for a faster solution and running time analysis if possible?

Greetings

Posted: Sun Oct 17, 2004 1:59 pm
Hint: There are fewer strings one given string can dominate than total number of strings.

Posted: Sun Oct 17, 2004 3:15 pm
I solve it in this way:
Read all the string.
Sort the strings in some way such that if string B < string A, string B never dominates string A.
Then check for all pair (string B, string A) if string A dominates string B.

Posted: Sun Oct 17, 2004 4:13 pm
Hi,

.. : 10x, this is beutiful. And of course you have to look at all adjacent pairs, otherwsie you'll time out.

marian: I still don't get your hint.

Posted: Sun Oct 17, 2004 5:24 pm
..: Aren't you checking O(n^2) pairs in the worst case (ie. no dominating strings)? Unless I didn't get your algorithm.

BiK: I was just saying that one string of length K can dominate at most 2^K-2 strings. Since K<=10, one string can dominate at most 1022 strings. So it's cheaper to generate all possible strings, given string can dominate and check if it is one of the given strings. If so, flag that string as non-dominating.

I used trie to represent input strings, and used recursive procedure to generate all dominated strings. So the complexity is O(N*2^K), which is better that O(N^2), because 2^K < N, in the worst case.

Posted: Sun Oct 17, 2004 5:32 pm
marian wrote:..: Aren't you checking O(n^2) pairs in the worst case (ie. no dominating strings)? Unless I didn't get your algorithm.
Yes, but enough to get AC
marian wrote: I used trie to represent input strings, and used recursive procedure to generate all dominated strings. So the complexity is O(N*2^K), which is better that O(N^2), because 2^K < N, in the worst case.
Your alogrithm is much more interesting than mine.
But how can you check if the generated string is in the input quickly?
There are totally (N*2^K) generated string, are you generate all these strings and then sort them for checking?~

Posted: Sun Oct 17, 2004 7:26 pm
marian: Thanks for your explanation. As far as I understand, you look at the strings as multisets of characters. And in your trie you do not distinguish between strings that give rise to the same multiset. Is that correct? For example you can sort the string lexicographically. And for a given multiset you keep the indeces of the input strings that produce the given multiset. So I assume that given an input string you generate the multisets dominated by that input string and mark them in the trie as non-dominating. And at the end you traverse your trie to find the multisets that are dominating. This is how I imagine your algorithm working. Am I correct?

..: I tried your approach and it timed out. If you got accepted, I think that it is exceptional and this is not the idea of the solution. Do you use pure C? I'm currios how you managed to get under the time limit with an O(N^2) solution.

Posted: Sun Oct 17, 2004 7:51 pm
BiK wrote: ..: I tried your approach and it timed out. If you got accepted, I think that it is exceptional and this is not the idea of the solution. Do you use pure C? I'm currios how you managed to get under the time limit with an O(N^2) solution.
I used pure C. I tell more about how I optimize the algorithm.
For each string, I will store how many different kind of chars and the occurance of each kind in the string. So string A dominates string B only if number of kind in A >= number of kind in B.
Also, for string A, B, C, if I know string C is dominated by B, I will mark the string C dominated and won't check again if any other string A dominates C.

The above rules will not decrease the complexity, but it decreases the practical number of opertaions. If I remove these 2 rules of my AC code, the runtime will grow from 4s to 8s.

Posted: Sun Oct 17, 2004 8:22 pm
BiK wrote:marian: Thanks for your explanation. As far as I understand, you look at the strings as multisets of characters. And in your trie you do not distinguish between strings that give rise to the same multiset. Is that correct?
Exactly.
BiK wrote:For example you can sort the string lexicographically. And for a given multiset you keep the indeces of the input strings that produce the given multiset.
In fact, I'm keeping the opposite - for every string I keep the index of apropriate multiset. In the trie, I have indices of multisets.
BiK wrote:So I assume that given an input string you generate the multisets dominated by that input string and mark them in the trie as non-dominating. And at the end you traverse your trie to find the multisets that are dominating.
Almost, only I'm not traversing trie at the end, but flagging non-dominating multiset indices to some array.

Also, to optimize this solution a bit, I'm traversing a trie as I generate dominated strings (I have node pointer as a parameter of my recursive function). This way, I can reduce the number of generated strings, because I can stop generating, when the needed edge is not in the trie.

Marian

Posted: Sun Oct 17, 2004 11:55 pm
to BiK
i think that the brute force
O(nn) approach mentioned by
Lawrence (aka ..) can easily get
accepted with the current input
and time limit. I used "pure" c
(no asm,register etc.) and O(nn).
csaba noszaly

Posted: Mon Oct 18, 2004 1:39 pm
think that the brute force O(nn) approach mentioned by Lawrence (aka ..) ...
Who is Lawrence?

Posted: Mon Oct 18, 2004 2:21 pm
Me

### Bah

Posted: Tue Oct 19, 2004 2:09 am
I, on the other hand, don't think that tries are the best data structure for this problem. Any data structure used for this problem has a construction penalty, especially once you break the 1 s barrier. I think an exhaustive search with good short circuiting heuristics goes best here combined with cheap to construct data structures.

What kept me inspired was the fact that the former first place implementation was written in PASCAL. Surely, I could not let this stand, to the embarassment of C++ programmers, everywhere .

Michael Goldshteyn
( 10745 - 0.129 s )

P.S. for those that still do not understand the problem statement, here is a simplified version:

Given strings A and B. B dominates A if BOTH conditions below are satisfied:

- Every distinct character in A also appears in B
- Any character that appears multiple times in A must appear at least that many times in B

So, a simple way of saying this is that a string B dominates a string A if A can be constructed using some or all of the characters from B.

Here are examples where the first string is dominated by the second:

abc bac (in fact, each dominates the other, so neither is a dominant string)
abc dcba
aabcc aaabccd
abccc aaabccddcc

Examples where the first string is NOT dominated by the second:

abc abdefg