Page 1 of 1

909 - The BitPack Data Compression Problem

Posted: Wed Nov 01, 2006 10:55 pm
by Abednego
I seem to be the only one who can't solve this problem. The 128 byte is useless, right? There is never a reason to use it.

Posted: Sun Nov 12, 2006 4:21 pm
by viniciusweb
I have some (basic) questions regarding 909:

- Is there a maximum input size?
- The input and output must be made using stdin / stdout, right? Because the problem says "The output file will consist of a file with compressed content."

(i'm new to this site, and as far as i know we're only allowed to use the standard input and output)

Posted: Sun Nov 12, 2006 6:13 pm
by rio
- Is there a maximum input size?
There is no discription about the input size. So you should not buf all the inputs.
I took a way which reading one chracter at once with getchar().
- The input and output must be made using stdin / stdout, right?
Yes, it's true for every problems here.

Posted: Thu Nov 23, 2006 11:54 am
by sclo
judging by the runtimes of the accepted solutions, the input should be fairly small.
I don't know why I can't get AC on this problem. I just did DP, so it should be right. I never output 128. There might be problem with the judge I/O.
I also wrote another program to do the data decompression and checked that it can always recover my input.
Here's the main part of my code:

Code: Select all

int F[maxn],P[maxn],Q[maxn],s[maxn],n;
void print(int i) {
  if(i==n) return;
  if(Q[i]==0) {
    putchar(P[i]+128);
    putchar(s[i]);
    print(i+P[i]+1);
  }
  else {
    putchar(P[i]);
    FOR(k,i,i+P[i]) putchar(s[k]);
    print(i+P[i]+1);
  }
}
int f(int i) {
  if(i==n) return 0;
  if(F[i]==-1) {
    int j;
    F[i] = inf;
    for(j=1; j<n && j<=127 && s[i+j]==s[i]; j++)
      if(f(i+j+1)+2<F[i])
        F[i]=f(i+j+1)+2,P[i]=j,Q[i]=0;
    for(j=0; j<n && j<=127; j++)
      if(f(i+j+1)+j+2<F[i])
        F[i]=f(i+j+1)+j+2,P[i]=j,Q[i]=1;
  }
  return F[i];
}
int main(){
  int c;
  n=0;
  while(1) {
    c=getchar();
    if(c==EOF) {
      break;
    }
    s[n++]=c;
  }
  assert(n>0 && n<=maxn);
  memset(F,-1,sizeof(F));
  f(0);
  print(0);
}
After fixing the I/O problems, I still can't get AC :(

Let's compare our WA submissions

Posted: Mon Dec 04, 2006 12:52 pm
by avm
I'm a third person who can't solved this problem.
I wrote two independent DP solutions on C++ and Java and both programs
received WA.
There is only 1 test in input, right?
Could anybody verifies my output?
Could anybody post part AC code which shows how
correctly handle input,output?

Test cases generator (1.in,2.in, ..., 9.in)

Code: Select all

#include <stdio.h>
#define F(x) freopen(x ".in","w",stdout)
void p(const char *s,int count)
{
while(--count >= 0) printf("%s",s);
}
int main(void)
{
int i,k;
unsigned int x;
char z[256];
F("1");p("122",42);fclose(stdout);
F("2");p("122",43);fclose(stdout);
F("3");p("1333",32);fclose(stdout);
F("4");p("aabb",32);fclose(stdout);
F("5");p("aabbc",26);fclose(stdout);
F("6");p("aacbb",26);fclose(stdout);
F("7");p("ab",64);p("22",1);p("ab",64);fclose(stdout);
F("8");for(i=1;i<=128;i++) {z[0]='a'+(i&1);z[1]=0;p(z,i);} fclose(stdout);
F("9");
x = 239;
for(k=1;k<=10000;k++) for(i='a';i<='z';i++)
  {
  z[0]=i;
  z[1]=0;
  x = 1664525 * x + 1013904223;
  p(z,(x % 3) + 1);
  }
fclose(stdout);
return 0;
}
Output files sizes

Code: Select all

1 127 bytes
2 130 bytes
3 128 bytes
4 128 bytes
5 131 bytes
6 131 bytes
7 260 bytes
8 256
9 476264 bytes

Posted: Mon Dec 04, 2006 3:49 pm
by rio
I've got these output file size with my AC code

Code: Select all

1  168 
2  172
3  128
4  128
5  156
6  156
7  260
8  256
9  490914

Posted: Mon Dec 04, 2006 4:29 pm
by avm
Thanks, rio
In right solution, for input with length <= 128, output can't be greater than l+1,
Because we can just copy input.
Size of 1.in is 126.
So your output for 1.in is wrong.
So I think that judge output is wrong with probability about 98%.

Posted: Mon Dec 04, 2006 5:04 pm
by rio
You're right, I didn't notice that point. My code seems to be wrong.

But I think the judge I/0 is not wrong (but week). Here's my guess.
I think that there is no case that my code outputs the wrong answer (like 1.in) in the judge I/O.
And there is a case that my code outputs the correct answer and your code doesn't.

Otherwise, I can't explain how all the otherguys got accepted.
I can't think that they all done the same mistake as me.

So I was just lucky. There was no critical case for me.
----
Sory for my poor English.

Posted: Mon Dec 04, 2006 6:05 pm
by avm
I wrote third absolutely wrong solution on C but it received AC.

Code: Select all

#include <stdio.h>
unsigned char a[256];
int n=0;
void f(unsigned char *a,int n)
{
int i;
if (n < 1) return;
putchar(n-1);
for(i=0;i<n;i++) putchar(a[i]);
}
int main()
{
n=0;
for(;;)
  {
  int l,c = getchar();if (c == EOF) break;
  a[n++] = (unsigned char) c;
  if (n > 1 && a[n-1] == a[n-2])
    {
    f(a,n-2);
    l = 129;
    for(;;)
      {
      c = getchar();if (c == EOF) break;
      if (c != a[n-1])
        {
        ungetc(c,stdin);
        break;
        }
      if (++l == 255) break;
      }
    putchar(l);
    putchar(a[n-1]);
    n = 0;
    }
  if (n == 129)
    {
    f(a,128);
    a[0] = a[128];
    n = 1;
    }
  }
f(a,n);
return 0;
}

Posted: Mon Dec 04, 2006 6:14 pm
by rio
wow. Its a strange problem.

Posted: Tue Dec 05, 2006 12:16 am
by little joey
My accepted program also uses this "too simple" approach to give non-optimal results, so the judge must be wrong (with better than 98% chance, I think).

Posted: Sun May 06, 2007 10:54 pm
by viniciusweb
A few months ago I got my code accepted, than for some reason the problem appeared in my "tried but not solved" list, so I submitted again and got WA.

I rewrote the code and got AC. My output to the test cases avm posted is:

Code: Select all

1. 127
2. 131
3. 128
4. 129
5. 132
6. 132
7. 261
8. 256
9. 490572

Posted: Sun May 27, 2007 7:33 pm
by ..
viniciusweb wrote: I rewrote the code and got AC. My output to the test cases avm posted is:

Code: Select all

1. 127
2. 131
3. 128
4. 129
5. 132
6. 132
7. 261
8. 256
9. 490572
My AC program gets the exact output as avm, so I think your program is still not optimal.