385 - DNA Translation

All about problems in Volume 3. If there is a thread about your problem, please use it. If not, create one with its number in the subject.

Moderator: Board moderators

Post Reply
User avatar
SeaViperz
New poster
Posts: 4
Joined: Mon Dec 30, 2002 9:44 am

385 - DNA Translation

Post by SeaViperz » Sun May 11, 2003 9:45 am

Hi all,

Can someone please explain why my program is wrong? It works correctly only for the last two test datas.
[cpp]
/*@BEGIN_OF_SOURCE_CODE*/
#include <stdio.h>
#include <string.h>
int len;
char strand[256]={'\0'};
char *seq[4][4][4]={
{ {"Phe","Phe","Leu","Leu"},{"Ser","Ser","Ser","Ser"},{"Tyr","Tyr","END","END"},{"Cys","Cys","END","Trp"} },
{ {"Leu","Leu","Leu","Leu"},{"Pro","Pro","Pro","Pro"},{"His","His","Gln","Gln"},{"Arg","Arg","Arg","Arg"} },
{ {"Ile","Ile","Ile","Met"},{"Thr","Thr","Thr","Thr"},{"Asn","Asn","Lys","Lys"},{"Ser","Ser","Arg","Arg"} },
{ {"Val","Val","Val","Val"},{"Ala","Ala","Ala","Ala"},{"Asp","Asp","Glu","Glu"},{"Gly","Gly","Gly","Gly"} }
};
int val(char a){
switch(a){
case 'T': return 0;
case 'C': return 1;
case 'A': return 2;
case 'G': return 3;
}
return 0;
}
bool check(){
int i;
char codon[4]={'\0'},amino[4]={'\0'},out[350]={'\0'};
bool start=false,firstcodon=false;
for(i=0;i<len;i++){
codon[i%3]=strand;
if(i%3==2){
strcpy(amino,seq[val(codon[0])][val(codon[1])][val(codon[2])]);
if(strcmp(amino,"Met")==0) start=true;
else if(strcmp(amino,"END")==0) break;
if(start){
if(firstcodon){
if(out[0]!='\0') strcat(out,"-");
strcat(out,amino);
}
firstcodon=true;
}
}
}
if(start) printf("%s\n",out);
return start;
}
void reverse(){
int i;
char a;
for(i=len/2;i>=0;i--){
a=strand;
strand=strand[len-i-1];
strand[len-i-1]=a;
}
return;
}
void complement(){
int i;
for(i=0;i<len;i++){
switch(strand){
case 'T': strand='A'; break;
case 'C': strand='G'; break;
case 'A': strand='T'; break;
case 'G': strand='C'; break;
}
}
return;
}
int main(){
while(gets(strand)){
if(strand[0]=='*') break;
len=strlen(strand);
complement();
if(!check()){
complement();
if(!check()){
reverse();
if(!check()){
complement();
if(!check()) printf("*** No translatable DNA found ***\n");
}
}
}
}
return 0;
}
/*@END_OF_SOURCE_CODE*/
[/cpp]

Thanks a lot. :D
*****SeaViperz*****
Draw your fangs and let's duel.
For my poison serves forever as my fuel!

Stormreaver
New poster
Posts: 1
Joined: Tue Sep 16, 2003 1:36 pm
Contact:

Post by Stormreaver » Tue Sep 16, 2003 1:39 pm

I assume you've already solved this problem or given up, but the immediate proiblem I see is that you don't consider protein substrings of the strand that is not preceeded by a number of characters divisible by three.

For example ATGATGTGA is a valid protein, but so is GATGATGTGA.

md6nguyen
New poster
Posts: 2
Joined: Wed Dec 03, 2003 9:50 pm

385 - DNA Translation WA

Post by md6nguyen » Thu Mar 04, 2004 7:31 pm

Hi, I got a WA in 385 (DNA Translation) and cannot find out why. My algorithm is straightforward, I go forward and if I cannot find a valid primary or complentary strand then I go backward.

Can someone give me a test case to show that my program is wrong?

/* @JUDGE_ID: 39439PK 385 C */

#include <stdio.h>

#define NUM_BASES 4
#define MAX_LEN 256
#define CODON_LEN 3

#define ph "Phe"
#define le "Leu"
#define se "Ser"
#define ty "Tyr"
#define cy "Cys"
#define tr "Trp"
#define pr "Pro"
#define hi "His"
#define gl "Gln"
#define ar "Arg"
#define il "Ile"
#define me "Met"
#define th "Thr"
#define as "Asn"
#define ly "Lys"
#define va "Val"
#define al "Ala"
#define ap "Asp"
#define gu "Glu"
#define gy "Gly"

/* Amino Acid table lookup */
char *aminoAcid[NUM_BASES][NUM_BASES][NUM_BASES] = {
{{ph,ph,le,le}, {se,se,se,se}, {ty,ty,"",""},{cy,cy,"",tr}},
{{le,le,le,le}, {pr,pr,pr,pr}, {hi,hi,gl,gl},{ar,ar,ar,ar}},
{{il,il,il,me}, {th,th,th,th}, {as,as,ly,ly},{se,se,ar,ar}},
{{va,va,va,va}, {al,al,al,al}, {ap,ap,gu,gu},{gy,gy,gy,gy}},
};

/* index to Amino acid table */
#define U 0
#define C 1
#define A 2
#define G 3

int findStartCodon(char *s) {
int n = CODON_LEN;
return strncmp(s,"TAC",n)==0 || strncmp(s,"ATG",n)==0;
}

int findStopCodon(char *s, int isPrimary) {
int n = CODON_LEN;
if (isPrimary) {
return strncmp(s,"ATT",n)==0 || strncmp(s,"ATC",n)==0 ||
strncmp(s,"ACT",n)==0;
}
else {
return strncmp(s,"TAA",n)==0 || strncmp(s,"TAG",n)==0 ||
strncmp(s,"TGA",n)==0;
}
}

struct {
char *seq[MAX_LEN/CODON_LEN];
int len;
} proteinSeq;

int getline(char s[], int lim) {
int c, i;
i = 0;
while (--lim > 0 && (c=getchar()) != EOF && c != '\n') {
s[i++] = c;
}
s = '\0';
if (c == EOF) return -1;
return i;
}

void printProteinSeq() {
int i;
for (i = 0; i < proteinSeq.len; i++) {
printf("%s", proteinSeq.seq);
if (i < proteinSeq.len-1) printf("-");
}
printf("\n");
}

int baseConvert(char base, int isPrimary) {
switch(base) {
case 'A':
return isPrimary ? U : A;
case 'T':
return isPrimary ? A : U;
case 'G':
return isPrimary ? C : G;
case 'C':
return isPrimary ? G : C;
}
}

/* Generate Protein sequence for a forward order DNA strand */
int processForward(char *s) {
int i, j, sLen = strlen(s);
int idx[CODON_LEN]; /* index to aminoAcid lookup */
char *amino;
int isPrimary;

proteinSeq.len = 0; /* Number of amino acids in protein */
if (sLen <= CODON_LEN) return 0;
isPrimary = s[0] == 'T' ? 1 : 0;
for (i = CODON_LEN; i <= sLen-CODON_LEN; i += CODON_LEN) {
if (findStopCodon(&s, isPrimary)) break;
for (j = 0; j < CODON_LEN; j++) {
idx[j] = baseConvert(s[i+j], isPrimary);
}
amino = aminoAcid[idx[0]][idx[1]][idx[2]];
if (strlen(amino)==0) return 0;
proteinSeq.seq[proteinSeq.len++] = amino;
}
if (proteinSeq.len == 0 || i > sLen-CODON_LEN) {
return 0;
}
else {
printProteinSeq();
return 1;
}
}

int main() {
int i, sLen;
int goBackward = 0;

char DNAStrand[MAX_LEN];

char *s, *start, *end, c;

/* Read DNA strand */
while (getline(DNAStrand, MAX_LEN) != -1) {
if (strcmp(DNAStrand, "*") == 0) break;
s = DNAStrand;
goBackward = 0;
while (*s != '\0') {
if ( findStartCodon(s) && processForward(s) ) {
break;
}
s++;
if (*s == '\0' && !goBackward) {
goBackward = !goBackward;

/* Reverse DNA strand */
for (start=DNAStrand, end=s-1; start < end; start++, end--) {
c = *start;
*start = *end;
*end = c;
}
s = DNAStrand;
}
}
if (*s=='\0') printf("*** No translatable DNA found ***\n");

}

}

d91-lek
New poster
Posts: 22
Joined: Thu Sep 16, 2004 2:25 am
Location: KTH, Stockholm
Contact:

Indeed, me too WA

Post by d91-lek » Thu Sep 16, 2004 2:35 am

Test cases my way too, please!

[c]
/* @JUDGE_ID: 49998ME 385 C "Brute force" */

/* Work in DNA, not mRNA. */


#include <stdio.h>
#include <string.h>


char *strrev(char *str)
{
char *b = str, *e = str + strlen(str) - 1;
char tmp;
while(b < e) {
tmp = *e;
*e = *b;
*b = tmp;
b++;
e--;
}
return str;
}

/* {T, C, A, G}
A <-> T, C <-> G
10 <-> 00, 01 <-> 11
dna ^= 2
*/
char *dnacomplement(char *dna)
{
char *c = dna;
while(*c) {
switch(*c) {
case 'T': *c = 'A'; break;
case 'A': *c = 'T'; break;
case 'C': *c = 'G'; break;
case 'G': *c = 'C'; break;
};
c++;
}
return dna;
}


enum basecode {T = 0, C = 1, A = 2, G = 3};
/* dnacode[1:st][3:rd][2:nd] */
char dnacode[4][4][4][4] = {{{{"Phe"},{"Ser"},{"Tyr"},{"Cys"}},
{{"Phe"},{"Ser"},{"Tyr"},{"Cys"}},
{{"Leu"},{"Ser"},{"---"},{"---"}},
{{"Leu"},{"Ser"},{"---"},{"Trp"}}},
{{{"Leu"},{"Pro"},{"His"},{"Arg"}},
{{"Leu"},{"Pro"},{"His"},{"Arg"}},
{{"Leu"},{"Pro"},{"Gln"},{"Arg"}},
{{"Leu"},{"Pro"},{"Gln"},{"Arg"}}},
{{{"Ile"},{"Thr"},{"Asn"},{"Ser"}},
{{"Ile"},{"Thr"},{"Asn"},{"Ser"}},
{{"Ile"},{"Thr"},{"Lys"},{"Arg"}},
{{"Met"},{"Thr"},{"Lys"},{"Arg"}}},
{{{"Val"},{"Ala"},{"Asp"},{"Gly"}},
{{"Val"},{"Ala"},{"Asp"},{"Gly"}},
{{"Val"},{"Ala"},{"Glu"},{"Gly"}},
{{"Val"},{"Ala"},{"Glu"},{"Gly"}}}};



int synthesize(char* dna) {
char *base = strstr(dna, "ATG");
enum basecode bc[3];
int bcidx = 0;
char *aminoacid, protein[(4*255)/3+1];
char *pend = protein;

if(!base)
return 0;


base += 3; /* ATG */
while(*base) {
switch(*base) {
case 'T': bc[bcidx] = T; break;
case 'A': bc[bcidx] = A; break;
case 'C': bc[bcidx] = C; break;
case 'G': bc[bcidx] = G; break;
};
if(bcidx == 2){
aminoacid = dnacode[bc[0]][bc[2]][bc[1]];
if(*aminoacid == '-'){
*(--pend) = ""[0];
printf("%s\n", protein);
return 1;
}
*(pend++) = *(aminoacid++);
*(pend++) = *(aminoacid++);
*(pend++) = *(aminoacid);
*(pend++) = '-';
bcidx = 0;
} else
bcidx++;
base++;
}

return 0;
}

main()
{
char dna[256];

while(scanf("%s", dna), dna[0]!='*')
if(!synthesize(dna)
&& !synthesize(strrev(dna))
&& !synthesize(dnacomplement(dna))
&& !synthesize(strrev(dna)))
printf("*** No translatable DNA found ***\n");
}

[/c]

User avatar
Gigi's Baby
New poster
Posts: 19
Joined: Thu May 02, 2002 5:47 am
Location: Zhongshan (Sun Yat-sen) University

Post by Gigi's Baby » Fri Sep 17, 2004 3:26 pm

To d91-lek:
I have seen your code. You just find the first "ATG" in the string as the head of the
strand , but maybe the head of the strand is not the first "ATG", so you should try all the
"ATG" as the head.

d91-lek
New poster
Posts: 22
Joined: Thu Sep 16, 2004 2:25 am
Location: KTH, Stockholm
Contact:

Accepted

Post by d91-lek » Fri Sep 17, 2004 3:53 pm

To Gigi's Baby:
The first ATG is the start of the strand. All codes in dna are meaningful
so the only thing stopping protein synthesizing once you have started is
strand end or a stop code. Admittedly, there could be more proteins in
the strand, but not according to the problem specification.

Actually my fault was ...ATGTGA... - valid but empty strands. The output
for these should be *** No translatable DNA found ***. Besides that, I
did *(--pend) = '\0', writing before the string when no protein was found.
With no string-end the last protein was printed (in the best case).

The test case I needed was:

<a valid protein>
ATGTGA
*

OK, over and out. I'll be at 179.

User avatar
Gigi's Baby
New poster
Posts: 19
Joined: Thu May 02, 2002 5:47 am
Location: Zhongshan (Sun Yat-sen) University

Post by Gigi's Baby » Fri Sep 17, 2004 4:11 pm

:)
I didn't read the problem description carefully.
My AC code tried all the ATG :oops:

d91-lek
New poster
Posts: 22
Joined: Thu Sep 16, 2004 2:25 am
Location: KTH, Stockholm
Contact:

Post by d91-lek » Sun Sep 19, 2004 2:32 pm

Please don't be sorry because you
are generous and I am cheap. :wink:

trulo17
Learning poster
Posts: 62
Joined: Mon Jan 24, 2005 6:12 am
Location: Lima,Peru

385 DNA Translation

Post by trulo17 » Mon Feb 19, 2007 3:00 am

could you help me understand the statement? It's not very clear, how can i know if the given string is the original or complemented dna, or the reversed form of them? Moreover, if more than one of them generates protein? how can i know what to output? thx in advance

Robert Gerbicz
Experienced poster
Posts: 196
Joined: Wed May 02, 2007 10:12 pm
Location: Hungary, Pest county, Halasztelek
Contact:

Post by Robert Gerbicz » Sun Feb 03, 2008 10:25 pm

I get only WA. What is the correct output for the above posted tricky:

Code: Select all

ATGTGA
*
I would appreciate more (critical) input/output!

brianfry713
Guru
Posts: 5947
Joined: Thu Sep 01, 2011 9:09 am
Location: San Jose, CA, USA

Re: 385 DNA Translation

Post by brianfry713 » Thu Jan 12, 2012 12:49 am

You need to test all 4 possibilities. The given DNA strand may be either the primary or the complementary DNA strand, and it may appear in either forward or reverse order.

This problem has a correction program that will accept any possible protein, so print any of them. For example in the test input:
TTAATACGACATAATTAT

The complement:
AAUUAUGCUGUAUUAAUA
Has the protein:
Leu-Tyr

The reverse complement:
AUAAUUAUGUCGUAUUAA
Has the protein:
Ser-Tyr

Either answer should get you AC.
Check input and AC output for thousands of problems on uDebug!

brianfry713
Guru
Posts: 5947
Joined: Thu Sep 01, 2011 9:09 am
Location: San Jose, CA, USA

Re: 385 - DNA Translation WA

Post by brianfry713 » Thu Jan 12, 2012 12:51 am

My AC program prints "*** No translatable DNA found ***" for any sequence of less than 9 characters.
Check input and AC output for thousands of problems on uDebug!

Post Reply

Return to “Volume 3 (300-399)”