Skip to main content

Thread: Help AWK script


i've been working on time, being complete awk novice, i'm bit stumped @ moment.

goal join number of tables based on first column of data. want similar bash 'join' command, have 60 tables join. other caveat each table have different number of rows, depending on results of upstream processing, , final number of rows needs equal table rows. final rule need first , second columns of data, , need change header of second column of data in order identify original table contributed table final output.

here example of input tables (i'm attaching couple play around with):

table 1
code:
amp     median  width  ampl387948860   45      173  ampl414952472   294     172  ampl391478722   37.5    158  ampl392076156   31      169  ampl535375693   12      176  ampl396629974   249     177  ampl408948551   386     129  ampl413431337   444     174  ampl401922916   297     75  ampl405468071   2       172  ampl408440700   0       173
table 2
code:
amp     median  width  ampl387948860   101     173  ampl391478722   74      158  ampl392076156   50      169  ampl396629974   483     177  ampl401922916   451     75  ampl405468071   4       172  ampl408440700   0       173
ultimately, want have large table, first row representing "ampl..." subtables, , column 2 either data subtable, or placeholder if there no data corresponding value in column 1:

code:
amp     subtable1       subtable2  ampl387948860   45      101  ampl414952472   294     ---  ampl391478722   37.5    74  ampl392076156   31      50  ampl535375693   12      ---  ampl396629974   249     483  ampl408948551   386     ---  ampl413431337   444     ---  ampl401922916   297     415  ampl405468071   2       4  ampl408440700   0       0
so far i've come following:

code:
awk '{ head=gensub(/median/,substr(filename,1,20),1,$2); } $0 {arr[$1]=arr[$1] "\t" head } end { for(i in arr) print i, arr[i] }' *.tsv
i thought working until realized not padding null entries tab, rather pushing left-most column. so, header no longer corresonds correct table. tried add if statement in there substitute "---" if there no data in field ($2), can't seem work right:

code:
awk '{ head=gensub(/median/,substr(filename,1,20),1,$2); } $0 {arr[$1]=arr[$1] "\t" head } end { for(i in arr) { if(i == 0) print "---", arr[i]; else print i, arr[i] }  }' *.tsv
does have suggestion on how work correctly?

code:
awk '    { row[$1]=$1; col[filename]=filename; val[$1":"filename]=$2; }      end { (x in row)            {              printf("%s", x);              for(y in col)              {                printf("\t%s", ((val[x":"y]=="")?"---":val[x":"y]) );              }              printf("\n");            }          }' *.tsv
output:
code:
amp	median	median ampl414952472	---	294 ampl408948551	---	386 ampl396629974	483	249 ampl405468071	4	2 ampl391478722	74	37.5 ampl401922916	451	297 ampl408440700	0	0 ampl535375693	---	12 ampl413431337	---	444 ampl392076156	50	31 ampl387948860	101	45
it's proof of concept need fix column headers


Forum The Ubuntu Forum Community Ubuntu Specialised Support Development & Programming Programming Talk [SOLVED] Help AWK script


Ubuntu

Comments

Popular posts from this blog

Thread: Firefox print dialog doesn't remember settings

Error 400 - Photoshop services are not available

After Effects error:creating resource file on Windows