How to split awk field correctly -
i have file (test.bed) looks (which might not tab-seperated):
chr1 10002 10116 id=1;frame=0;strand=+; 0 + chr1 10116 10122 id=2;frame=0;strand=+; 0 + chr1 10122 10128 id=3;frame=0;strand=+; 0 + chr1 10128 10134 id=4;frame=0;strand=+; 0 + chr1 10134 10140 id=5;frame=0;strand=+; 0 + chr1 10140 10146 id=6;frame=0;strand=+; 0 + chr1 10146 10182 id=7;frame=0;strand=+; 0 + chr1 10182 10188 id=8;frame=0;strand=+; 0 + chr1 10188 10194 id=9;frame=0;strand=+; 0 + chr1 10194 10200 id=10;frame=0;strand=+; 0 + i want produce following output (which should tab-seperated):
chr1 10002 10116 id=1 0 + chr1 10116 10122 id=2 0 + chr1 10122 10128 id=3 0 + chr1 10128 10134 id=4 0 + chr1 10134 10140 id=5 0 + chr1 10140 10146 id=6 0 + chr1 10146 10182 id=7 0 + chr1 10182 10188 id=8 0 + chr1 10188 10194 id=9 0 + chr1 10194 10200 id=10 0 + i have tried following code:
awk 'ofs="\t" split ($0, a, ";"){print a[1],$5,$6}' test.bed but get:
chr1 10002 10116 id=1 40 4+ chr1 10116 10122 id=2 40 4+ chr1 10122 10128 id=3 40 4+ chr1 10128 10134 id=4 40 4+ chr1 10134 10140 id=5 40 4+ chr1 10140 10146 id=6 40 4+ chr1 10146 10182 id=7 40 4+ chr1 10182 10188 id=8 40 4+ chr1 10188 10194 id=9 40 4+ chr1 10194 10200 id=10 40 4+ what doing wrong? somehow number '4' added last 2 fields. thought number '4' somehow might have splitting in 4th field, however, tried producing similar file 3rd field split, , still got number '4' added last 2 fields. rather new 'awk' guess error in syntax. appreciated.
if set field separator whitespace or semi-columns won't have handle splitting yourself:
$ awk '{print $1,$2,$3,$4,$8,$9}' fs='[[:space:]]+|;' ofs='\t' file chr1 10002 10116 id=1 0 + chr1 10116 10122 id=2 0 + chr1 10122 10128 id=3 0 + chr1 10128 10134 id=4 0 + chr1 10134 10140 id=5 0 + chr1 10140 10146 id=6 0 + chr1 10146 10182 id=7 0 + chr1 10182 10188 id=8 0 + chr1 10188 10194 id=9 0 + chr1 10194 10200 id=10 0 + as doing wrong in:
awk 'ofs="\t" split ($0, a, ";"){print a[1],$5,$6}' - the syntax of
awkcondition{block}, setting value ofofs, splitting not conditional. statements should inside block. - however don't need set value of
ofson every line should initialized once. can using-voption, inbeginblock or after script.
valid alternatives:
$ awk -v ofs='\t' '{split($0,a,";");print a[1],$5,$6}' file $ awk 'begin{ofs="\t"}{split($0,a,";");print a[1],$5,$6}' file $ awk '{split ($0,a,";");print a[1],$5,$6}' ofs='\t' file
Comments
Post a Comment