awk - Data partitioning by columns -
i have big matrix of 50 rows , 1.5m columns. these 1.5m columns, first 2 headers.
i trying divide data columns small pieces. example each small set 50 lines , 100 columns. each small data must have first 2 columns mentioned above headers.
i tried
awk '{print $1"\t"$2"\t"}' test | cut -f 3-10 awk '{print $1"\t"$2"\t"}' test | cut -f 11-20 ...
or
cut -f 1-2 | cut -f 3-10 test cut -f 1-2 | cut -f 11-20 test ...
but none of above working.
is there efficient way of doing this?
one way awk. don't know if (awk
) can handle such big number of columns, give try. uses modulus operator cut line each specific number of columns.
awk '{ ## print header of first line. printf "%s%s%s%s", $1, fs, $2, fs ## count number of columns printed, 0 100. count = 0 ## traverse every columns first 2 keys. ( = 3; <= nf; i++ ) { ## print header again when counted 100 columns. if ( count != 0 && count % 100 == 0 ) { printf "%s%s%s%s%s", ors, $1, fs, $2, fs } ## print current column , count it. printf "%s%s", $i, fs ++count } ## separator between splits. print ors } ' infile
i've tested 2 lines , 4
columns instead of 100
. here test file:
key1 key2 1 2 3 4 5 6 7 8 9 ten key1 key2 one2 two2 three2 four2 five2 six2 seven2 eight2 nine2 ten2
and results in:
key1 key2 1 2 3 4 key1 key2 5 6 7 8 key1 key2 9 ten key1 key2 one2 two2 three2 four2 key1 key2 five2 six2 seven2 eight2 key1 key2 nine2 ten2
Comments
Post a Comment