awk - Data partitioning by columns -
i have big matrix of 50 rows , 1.5m columns. these 1.5m columns, first 2 headers.
i trying divide data columns small pieces. example each small set 50 lines , 100 columns. each small data must have first 2 columns mentioned above headers.
i tried
awk '{print $1"\t"$2"\t"}' test | cut -f 3-10 awk '{print $1"\t"$2"\t"}' test | cut -f 11-20 ... or
cut -f 1-2 | cut -f 3-10 test cut -f 1-2 | cut -f 11-20 test ... but none of above working.
is there efficient way of doing this?
one way awk. don't know if (awk) can handle such big number of columns, give try. uses modulus operator cut line each specific number of columns.
awk '{ ## print header of first line. printf "%s%s%s%s", $1, fs, $2, fs ## count number of columns printed, 0 100. count = 0 ## traverse every columns first 2 keys. ( = 3; <= nf; i++ ) { ## print header again when counted 100 columns. if ( count != 0 && count % 100 == 0 ) { printf "%s%s%s%s%s", ors, $1, fs, $2, fs } ## print current column , count it. printf "%s%s", $i, fs ++count } ## separator between splits. print ors } ' infile i've tested 2 lines , 4 columns instead of 100. here test file:
key1 key2 1 2 3 4 5 6 7 8 9 ten key1 key2 one2 two2 three2 four2 five2 six2 seven2 eight2 nine2 ten2 and results in:
key1 key2 1 2 3 4 key1 key2 5 6 7 8 key1 key2 9 ten key1 key2 one2 two2 three2 four2 key1 key2 five2 six2 seven2 eight2 key1 key2 nine2 ten2
Comments
Post a Comment