Monday, December 1, 2014

How to concatenate side by side two files

If you need to concatenate two files side by side use paste command
-d '\t' will make tab as a delimiter.

Following command will paste two files side by side with tab as delimiter and cut columns 1,2,4 into a file conc.txt

paste -d'\t' file1 file2 | cut -f1,2,4 > conc.txt

E.g.

file1

atp-binding     2.3281287823771484
blocked amino end       7.743558776167471
compositionally biased region:Poly-Lys  4.604155374887082
compositionally biased region:Ser-rich  3.331241830065359
disease mutation        1.9403512039170954
disulfide bond  1.674089840106158
disulfide bond  1.6242758946817313
domain:PH       4.978121581497109
endoplasmic reticulum   2.331394040136443
extracellular matrix    3.941396444854259
glycoprotein    1.5948598745418259
glycosylation site:N-linked (GlcNAc...) 1.5429886170985712
GO:0000166~nucleotide binding   1.6648241884322064
GO:0001882~nucleoside binding   1.8304477780284232

file2 

atp-binding     1.7131363573311138
blocked amino end       11.728395061728394
compositionally biased region:Poly-Lys  3.4320987654320985
compositionally biased region:Ser-rich  4.290123456790123
disease mutation        5.4453262786596115
disulfide bond  1.9547325102880657
disulfide bond  1.4300411522633742
domain:PH       3.3000949667616335
endoplasmic reticulum   2.2805212620027433
extracellular matrix    3.9094650205761314
glycoprotein    1.9108059370231654
glycosylation site:N-linked (GlcNAc...) 1.4389233954451346
GO:0000166~nucleotide binding   1.465270684371808
GO:0001882~nucleoside binding   1.6341991341991342

conc.txt

atp-binding     2.3281287823771484      1.7131363573311138
blocked amino end       7.743558776167471       11.728395061728394
compositionally biased region:Poly-Lys  4.604155374887082       3.4320987654320985
compositionally biased region:Ser-rich  3.331241830065359       4.290123456790123
disease mutation        1.9403512039170954      5.4453262786596115
disulfide bond  1.674089840106158       1.9547325102880657
disulfide bond  1.6242758946817313      1.4300411522633742
domain:PH       4.978121581497109       3.3000949667616335
endoplasmic reticulum   2.331394040136443       2.2805212620027433
extracellular matrix    3.941396444854259       3.9094650205761314
glycoprotein    1.5948598745418259      1.9108059370231654
glycosylation site:N-linked (GlcNAc...) 1.5429886170985712      1.4389233954451346
GO:0000166~nucleotide binding   1.6648241884322064      1.465270684371808
GO:0001882~nucleoside binding   1.8304477780284232      1.6341991341991342

No comments:

Post a Comment