Monday, December 7, 2015

How to compare two files with grep


Lets say that you have two lists of entries e.g. two lists of genes that you want to compare.
To compare the two lists you can use simple grep command in Unix with flags -f obtain pattern from file, -x select only those matches that match whole line -F pattern is a set of new-line separated fixed strings. Thus by using the -x flag the match has to be complete in the second file and not impartial, which would happen if for example a pattern being searched is partially included in some longer strings in the second file, and we want to search for full matches.

An example below:

mpjanic@valkyr:~/REBUTTAL$ head extrinsic_cardiomiopathy_disease_ontology_cut
ABCA1
ADH1B
ADIPOQ
ADM
ALDH2
ALOX5
ALOX5AP
ANGPT1
APOA1
APOA4

mpjanic@valkyr:~/REBUTTAL$ head coronary_artery_disease_gwas_cut 
Reported Gene(s)
intergenic
PHACTR1
LIPA
PDGFD
intergenic
KIAA1462
PHACTR1
intergenic
intergenic

mpjanic@valkyr:~/REBUTTAL$ grep -F -x -f extrinsic_cardiomiopathy_disease_ontology_cut coronary_artery_disease_gwas_cut 
SH2B3
SORT1
SORT1
LPL
SH2B3
CXCL12
SH2B3
SORT1
TRIB1
LIPG
CETP
CETP
CETP
LIPC
LPL
ABCA1
LCAT
LIPC
LPL
ESR1

...

No comments:

Post a Comment