Friday, March 11, 2016

Randomly select N lines from file

If you want to quickly randomly select N lines from a file use sort with -R option to randomly shuffle your file then pipe it to head to print first N lines.


mpjanic@zoran:~/AHR-TCF21$ cat tmp
S100A14
SBNO1
MTVR2
MTVR2
MTVR2
MTVR2
CLRN3
RBX1
ILVBL
NPPC
LINC00939
LINC00939
LINC00939
PCSK1N
MIR7974
NT5E
NT5E
NT5E
PTMS
ARID1B
mpjanic@zoran:~/AHR-TCF21$ sort -R tmp | head -n 5
CLRN3
NPPC
LINC00939
LINC00939
LINC00939
In this case you see that the output contains three time repeated entry because this entry is repeating in the input file, and if you would like to select only unique entries pipe the code to uniq.

mpjanic@zoran:~/AHR-TCF21$ sort -R tmp | uniq | head -n 5
NPPC
NT5E
ILVBL
S100A14
RBX1

No comments:

Post a Comment