Sunday, February 28, 2016
How to remove all characters except ATCG> from a fasta file
If you have a fasta file that you want to clean up completely i.e. to remove all characters except ATCG> use:
To remove blank lines that may appear pipe this code to sed
Still this procedure may leave some remnant ATGC letters from the text you wanted to remove, for example here you have letters from the file header at the beginning of the file that were not removed, that you have to clean manually. Also G, A and > were left behind from the description lines.