Wednesday, April 11, 2012

How to predict cellular localization of a protein - PSort and Wolf PSort prediction tools

If you obtained an amino-acid sequence and need to predict the cellular localization of the protein containing that sequence you can use the PSort prediction algorithm.

It combines several prediction methods and algorithms for the amino-acid sequences which potentially represent localization signals in the cell. Based on such analysis it will predict the subcellular compartment the protein is most probably located in.


PSort will incorporate in its prediction various information about the aminoacid sequences. For example, it will detect the presence of: signal sequence and its cleavage site (PSG score, McGeoch's method and GvH score, von Heijne's method), transmembrane domains (ALOM score, Klein et al's method), membrane topology i.e. whether N- or C- terminus is cytoplasmatic (Hartmann et al.), mitochondrial targeting sequence (MITDISC and Gavel scores), nuclear localization signals (NUCDISC), KDEL ER membrane retention signals, SKL1 and SKL2 peroxisomal targeting signals, VAC possible vacuolar targeting motif, RNA-binding motifs,  lipid anchors - i.e. NMYR N-myristoylation motif and prenylation motif, memYQRL transport motif from cell surface to Golgi, tyrosines in the tail, dileucine motif in the tail, PROSITE DNA binding motifs, PROSITE ribosomal protein motifs, coiled-coil regions.
At the end PSort will give an estimate in percents what is the probabilitty of the protein to be localized in each cellular compartment.


PSort gives the following output:
55.6 %: extracellular, including cell wall
22.2 %: nuclear
22.2 %: mitochondrial

Wolf PSort is the useful extension of the PSort algorithm and it uses PSort parameters and their scores to search the UniProt (and Gene Ontology) databases of proteins to find proteins that have similar PSort scores as the query protein. The search will exclude highly homologous protein sequences as they will likely have the same scores as the query sequence and this kind of search could be achieved using e.g. BLAST. Instead Wolf PSort will give mainly non-homologous proteins that have similar localization motifs as the query sequence. Wolf PSort will also give a sequence with 100% homology as the query sequence.

For the example sequence above Wolf Psort gave 32 so called nearest neighbors, proteins with similar PSort scores as the query, but with low homology to the query (9-22%). It also gave an output with 100% homology where the sequence you obtained might actually come from. All the proteins have extracellular localization which provides additional confirmation that the query protein might be localized outside of the cell.

Link to the Wolf Psort papers:

No comments:

Post a Comment