101 things you (maybe) didn’t know about MacVector: #14 – How To Align Non-Overlapping Protein Fragments Against A Parent Protein

The classic algorithm for aligning multiple protein sequences is ClustalW. Normally, it does a great job of aligning related DNA and Protein sequences and can handle thousands of sequences if required. However, one place where it struggles is if you are aligning non-overlapping segments of DNA or Protein against a parental full-length sequence. The reason for this is that the basic algorithm compares each sequence against every other sequence in a pair-wise manner before generating the multiple alignment. That means that it forces all of the segments to overlap, even though they won’t align at all well. For example, here is an alignment of a 120 residue parental protein sequence aligned against 4 non-overlapping 30 residue segments of itself using ClustalW;

PoorClustalWAlignment.png

Here you can see that Segment1 aligns nicely to the Full Length sequence, but the other segments are poorly aligned, all overlapping Segment1. Now, with DNA sequences, the better way to approach this type of alignment is to use the Analyze | Align To Reference function that can not only correctly align segmented DNA sequences, but can also “flip” sequences that align to the minus strand. However, that is not an option for protein sequences. Luckily, there is a solution: choose Analyze | Align Multiple Sequences Using | Muscle (or hold down the Align toolbar button and select Muscle from the popup menu). Then, in the Muscle setup dialog, set the Profile: to PAM200 and set Diagonals: to On;

MuscleDialog.png

You can also choose VTML240 as the Profile, but this will not work with the default log-expectation Profile. When you click OK, the alignment will be recalculated and the display will refresh to show the correctly aligned sequences;

AfterMuscleAlignment.png

This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

This entry was posted in 101 Tips and tagged , . Bookmark the permalink. Both comments and trackbacks are currently closed.