An overview of Sequence Assembly in MacVector

MacVector has had sequence assembly functionality for a long time. Many moons ago MacVector came with a OS9 only tool called AssemblyLIGN. Although a useful tool it was fairly limited and needed a complete replacement, so about five years ago Assembler was released, which is a much more modern tool for sequence assembly. In addition to Assembler MacVector also has built in assembly functionality. With the next release of MacVector (Which is version 11 and due out in the Summer) we will be introducing support for the assembly of short read data so I think it’s an appropriate time to review these existing sequence assembly tools.

Since MacVector’s main base of users are generally lab molecular biologists, so the built in tools are designed to be useful for their needs, and are focused on sequencing of known sequences. These needs are served by a function called Align to Reference, that allows you to take trace files and assemble them against a template sequence. Its an ideal tool if you are doing small scale sequencing projects. For example confirming a site directed mutagenesis experiment, checking that a construct you’ve just spent months doing has the insert in the correct order, or confirming the sequence of a cloned PCR fragment.

However, the functionality of this tool does not stop there and it’s also an excellent tool for SNP analysis, with some special tools to allow you to easily spot mutations from your original template sequence. Align To Ref is also designed for aligning mRNA/cDNA sequences against genomic templates to allow you to find intron/exon junctions. Other uses include taking EST sequences and again aligning them against a genomic template. That will show coding regions very nicely.

Then we come back to Assembler. This is intended for full scale de novo sequencing projects. Although available as a separate product, Assembler integrates completely into MacVector, and in fact allows you to run standard MacVector sequence analysis jobs directly against a contig as if it was a single sequence. That’s great for designing primers to amplify part of your contig for example. Assembler uses the Phred, Phrap and Cross Match algorithms to assembles traces into contigs. It shows full quality scores of the reads and the aligned contigs. Although it performs automatic vector trimming, you can also run a library of your usual cloning vectors against traces before assembling them to improve accuracy. Assembler will also basecall traces with a higher accuracy that many sequencing machines themselves can achieve. Incidentally the trial version of MacVector also includes the Assembler plugin. Just go to FILE – NEW – ASSEMBLY PROJECT to start Assembler.

I’ll write about the next release of Assembler and its improvements soon.

This entry was posted in Techniques. Bookmark the permalink. Both comments and trackbacks are currently closed.