MacVector are at the ASCB – EMBO 2017 meeting in Philadelphia

The MacVector team will be at the ASCB/EMBO 2017 meeting this coming Sunday.

The conference starts on Sunday the 3rd of December and runs until Tuesday the 5th. It’s being held at the Pennsylvania Convention Center.

Our booth is 1002. If you are in Philadelphia for the conference please pop along and say “hi!”. We enjoy meeting long time users and learning how they use MacVector. You might be able to teach us some new tricks! We also enjoy meeting new users and demonstrating just how MacVector can make life easier for you in the lab! Come learn about our new Scan for… Missing Features tool. We’ll show you how you can go from a completely blank sequence to a beautifully annotated plasmid map in just a few steps. We’ll be demoing other new and old features.

Booth hours are Sunday, Monday and Tuesday from 9:30 to 4:00 pm.

Scan for Missing Features workflow v2

Follow the conference on Twitter with the hashtag #ASCBEMBO2017. Follow the MacVector team on Twitter for tips, tutorials and more.

Posted in Uncategorized | Tagged , | Comments closed

Downloading hits from the MacVector BLAST Map results tab

MacVector’s BLAST Map results tab (added in MacVector 15.5) is a unique interface for examining the annotations around hits to a query sequence. Each pane in the display represents a High Scoring Segment Pair, as seen in the BLAST Aligned Sequence tab. At the lower left corner of each pane is a Download button – when you click on it, a popup menu appears with several options.

NewImage

The major options are to either download the entire genome, remembering this might be many Mbp in size, or to just download the displayed fragment. In general, MacVector displays the High Scoring Segment Pair region of the sequence, plus 2kb on either side to provide context, and that is what will be downloaded. To reduce the number of mouse-clicks when scrolling through a lot of results, Download to Last Folder places the downloaded sequence in the last folder that you selected (or your Downloads folder if you’ve never selected one before).

Posted in Tips | Tagged , , | Comments closed

Turning on/off the SCAN FOR missing Features and ORFs

If you’re running MacVector 15.5 or later then you will have noticed extra features annotated to your sequences. These are from the Scan for ORFs tool (added in MacVector 15.5) and the Scan For Missing Features (added in MacVector 16) tools that automatically scan every DNA sequence window for open reading frames, missing annotations and displays the results in the Map tab.

It’s very useful, especially if you frequently receive blank sequences from colleagues, or you want to show quickly show coding frames in unannotated trace files you receive from your sequencing service. However, if you are working with richly annotated sequences, you may find these distracting.

It’s easy to turn these off (and back on again). Both tools are controlled in the MacVector | Preferences -> DNA Map pane, along with the Scan For Restriction Sites, settings. MacVector will remember this setting for all new sequences you open, until you turn it back on again.

The appearance of the ORFs can be controlled using the Options | Default Symbols menu item.

Future releases of MacVector will see more automatic annotation tools, so if there’s something you’d like to see please email MacVector Support.

Posted in Uncategorized | Tagged , , , | Comments closed

Controlling The Automatic ORF Display in MacVector 15.5

MacVector automatically scans every DNA sequence window for open reading frames and displays the results in the Map tab.

NewImage

The setting for this are controlled by the MacVector | Preferences -> DNA Map pane, along with the automatic Show restriction sites settings.

NewImage

The Minimum Number of Codons setting is fairly obvious. 5’ ends are starts and 3’ ends are stops are used for linear sequences and will display ORFs coming into the sequence that do not have a start codon, or ORFs that read out of the sequence without a stop codon. Codons after stops are starts is useful if you just want to know the locations of the longest possible ORFs in a sequence, irrespective of the existence of a start codon. Finally, the Suppress annotated CDS ORFs check box prevents the display of ORFs that are already annotated. However, if a potentially longer ORF is present, e.g. there is another ATG codon upstream, that ORF will be displayed. You can read more details here.

The appearance of the ORFs can be controlled using the Options | Default Symbols menu item.

Posted in Tips | Tagged , , , | Comments closed

Use the BLAST Map to better identify blast hits

With the advent of cheap Next Generation Sequencing technologies, there has been an explosion of whole genome sequences deposited in BLAST databases. One consequence of this is that, particularly for sequences of bacterial origin, most of the significant hits are to entire genomes. The classic BLAST results show the sequence alignments, but give no indication of what features are present on the database sequence at the alignment location. MacVector has a new unique BLAST Map result tab that shows the actual annotations around the alignment location for each high scoring segment pair.

NewImage

If a sequence is highly annotated, such that not all of the features are visible, simply holding the mouse over the appropriate pane will expand the pane to display all of the features. If you have already annotated (or partially annotated) your query sequence, those features will appear in the grey background “overlay” pane – that stays locked to the top of the window while the hits can be scrolled underneath it.

Posted in Tips | Tagged , , | Comments closed

MacVector’s Scan For Missing Features tool makes beautiful plasmid maps easy!

Scan for.. Missing Features: Sequences are automatically scanned and missing features displayed. A simple right-click converts them to a permanent feature. Even blank sequences will be displayed fully annotated with common features. You can even add your own proprietary features.

Posted in Tips | Tagged , , , , | Comments closed

101 things you (maybe) didn’t know about MacVector: #51 – Rapid assembly of genomes with Velvet and SPAdes

Not so long ago, to assemble even a small genome with Next Generation Sequencing data required an array of clustered computers and a lot of patience. But improvements in algorithms and hardware mean that it is now realistic to assemble bacterial genomes, or even smaller eukaryotic genomes using MacVector on a modest laptop machine.

MacVector 16 introduced a new assembler, SPAdes, to add to the existing Velvet and Phrap assemblers. Here’s a quick summary of the pros and cons of each assembler;

phrap: slow and does a poor job with most NGS data. But it handles long sequences extremely well as its tuned for Sanger ABI type reads. The best choice if you have just a few long reads or one or two consensus sequences resulting from the other algorithms.

velvet: fast with moderate RAM requirements. You can literally assemble smaller genomes (e.g. Mycoplasma) in less than two minutes. However, it takes a bit of playing with the parameters to get optimal assemblies.

SPAdes: much slower than velvet, but uses a lot less RAM with larger genomes/data sets. Typically works “out of the box” with fewer tweaks required to generate longer contigs than velvet.

Lets take a look at some performance data that illustrates this with a variety of bacterial genomes and input data to get a better understanding of how to use these algorithms. For this we will compare just velvet and SPAdes as it is not generally appropriate to use phrap for large NGS datasets. The table below lists the timings, memory usage and summary of results for a variety of different bacterial genomes with NGS data from Illumina HiSeq and MiSeq machines. This data was generated on a four year old MacBook Pro with a 2.7 GHz Intel Core i7 processor and 16 GB of RAM, i.e. a very modest machine by today’s standards.

AssemblyTimings

First, note how fast some of these assemblies complete. Velvet can assemble a small Mycoplasma genome from just under half a million MiSeq reads in a little more than a minute. Even a large ~7 GB Streptomyces sp. genome was assembled by Velvet in under an hour. While SPAdes is slower to complete, it uses significantly less RAM with the larger datasets and was able to complete the assembly of a larger B. subtilis MiSeq data set where Velvet ran out of memory and crawled to a halt.

As is usual with bacterial genome assemblies, neither Velvet nor SPAdes is able to generate a single contig representing the entire genome in these tests. This is due to the presence of repeat sequences (typically rRNA operons and insertion sequences) which prevent the assembly algorithms from knowing which order to join the contigs together. While this can be solved by including additional long insert reads into the assembly, we’ll explore some strategies for merging contigs in a future blog post.

</br />
This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

SaveSave

Posted in 101 Tips | Tagged , , | Comments closed

MacVector 16: Our latest release takes automatic sequence annotation to a whole new level

MacVector 16, our latest release, makes beautiful plasmid maps easier, and accurate de novo assembly achievable on your own desktop.

Scan for Missing Features workflow v2

Scan for.. Missing Features: Sequences are automatically scanned and missing features displayed. A simple right-click converts them to a permanent feature. Even blank sequences will be displayed fully annotated with common features. You can even add your own proprietary features.

Unified Feature Editor: A new editor combines the GenBank data and graphical appearance into a single dialog for simpler editing.

Batch Translation using Applescript: Produce individual protein sequences, or FASTA files with all translated CDS regions from a folder of DNA/mRNA/cDNA sequences.

SPAdes: de novo assembly: This new algorithm offers support for mixed assemblies (e.g. Illumina, PacBio and Oxford Nanopore in the same assembly), has a smaller RAM footprint than Velvet and typically produces longer contigs.

How to upgrade to MacVector 16.0

If you have active maintenance and are running MacVector 15.5.4 or later then you should have been notified about the new release. However, due to installer changes the preferred upgrade path is to directly download the installer of MacVector 16.0. To install this version, you must have a maintenance contract that was active on 1st September, 2017. You must also be running MacVector 15.5.4 and OS X 10.7 or later.

If you have an older version of MacVector then download the trial and request an upgrade quote.

Even if you have downloaded the trial in the past then downloading a new trial will give you a fresh 21 days to evaluate MacVector.

When a trial license expires it becomes MacVector Free. So if you decide against upgrading then you can just delete the trial license and easily go back to your current version. It’s risk free as MacVector files are backwards compatible.

Posted in Releases | Tagged , , , | Comments closed

Testing pairs of primers with MacVector

MacVector’s Primer Design tools makes it easy to test your primers. You can insert them directly from MacVector’s Primer Database or copy and paste them.

Posted in Tips | Tagged , , | Comments closed

Reference assembly with MacVector and Assembler

MacVector has a plugin module called Assembler that integrates directly into the main package and provides sequence assembly functionality. Assembler was designed from the ground up to be easy to use and allow users to easily manage the large amount of data that sequencing generates nowadays.

The Assembler interface is built around the Assembly Project window which gives great control over managing your reads and gives you easy access many different assembly tools. To bring data in you just drag and drop it from Finder into a new Assembly Project window.

Assembler uses many popular third party tools/algorithms that are widely respected and published.

De Novo assembly

  • Velvet
  • Phred/Phrap.
  • SPAdes

Reference Assembly

  • Bowtie2

Assembler also uses many other popular tools for variant calling. Assemblies are stored in the BAM file format which is the most widely used format. That means your data is not locked away in a proprietary format.

Assembler accepts reference sequences in many different formats (since it’s integrated within MacVector itself it’s trivial to directly download fully annotated references from the NCBI’s Genbank database).

Sequencing reads must be in FASTQ format. Although Assembler will import already assembled reads in BAM format.

The best way to try Assembler is to try it out. Download a 21 day trial version.

Posted in Techniques | Tagged , , , , | Comments closed