General musings from the MacVector team about sequence analysis, molecular biology, the Mac in general and of course your favorite sequence analysis app for the Mac!

Use Database | Auto-Annotate Sequence to annotate prokaryotic genomes

The continuing advances in Next Generation Sequencing have made it relatively low cost to sequence prokaryotic genomes. Many scientists are embarking on large projects to sequence multiple related genomes. These might be clinical isolates of the same species exhibiting different pathogenetic properties, environmental isolates from different sites, or a study over time of the changes in microbial genomes from specific locations. Once you have your sequence, the definitive source of annotation is the NCBI Prokaryotic Annotation Pipeline. However, to have that run on your sequence, you must submit the sequence to the NCBI. This is not always ideal – perhaps you are still working on resolving repeat sequences for your genome, you don’t want to wait for it to be published or you don’t want to go through the hassle of a formal submission for many variant sequences. MacVector to the rescue!

First, you need to download existing similar genome sequences – open the Database | Online Search for Keywords (Entrez) browser and search for [name of your organism] “+” “complete genome”. Assuming you are working with a reasonably common organism, you might find a few (or a lot of) hits. Select those you are interested in and click on the To Disk button, selecting a suitable target folder to save the downloaded genomes into.

Now take your unannotated genome sequence and invoke Database | Auto-Annotate Sequence. Select the folder containing your genomes as the target and press OK. On a 2.7 GHz laptop, scanning a 1.8 Mbp Campylobacter jejuni genome against 25 related C. jejuni genomes (average 5,000 features per genome) takes around 10 minutes with the default parameters, resulting in a fully annotated genome.


There’s an upcoming “Genome Comparison” tool in MacVector 17 (due late summer) that lets you actually directly compare the features of two related genomes based on DNA or (for CDS features) protein sequences and reports all of the identities, similarities, differences and missing features. The tool confirmed that no features were missing compared to the NCBI annotated genome and there were just minor differences with a few CDS features where there were mutations creating or removing stop codons.

Posted in Tips, Tutorials | Tagged , , | Comments closed

Automatically displaying open reading frames with MacVector’s Scan For Open Reading Frames tool

As well as Scan for.. Missing Features that shows annotation on your sequences, you may have noticed extra CDS features annotated to your sequences. These are from the Scan for.. Open Reading Frames tool that automatically scans every DNA sequence window for open reading frames and displays the results in the Map tab.

It’s very useful, especially if you frequently receive blank sequences from colleagues, or you want to show quickly show coding frames in unannotated trace files you receive from your sequencing service. However, if you are working with richly annotated sequences, you may find these distracting.

It’s easy to turn off (and back on again). The tool is controlled in the MacVector | Preferences | Scan DNA pane, along with restriction enzyme sites and missing features. MacVector will remember this setting for all new sequences you open, until you turn it back on again.


The appearance of the ORFs can be controlled using the Options | Default Symbols menu item.

(Note: prior to MacVector 16 this tool was called Automatic ORF Annotation)

Future releases of MacVector will see more automatic annotation and visualization tools. For example MacVector 17 will show primer binding sites from your Primer Library. So if there’s something you’d like to see please email MacVector Support.

Posted in Tips | Tagged , , | Comments closed

Pasting tabular data from MacVector into Microsoft Excel

There are a number of MacVector analyses that generate tabular text output. Examples include the Protein Analysis Toolbox List output, the Raw Data tab of the ABI chromatogram document window, the Matrix tab of the multiple sequence alignment document windows and the Coverage tab of the Bowtie contig editor. Each of these is actually composed of tab delimited columns, meaning that you can copy the entire table and paste the data into Microsoft Excel for additional analysis.

However, over the years, the behavior of Microsoft Excel has changed when it comes to dealing with copied tab-delimited text data. For example, Microsoft Excel 2008 would directly accept this data – you could copy from MacVector, switch to Excel and choose Edit | Paste and each item would be pasted into an individual cell. However, the current version of Microsoft Excel 365 (version 15.41) behaves differently – if you Edit | Paste, all of the data on each line gets pasted into the first column. To get around this, simply choose Edit | Paste Special... There is just a single option for the data on the clipboard (“Text”), but when you click OK, the data gets pasted into individual cells just as you would want.


Posted in Tips | Tagged , | Comments closed

Using MacVector’s Agarose Gel tool to design a digest to screen minipreps after a ligation.

How to design a digest to screen minipreps after a ligation.

(View full size on website…)

Posted in Techniques, Tips | Tagged , , , | Comments closed

MacVector and macOS Mojave

Apple released macOS Mojave today (Tuesday 25th September).

Over the past few months we have performed preliminary testing with MacVector 16.0.9 on various development releases of macOS Mojave through its development cycle and all appears to be well.

So whereas we cannot state MacVector is compatible until it is officially released, we can say that we think all is OK.

Incidentally over the past few releases of MacVector we have made significant effort to future proof MacVector for upcoming releases of macOS.

Please do note that MacVector 16.0.9 will not support Dark Mode. However, it is planned for the next release, MacVector 17 (out later this year).

For older versions of MacVector you can check compatibility on our website. The results for macOS Mojave are not yet published, but it is likely that they will follow the ones for macOS High Sierra 10.13.4.

Please note that we do generally recommend that you do not upgrade to a new macOS release until at least a few weeks after a release.

Posted in Uncategorized | Tagged , , | Comments closed

Free MacVector Teaching licenses

Did you know that MacVector, Inc. offers FREE licenses for teaching purposes?

If you are in the process of either preparing your lab for the upcoming year, or preparing for the courses you need to teach then don’t forgot that MacVector makes a great teaching tool. We offer free teaching licenses for anyone with an active, up-to-date license. All we need for your teaching license is the course name and number, start and stop date of the course, and the number of seats needed. We also have many great resource materials that will help you or your students learn MacVector.

Send my free teaching license →
Posted in General | Tagged | Comments closed

NOW CANCELLED MacVector are at the NIH Research Festival

NIH Research Festival Tent show September 13-14, 2018

Due to the forecast storm the Tent Show has been postponed. As soon as we know the rescheduled date we’ll update.

We’re at the NIH Research Festival this week. Please drop by our booth on Thursday or Friday, when the big, white exhibitors tent is open. We’re on Booth #520 on parking lot 10H.

We enjoy the tent show and look forward to meeting NIH MacVector users both new and old, and anybody who is interested in learning to use the easiest to use sequence analysis application for the Mac. We’re happy to just have a chat, but we’ll also be able to give you a preview of our next release, MacVector 17.0, which will be out soon (automatic Gibson assembly, Compare Genomes, and much more!). We’ll have some goodies too – our popular mousepads with the IUPAC Genetic Code as well as some Pop sockets for your cell phone.

The exhibit hours are:

  • Thurs. Sept 13 from 9:30 to 15:30
  • Fri. Sept 14 from 9:30 to 14:30
  • Don’t forget that the NIH host a Picnic. There will be food from several restaurants being sold at a discount for both researchers and exhibitors.

    Posted in Meetings | Tagged , | Comments closed

    MacVector and future releases of macOS

    The next OS release for the Mac will arrive later this year. MacOS Mojave is a significant OS upgrade and includes a lot of new features, especially when compared with the differences between OS X El Capitan, macOS Sierra and macOS High Sierra.

    Our developers have been hard at work ensuring that when macOS Mojave is released, MacVector 16 and the following release (MacVector 17) will be fully compatible.

    As usual for older versions you can check compatibility on our website. The results for macOS Mojave are not yet published, but it is likely that they will follow the ones for macOS High Sierra 10.13.4.

    For recent older versions it is likely that they will work fine. Over the past few version we have been striving to future proof MacVector for new versions of macOS.

    However, for older versions of MacVector there will be issues. Apple do make significant changes between releases. One issue that you will see with MacVector 13.5 and earlier versions is regular dialogs stating that this version of MacVector will not work with the next release of macOS.

    A more subtle version of this dialog started appearing with the release of macOS High Sierra 10.13.4, when Apple added warnings to all 32 bit applications.

    Here at MacVector we always strive to stay ahead of the game. When Apple recommended (many years ago) that all applications should be 64 bit, we immediately started working to move MacVector to be fully 64 bit resulting in the release of MacVector 14 in 2015.

    Please do note that whereas our developers do develop and run MacVector on Apple’s public and developer betas of macOS, we really do not recommend that users run MacVector for any real workflows on beta releases. There may be unexpected issues that you may encounter.

    Posted in Development | Tagged , | Comments closed

    RNASeq Expression Analysis with MacVector and Assembler

    If you have the Assembler module, MacVector can align millions of NGS reads from RNASeq experiments against large genomes and generate a coverage table displaying the relative expression levels of every gene in a genome. The key to this functionality is that you must have a reference genome with genes annotated as CDS or gene features – then you can use Bowtie to rapidly assemble reads from fastq files against the genome and generate a report listing the coverage for each feature. The steps to do this are:

  • Use File | New | Assembly Project to create a new project.
  • Click on the Add Ref toolbar button to add a suitable annotated reference genome.
  • Click on the Add Reads toolbar item to add your RNASeq reads.
  • Click on the Bowtie toolbar button to assemble the reads against the reference genome.
  • Double-click on the resulting reference contig item and switch to the Coverage tab.
  • You will get a display similar to this.


    This lists each CDS and gene feature, along with the number of reads that aligned to each feature and the Reads Per Kilobase of transcript per Million mapped reads (RPKM) and Transcripts Per Kilobase Million (TPM) values that can be used to compare expression between genes and runs.

    There is an RNASeq Expression Analysis Tutorial available that describes this functionality in more detail. If you are running MacVector 16.0.4 or later you will find this in the /Applications/MacVector/Documentation/ folder. Otherwise you can download the tutorial and sample dataset. Although this is a small dataset designed for rapid analysis, you can use this approach to align 50 million+ reads against the entire human transcriptome on a modest (16 GB RAM) MacBook Pro.

    Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

    Posted in Techniques, Tips | Tagged , , , | Comments closed

    How to split large fastq files for more manageable assemblies

    We’ve previously discussed how important it can be to make sure you are using the appropriate number of fastq reads from an NGS experiment to ensure you obtain the results you are looking for. Using too many reads can confuse algorithms with the massive coverage increasing mis-assemblies due to background errors in the reads. In addition, large numbers of reads can significantly impact CPU performance, memory usage, and even disk usage. At MacVector we have coded a simple utility that will split large fastq files into smaller chunks. It’s completely free to download and should work on all versions of macOS/ Mac OS X.

    Download the SplitFastqFile utility

    You run it by simply dropping fastq files onto the application and following the prompts. When complete, you’ll see the split files in a folder, with naming similar to this.

    (


    Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

    Posted in Techniques, Tips | Tagged , , , | Comments closed