Weekly Tip: Use Hash Value = 12 for speedy genome comparisons with Create Dot Plot

MacVector’s Analyze | Create Dot Plot function can be used to compare entire genomes very quickly to get both an overall view of similarity (large inversions and duplications) while providing the ability to “drill down” to the residue level to see individual SNPs. One of the keys to ensuring the calculations complete in a reasonable length of time is to set the Hash Value to a large number, typically 11 or 12. For example, to compare two E. coli genomes (~4.6 Mbp) these settings are a good start.


On a typical laptop, with these settings the calculation takes just a few seconds to run. But the resulting plot clearly shows the well documented inversion in E. coli strain W3110 relative to MG1655 due to a recombination between the rrnB and rrnE rRNA gene clusters.


Posted in Tips | Tagged , | Leave a comment

MacVector’s Primer Database – Importing primers from Excel

Many molecular biologists keep lists of their primer sequences in Excel or some other spreadsheet tool. Previously MacVector had a separate utility that allows you to import primers kept in spreadsheet format into a Primer Database for direct use within MacVector. With the release of MacVector 18.2 we have integrated this functionality within MacVector.

CDC primers

Rather than direct importing the file you will need to first open the CSV file in TextEdit, then copy and paste into MacVector:

  • Prepare your data in an Excel or Numbers spreadsheet with three columns – “Name”, “Sequence”, “Comment”.
  • Export the data (or Save As…) in Tab Separated Values (TSV) format or Comma Separated Values (CSV) format.
  • Open the file with TextEdit, select all the rows of text
  • Edit | Copy
  • switch to MacVector and select File | New From Clipboard.
  • You can also open an existing Primer Database file and paste the new entries into it.

Remember that as well as scanning sequences to look for your primers, you can also automatically display primer binding sites with any sequence that you open.

Here’s an overview of all the primer workflows in MacVector.

ScanDNA primers 1

Posted in Releases, Tips | Tagged , , | Comments closed

MacVector 18.2 is out! …and ready for macOS Monterey

MacVector 18.2


We are very pleased to announce that MacVector 18.2 is available to download. MacVector 18.2 is a Universal Binary that runs natively on both Apple Silicon and Intel Macs. It is fully supported on macOS Sierra (10.12) to macOS Monterey (12).

MV18 2 Monterey

New features:

Align to Reference Enhancements

  • The Align to Reference alignment algorithm has been overhauled to do a much better job handling larger numbers of gaps in the alignment between a reference sequence and a read.
  • The alignment algorithm has been further optimized for speed. In addition, the Sensitivity setting can now be lower due to the enhanced consecutive gap detection, which also speeds up calculations.
  • When aligning ABI chromatogram data, or plain sequences, the Map tab now graphically displays the “trimmed” regions at either end of the sequences.

MV18 2TrimmingEnds

  • There is a new Remove Gaps context-sensitive (right-click) menu option that deletes residues in reads that correspond to a gap in the consensus sequence.

MV18 2 CloseGaps

Context Sensitive Hamburger Menus

Where a window has a context-sensitive menus available these views now contain a “hamburger” button (three parallel horizontal lines) that displays the same context sensitive menu when clicked on.

Importing of Primer Databases in TSV or CSV Format

You can import primer data into a MacVector Primer Database (.nsub) file from an Excel or CSV formatted file. This functionality replaces the old Primer Converter utility.

CDC primers

Miscellaneous Enhancements

To reduce clutter in the Assembly Project window toolbar, all of the assembly algorithms have been consolidated into a single Assemble toolbar button with a dropdown menu.

(Read more…)

How to update to MacVector 18.2.

If you have an active license, then you will be prompted to automatically update within the next few days.
You can also download the installer and do it manually now.

Posted in Uncategorized | Tagged , , | Comments closed

Compare a pair of genomes

In recent years there has been an explosion of whole-genome sequencing projects. One common question coming out of this has been to ask:

“Exactly what are the genetic differences between my sequenced organism and another related strain?”

MacVector to the rescue! MacVector’s Compare Genomes By Feature… tool lets you see the differences between two annotated genomes in fine detail.

CompareGenomes 1

The algorithm takes every annotated feature from the source genome and looks for the presence of that feature in the comparison genome based on sequence similarity. CDS features are even translated so that the predicted amino acid sequences are compared. The results are then tabulated to show identical, closely related, and weakly related features in separate tabs, with additional tabs for features that are completely missing and a “details” tab that shows the low-level alignment details for any matching pair of features. Hot-links in the result tabs let you quickly scroll the parent sequences to any individual feature of interest.

How to compare a pair of genomes

Compares two related annotated genomes (or smaller sequences) to identify and list, in spreadsheet form, identical, similar and weakly similar features along with missing features.

  • Open the pair of sequences you want to compare
  • Choose the feature types you want to compare and the target sequence
  • CompareGenomes dialog

  • Click OK
  • When the job has completed then you will be presented with the Filter dialog. Normally the defaults will be suitable. However, if the genomes are very similar you may want to increase the Similarity Threshold
  • click OK.
  • CompareGenomesFilter

    A results window will appear with the following tabs

    Identical, Similar, Weak, Missing, Details, Plot, Context


    The first three tabs refer to the similarity of a feature between the two genomes. The differences are set in the previous dialog with the threshold setting.

    Identical lists all of the features that are perfectly conserved between the two genomes based on sequence identity, even if the names and qualifiers are different. CDS features are translated and the amino acid sequences compared, so there may be silent mutation differences in the encoding DNA sequences.

    Similar shows matches that are not identical but match or exceed the Similarity Threshold.

    Weak lists all the remaining matches that exceeded our initial search criteria but were not sufficiently similar to be included on the Similar tab.

    Missing refers to features that are completely absent in the second sequence

    Details tab is used to display feature alignments when you click on a hotlink in the first three tabs.

    Plot shows a dot plot of your pair of sequences so you can visualise the relationship between the pair of genomes.

    Context shows the alignment between the pair of genomes.


    The format of the results tabs

    For the first three tabs the format is similar. The first five columns are the “name”, type, start, stop and strand of the feature in the parent sequence i.e. the sequence that you had frontmost when you invoked the search. The “name” is the label that appears in the Map tab for the feature. By default, for CDS features, this would be the /gene= qualifier, but this can be configured on an individual feature basis or for all features of a type. The rightmost columns provide the same information for the feature(s) that matched on the target genome except that there are is an extra Match Score column. This displays the DNA identity score for each pair of features along with, (in brackets) the identity score for the predicted amino acid translation for CDS features given the current default genetic code.

    Note that features that are duplicated in the target genome will show additional matches.

    Note that when multiple matches are found, if one of them has a 100% match, all of the matching features are shown in the match list,
    even if they do not also have 100% identity. This approach ensures that you are always aware of duplicated/pseudogenes with significant but non-identical matches.

    The display is highly interactive and you can click on any blue hotlink to view more information about it.

    For example if you click on a hotlinked feature name in the first column then the parent sequence document is brought frontmost, switches to the Features tab and highlights and scrolls to the corresponding feature. So, you can use this shortcut to quickly jump to any feature of interest.


    Alternatively If you click on the target genome gene names then the window changes to select the Details tab and shows the sequence alignment between the two features.

    Posted in Uncategorized | Tagged , | Comments closed

    MacVectorTip: Scan For… Missing Primers: Automatically display Primer Binding Site on your sequences

    MacVector’s Scan DNA For.. tool allows you to automatically display restriction enzyme recognition sites, putative ORFs, CRISPR PAM sites, missing annotation and also it will display primer binding sites from your own Primer Database in each DNA sequence that you open. Here’s an example of a couple of primers displayed on the pET 47b LIC cloning vector on each side of the LIC cloning site. You can you quickly annotate interesting binding sites with a simple right mouse click.

    ScanDNA primers 1

    The feature is controlled from the MacVector | Preferences | Scan DNA | Primers tab. My default, it uses the Primer Database.nsub file that you can find in the /Applications/MacVector/Subsequences/ folder, but you can point it to any .nsub file of your choosing.

    ScanDNA primers 2

    Posted in Tips | Tagged , , , | Comments closed

    MacVectorTip: Using the Align to Reference Shading and Trimming toolbar buttons

    MacVector’s Align to Reference Editor and the Contig Editor in Assembly Projects have two useful functions for visualizing assemblies. The Shading button turns on background coloring of the residues in the upper pane, based on quality values (these can be from Sanger reads or from NGS reads). The scale ranges from a dark red for poor quality, through white for “OK” quality (phred score 20 for an individual read) through dark green for high quality (phred score 40 or above). Edited residues appear blue. Coloring can be toggled on and off using the Shading button. The Trimmed button controls the visibility of trimmed (or “clipped”) residues – these are residues that do not align to the reference sequence and so are greyed out to indicate that, and they are not included in consensus calculations. However, you can choose to completely hide those residues by turning off the Trimmed button.

    Here is an alignment with Shading and Trimmed both turned ON. Note the greyed out residues in the outlined areas;


    Now the same alignment with Shading and Trimmed turned off;


    Don’t forget the Dots button too. That replaces all residues that match the consensus with a “.” so to make visualisation of mismatches easier..

    Posted in Tips | Tagged , | Comments closed

    Importing SnapGene files into MacVector

    MacVector will directly import SnapGene DNA files. You just need to use FILE | OPEN or double click the file.

    This is very useful when downloading plasmid sequences from the wonderful Addgene plasmid repository.

    When you import a Snapgene file the appearance will be very similar. The colors of features will be the same as the original. However, there are some aspects to the display that are not the same between the two applications. For example MacVector has multiple levels (up to six) outside and inside a plasmid and will always try to place features so that no feature overlaps another. However, Snapgene will always place features on the same two levels and so features sometimes overlap.

    Here’s a plasmid sequence downloaded from the Addgene website in Snapgene format. It’s been opened directly in MacVector by double clicking the file.


    This sequence was opened using the MacVector defaults. However, MacVector’s graphics are highly customizable and you can adjust the graphical settings to display the plasmid exactly as you want. For example you may prefer features to be displayed on just those two levels instead of being distributed over the multiple levels as per the default settings.

    MacVector will directly import many file formats such as common sequence formats such as Genbank, FASTA, FASTQ plus from software packages such as Sequencher Projects and Serial Cloner,

    Posted in Uncategorized | Tagged , , | Comments closed

    Designing primers and documenting In-Fusion Cloning with MacVector

    The In-Fusion Cloning kits from Takara allow you to perform ligase free cloning of PCR products into vectors in as little as 15 minutes.

    You can use MacVector’s Gibson Cloning/Ligase Independent tool to design primers for In-Fusion cloning workflows. The In-Fusion kits need a 15nt overlap between the ends of a fragment and the ends of a linearized vector. The reaction uses a 3’ exonuclease to create single-stranded overhangs (Gibson Assembly uses a 5’ exonuclease, but otherwise is very similar).

    Here are the basic steps;

    1. Use File | New | Gibson/Ligase-Independent Assembly… and select the second option in the dialog. This creates your Ligase Independent Cloning (LIC) project.
    2. In fusion 1

    3. Next you want to prepare your vector. There a a few ways you might do this in the lab –
      1. You might have linearized a vector with a restriction enzyme(s). If its a single enzyme, just open the vector sequence and drag the name of the enzyme from the Map tab of the vector and drop it onto the LIC project window. If you are using two enzymes select both enzymes (use [shift] to select the second one) making sure you have the bulk of the vector selected so you include markers and the replication origin, choose Edit | Copy, switch to the LIC Project window and Edit | Paste.
        Here, we’ve taken the major EcoRI – HindIII fragment from pUC19.

        In Fusion 2For this approach click on the outlined button and select the No Primer option

      2. You may want to create a vector backbone using PCR. In this case, you want to select and copy the exact DNA sequence you want to appear in the final construct. In the example below we’ve taken the sequence from the stop codon of the lacZ alpha gene in pUC19 to the ATG start codon of lacZ alpha, as if we were going to create a fusion protein starting with that ATG. In this case, we want MacVector to calculate an Automatic Primer for each end (Click the outlined button and change to Automatic Primer). As a second fragment has not yet been entered, MacVector simply generates a primer to match the other end of the vector backbone fragment.
      3. In fusion 3

      4. The third way might be if you are using a pre-prepared linearized vector from a company.
    4. Next you need to prepare your insert. Basically, just select the exact piece of DNA you want to insert, copy, then paste into the Project. Here we chose a GalK gene to insert;
    5. In Fusion 4
      The default value for overlaps is 8. Click on the Prefs toolbar button to increase the minimum to 15 as recommended for In-Fusion cloning.

      In fusion 5

    6. Finally you need to circularize the resulting product to have your vector with insert. Click the Assemble button in the toolbar and you will get your In-Fusion product in a new window.

    In Fusion 6

    The new sequence will have /FRAG features in the Features tab showing how the molecule was constructed.

    In Fusion 7

    Posted in Techniques | Tagged | Comments closed

    MacVectorTip: Using tabbed sequence windows in MacVector

    One of the lesser known features of macOS is the ability to store all open documents of an application in tabs. Tabs were initially introduced for the Finder, but macOS Mavericks saw them apply to supported application document windows too. MacVector has supported tabs since their introduction, however, by default the Tab Bar is turned off. 

    To view the Tab Bar in MacVector then use:



    However, to control the behaviour when you open new documents you need to use the main system preferences dialog:


    When set to ALWAYS then every time you open a new document in all supported applications, then it will open in a new tab. If you prefer multiple windows you can drag the tab out of the window to open in a new window. However, you may prefer for some windows to be tabbed and others to always open in separate windows. So if you want a particular sequence in a separate windows, then drag the tab out of the window or use:


    Please note that MacVector’s Results windows are always tabbed irrespective of the SYSTEM setting. However, you can always drag a results tab out of a window to open in a new window.

    Posted in Uncategorized | Tagged , | Comments closed

    MacVector Tip: a complex subsequence pattern example.

    MacVector’s Subsequence tools allows you to search for motifs in both protein and DNA sequences. As well as a library of existing subsequence files, such as promotors and transcription factor binding sites, you can keep a library of your own subsequence matches. Subsequences libraries are multiple patterns kept in a single file. A search will look for matches to all subsequences in that file.

    We recently had an interesting and tricky question on how to search for a protein motif where one of the amino acids was one of four different residues and the second and/or third amino acids were one of two amino acids.

    Looking for ambiguous residues is relatively easy. You just surround the amino acids with parentheses and it will match one of those. For example (MV) would match either methionine or valine at that position.

    However, the second part of the motif is trickier. Whereas MacVector’s Subsequence search tool can have multiple parts and you can have AND or OR, it does not accept AND/OR logic. However, you can use that OR logic to have two parts. Here’s how this was done.

    Our example peptide/motif we are looking for has ten amino acids. The amino acids are as follows:

    • The first position is one of five residues: arginine, lysine, aspartic acid or glutamic acid (RKDE).
    • The second and third positions are where one or both are tryptophan, tyrosine or methionine (WYM).
    • A string of any six amino acids.
    • The tenth position can be alanine, isoleucine, leucine, methionine, phenylalanine, valine, proline or glycine (AILMFVPG).

    So let’s take that motif position by position and build our subsequence.

    • Position 1 – (RKDE) – would match any amino acid of those four.

    • Positions 2 and 3 – You cannot specify that “one or both” can be a match. But you can specify that one or the other will match by using two parts with OR to match. Then Xaa = X will match any amino acid.

      • (RKDE)(WYM)X would match any amino acid at the third position.

      • (RKDE)X(WYM) would match any amino acid at the second position.

    • Positions 4 to 9 – Then you can use X for the rest. So:

    • position 10 will be (AILMFVPG).

    So our full set of matches will be:

    ComplexSubsequenceMatches 1

    Here’s how this can be entered in the Subsequence Editor:

    ComplexSubsequenceMatches 2

    The Editing Subsequences help topic covers this.

    Posted in Tips | Tagged , | Comments closed