Identifying transposon insertion sites from multiplexed NGS data

Transposon mutagenesis is a common approach for investigating gene function in bacterial genomes by selecting for clones where the transposon inserting into the genome has generated a specific phenotype. You can then simply sequence the entire genome of each clone by NGS to identify the transposon insertion site. To lower the cost of such experiments, it is common to pool several individual genomes into each NGS sample and then run appropriate sequence analysis to identify the genes disrupted by the transposition events.

There is a new Transposon Insertion Analysis Tutorial that describes how to perform this analysis using MacVector with Assembler. To follow along, you can download sample data. The basic strategy is to use MacVector’s Align to Folder functionality to pull out all pairs of reads that contain transposon sequences then align those to the genome to identify the end points of the transposon insertion site.


The tutorial goes into detail, describing several approaches you can use to identify the insertion locations, along with shortcuts and suggestions on how to rapidly annotate the insertion sites on the complete genome. While the tutorial does use Macvector with Assembler for parts of the analysis, you can actually accomplish the same end result using plain MacVector.

Posted in Techniques, Tutorials | Tagged , , , | Comments closed

Human Transcriptome RNA-Seq Analysis Using MacVector

With MacVector Pro and Assembler you can use Bowtie to perform RNA-Seq analyses using NGS data. The interface even has specialized output tabs listing the coverage information and statistics for each annotated CDS and gene feature on the genome. You can download a short tutorial and a sample dataset that illustrate the analysis workflow using a small (1.6 Mbp) prokaryotic genome.

What surprises many people is that the combination of MacVector and modest Macintosh hardware can actually perform this analysis on the human genome. Now there are limitations to this – it’s not currently practical to do this with the entire genome due to memory and processing constraints, but it is possible to run an analysis against the known Human Transcriptome. The latest version of this can be downloaded from the GENCODE database. There is a new RNA-Seq Human Transcriptome Analysis Tutorial that describes the basic procedure in detail and some sample data that can be downloaded. The end result is that you get a table similar to that shown below that can be copied and pasted into Microsoft Excel for additional analysis.


Simple DNA sequence assembly on a Mac with MacVector with Assembler.

MacVector has a software plugin called Assembler that integrates directly into the DNA sequence analysis toolkit and provides DNA sequence assembly functionality. Dealing with sequencing reads has never been easier.

MacVector includes no less than five different assemblers just a few mouse clicks away from your sequencing reads. Phrap assembles Sanger sequencing reads or existing contigs, while there are three separate NGS de novo assemblers – Velvet for short read datasets, Flye for Nanopore and PacBio long reads and SPAdes for mixed assemblies. For reference assembly Bowtie2 can map millions of sequencing reads against genomic reference sequences and is ideal for RNASeq gene expression analysis data too.

Assembler is tightly integrated into MacVector. It’s easy to bring sequencing reads into MacVector, and it’s just as easy to directly design primers for a contig, run BLAST searches on a contig, and much more, right from your desktop!

Posted in Techniques, Tutorials | Tagged , , | Comments closed

Use a right-click in the Editor tab to see if your contig can be circularized

MacVector incorporates no less than THREE different de novo assemblers, phrap, velvet and SPAdes. While all are great assemblers, with each having their own specific advantages, none of them will generate a circular sequence from input reads. However, MacVector also includes a tool to help you with this. If you are assembling reads representing plasmid sequences, or if you are closing gaps in a circular genome, you can find out if a contig can be circularized by double-clicking on it in the Assembly Project and then right-clicking* in the Contig Editor to bring up a context-sensitive menu.


The algorithm looks for a perfect overlap between the ends of at least 20 bases. If no overlap exists, the menu item is greyed out and reads “Cannot Circularize Consensus”. Otherwise it indicates the length of the overlap. If you select the menu item, a new sequence window opens containing the circularized consensus of the contig, with all gaps removed.

*To right click with a trackpad hold down [CTRL] and click once or tap with two fingers. MacVector has many “right click” menus with extra functionality.

Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

Posted in Techniques, Tips | Tagged , | Comments closed

Import Multi-Sequence Genbank Files into an Assembly Project for easy access to Features

There are many genomes in the Genbank database that cannot be downloaded as single annotated sequences. These might be large multi-chromosome eukaryotic genomes, but, increasingly, partially sequenced bacterial chromosomes where the major contigs have been annotated using the NCBI annotation pipeline. Typically, when you encounter these, there are options to download annotated versions of these as multi-sequence Genbank formatted files. MacVector has the option to open any file containing multiple sequences as either a Multiple Sequence Alignment document or as individual Sequence documents. This is not always optimal if you have more than a handful of sequences in the file. However, if you use MacVector with Assembler, you can import these sequences into a project using the Add Ref toolbar button and the individual sequences will not only be displayed in the project window, but, if you double-click on one, the complete annotated sequence will be opened.


This is a great way to view and/or sort collections of annotated sequences in Genbank format that cannot be done directly through the Apple Finder. Once opened, you can Export… any sequence in another format if you wish.

Posted in Techniques, Tips | Tagged , , | Comments closed

Opening Genbank or FASTA files with multiple sequences as individual sequences

Many sequence formats contain multiple concatenated sequence entries. For example FASTA and Genbank are two formats capable of storing multiple individual sequences.

By default MacVector will treat such sequences as alignments and open them in the Multiple Sequence Alignment editor. Most users who want to open such a file do want to see an alignment. Additionally if the default behaviour was to open as individual sequences, then accidentally clicking on a large alignment would result in many hundreds of individual sequence windows opening up on your desktop (do remember that holding down the OPTION key and clicking on the close button will close all open sequences).

If you need to open such a sequence file as individual sequences, then there’s a simple option that you need to check in the FILE | OPEN dialog. This behaviour has not changed for quite some time. However, a few versions back the appearance of the dialog changed, due to a change in Apple’s guidelines on file dialogs. Whereas the older dialog had an obvious way to see this dropdown menu, now all you see is a small OPTIONS button in the bottom left hand corner.


To open multiple sequence files as individual files you need to check an option in the FILE | OPEN dialog.

  • Click FILE | OPEN
  • In the dialog click OPTIONS (bottom left corner)
  • Click OPEN
  • (Read More…)

    Posted in Tips | Tagged , , | Comments closed

    Optimizing Align To Folder Parameters for use with NGS Data

    You can use the Database | Align To Folder function to scan large fasta or fastq files containing NGS data to find and retrieve just those reads that match a specific target sequence. The search is aware of paired-end reads, so when you retrieve hits, both reads of a pair will be saved into a pair of fasta or fastq files, even if only one of them matched the query sequence. This is a great way of finding sequencing reads to extend any short sequence. For optimum performance, set up the Align To Folder search like this.


    Set the Search Folder to the location of your data and select the Folder contains paired-end reads checkbox if you are working with paired data. For speed, make sure Hash Value is set to the maximum (currently 12) and use a large Scores to Keep value to make sure you can retrieve all the hits. Finally, use the DNA identity with penalties matrix to optimize the search so that only very close matches are reported.

    On a moderate machine (e.g. a three year old 2.7 GHz i7 MacBook Pro), a search of 20 million x 90nt reads with a 500 bp search sequence might take from 45 mins to two hours, depending on the number of hits encountered. To retrieve the hits, select the numbered rows in the Folder Description List results tab and choose the Database | Retrieve to File… menu item. If you used paired-end data, two files will be produced, –1.fastq and –2.fastq, that you can then use as input to Assembler for SPAdes or Velvet assembly, or to Analyze | Align To Reference for more detailed reference alignment analysis.

    Read More…

    Posted in Techniques | Tagged , , | Comments closed

    MacVector and macOS Catalina

    The next OS release for the Mac will arrive during September. macOS Catalina is a major OS release and includes many new features.

    As usual our developers have been hard at work ensuring that when macOS Catalina is released, MacVector 17 will be fully compatible.

    For older versions you can check compatibility on our website as soon as macOS Catalina is officially released.

    For versions of MacVector released over the past few years it is likely that they will work fine. For these we have been striving to future proof MacVector for new versions of macOS.

    However, for older versions of MacVector there will be issues. More significantly MacVector 13.5 and all older versions will not run. The MacVector application icon will be displayed with a “stop sign” indicating it will not run.

    This is due to Apple moving fully to a 64 bit operating system.

    Starting with the release of macOS High Sierra 10.13.4 any 32 bit application would periodically display warning dialogs. For example: “the application is not optimised for this release and needs to be updated” and “This application will not work on future macOS releases”. However, the application would still run. But with the release of macOS Catalina this migration is now complete and 32bit applications will no longer run.

    Here at MacVector we always strive to stay ahead of the game. When Apple recommended (many years ago) that all applications should be 64 bit, we immediately started working to move MacVector to be fully 64 bit resulting in the release of MacVector 14 in 2015.

    Please do note that whereas our developers do develop and run MacVector on Apple’s public and developer betas of macOS, we really do not recommend that users run MacVector for any real workflows on beta releases. There may be unexpected issues that you may encounter.

    Posted in Releases | Tagged , | Comments closed

    Upgrade to the most feature packed version of MacVector yet and install on multiple Macs with a 50% discount.

    Personal licenses are ideal for using on a single Mac, but not if you have multiple Macs or want to install on a shared lab computer as well as your personal Mac. So why not upgrade to a standard license of MacVector Pro 17 that you can share among a group of users in the same research group?

    What’s even more flexible is if you take your laptop home, or away on travel to a conference, then you can work on MacVector and the license back in the lab will still work!

    During August we have a 50% discount on either exchanging a current personal license to a standard license, or upgrading an older personal license to a standard one of MacVector 17.

    MacVector 17 is one of the most feature-packed updates that we have ever released. Highlights include an interactive Restriction Enzyme Picker, a unique genome comparison tool and a tool to help you design and document Gibson Assembly and LIC workflows. MacVector 17 has support for Dark Mode, helps identify genome sequencing errors, automatically displays primer binding sites, maps multiple sequencing datasets against a single reference and has numerous performance improvements.

    How to upgrade a personal license to a standard license.

    Personal Licenses are locked to a single Mac. They are best suited to a single person using MacVector.

    Standard licenses may be installed on ALL Macs in a lab with one concurrent user. Standard licenses use Bonjour to check if the serial number is in use. If there is no network they just work.

    To upgrade request a quote or email for more information.

    Posted in Releases | Tagged | Comments closed

    The MacVector Team will be at ASM Microbe 2019 in San Francisco.

    We’re at ASM Microbe 2019 in San Francisco from Friday 21st June until Sunday 23rd.

    The show is finally back on the West Coast at the Moscone Convention Center after 6 years on the East Coast and New Orleans.

    This year the exhibit hall hours are unchanged from last year’s show:

  • Friday, June 21st – 10:30 AM – 5:00 PM
  • Sat. June 22nd – 10:30 AM – 5:00 PM
  • Sunday, June 23rd – 10:00 AM – 4:00 PM
  • We’re on booth 1731.

    Please do drop by. We’ll be demoing our latest release, MacVector 17.

    If you’ve never used MacVector before, or you are a power user, then please drop by and say hello. We always enjoy meeting users and we guarantee we can teach you something new, and hopefully you’ll be able to teach us something new too!

    See you in San Francisco!

    The Twitter hashtag looks to be #ASMMICROBE2019.

    Posted in General, Meetings | Tagged | Comments closed

    What can MacVector do for my lab?

    Here’s what MacVector can do for your lab.

    Comparing sequences

    Whatever type of alignment your sequence needs, there’s a tool in MacVector.


    CRISPR Indel Analysis: Identify insertions and deletions following CRISPR editing of a target.

    Compare Genomes: Compares two related annotated genomes to identify identical, similar and weakly similar features.

    Sequence assembly of NGS data against a reference genome or compare your sequencing against your new construct.

    Coverage Tab: Compare different datasets assembled against the same reference sequence with expression level comparison.

    Translated Multiple Sequence Alignments: Align DNA sequences based on their translations.


    Visualise shared aligned domains: Protein alignments display domains that are conserved between sequences.

    Align proteins against a reference great for comparing known proteins against an unknown one.

    Auto Annotation of common plasmid features to blank sequences.

    InterProScan: Scan proteins for functional domains against many databases.


    How Do I?: new menu shows common workflows with step by step guides. Every tool has a link to a video tutorial.


    Design Cloning workflows

    As simple as dragging a fragment to a cloning vector.

    Flexible Cloning Subclone with restriction enzymes, Gibson cloning, Gateway and more.

    Cloning history Every step is documented.

    Agarose Gel: run out digested sequences. Easily identify site(s) to differentiate successful clones.

    Restriction Enzyme Picker: easy identification of useful enzyme cut sites.


    Gibson/Ligase-Independent Assemblies: Automatic primer design and assembly for Gibson Assembly or Ligase Independent Cloning workflows.

    Primer Design

    Design primers with ease.

    QuickTest Primer changes primer design. Hairpin? Nudge your primer until it goes.

    Add tails to your primers with silent restriction sites/mismatches and view reading frame changes.

    Quickly design pairs of primers click a region to get the best primer pairs to amplify it.

    Scan For… Missing Primers: utilize the power of the Primer Database to display binding sites from your lab’s primer collection

    Read More…

    Posted in Releases | Tagged , , , | Comments closed