MacVectorTip: Identifying, Selecting and Assembling NGS reads with a variant genotype

When analyzing/assembling/aligning NGS data, there are many scenarios where you might want to separate out the reads representing different genotypes or variant sequences. MacVector makes this very easy. Take a reference sequence and choose Analyze | Align to Reference. Now click the Add Seqs button and select and add your NGS data files. NOTE: if your reference represents just a subset of the data in the NGS files, you might want to first filter the data using Align to Folder.

Here we see an Align to Reference where about half the reads have obvious SNPs compared to the reference. Note that the Dots toolbar button is toggled on to help emphasize the mismatches;


To select all of the reads that contain the SNP, first select a few residues around that SNP, as shown above. This helps ignore the occasional “bad” sequence, though, for most purposes, you can just select the one residue. Then right-click ([ctrl]-click) and choose Select Overlapping Reads Containing Selected Sequence from the context sensitive menu. This selects every read that aligns at that location with the G at that position. Finally, right-click and choose Select Matching Pairs. Now you have the mate-pairs of the SNP reads selected and you can save all the selected reads using the right-click Export Selected Reads as FastA/Q option.

If your sequence has multiple SNPs/genotypes/repeats, you can always then choose the right-click Delete Selected Reads option to remove those reads and start again on another set.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: Trimming trace files by quality

Many of our users are familiar with the ability of Sequencher to semi-automatically trim poor quality sequences from the ends of Sanger ABI reads. Although it is generally not necessary to do this in MacVector because most of the algorithms can automatically handle poor quality data, there are times when it can be beneficial. So MacVector has a Quality Trimming function that removes residues from the ends of Sanger reads that fall below a configurable quality threshold. You can invoke this in either the Align to Reference or Assembly Project windows by clicking on a new Qual Trim toolbar button.


This opens a setup dialog letting you determine how the reads should be trimmed.


The trimmed residues are normally shown greyed out.


But can be completely hidden by clicking on the Trimmed toggle toolbar button.


Posted in Tips | Tagged , | Comments closed

MacVectorTip: How to copy a specific short amino acid translation of a sequence

There can be times when you are messing around with open reading frames, inserting residues to change frames to try to get the perfect CDS fusion. The MacVector single sequence Editor will show those (click and hold on the “Display” toolbar button) but if you select and copy, only the DNA sequence (with any overlapping features) will be copied to the clipboard. If you need to copy a specific translation of a sequence, here’s how to do it: Select the region you are interested in, then invoke Analyze | Translation… Select the “Display text view with translation” option, set the Number of Frames to 3 or 6 and click OK.


From the resulting result window, you can select the text of the amino acid sequence you are interested in, copy, and then create a new sequence document (File | New From Clipboard) or paste into an external application.

Posted in Uncategorized | Tagged , | Comments closed

Simulating DNA electrophoresis in agarose gels using MacVector’s Agarose Gel tool

MacVector has a Agarose Gel interface which allows you to view photo-realistic recreations of restriction digests of linear and circular DNA molecules. The gels look so realistic that users have had a hard time telling photos of their own digests from the simulation in MacVector. When you first use the new tool and compare it to your real gels, you’ll see what they mean!

The Agarose Gel tool uses published algorithms for determining the migration of DNA fragments. The resulting pattern should be as accurate as can be estimated for DNA molecules with the exception that there is currently no algorithm that will accurately determine the migration of uncut plasmids through a gel12 .

The tool makes designing digests for checking constructs very easy and quick. For example, to check the orientation of a cloned gene, drag a restriction site in that gene to the gel window to view the correct band pattern. Then for comparison, repeat the ligation in the incorrect orientation and drag the site again. You can add a lane for an empty vector too. You can even print out the gel to take into the darkroom as a guide for cutting out a band.

GelOrientation Cloning Workflow

The default settings for the agarose gel display in MacVector generate a reasonable photorealistic simulation of an agarose gel. Although the intensity of small bands has been increased quite significantly, so that they show up a little more obviously and crisply than they might in real life (there’s also no smear from the loading buffer in the wells!). You can select different sets of markers, or define your own. As you mouse-over different bands the status bar updates with details of the fragment(s) under the pointer. The percentage of agarose in the gel and the relative run time can be adjusted as well as the realism of the display. You can optionally view the sizes of the fragments directly on the gel.

The advantages of being able to visualise a gel before you for it are multiple. For example being able to identify which band you need to cut out of a gel before going into the darkroom, or knowing which are the ideal restriction enzymes to cut your mini preps with to produce different banding patterns between a correctly ligated fragment, a reversed one and an empty vector…

How to simulate an Agarose Gel for your DNA sequences.

For a single digest

  • Open your sequence and switch to the MAP tab.
  • Drag a restriction site from the Map tab and drop on the Gel window

For a double digest

  • Open your sequence and switch to the MAP tab.
  • Select one restriction site, hold down SHIFT and select a second restriction site.
  • Drag the sites from the Map tab and drop on the Gel window.

To change the gel marker

  • Click the ADD MARKER toolbar button.
  • Choose a new DNA marker.
  • To remove a lane
  • Select the lane.
  • Drag and drop the lane outside of the gel window.

  1. Electrophoresis. 2002 Aug;23(16):2710–9. “DNA electrophoresis in agarose gels: effects of field and gel concentration on the exponential dependence of reciprocal mobility on DNA length” Randolph L Rill 1, Afshin Beheshti, David H Van Winkle PMID: 12210176 DOI: 10.1002/1522–2683(200208)23:16<2710::AID-ELPS2710>3.0.CO;2–0 []
  2. Electrophoresis. 2002 Jan;23(1):15–9. “DNA electrophoresis in agarose gels: a simple relation describing the length dependence of mobility” Winkle David H Van 1, Afshin Beheshti, Randolph L Rill PMID: 11824615 DOI: 10.1002/1522–2683(200201)23:1<15::AID-ELPS15>3.0.CO;2-L []
Posted in Techniques | Tagged | Comments closed

MacVectorTip: How to find Restriction Enzymes that only cut outside of a specific region

One common cloning related task is to ask MacVector to find restriction enzyme sites that cut in a molecule, but that do not cut in a specific region. e.g. suppose you want to find restriction enzymes that cut pBR322 but that do not cut in the Tetracycline Resistance Gene. To do this, choose the Analyze | Restriction Enzyme... menu item and run a search using All Enzymes with your favorite .renz file (I used New England Biolabs.renz). When the search completes you’ll see a result dialog. Click on the With no cuts in checkbox, then click in the feature selection “note” icon to display a popup menu and select tetracycline resistance gene from the CDS sub-menu.


You can then see in the Restriction Map result tab that none of the displayed enzymes cut in the Tet gene.


Posted in Techniques, Tips | Tagged , | Comments closed

MacVectorTip: Simulating mixed plasmid populations in agarose gels

We had a recent support call this week from somebody who believed from their agarose gels that they had a mixed population of plasmids from an experiment and wanted to document and determine the banding pattern using MacVector’s agarose gel simulation. You might come across this type of scenario if you have been making site-specific mutations and introducing new restriction sites into a vector where only some of the resulting plasmids might have acquired the extra site. You can simulate this in MacVector by creating concatenated fake plasmids counting the two variant plasmids.

We used pBR322 and introduced a fifth FspI site by changing the T at 2788 to a C, creating pBR322+. We then joined a copy of both plasmids by selecting the unique EcoRI site in pBR322, selecting Edit | Copy, then selecting the EcoRI site in pBR322+ and choosing Edit | Paste. We saved this molecule with the two directly repeated plasmids “Hybrid–1+1”. Then we added a second copy of pBR322 into one of the EcoRI sites to create a molecule with two copies of pBR322 and one copy of pBR322+ (Hybrid–2+1). Finally we used File | New Agarose Gel to create an empty gel, then selected FspI sites from each of the four molecules and dragged them onto separate lanes in the agarose gel. (Note that by default MacVector only shows a maximum of 8 cuts per enzyme so we increased this to unlimited in the RE Picker to ensure the FspI sites were displayed).


In gel tracks 3 and 4 you can see the different banding patterns between pBR322 and pBR322+. Track 5 shows the banding pattern for a 50:50 mix of the two plasmids (not the non equip-molar intensity of the variant bands) and track 6 shows the difference in band intensity when the mutant plasmid represents just one third of the molecules.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: Designing Primers for Gibson Assembly

You can use MacVector to design primers for multi-fragment Gibson Assembly, and also generate the predicted recombinant DNA molecule resulting from the assembly. All you have to do to get started is choose the File->New->Gibson/Ligase-independent Assembly… menu item. From there, you can choose the type of assembly (it doesn’t have to be the usual 5’ exonuclease Gibson approach) and follow the instructions. If you are familiar with the New England Biolabs NEBuilder interface then you will love this. The algorithm is very similar, but you can add your annotated MacVector nucleic acid files and the final molecule will retain all of their features and annotations. In addition, the interface lets you view the translations around the junctions so you can be absolutely sure that your primer will create that perfect fusion protein.


(Note the gray CC residues that were inserted to ensure that the adjacent ATG start codon is maintained in-frame)

We do not currently have a dedicated tutorial for Gibson/Ligase-independent Assembly, but there is a useful section with examples you can follow in the Whats New in MacVector Workshop manual.

..and don’t forget that if you prefer Ligase Independent Cloning or other similar techniques the tool will do those too.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: Sign up for an NCBI API key to speed up BLAST results

MacVector has a very cool BLAST Map results tab that displays the annotations surrounding any hits from the selected database. Particularly if you are working with prokaryotic sequences, where most BLAST hits these days are to genome sequences. This can really help you see the context of your hits, and let you download just the relevant region for additional analysis.


A few years ago the NCBI introduced a “throttling” procedure such that unregistered users are limited to three requests per second per IP address. While most users may not see a major difference with this, if you are sharing an institutional IP address with other MacVector users, you may find that the BLAST Map tab is slow to populate, and other NCBI-related functions may also be slow. You can get around this by registering for your own personal NCBI API key. Once you have a key, you can register it in MacVector using the MacVector | Preferences | Internet tab and pasting your API key into the Entrez Server edit box;


Once activated, your key will enable MacVector to submit up to 10 requests per second irrespective of other users at your Institution, which will dramatically speed up population of the BLAST Map tab and other NCBI services, particularly in shared environments.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: displaying CRISPR PAM Sites on a sequence

CRISPR-Cas9 genetic editing mechanisms require a short (typically 20nt) RNA sequence complementary to a target site next to a Protospacer Adjacent Motif (PAM) sequence. The most commonly used Cas9 nuclease (SpCas9) from Streptococcus pyogenes recognizes the PAM sequence NGG. One new function introduced a few years ago will search for PAM sequences in your sequence. There is a simple setup dialog that lets you control basic search characteristics, including choosing the specific type of nuclease to be used (we currently include 32 different characterized motifs in the default PAM file).


You can also search using all of the motifs in the source file, or just those selected. Potential target sites are shown graphically – here zoomed in to display the actual crRNA (lower case) and PAM sequence (upper case) of the potential target sites.


A tabular output lets you view more details about the hits.


You can also configure MacVector to automatically scan all opened nucleic acid sequence documents for PAM sites using the MacVector | Preferences -> Scan DNA preference pane.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: visualizing shared domains in a protein alignment

MacVector has a domain-outlining facility for multiple sequence alignments, letting you easily visualize the relationships between features in aligned protein sequences.
MacVector’s new multiple alignment file format retains the features/annotations from the sequences that are used to create the alignment. The colors of features from the individual sequence documents are used to outline the domains in the alignment. You can also create new domains and dynamically show/hide features in alignments.

Note that alignments created using versions of MacVector before 17.5 will not have this information and will need to be recreated.

Displaying shared domains:

  • Ensure your protein sequences are annotated -Tip: you can use DATABASE | INTERPROSCAN to quickly scan and annotate domains to your proteins.
  • Ensure the domains/features you are interested in are visible and set the Fill color to the color you would like to see in the alignment.
  • You can also control the visibility of domains/features using the floating feature palette seen in the EDITOR tab.
  • Use FILE | OPEN and select multiple protein sequences.
  • Click OPTIONS (bottom left hand corner) and choose OPEN MULTIPLE SEQUENCE FILE – AS MULTIPLE ALIGNMENT
  • Click OPEN
  • Now run the alignment by clicking ALIGN
  • In the EDITOR tab using the toolbar button turn on the feature display MODE to SHOW FEATURES

In the Editor tab a new line will appear above each sequence displaying the extent and color of visible features.


When you switch to the Picture tab, you will see colored outlines around the shared domains.


Posted in Tips | Tagged , , | Comments closed