MacVectorTip: Simulating mixed plasmid populations in agarose gels

We had a recent support call this week from somebody who believed from their agarose gels that they had a mixed population of plasmids from an experiment and wanted to document and determine the banding pattern using MacVector’s agarose gel simulation. You might come across this type of scenario if you have been making site-specific mutations and introducing new restriction sites into a vector where only some of the resulting plasmids might have acquired the extra site. You can simulate this in MacVector by creating concatenated fake plasmids counting the two variant plasmids.

We used pBR322 and introduced a fifth FspI site by changing the T at 2788 to a C, creating pBR322+. We then joined a copy of both plasmids by selecting the unique EcoRI site in pBR322, selecting Edit | Copy, then selecting the EcoRI site in pBR322+ and choosing Edit | Paste. We saved this molecule with the two directly repeated plasmids “Hybrid–1+1”. Then we added a second copy of pBR322 into one of the EcoRI sites to create a molecule with two copies of pBR322 and one copy of pBR322+ (Hybrid–2+1). Finally we used File | New Agarose Gel to create an empty gel, then selected FspI sites from each of the four molecules and dragged them onto separate lanes in the agarose gel. (Note that by default MacVector only shows a maximum of 8 cuts per enzyme so we increased this to unlimited in the RE Picker to ensure the FspI sites were displayed).

Unknown

In gel tracks 3 and 4 you can see the different banding patterns between pBR322 and pBR322+. Track 5 shows the banding pattern for a 50:50 mix of the two plasmids (not the non equip-molar intensity of the variant bands) and track 6 shows the difference in band intensity when the mutant plasmid represents just one third of the molecules.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: Designing Primers for Gibson Assembly

You can use MacVector to design primers for multi-fragment Gibson Assembly, and also generate the predicted recombinant DNA molecule resulting from the assembly. All you have to do to get started is choose the File->New->Gibson/Ligase-independent Assembly… menu item. From there, you can choose the type of assembly (it doesn’t have to be the usual 5’ exonuclease Gibson approach) and follow the instructions. If you are familiar with the New England Biolabs NEBuilder interface then you will love this. The algorithm is very similar, but you can add your annotated MacVector nucleic acid files and the final molecule will retain all of their features and annotations. In addition, the interface lets you view the translations around the junctions so you can be absolutely sure that your primer will create that perfect fusion protein.

Unknown

(Note the gray CC residues that were inserted to ensure that the adjacent ATG start codon is maintained in-frame)

We do not currently have a dedicated tutorial for Gibson/Ligase-independent Assembly, but there is a useful section with examples you can follow in the Whats New in MacVector Workshop manual.

..and don’t forget that if you prefer Ligase Independent Cloning or other similar techniques the tool will do those too.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: Sign up for an NCBI API key to speed up BLAST results

MacVector has a very cool BLAST Map results tab that displays the annotations surrounding any hits from the selected database. Particularly if you are working with prokaryotic sequences, where most BLAST hits these days are to genome sequences. This can really help you see the context of your hits, and let you download just the relevant region for additional analysis.

Unknown

A few years ago the NCBI introduced a “throttling” procedure such that unregistered users are limited to three requests per second per IP address. While most users may not see a major difference with this, if you are sharing an institutional IP address with other MacVector users, you may find that the BLAST Map tab is slow to populate, and other NCBI-related functions may also be slow. You can get around this by registering for your own personal NCBI API key. Once you have a key, you can register it in MacVector using the MacVector | Preferences | Internet tab and pasting your API key into the Entrez Server edit box;

Unknown

Once activated, your key will enable MacVector to submit up to 10 requests per second irrespective of other users at your Institution, which will dramatically speed up population of the BLAST Map tab and other NCBI services, particularly in shared environments.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: displaying CRISPR PAM Sites on a sequence

CRISPR-Cas9 genetic editing mechanisms require a short (typically 20nt) RNA sequence complementary to a target site next to a Protospacer Adjacent Motif (PAM) sequence. The most commonly used Cas9 nuclease (SpCas9) from Streptococcus pyogenes recognizes the PAM sequence NGG. One new function introduced a few years ago will search for PAM sequences in your sequence. There is a simple setup dialog that lets you control basic search characteristics, including choosing the specific type of nuclease to be used (we currently include 32 different characterized motifs in the default PAM file).

Unknown

You can also search using all of the motifs in the source file, or just those selected. Potential target sites are shown graphically – here zoomed in to display the actual crRNA (lower case) and PAM sequence (upper case) of the potential target sites.

Unknown

A tabular output lets you view more details about the hits.

Unknown

You can also configure MacVector to automatically scan all opened nucleic acid sequence documents for PAM sites using the MacVector | Preferences -> Scan DNA preference pane.

Posted in Tips | Tagged , | Comments closed

MacVectorTip: visualizing shared domains in a protein alignment

MacVector has a domain-outlining facility for multiple sequence alignments, letting you easily visualize the relationships between features in aligned protein sequences.
MacVector’s new multiple alignment file format retains the features/annotations from the sequences that are used to create the alignment. The colors of features from the individual sequence documents are used to outline the domains in the alignment. You can also create new domains and dynamically show/hide features in alignments.

Note that alignments created using versions of MacVector before 17.5 will not have this information and will need to be recreated.

Displaying shared domains:

  • Ensure your protein sequences are annotated -Tip: you can use DATABASE | INTERPROSCAN to quickly scan and annotate domains to your proteins.
  • Ensure the domains/features you are interested in are visible and set the Fill color to the color you would like to see in the alignment.
  • You can also control the visibility of domains/features using the floating feature palette seen in the EDITOR tab.
  • Use FILE | OPEN and select multiple protein sequences.
  • Click OPTIONS (bottom left hand corner) and choose OPEN MULTIPLE SEQUENCE FILE – AS MULTIPLE ALIGNMENT
  • Click OPEN
  • Now run the alignment by clicking ALIGN
  • In the EDITOR tab using the toolbar button turn on the feature display MODE to SHOW FEATURES

In the Editor tab a new line will appear above each sequence displaying the extent and color of visible features.

Unknown

When you switch to the Picture tab, you will see colored outlines around the shared domains.

Unknown

Posted in Tips | Tagged , , | Comments closed

MacVectorTip: correctly flagging PacBio and Oxford Nanopore datasets for assembly by Flye

MacVector 17.5 introduced Flye for assembly of PacBio and Oxford Nanopore reads.

Flye joins Phrap, Velvet and SPAdes for de novo sequence assembly using along with Bowtie2 and Align To Reference for reference assembly.

Flye is an assembler algorithm tuned to assemble low quality long reads such as those produced by the new generation of single molecule sequencers. With typical bacterial genome assemblies it is fairly common to be able to assemble reads into a single full-length genome contig.

Because these longer reads to be very error prone, MacVector also includes an optional polishing step using Racon. Polishing is the technique of correcting sequencing errors by aligning reads against contigs produced by the first run of the assembler. Multiple rounds of polishing will keep increasing the accuracy of the resulting consensus.

It is important to tell MacVector what type of reads you are assembling before running Flye. Flye will be disabled unless your read files include at least one PacBio or Nanopore dataset.

Unknown

This is easily done by double-clicking on the Status item after importing the reads via the Add Reads toolbar button.

Posted in Tips | Tagged , , | Comments closed

MacVectorTip: quality score visualization in sequence assemblies.

Quality scoring of Assemblies and Align to Reference alignments can be visualized directly on the sequence. Residues can be shaded according to their quality scores. These can be displayed anywhere quality values are available, including de novo and reference assemblies in Assembler and Align to Reference alignments.

A Shading toolbar button lets you turn on coloring based on the quality value assigned to each residue.

Unknown

The intensity of the colors indicates the phred-based quality value of each residue.

  • For individual reads, this ranges from 0 (deep red) through 20 (white) to 40 or above (deep green). The consensus scale is doubled and ranges from 0 (deep red) through 40 (white) to 80 or above (deep green).
  • Gaps are always shown with a white background. As with earlier versions of MacVector, you can “mouse-over” a residue to view the numerical information in a tooltip.
  • Edited residues are always given a phred quality value of 99 and these residues are given a blue background.
  • Unknown

    (Read More..)

    Posted in Tips | Tagged , | Comments closed

    MacVectorTip: Assembling Fungal Genomes using SPAdes

    MacVector with Assembler can assemble bacterial genomes in just minutes on quite modest hardware. Currently MacVector has four de novo assembly tools (SPAdes, Velvet, Flye and Phrap).

    But what of larger genomes? It is currently impractical to run de novo assemblies of Human genomes on a low cost Mac, though RNA-Seq analyses against the human transcriptome are possible. However, here we performed some complete genome de novo assembly tests on a sample NGS experiment from Aspergillus fischeri (NCBI SRA SRR10092049). This data set consists of 2x 18.2 million paired -end 150nt Illumina HiSeq reads, representing 5.5Gbp of sequence data. The assembly was run on a 2.9 GHz 6-core i9 MacBook Pro with 32 GB RAM, using SPAdes with 11 threads and slightly modified K-MER values.

    As seen below, the complete assembly, including the optional Bowtie reference assembly step, took just over 28 hours in total. Maximum RAM usage during assembly was about 24 GB. 1,314 contigs were generated with a total combined length of 31.38 Mbp, right in line with the reported genome size with the longest contig 1,132,146 bp in length and an N50 Score of 350,491.

    The information presented here gives you some idea of what is achievable on a modest machine using MacVector with Assembler. Assembly would be expected to be faster with a more restricted set of K-MER values, or on higher end machines such as the Mac Pro, Apple Studio or with much more RAM.

    Unknown

    Posted in Tips | Tagged , , | Comments closed

    MacVectorTip: Viewing external database entries for features in a sequence.

    Sequences, or regions of sequences, can be linked to external databases. For example an entire sequence entry or for when annotation tools are used to annotate proteins with domain or motif information (for example InterProScan). Very useful for when you want to view more detailed or updated information. Within the Genbank specification, which MacVector extensively uses, an external database entry can be stored in a /DB_XREF qualifier. This allows the database entry to be easily viewed. The Genbank (and Genpept) specification allow for many different databases to be accessed using this qualifier.

    Unknown

    In MacVector the original database entry can easily be viewed in a web browser by selecting, then right clicking (or holding down CTRL and left clicking) the feature entry in the Features tab and viewing the available DB_XREF entries. Selecting one will load it in your web browser.

    Unknown

    Posted in Tips | Tagged , , | Comments closed

    MacVectorTip: How to Customize Window Button Toolbars

    Like many Mac applications, MacVector takes full advantage of the built-in ability to add, delete and rearrange the action buttons on window toolbars. To make these changes, right-click (or [ctrl]-click) in the gray space on any toolbar and a context-sensitive menu will appear. Choose Customize Toolbar and a dialog will be displayed with all of the buttons available for that tab, like this one for the Editor tab of the DNA Sequence Window.

    Unknown

    Note that modifying the toolbar is a global change that affects all windows containing that tab. It is also specific to different document types, so you can have different sets of buttons on the Editor toolbar of the DNA, Protein, Trace/Chromatogram and MSA document windows for example. Once modified, the changes remain permanently until you either customize them again, or reset your MacVector Preferences.

    Posted in Tips | Tagged , | Comments closed