MacVectorTip: Designing Primers for Gibson Assembly

You can use MacVector to design primers for multi-fragment Gibson Assembly, and also generate the predicted recombinant DNA molecule resulting from the assembly. All you have to do to get started is choose the File->New->Gibson/Ligase-independent Assembly… menu item. From there, you can choose the type of assembly (it doesn’t have to be the usual 5’ exonuclease Gibson approach) and follow the instructions. If you are familiar with the New England Biolabs NEBuilder interface then you will love this. The algorithm is very similar, but you can add your annotated MacVector nucleic acid files and the final molecule will retain all of their features and annotations. In addition, the interface lets you view the translations around the junctions so you can be absolutely sure that your primer will create that perfect fusion protein.

Unknown

(Note the gray CC residues that were inserted to ensure that the adjacent ATG start codon is maintained in-frame)

We do not currently have a dedicated tutorial for Gibson/Ligase-independent Assembly, but there is a useful section with examples you can follow in the Whats New in MacVector Workshop manual.

..and don’t forget that if you prefer Ligase Independent Cloning or other similar techniques the tool will do those too.

Posted in Tips | Tagged , | Leave a comment

MacVectorTip: Sign up for an NCBI API key to speed up BLAST results

MacVector has a very cool BLAST Map results tab that displays the annotations surrounding any hits from the selected database. Particularly if you are working with prokaryotic sequences, where most BLAST hits these days are to genome sequences. This can really help you see the context of your hits, and let you download just the relevant region for additional analysis.

Unknown

A few years ago the NCBI introduced a “throttling” procedure such that unregistered users are limited to three requests per second per IP address. While most users may not see a major difference with this, if you are sharing an institutional IP address with other MacVector users, you may find that the BLAST Map tab is slow to populate, and other NCBI-related functions may also be slow. You can get around this by registering for your own personal NCBI API key. Once you have a key, you can register it in MacVector using the MacVector | Preferences | Internet tab and pasting your API key into the Entrez Server edit box;

Unknown

Once activated, your key will enable MacVector to submit up to 10 requests per second irrespective of other users at your Institution, which will dramatically speed up population of the BLAST Map tab and other NCBI services, particularly in shared environments.

Posted in Tips | Tagged , | Leave a comment

MacVectorTip: displaying CRISPR PAM Sites on a sequence

CRISPR-Cas9 genetic editing mechanisms require a short (typically 20nt) RNA sequence complementary to a target site next to a Protospacer Adjacent Motif (PAM) sequence. The most commonly used Cas9 nuclease (SpCas9) from Streptococcus pyogenes recognizes the PAM sequence NGG. One new function introduced a few years ago will search for PAM sequences in your sequence. There is a simple setup dialog that lets you control basic search characteristics, including choosing the specific type of nuclease to be used (we currently include 32 different characterized motifs in the default PAM file).

Unknown

You can also search using all of the motifs in the source file, or just those selected. Potential target sites are shown graphically – here zoomed in to display the actual crRNA (lower case) and PAM sequence (upper case) of the potential target sites.

Unknown

A tabular output lets you view more details about the hits.

Unknown

You can also configure MacVector to automatically scan all opened nucleic acid sequence documents for PAM sites using the MacVector | Preferences -> Scan DNA preference pane.

Posted in Tips | Tagged , | Leave a comment

MacVectorTip: visualizing shared domains in a protein alignment

MacVector has a domain-outlining facility for multiple sequence alignments, letting you easily visualize the relationships between features in aligned protein sequences.
MacVector’s new multiple alignment file format retains the features/annotations from the sequences that are used to create the alignment. The colors of features from the individual sequence documents are used to outline the domains in the alignment. You can also create new domains and dynamically show/hide features in alignments.

Note that alignments created using versions of MacVector before 17.5 will not have this information and will need to be recreated.

Displaying shared domains:

  • Ensure your protein sequences are annotated and that the domains of interest are visible (you can use DATABASE | INTERPROSCAN to quickly scan and annotate domains to your proteins).
  • Use FILE | OPEN and select multiple protein sequences.
  • Click OPTIONS (bottom left hand corner) and choose OPEN MULTIPLE SEQUENCE FILE – AS MULTIPLE ALIGNMENT
  • Click OPEN
  • Now run the alignment by clicking ALIGN
  • In the EDITOR tab using the toolbar button turn on the feature display MODE to SHOW FEATURES
  • In the Editor tab a new line will appear above each sequence displaying the extent and color of visible features.

    Unknown

    When you switch to the Picture tab, you will see colored outlines around the shared domains.

    Unknown

    Posted in Tips | Tagged , , | Leave a comment

    MacVectorTip: correctly flagging PacBio and Oxford Nanopore datasets for assembly by Flye

    MacVector 17.5 introduced Flye for assembly of PacBio and Oxford Nanopore reads.

    Flye joins Phrap, Velvet and SPAdes for de novo sequence assembly using along with Bowtie2 and Align To Reference for reference assembly.

    Flye is an assembler algorithm tuned to assemble low quality long reads such as those produced by the new generation of single molecule sequencers. With typical bacterial genome assemblies it is fairly common to be able to assemble reads into a single full-length genome contig.

    Because these longer reads to be very error prone, MacVector also includes an optional polishing step using Racon. Polishing is the technique of correcting sequencing errors by aligning reads against contigs produced by the first run of the assembler. Multiple rounds of polishing will keep increasing the accuracy of the resulting consensus.

    It is important to tell MacVector what type of reads you are assembling before running Flye. Flye will be disabled unless your read files include at least one PacBio or Nanopore dataset.

    Unknown

    This is easily done by double-clicking on the Status item after importing the reads via the Add Reads toolbar button.

    Posted in Tips | Tagged , , | Leave a comment

    MacVectorTip: quality score visualization in sequence assemblies.

    Quality scoring of Assemblies and Align to Reference alignments can be visualized directly on the sequence. Residues can be shaded according to their quality scores. These can be displayed anywhere quality values are available, including de novo and reference assemblies in Assembler and Align to Reference alignments.

    A Shading toolbar button lets you turn on coloring based on the quality value assigned to each residue.

    Unknown

    The intensity of the colors indicates the phred-based quality value of each residue.

  • For individual reads, this ranges from 0 (deep red) through 20 (white) to 40 or above (deep green). The consensus scale is doubled and ranges from 0 (deep red) through 40 (white) to 80 or above (deep green).
  • Gaps are always shown with a white background. As with earlier versions of MacVector, you can “mouse-over” a residue to view the numerical information in a tooltip.
  • Edited residues are always given a phred quality value of 99 and these residues are given a blue background.
  • Unknown

    (Read More..)

    Posted in Tips | Tagged , | Leave a comment

    MacVectorTip: Assembling Fungal Genomes using SPAdes

    MacVector with Assembler can assemble bacterial genomes in just minutes on quite modest hardware. Currently MacVector has four de novo assembly tools (SPAdes, Velvet, Flye and Phrap).

    But what of larger genomes? It is currently impractical to run de novo assemblies of Human genomes on a low cost Mac, though RNA-Seq analyses against the human transcriptome are possible. However, here we performed some complete genome de novo assembly tests on a sample NGS experiment from Aspergillus fischeri (NCBI SRA SRR10092049). This data set consists of 2x 18.2 million paired -end 150nt Illumina HiSeq reads, representing 5.5Gbp of sequence data. The assembly was run on a 2.9 GHz 6-core i9 MacBook Pro with 32 GB RAM, using SPAdes with 11 threads and slightly modified K-MER values.

    As seen below, the complete assembly, including the optional Bowtie reference assembly step, took just over 28 hours in total. Maximum RAM usage during assembly was about 24 GB. 1,314 contigs were generated with a total combined length of 31.38 Mbp, right in line with the reported genome size with the longest contig 1,132,146 bp in length and an N50 Score of 350,491.

    The information presented here gives you some idea of what is achievable on a modest machine using MacVector with Assembler. Assembly would be expected to be faster with a more restricted set of K-MER values, or on higher end machines such as the Mac Pro, Apple Studio or with much more RAM.

    Unknown

    Posted in Tips | Tagged , , | Comments closed

    MacVectorTip: Viewing external database entries for features in a sequence.

    Sequences, or regions of sequences, can be linked to external databases. For example an entire sequence entry or for when annotation tools are used to annotate proteins with domain or motif information (for example InterProScan). Very useful for when you want to view more detailed or updated information. Within the Genbank specification, which MacVector extensively uses, an external database entry can be stored in a /DB_XREF qualifier. This allows the database entry to be easily viewed. The Genbank (and Genpept) specification allow for many different databases to be accessed using this qualifier.

    Unknown

    In MacVector the original database entry can easily be viewed in a web browser by selecting, then right clicking (or holding down CTRL and left clicking) the feature entry in the Features tab and viewing the available DB_XREF entries. Selecting one will load it in your web browser.

    Unknown

    Posted in Tips | Tagged , , | Comments closed

    MacVectorTip: How to Customize Window Button Toolbars

    Like many Mac applications, MacVector takes full advantage of the built-in ability to add, delete and rearrange the action buttons on window toolbars. To make these changes, right-click (or [ctrl]-click) in the gray space on any toolbar and a context-sensitive menu will appear. Choose Customize Toolbar and a dialog will be displayed with all of the buttons available for that tab, like this one for the Editor tab of the DNA Sequence Window.

    Unknown

    Note that modifying the toolbar is a global change that affects all windows containing that tab. It is also specific to different document types, so you can have different sets of buttons on the Editor toolbar of the DNA, Protein, Trace/Chromatogram and MSA document windows for example. Once modified, the changes remain permanently until you either customize them again, or reset your MacVector Preferences.

    Posted in Tips | Tagged , | Comments closed

    MacVectorTip: Understanding Color Groups

    You can align hundreds, or even thousands of protein sequences within MacVector using three different alignment algorithms ClustalW, MUSCLE or T-Coffee. Once aligned, you may be familiar with the colorful display in the Editor tab.

    ColorGroups1

    But there’s more to this than pretty colors. The default Color Group in MacVector is one called “Chemical Type”. In this, glycine, leucine, isoleucine, valine and alanine are all considered to be of the same type, and thus are included in the same group. You can access the Color Group selector/editor by clicking on the Groups toolbar button (note you may need to resize the window larger to see this button). The selector is also accessible from the Prefs | Consensus pane.

    ColorGroups2

    You can change the selected Color Group from about 20 built-in groups using the Color By: dropdown menu. You can edit the groups or even create your own groups. When amino acids belong to the same group, MacVector considers them to be “similar”. This affects many related functions throughout the multiple alignment interface. One example in the first image is a consensus that is a dot – none of the residues individually exceed 51% (the default identity threshold), but all belong to the same Color Group. If you select a different Color Group scheme, not only will the Editor tab update with the new colors, but the consensus will change to reflect the new groups.

    The Picture tab also handles similarities – you can shade and outline residues based on the currently selected color grouping scheme. This is controlled from the Prefs | Picture Shading tab.

    ColorGroups3

    Finally, the Text, Pairwise and Matrix tabs also respond to the currently selected Color Group to determine similarities. Prior to MacVector 18.2.5, these always used the ClustalW Default Groups similarity scheme, but now, for consistency, they honor the currently selected group.

    ColorGroups4

    Posted in Tips | Tagged , | Comments closed