General musings from the MacVector team about sequence analysis, molecular biology, the Mac in general and of course your favorite sequence analysis app for the Mac!
MacVector Pro now includes Assembler, a powerful sequence assembly plugin that brings sequence assembly directly to your desktop with the same user-friendly interface MacVector users have come to expect. Assembler simplifies the management, assembly, and analysis of all types of sequencing data.
Assembler’s extensive toolkit has always been seamlessly integrated into MacVector, but previously required an additional purchase. In response to the growing availability of sequencing data and to better serve our customers, we have made the decision to include Assembler at no extra cost for all active MacVector users starting in September 2024.
Assembler combines the Assembly Projects manager with a robust sequence assembly toolkit, offering features to handle a wide range of sequencing reads from various platforms. From Sanger reads to PacBio and Nanopore long reads. With six de novo assemblers and two tools for mapping reads to a reference sequence, Assembler provides the necessary tools for efficient and accurate assembly. Users can even perform de novo assembly of genomes up to ~100 Mbp on a standard Mac laptop.
How to upgrade your license to MacVector with Assembler
Active users: on your next renewal date Assembler will be automatically added. You do not need to do anything (alternatively contact MacVector Support now).
Assembler is fully integrated into MacVector and allows you to manage sequencing data with the familiar MacVector style. You can design primers directly on a contig or BLAST that contig to identify it.
Reference Sequence Assembly: Map millions of reads against genomes, transcriptomes or other reference sequences using Bowtie2.
Compare Genomes: Compare two related annotated genomes to see common or missing genes.
Coverage Tab: compare multiple assemblies with expression level comparison.
Variant Calling: SNPs and INDELS are visualised on your assembly and supplied in VCF.
Bacterial genome tools Tools for finishing bacterial genomes including circularizing genomes.
Easy to use interface. Navigating around your assemblies has never been easier. Display an entire contig in the graphical Map and select a read to zoom straight to that region. Click on a base in a contig and see coverage and variants.
Compare different assemblies. Map multiple datasets against the same reference, or repeat the same assembly with different options.
RNA-Seq analysis with read depth visualization and per gene coverage data (RPKM & TPM).
With MacVector/Assembler, there is no need to send your sequencing away or learn complicated software tools. You can analyze your own data with MacVector on your own Mac.
Apple released macOS Sequoia earlier this week (16th September 2024). As usual in the run up to a new macOS release, we have been testing MacVector on development builds of macOS Sequoia. We are happy to report that there are no issues and that MacVector 18.7 is fully supported on macOS Sequoia.
If you do come across any issues that we have missed (we do try but there’s always some bugs that hide until it’s too late!) please do contact support.
Compatibility of previous versions of MacVector
For versions of MacVector before MacVector 18.7 you can check compatibility on a table which we update after every new release of macOS and MacVector.
Basically MacVector 18.6.1 should work fine on macOS Sequoia although it is not officially supported. For earlier versions of MacVector then due to last minute changes that Apple made last year to macOS Sonoma there are many tools that will not work. Our developers strive to future proof MacVector, however, when Apple make significant changes this does sometimes take matters out of our control!
The recently released MacVector 18.7 has a new History tab in the Single sequence editor that shows the editing history of your DNA sequences
Since the introduction of MacVector’s Cloning Clipboard, all cloning actions (such as ligating a digested fragment into a vector) create a /FRAG feature that records the source of the ligated fragment, the restriction enzymes used to digest it (and any end treatment such as Klenow) as well as a timestamp. Then with the introduction of the OPTIMIZE CDS (MacVector 18.6) feature this also creates a new EDIT feature that details information about the action.
The new History tab lists these “history” features along with a timestamp and user of when they were performed thus allowing you to track the edit history of your sequences.
Although currently there are only a few edit operations that store information in the History tab, eventually all sequence modifications will write out this information allowing the full history of any construct to be determined and eventually allowing a simple roll back of the construct to how it existed on a specific date. The history tab may be sorted according to the user and date as well as the usual sort columns.
The following edit operations are currently shown in the History tab.
CDS Optimization: creates an /EDIT feature.
Cloning operations: such as ligating genes into a vector, create a /FRAG feature.
SOURCE annotations: These are a mandatory Genbank feature that summarizes the length of the sequence, scientific name of the source organism, and Taxon ID number. Can also include other information such as map location, strain, clone, tissue type, etc., if provided by submitter. Note: Source features are a standard Genbank feature and do not include timestamp or user.
Unlike all other annotation/features EDIT and FRAG features are not standard Genbank nomenclature. However, if you need to export your sequence as “pure” Genbank you can use FILE | EXPORT | Genbank Flat (strict). This will remove all non standard terms and export a file that is 100% compliant with the Genbank specification.
How to view the EDIT history of your sequence.
Perform a ligation of a digested fragment from another sequence
Perform a codon optimisation of a CDS feature
Switch to the HISTORY tab of your sequence.
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
One change in MacVector 18.7, that will improve installation on multi user Macs, is that by default MacVector now stores restriction enzyme files in the user’s home folder. Since it’s the user’s home folder, it will always be writeable, even if the user does not have Administrator access to the machine. Additionally for a Mac used by multiple users editing these files will not affect other users restriction enzyme choices.
To access this folder from Finder:
In Finder, hold down the OPTION key and click on the GO menu in the toolbar
Select LIBRARY and navigate to ApplicationSupport | MacVector | Restriction Enzymes
If you have already copied restriction enzyme files to a different location, then you will not be affected by this change, and your original files will be unaffected. When MacVector starts for the first time, it will create and populate this directory with the latest set of restrictions enzymes shipped with MacVector. It will then look in the old /Applications/MacVector/Restriction Enzymes/ folder and if any files in there have a newer time stamp than the default enzymes, then those files will be copied to the new location, ensuring no user edits to the data files are lost.
If the files are missing?
If any of the Restriction Enzyme tools complain of no files, then do one of the following actions:
In the RE tool click DEFAULTS and the location will be corrected
Click SET ENZYME FILE and navigate to the new location.
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
We’ve just released another minor update to MacVector 18.7. Sorry to follow an update with another update so quickly but we discovered two minor, but annoying bugs.
You will be prompted to update over the next few days (unless you have turned off update notifications). However, you can also run MACVECTOR | CHECK FOR UPDATES.. or download the new version directly
Changes for MacVector 18.7.2
Bug fixes
Filtering fastq reads in minimap2 could sometimes cause a crash
The button to turn of automatic restriction sites display was ignored and RE sites were always shown
You will be prompted to update over the next few days (unless you have turned off update notifications). However, you can also run MACVECTOR | CHECK FOR UPDATES.. or download the new version directly
Changes for MacVector 18.7.1
macOS Support
This release now supports all macOS releases from macOS 10.13 (High Sierra) through
macOS 14.5 (Sonoma) and has also been tested on beta releases of the upcoming macOS
Sequoia operating system.
Bug fixes
Unique and non-unique restriction sites now correctly get updated to the appropriate color when inserting or deleting sequence data.
The minimap2 consensus calculation now includes quality information.
The minimap2 input parameters dialog now provides more intuitive enablement/disablement of parameters when the preset parameters menu is used.
The History tab now reports ‘source’ features as well as ‘frag’ and ‘edit’.
If your license is not valid for MacVector 18.7 please contact support.
Our latest release, MacVector 18.7, has a new Codon Usage Table viewer. You can use this to generate your own codon usage table (CUT or .bias) files. You can use codon usage tables to optimize codon usage of CDS features for enhanced expression in a different organism. They can also be used in the Nucleic Acid Toolbox to predict protein coding ORFs.
You can import data directly from codon usage websites or generate your own tables by translating CDS features in one or more sequences.
The Codon Usage Table viewer displays the data in a standard text format with one row of data per codon, identical to codon usage output windows used in other MacVector translation functions.
Importing codon usage tables
Copy data from a source that matches the format above or to the popular GCG format available on codon usage websites such as CUTG.
Select New From Clipboard will create a new CUT file in the Codon Usage Table viewer.
Save the table as a .bias file.
Generating custom Codon Usage Tables from a single sequence.
Open a sequence that has annotated CDS features.
Analyze | Translate All CDS Features…
Toggle Create Codon Usage Table (.bias) to on and enter a suitable name.
Click OK.
Generating Codon Usage Tables from multiple sequences.
The CUT viewer has a toolbar button for the Translate All CDS Features in Folder function. You can invoke this multiple times and each new set of results will be added to the existing codon usage data. You can use this to slowly build up the codon usage information from a large sequence data set in multiple folder locations on your computer.
Start by one of the following two methods:
File | New | Codon Usage Table (.bias) and click Transl.Folder
run Database | Translate All CDS Features in Folder
Choose a folder of sequences you want to generate a CUT from.
The CUT viewer will be blocked until the job has finished.
Click Transl.Folder to repeat this procedure with multiple folders. At each stage the CUT will be updated.
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
Our latest release, MacVector 18.7, sees the addition of Minimap2 to Assembler’s sequencing toolkit. So if you have the Assembler module, you can now map noisy long-read data from Pacific Biosciences or Oxford Nanopore to one or more genomes. Minimap2 is a reference assembler similar to Bowtie2. But whereas Bowtie2 excels at mapping “short reads” (500nt or less) to a reference, Minimap2 can handle very long reads – i.e. Oxford Nanopore or Pacific BioSciences reads. Additionally Minimap2 is significantly faster than Bowtie, even with short reads.
Assembling reads against a reference with Minimap2
Create a new Assembly Project – File | New | Assembly Project
Click on the Add Ref toolbar button to add one or more reference sequences.
Then click on the Add Reads toolbar button and select one or more NGS data files.
While not essential, it is usually also a good idea to double-click on the Status column of each read to let MacVector know exactly what type of data you are analyzing.
Click on the Assemble toolbar button and select minimap2 from the menu.
The simplest option is to choose one of the presets in the resulting dialog that will tune the assembly parameters for your specific type of data.
Now run the alignment by clicking OK
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
MacVector 18.7 has just been released. If you are eligible for this release you will be prompted to upgrade, otherwise go to MACVECTOR | CHECK FOR UPDATES… and follow the prompts to be automatically upgraded. If your license is not eligible then why not upgrade?
Overview
MacVector 18.7 introduces a History tab to track the construction of your expression vectors and clones. It also includes new features for codon optimization, such as direct support for Codon Usage Tables (CUT/.bias) and the ability to create custom Codon Usage Tables files from your own sequences. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
MacVector 18.7 also makes it easier to store your own Restriction Enzyme files when multiple people use the same Mac (ideal for that shared lab Mac!).
..and as usual there are a lot of enhancements to existing features such as protein pI calculations.
MacVector 18.7 was developed on macOS Sonoma and is supported on macOS High Sierra to macOS Sonoma. It has also been tested on early development releases of macOS Sequoia and will be fully supported when Apple release it. MacVector 18.7 is a Universal Binary that will run on Apple Silicon Macs and Intel Macs.
Long-Read Reference Alignments using minimap2
The addition of Minimap2 allows Assembler to map noisy long-read data from Pacific Biosciences or Oxford Nanopore to a reference sequence(s). Minimap2 is similar to Bowtie2 but optimized for handling long reads instead of short reads under 500 nucleotides. Additionally, Minimap2 also excels at assembling short read data and may even out-perform Bowtie2 in certain situations.
Translate All CDS Features
You can now easily translate all CDS features in a sequence with the new menu option Analyze | Translate All CDS Features. This is useful for translating proteins in bacterial genomes or eukaryotic sequences. You can choose to display all translated proteins in fasta format or create a codon usage table from the results.
Translate All CDS Features in Folder
There is a new Database | Translate All CDS Features in Folder menu option that is similar to Translate All CDS Features except that it takes a source folder and then loads every sequence file in the folder and translates each CDS feature that it finds, accumulating the results and offering the same result options as Translate All CDS Features. A Codon Usage Table viewer window is always created and displayed when you select this option.
Codon Usage Table Viewer
MacVector now includes a viewer for codon usage tables (CUT/.bias) files. This displays the data in a standard text format with one row of data per codon, identical to codon usage output windows used in other MacVector translation functions.
You can import Codon Usage Tables from CUT files available on codon usage websites such as CUTG. You can also generate custom CUT files by using the new Translate All CDS Features from a single sequence or from multiple sequences using the new Translate All CDS Features in Folder tool. You can invoke this multiple times and slowly build up the codon usage information from a large sequence data set in multiple folders.
History Tab
There is a new tab in nucleic acid single sequence editors called History. This tab lists several MacVector-specific features relating to the editing history of the sequences such as ‘frag’ and ‘edit’. It also includes Source annotations which summarize the length of the sequence, scientific name of the source organism, and Taxon ID number. Apart from Source annotations these features now contain additional information such as the date of the operation, the name of the user who performed it and additional sequence information. In the future, all MacVector sequence modifications will write out this information allowing the full history of any construct to be determined and even allowing a simple reversion of the construct to how it existed on a specific date
Change in Default Restriction Enzyme File Location
MacVector now saves restriction enzyme files in a new location (~/Library/Application Support/MacVector/Restriction Enzymes/) within a user’s home folder. This is always writeable, even without Administrator access. If you have already saved files in a different location, they will not be affected. MacVector will automatically populate the new directory with the latest restriction enzymes and update any user-edited files from the old location.
Miscellaneous Enhancements and Bug Fixes
The Align to Reference SNPs tab now also displays the percentage of each residue present in each heterozygote SNP.
The Align to Reference consensus calling threshold default has been raised to 70% so that heterozygous SNPs are more consistently reported on the consensus line.
A crash when repeating heterozygote analysis has been fixed.
Copied fasta text data is now more reproducibly parsed as single sequence data by New From Clipboard.
The protein pI calculations have been modified to also report the pI ignoring Trp and Cys residues. This brings the results more in agreement with the popular ExPASY website.
A bug where the “blocking” for protein sequences was taking the DNA values has been fixed.
Exporting sequence data in the Sequin .tbl format now correctly writes out the correct sequence for the minus strand of segmented features.