General musings from the MacVector team about sequence analysis, molecular biology, the Mac in general and of course your favorite sequence analysis app for the Mac!
The recently released MacVector 18.7 has a new History tab in the Single sequence editor that shows the editing history of your DNA sequences
Since the introduction of MacVector’s Cloning Clipboard, all cloning actions (such as ligating a digested fragment into a vector) create a /FRAG feature that records the source of the ligated fragment, the restriction enzymes used to digest it (and any end treatment such as Klenow) as well as a timestamp. Then with the introduction of the OPTIMIZE CDS (MacVector 18.6) feature this also creates a new EDIT feature that details information about the action.
The new History tab lists these “history” features along with a timestamp and user of when they were performed thus allowing you to track the edit history of your sequences.
Although currently there are only a few edit operations that store information in the History tab, eventually all sequence modifications will write out this information allowing the full history of any construct to be determined and eventually allowing a simple roll back of the construct to how it existed on a specific date. The history tab may be sorted according to the user and date as well as the usual sort columns.
The following edit operations are currently shown in the History tab.
CDS Optimization: creates an /EDIT feature.
Cloning operations: such as ligating genes into a vector, create a /FRAG feature.
SOURCE annotations: These are a mandatory Genbank feature that summarizes the length of the sequence, scientific name of the source organism, and Taxon ID number. Can also include other information such as map location, strain, clone, tissue type, etc., if provided by submitter. Note: Source features are a standard Genbank feature and do not include timestamp or user.
Unlike all other annotation/features EDIT and FRAG features are not standard Genbank nomenclature. However, if you need to export your sequence as “pure” Genbank you can use FILE | EXPORT | Genbank Flat (strict). This will remove all non standard terms and export a file that is 100% compliant with the Genbank specification.
How to view the EDIT history of your sequence.
Perform a ligation of a digested fragment from another sequence
Perform a codon optimisation of a CDS feature
Switch to the HISTORY tab of your sequence.
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
One change in MacVector 18.7, that will improve installation on multi user Macs, is that by default MacVector now stores restriction enzyme files in the user’s home folder. Since it’s the user’s home folder, it will always be writeable, even if the user does not have Administrator access to the machine. Additionally for a Mac used by multiple users editing these files will not affect other users restriction enzyme choices.
To access this folder from Finder:
In Finder, hold down the OPTION key and click on the GO menu in the toolbar
Select LIBRARY and navigate to ApplicationSupport | MacVector | Restriction Enzymes
If you have already copied restriction enzyme files to a different location, then you will not be affected by this change, and your original files will be unaffected. When MacVector starts for the first time, it will create and populate this directory with the latest set of restrictions enzymes shipped with MacVector. It will then look in the old /Applications/MacVector/Restriction Enzymes/ folder and if any files in there have a newer time stamp than the default enzymes, then those files will be copied to the new location, ensuring no user edits to the data files are lost.
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
We’ve just released another minor update to MacVector 18.7. Sorry to follow an update with another update so quickly but we discovered two minor, but annoying bugs.
You will be prompted to update over the next few days (unless you have turned off update notifications). However, you can also run MACVECTOR | CHECK FOR UPDATES.. or download the new version directly
Changes for MacVector 18.7.2
Bug fixes
Filtering fastq reads in minimap2 could sometimes cause a crash
The button to turn of automatic restriction sites display was ignored and RE sites were always shown
You will be prompted to update over the next few days (unless you have turned off update notifications). However, you can also run MACVECTOR | CHECK FOR UPDATES.. or download the new version directly
Changes for MacVector 18.7.1
macOS Support
This release now supports all macOS releases from macOS 10.13 (High Sierra) through
macOS 14.5 (Sonoma) and has also been tested on beta releases of the upcoming macOS
Sequoia operating system.
Bug fixes
Unique and non-unique restriction sites now correctly get updated to the appropriate color when inserting or deleting sequence data.
The minimap2 consensus calculation now includes quality information.
The minimap2 input parameters dialog now provides more intuitive enablement/disablement of parameters when the preset parameters menu is used.
The History tab now reports ‘source’ features as well as ‘frag’ and ‘edit’.
If your license is not valid for MacVector 18.7 please contact support.
Our latest release, MacVector 18.7, has a new Codon Usage Table viewer. You can use this to generate your own codon usage table (CUT or .bias) files. You can use codon usage tables to optimize codon usage of CDS features for enhanced expression in a different organism. They can also be used in the Nucleic Acid Toolbox to predict protein coding ORFs.
You can import data directly from codon usage websites or generate your own tables by translating CDS features in one or more sequences.
The Codon Usage Table viewer displays the data in a standard text format with one row of data per codon, identical to codon usage output windows used in other MacVector translation functions.
Importing codon usage tables
Copy data from a source that matches the format above or to the popular GCG format available on codon usage websites such as CUTG.
Select New From Clipboard will create a new CUT file in the Codon Usage Table viewer.
Save the table as a .bias file.
Generating custom Codon Usage Tables from a single sequence.
Open a sequence that has annotated CDS features.
Analyze | Translate All CDS Features…
Toggle Create Codon Usage Table (.bias) to on and enter a suitable name.
Click OK.
Generating Codon Usage Tables from multiple sequences.
The CUT viewer has a toolbar button for the Translate All CDS Features in Folder function. You can invoke this multiple times and each new set of results will be added to the existing codon usage data. You can use this to slowly build up the codon usage information from a large sequence data set in multiple folder locations on your computer.
Start by one of the following two methods:
File | New | Codon Usage Table (.bias) and click Transl.Folder
run Database | Translate All CDS Features in Folder
Choose a folder of sequences you want to generate a CUT from.
The CUT viewer will be blocked until the job has finished.
Click Transl.Folder to repeat this procedure with multiple folders. At each stage the CUT will be updated.
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
Our latest release, MacVector 18.7, sees the addition of Minimap2 to Assembler’s sequencing toolkit. So if you have the Assembler module, you can now map noisy long-read data from Pacific Biosciences or Oxford Nanopore to one or more genomes. Minimap2 is a reference assembler similar to Bowtie2. But whereas Bowtie2 excels at mapping “short reads” (500nt or less) to a reference, Minimap2 can handle very long reads – i.e. Oxford Nanopore or Pacific BioSciences reads. Additionally Minimap2 is significantly faster than Bowtie, even with short reads.
Assembling reads against a reference with Minimap2
Create a new Assembly Project – File | New | Assembly Project
Click on the Add Ref toolbar button to add one or more reference sequences.
Then click on the Add Reads toolbar button and select one or more NGS data files.
While not essential, it is usually also a good idea to double-click on the Status column of each read to let MacVector know exactly what type of data you are analyzing.
Click on the Assemble toolbar button and select minimap2 from the menu.
The simplest option is to choose one of the presets in the resulting dialog that will tune the assembly parameters for your specific type of data.
Now run the alignment by clicking OK
MacVector 18.7 was released in July 2024 and introduces a History tab to track the construction of your expression vectors and clones. It also includes direct support for Codon Usage Tables, creating custom Codon Usage Tables and batch translation of CDS features. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
MacVector 18.7 has just been released. If you are eligible for this release you will be prompted to upgrade, otherwise go to MACVECTOR | CHECK FOR UPDATES… and follow the prompts to be automatically upgraded. If your license is not eligible then why not upgrade?
Overview
MacVector 18.7 introduces a History tab to track the construction of your expression vectors and clones. It also includes new features for codon optimization, such as direct support for Codon Usage Tables (CUT/.bias) and the ability to create custom Codon Usage Tables files from your own sequences. Additionally, MacVector 18.7 enhances Assembler’s toolkit by adding a new reference assembler for mapping PacBio and ONT sequencing reads to your reference sequences.
MacVector 18.7 also makes it easier to store your own Restriction Enzyme files when multiple people use the same Mac (ideal for that shared lab Mac!).
..and as usual there are a lot of enhancements to existing features such as protein pI calculations.
MacVector 18.7 was developed on macOS Sonoma and is supported on macOS High Sierra to macOS Sonoma. It has also been tested on early development releases of macOS Sequoia and will be fully supported when Apple release it. MacVector 18.7 is a Universal Binary that will run on Apple Silicon Macs and Intel Macs.
Long-Read Reference Alignments using minimap2
The addition of Minimap2 allows Assembler to map noisy long-read data from Pacific Biosciences or Oxford Nanopore to a reference sequence(s). Minimap2 is similar to Bowtie2 but optimized for handling long reads instead of short reads under 500 nucleotides. Additionally, Minimap2 also excels at assembling short read data and may even out-perform Bowtie2 in certain situations.
Translate All CDS Features
You can now easily translate all CDS features in a sequence with the new menu option Analyze | Translate All CDS Features. This is useful for translating proteins in bacterial genomes or eukaryotic sequences. You can choose to display all translated proteins in fasta format or create a codon usage table from the results.
Translate All CDS Features in Folder
There is a new Database | Translate All CDS Features in Folder menu option that is similar to Translate All CDS Features except that it takes a source folder and then loads every sequence file in the folder and translates each CDS feature that it finds, accumulating the results and offering the same result options as Translate All CDS Features. A Codon Usage Table viewer window is always created and displayed when you select this option.
Codon Usage Table Viewer
MacVector now includes a viewer for codon usage tables (CUT/.bias) files. This displays the data in a standard text format with one row of data per codon, identical to codon usage output windows used in other MacVector translation functions.
You can import Codon Usage Tables from CUT files available on codon usage websites such as CUTG. You can also generate custom CUT files by using the new Translate All CDS Features from a single sequence or from multiple sequences using the new Translate All CDS Features in Folder tool. You can invoke this multiple times and slowly build up the codon usage information from a large sequence data set in multiple folders.
History Tab
There is a new tab in nucleic acid single sequence editors called History. This tab lists several MacVector-specific features relating to the editing history of the sequences such as ‘frag’ and ‘edit’. It also includes Source annotations which summarize the length of the sequence, scientific name of the source organism, and Taxon ID number. Apart from Source annotations these features now contain additional information such as the date of the operation, the name of the user who performed it and additional sequence information. In the future, all MacVector sequence modifications will write out this information allowing the full history of any construct to be determined and even allowing a simple reversion of the construct to how it existed on a specific date
Change in Default Restriction Enzyme File Location
MacVector now saves restriction enzyme files in a new location (~/Library/Application Support/MacVector/Restriction Enzymes/) within a user’s home folder. This is always writeable, even without Administrator access. If you have already saved files in a different location, they will not be affected. MacVector will automatically populate the new directory with the latest restriction enzymes and update any user-edited files from the old location.
Miscellaneous Enhancements and Bug Fixes
The Align to Reference SNPs tab now also displays the percentage of each residue present in each heterozygote SNP.
The Align to Reference consensus calling threshold default has been raised to 70% so that heterozygous SNPs are more consistently reported on the consensus line.
A crash when repeating heterozygote analysis has been fixed.
Copied fasta text data is now more reproducibly parsed as single sequence data by New From Clipboard.
The protein pI calculations have been modified to also report the pI ignoring Trp and Cys residues. This brings the results more in agreement with the popular ExPASY website.
A bug where the “blocking” for protein sequences was taking the DNA values has been fixed.
Exporting sequence data in the Sequin .tbl format now correctly writes out the correct sequence for the minus strand of segmented features.
Quality scoring of Assemblies and Align to Reference alignments can be visualized directly on the sequence. Residues can be shaded according to their quality scores. These can be displayed anywhere quality values are available, including de novo and reference assemblies in Assembler and Align to Reference alignments.
The intensity of the shading of residues indicates the phred-based quality value of each residue.
For individual reads, this ranges from 0 (deep red) through 20 (white) to 40 or above (deep green). The consensus scale is doubled and ranges from 0 (deep red) through 40 (white) to 80 or above (deep green).
Gaps are always shown with a white background. You can “mouse-over” a residue to view the numerical information in a tooltip.
Edited residues are always given a phred quality value of 99 and these residues are given a blue background.
Most assembly algorithms are quality score aware and better quality reads will take priority over lower quality sequences. However, edited residues will always override all other sequences/reads even if they are high quality. This also means that a single read with an edited sequence will take priority over many other reads with a different sequence and the edited residues in that single edited read will define the consensus. This is done as we assume that the user knows best!
Select the two sites, for subcloning your targeted gene, and click DIGEST.
Drag the digested fragment from the Cloning Clipboard to your vector
click LIGATE.
Create your agarose gel with the correct insert and a vector only lane.
Go to FILE > NEW > AGAROSE GEL
Open your sequence and switch to the MAP tab.
Drag a restriction site (or sites) that digest within the fragment and also in the vector to the Agarose Gel window.
Open up your original cloning vector.
Drag the same site(s) to the Agarose Gel window.
Undo the ligation, and repeat with the wrong orientation.
Switch back to the ligated sequence, and use UNDO to remove the ligated fragment.
Switch back to the Cloning Clipboard.
Drag the same digested fragment from the Cloning Clipboard to your vector.
Hold down [OPTION] and click LIGATE.
Switch to the MAP tab.
Drag the same site (or sites) that digest the fragment and the vector to the Agarose Gel window.
Now you will end up with an Agarose Gel with three lanes: A lane with empty vector, a lane with the insert in the correct orientation, and a lane with the insert in the wrong orientation. Now it’s easy to screen your minipreps, as you know the gel bands of a correct miniprep before you’ve even loaded it on the gel!
If the graphics in a nucleic acid sequence Map tab appear somewhat “washed out” it is because the graphic items represent common features that MacVector has found that are not annotated on the sequence. For example, here are the Map and Feature tabs of an unannotated cloning vector;
You can see a number of features on the Map tab, but the Features tab is completely empty. The graphics indicate common features that MacVector has identified that have not been annotated on the sequence. If you select one of the features in the Map tab and right-click (or [ctrl]-click) there is an option in the resulting context sensitive menu to Add CDS Feature. When that is selected, the feature takes on a bold appearance and a new annotation appears in the Features tab.
If you wish, you can select multiple missing features and then add them all with a right-click. Or you can select the Results | Missing Features tree view item in the floating Graphics Palette to select all missing features and then add with a right-click.
Note that the automatic display of missing features is controlled by the MacVector | Settings | Scan DNA tab. From there you can control how they are identified or even point the algorithm to your own folder of annotated sequences to be a source for the missing features.