Our latest release, MacVector 18.6 has a new tool that will directly optimize codon usage of CDS features for enhanced expression in a different organism.
The new tool pulls together multiple tools into a one step procedure which can be run by selecting a CDS feature in your nucleic acid sequence and running Analyze | Optimize Codon Usage for CDS… You will need to choose an appropriate codon usage table (.bias file) for expression.
When optimized a new Feature is annotated to the sequence showing that the CDS has been optimized, which algorithm was used and which codon usage table. It will also show the user who made the modification, date, and how the sequence was before and after the action.
How to optimize codon usage for a CDS feature
- Select a CDS feature in the Map or Features tab of a nucleic acid sequence.
- Choose Analyze | Optimize Codon Usage for CDS…
- Choose the codon usage table (.bias file) to use, along with the genetic code and the optimization algorithm.
- Either apply the results to the CDS feature or just view the proposed changes.
You will need a codon usage table (.bias) for the organism that the CDS will be expressed in. A number of common tables are shipped with MacVector, but we can generate new ones on request. A future release of MacVector will generate codon usage tables automatically.
Codon Usage optimization algorithms
There are four different algorithms that MacVector provides for optimizing codon usage.
- Most Frequently Used Codon – this simply uses the most commonly occurring codon for each amino acid. So if, e.g. the most common Leu codon is CTC, all Leu codons will be CTC. Perhaps this is only useful if you want to design a “best guess” primer and are willing to accept a certain failure rate. If you used this to optimize expression, the host would likely run out of that tRNA and you wouldn’t see optimal expression.
- Frequency Distribution – this selects a random codon for each amino acid, biased towards the most commonly used codon that encodes each amino acid. Each time you run the algorithm, a different, random set of codons will be selected. If you were to generate a new DNA over and over again, eventually this would create a collection of sequences where the average codon usage would exactly match the average for the .bias organism, but any individual reverse translation may randomly be quite different.
- Probability Distribution – this is probably the most powerful setting if you are interested in expression. Similar to the Frequency Distribution, this chooses a random codon, biased towards the most frequently used codons for each amino acid. However, this version tries to ensure that the final DNA sequence has a codon usage profile as closely matching as possible to the codon usage of the selected .bias file. Again, each time you invoke the algorithm, it will produce a different sequence. But as the overall codon usage in the DNA sequence is guaranteed to be as close as possible to the codon usage in the .bias organism this should, in theory, give you the best chance of high expression. Again, you will get a different sequence each time you invoke this.
Uniform Distribution – this ignores the usage of each codon and randomly assigns an appropriate codon for each amino acid. It’s similar to the default algorithm that uses ambiguities to create an “absolute” coding DNA, but here it just chooses a random codon with no regard for codon usage probability. Again, you will get a different sequence each time you invoke this.
MacVector 18.6 was released in July 2023. This release adds one-click optimization of CDS coding regions, automatic phrap sub-project assembly, direct support of .csv/.tsv files for Primer Database, inclusion of graphical information in GenBank exports and numerous tweaks and improvements to many workflows.