101 things you (maybe) didn’t know about MacVector: #49 – Identifying CRISPR Indels

If you are screening a set of clones for the presence of changes after a CRISPR experiment, then the MacVector Analyze | Align To Reference functionality is the approach to use. However, you may find that the default parameters are not ideal for this type of analysis – they are tuned for simple sequence confirmation experiments and trade off between sensitivity to accommodate larger (>5 nt) insertions/deletions and speed.

If you are using MacVector 14.5.3 or earlier, the key to correctly handling larger deletions and insertions is to;

(a) Use the cDNA Alignment algorithm. This allows for unlimited length deletions in the reads, as long as there are sufficient matching residues on either side of the deletion to exceed the minimum match criteria.
(b) Increase the Sensitivity setting. This determines “how far ahead” the alignment algorithm looks. To handle insertions in the reads compared to the reference, it needs to be larger than the expected number of insertions. Typically, for CRISPR experiments, this is only one or two residues, but it can be larger, so setting this to a larger value will handle those cases much better. However, you also need to adjust the X-Dropoff value to be at least Sensitivity multiplied by the Gap Penalty.

Here’s some reasonable settings for MacVector 14.5.3 with the critical settings outlined in red;

CRISPR14 5 3Settings

Aligning with the high Sensitivity value may take some time compared to the normal defaults e.g. 1,000 nt ABI trace files might align against a ~10,000 nt Reference at about 1 Read per second on an average machine. Here is an example alignment of a set of reads with a variety of short insertions and deletions centered around a (fake) CRISPR target site;

CRISPR14 5 3Results

You can see that the largest insert in a Read was 7 nt and this is reflected by 7 gaps inserted in the Reference. While most of the reads have short deletions and the deleted residues are represented by the gap character (“-“), some of the Reads with longer deletions are indicated by the “large gap” series of characters (“<- - - - ->“). This simply indicates that the Read was split into two separate segments during the alignment process.

If you are using MacVector 15.0.1 or later, you will find that this interface has had some significant tweaks. First, a new CRISPR Indel Detection mode has been added to the Align To Reference settings dialog. This largely removes any need to adjust the individual settings;

CRISPR15 0Settings

The second change is that a clean up step has been added to the alignment algorithm to minimize the number of gapped segments in the final alignment. This has the effect of dramatically cleaning up the region around the indels. Compare the alignment below to the previous MacVector 14.5.3 generated alignment;

CRISPR15 0Results



This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

This entry was posted in 101 Tips and tagged , . Bookmark the permalink. Both comments and trackbacks are currently closed.