101 things you (maybe) didn’t know about MacVector: #52 – Data mining to identify and analyze pangolin CoV-2 analogs to the human COVID-19 virus

One of the most underrated features in MacVector is the Database | Align to Folder function. You can use this as a more sensitive version of a local BLAST search to find sequences in a “database” that match a query sequence. But in this case the “database” is simply a collection of your own sequences, […]

Posted in 101 Tips, Tips | Also tagged , , | Comments closed

Identifying transposon insertion sites from multiplexed NGS data

Transposon mutagenesis is a common approach for investigating gene function in bacterial genomes by selecting for clones where the transposon inserting into the genome has generated a specific phenotype. You can then simply sequence the entire genome of each clone by NGS to identify the transposon insertion site. To lower the cost of such experiments, […]

Posted in Techniques, Tutorials | Also tagged , , | Comments closed

Optimizing Align To Folder Parameters for use with NGS Data

You can use the Database | Align To Folder function to scan large fasta or fastq files containing NGS data to find and retrieve just those reads that match a specific target sequence. The search is aware of paired-end reads, so when you retrieve hits, both reads of a pair will be saved into a […]

Posted in Techniques | Also tagged , | Comments closed

RNASeq Expression Analysis with MacVector and Assembler

If you have the Assembler module, MacVector can align millions of NGS reads from RNASeq experiments against large genomes and generate a coverage table displaying the relative expression levels of every gene in a genome. The key to this functionality is that you must have a reference genome with genes annotated as CDS or gene […]

Posted in Techniques, Tips | Also tagged , , | Comments closed

How to split large fastq files for more manageable assemblies

We’ve previously discussed how important it can be to make sure you are using the appropriate number of fastq reads from an NGS experiment to ensure you obtain the results you are looking for. Using too many reads can confuse algorithms with the massive coverage increasing mis-assemblies due to background errors in the reads. In […]

Posted in Techniques, Tips | Also tagged , , | Comments closed

Balancing Velvet KMER and coverage

The Velvet assembly algorithm in MacVector is blazingly fast and generates excellent assemblies. However, you do have to be careful when assembling NGS data to be sure that the parameters you submit are appropriate for the data you are assembling in order to get optimal results. By far the most important parameter is the KMER […]

Posted in Tips | Also tagged , , | Comments closed

Use a right-click in the Editor tab to see if your contig can be circularized

MacVector 16 incorporates no less than THREE different de novo assemblers, phrap, velvet and SPAdes. While all are great assemblers, with each having their own specific advantages, none of them will generate a circular sequence from input reads. However, MacVector 16 also includes a new feature to help you with this. If you are assembling […]

Posted in Tips | Also tagged , | Comments closed

Assembling sequencing data with MacVector and Assembler

MacVector has a software plugin called Assembler that integrates directly into the DNA sequence analysis toolkit and provides DNA sequence assembly functionality. Dealing with sequencing reads has never been easier. MacVector includes no less than five different assemblers just a few mouse clicks away from your sequencing reads. Phrap assembles Sanger sequencing reads or existing […]

Posted in Techniques, Tips | Also tagged , , , , | Comments closed

Import Multi-Sequence Genbank Files into an Assembly Project for easy access to Features

There are many genomes in the Genbank database that cannot be downloaded as single annotated sequences. These might be large multi-chromosome eukaryotic genomes, but, increasingly, partially sequenced bacterial chromosomes where the major contigs have been annotated using the NCBI annotation pipeline. Typically, when you encounter these, there are options to download annotated versions of these […]

Posted in Tips | Also tagged , , , | Comments closed

RNASeq Expression Analysis with NGS data

If you have the Assembler module, MacVector can align millions of NGS reads from RNASeq experiments against large genomes and generate a coverage table displaying the relative expression levels of every gene in a genome. The key to this functionality is that you must have a reference genome with genes annotated as CDS or gene […]

Posted in Tutorials | Also tagged , , | Comments closed