When analyzing/assembling/aligning NGS data, there are many scenarios where you might want to separate out the reads representing different genotypes or variant sequences. MacVector makes this very easy. Take a reference sequence and choose Analyze->Align to Reference. Now click the Add Seqs button and select and add your NGS data files. NOTE: if your reference […]
MacVectorTip: Filtering NGS Data to retrieve reads matching a known sequence
So you just got your NGS reads back from that sequencing experiment and, wow, what a HUGE amount of data. Wouldn’t it be easier to handle if you could pare that down to just the gene/plasmid/sequence(s) you are interested in? MacVector to the rescue as it can read and filter fast/q files, even if they […]
Virtual Gene Cloning from NGS RNA-Seq Data
The NCBI Sequence Read Archive (SRA) database is a huge resource of Next Generation Sequencing experimental data. Many groups and laboratories deposit data here that they have generated for their own specific projects that can be datamined for other unrelated projects with a minimum of effort. MacVector contains a number of powerful tools that can […]
101 things you (maybe) didn’t know about MacVector: #52 – Data mining to identify and analyze pangolin CoV-2 analogs to the human COVID-19 virus
One of the most underrated features in MacVector is the Database | Align to Folder function. You can use this as a more sensitive version of a local BLAST search to find sequences in a “database” that match a query sequence. But in this case the “database” is simply a collection of your own sequences, […]
Identifying transposon insertion sites from multiplexed NGS data
Transposon mutagenesis is a common approach for investigating gene function in bacterial genomes by selecting for clones where the transposon inserting into the genome has generated a specific phenotype. You can then simply sequence the entire genome of each clone by NGS to identify the transposon insertion site. To lower the cost of such experiments, […]
Optimizing Align To Folder Parameters for use with NGS Data
You can use the Database | Align To Folder function to scan large fasta or fastq files containing NGS data to find and retrieve just those reads that match a specific target sequence. The search is aware of paired-end reads, so when you retrieve hits, both reads of a pair will be saved into a […]
RNASeq Expression Analysis with MacVector and Assembler
If you have the Assembler module, MacVector can align millions of NGS reads from RNASeq experiments against large genomes and generate a coverage table displaying the relative expression levels of every gene in a genome. The key to this functionality is that you must have a reference genome with genes annotated as CDS or gene […]
How to split large fastq files for more manageable assemblies
We’ve previously discussed how important it can be to make sure you are using the appropriate number of fastq reads from an NGS experiment to ensure you obtain the results you are looking for. Using too many reads can confuse algorithms with the massive coverage increasing mis-assemblies due to background errors in the reads. In […]
Balancing Velvet KMER and coverage
The Velvet assembly algorithm in MacVector is blazingly fast and generates excellent assemblies. However, you do have to be careful when assembling NGS data to be sure that the parameters you submit are appropriate for the data you are assembling in order to get optimal results. By far the most important parameter is the KMER […]
Use a right-click in the Editor tab to see if your contig can be circularized
MacVector 16 incorporates no less than THREE different de novo assemblers, phrap, velvet and SPAdes. While all are great assemblers, with each having their own specific advantages, none of them will generate a circular sequence from input reads. However, MacVector 16 also includes a new feature to help you with this. If you are assembling […]