General musings from the MacVector team about sequence analysis, molecular biology, the Mac in general and of course your favorite sequence analysis app for the Mac!

Automatically displaying open reading frames with MacVector’s Scan For Open Reading Frames tool

As well as Scan for.. Missing Features that shows annotation on your sequences, you may have noticed extra CDS features annotated to your sequences. These are from the Scan for.. Open Reading Frames tool that automatically scans every DNA sequence window for open reading frames and displays the results in the Map tab.

It’s very useful, especially if you frequently receive blank sequences from colleagues, or you want to show quickly show coding frames in unannotated trace files you receive from your sequencing service. However, if you are working with richly annotated sequences, you may find these distracting.

It’s easy to turn off (and back on again). The tool is controlled in the MacVector | Preferences | Scan DNA pane, along with restriction enzyme sites and missing features. MacVector will remember this setting for all new sequences you open, until you turn it back on again.

Unknown

The appearance of the ORFs can be controlled using the Options | Default Symbols menu item.

(Note: prior to MacVector 16 this tool was called Automatic ORF Annotation)

Future releases of MacVector will see more automatic annotation and visualization tools. For example MacVector 17 will show primer binding sites from your Primer Library. So if there’s something you’d like to see please email MacVector Support.

Posted in Tips | Tagged , , | Leave a comment

Pasting tabular data from MacVector into Microsoft Excel

There are a number of MacVector analyses that generate tabular text output. Examples include the Protein Analysis Toolbox List output, the Raw Data tab of the ABI chromatogram document window, the Matrix tab of the multiple sequence alignment document windows and the Coverage tab of the Bowtie contig editor. Each of these is actually composed of tab delimited columns, meaning that you can copy the entire table and paste the data into Microsoft Excel for additional analysis.

However, over the years, the behavior of Microsoft Excel has changed when it comes to dealing with copied tab-delimited text data. For example, Microsoft Excel 2008 would directly accept this data – you could copy from MacVector, switch to Excel and choose Edit | Paste and each item would be pasted into an individual cell. However, the current version of Microsoft Excel 365 (version 15.41) behaves differently – if you Edit | Paste, all of the data on each line gets pasted into the first column. To get around this, simply choose Edit | Paste Special... There is just a single option for the data on the clipboard (“Text”), but when you click OK, the data gets pasted into individual cells just as you would want.

Unknown

Posted in Tips | Tagged , | Leave a comment

Using MacVector’s Agarose Gel tool to design a digest to screen minipreps after a ligation.

How to design a digest to screen minipreps after a ligation.

(View full size on website…)

Posted in Techniques, Tips | Tagged , , , | Leave a comment

MacVector and macOS Mojave

Apple released macOS Mojave today (Tuesday 25th September).

Over the past few months we have performed preliminary testing with MacVector 16.0.9 on various development releases of macOS Mojave through its development cycle and all appears to be well.

So whereas we cannot state MacVector is compatible until it is officially released, we can say that we think all is OK.

Incidentally over the past few releases of MacVector we have made significant effort to future proof MacVector for upcoming releases of macOS.

Please do note that MacVector 16.0.9 will not support Dark Mode. However, it is planned for the next release, MacVector 17 (out later this year).

For older versions of MacVector you can check compatibility on our website. The results for macOS Mojave are not yet published, but it is likely that they will follow the ones for macOS High Sierra 10.13.4.

Please note that we do generally recommend that you do not upgrade to a new macOS release until at least a few weeks after a release.

Posted in Uncategorized | Tagged , , | Leave a comment

Free MacVector Teaching licenses

Did you know that MacVector, Inc. offers FREE licenses for teaching purposes?

If you are in the process of either preparing your lab for the upcoming year, or preparing for the courses you need to teach then don’t forgot that MacVector makes a great teaching tool. We offer free teaching licenses for anyone with an active, up-to-date license. All we need for your teaching license is the course name and number, start and stop date of the course, and the number of seats needed. We also have many great resource materials that will help you or your students learn MacVector.

Send my free teaching license →
Posted in General | Tagged | Leave a comment

NOW CANCELLED MacVector are at the NIH Research Festival

NIH Research Festival Tent show September 13-14, 2018

Due to the forecast storm the Tent Show has been postponed. As soon as we know the rescheduled date we’ll update.

We’re at the NIH Research Festival this week. Please drop by our booth on Thursday or Friday, when the big, white exhibitors tent is open. We’re on Booth #520 on parking lot 10H.

We enjoy the tent show and look forward to meeting NIH MacVector users both new and old, and anybody who is interested in learning to use the easiest to use sequence analysis application for the Mac. We’re happy to just have a chat, but we’ll also be able to give you a preview of our next release, MacVector 17.0, which will be out soon (automatic Gibson assembly, Compare Genomes, and much more!). We’ll have some goodies too – our popular mousepads with the IUPAC Genetic Code as well as some Pop sockets for your cell phone.

The exhibit hours are:

  • Thurs. Sept 13 from 9:30 to 15:30
  • Fri. Sept 14 from 9:30 to 14:30
  • Don’t forget that the NIH host a Picnic. There will be food from several restaurants being sold at a discount for both researchers and exhibitors.

    Posted in Meetings | Tagged , | Leave a comment

    MacVector and future releases of macOS

    The next OS release for the Mac will arrive later this year. MacOS Mojave is a significant OS upgrade and includes a lot of new features, especially when compared with the differences between OS X El Capitan, macOS Sierra and macOS High Sierra.

    Our developers have been hard at work ensuring that when macOS Mojave is released, MacVector 16 and the following release (MacVector 17) will be fully compatible.

    As usual for older versions you can check compatibility on our website. The results for macOS Mojave are not yet published, but it is likely that they will follow the ones for macOS High Sierra 10.13.4.

    For recent older versions it is likely that they will work fine. Over the past few version we have been striving to future proof MacVector for new versions of macOS.

    However, for older versions of MacVector there will be issues. Apple do make significant changes between releases. One issue that you will see with MacVector 13.5 and earlier versions is regular dialogs stating that this version of MacVector will not work with the next release of macOS.

    A more subtle version of this dialog started appearing with the release of macOS High Sierra 10.13.4, when Apple added warnings to all 32 bit applications.

    Here at MacVector we always strive to stay ahead of the game. When Apple recommended (many years ago) that all applications should be 64 bit, we immediately started working to move MacVector to be fully 64 bit resulting in the release of MacVector 14 in 2015.

    Please do note that whereas our developers do develop and run MacVector on Apple’s public and developer betas of macOS, we really do not recommend that users run MacVector for any real workflows on beta releases. There may be unexpected issues that you may encounter.

    Posted in Development | Tagged , | Comments closed

    RNASeq Expression Analysis with MacVector and Assembler

    If you have the Assembler module, MacVector can align millions of NGS reads from RNASeq experiments against large genomes and generate a coverage table displaying the relative expression levels of every gene in a genome. The key to this functionality is that you must have a reference genome with genes annotated as CDS or gene features – then you can use Bowtie to rapidly assemble reads from fastq files against the genome and generate a report listing the coverage for each feature. The steps to do this are:

  • Use File | New | Assembly Project to create a new project.
  • Click on the Add Ref toolbar button to add a suitable annotated reference genome.
  • Click on the Add Reads toolbar item to add your RNASeq reads.
  • Click on the Bowtie toolbar button to assemble the reads against the reference genome.
  • Double-click on the resulting reference contig item and switch to the Coverage tab.
  • You will get a display similar to this.

    NewImage

    This lists each CDS and gene feature, along with the number of reads that aligned to each feature and the Reads Per Kilobase of transcript per Million mapped reads (RPKM) and Transcripts Per Kilobase Million (TPM) values that can be used to compare expression between genes and runs.

    There is an RNASeq Expression Analysis Tutorial available that describes this functionality in more detail. If you are running MacVector 16.0.4 or later you will find this in the /Applications/MacVector/Documentation/ folder. Otherwise you can download the tutorial and sample dataset. Although this is a small dataset designed for rapid analysis, you can use this approach to align 50 million+ reads against the entire human transcriptome on a modest (16 GB RAM) MacBook Pro.

    Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

    Posted in Techniques, Tips | Tagged , , , | Comments closed

    How to split large fastq files for more manageable assemblies

    We’ve previously discussed how important it can be to make sure you are using the appropriate number of fastq reads from an NGS experiment to ensure you obtain the results you are looking for. Using too many reads can confuse algorithms with the massive coverage increasing mis-assemblies due to background errors in the reads. In addition, large numbers of reads can significantly impact CPU performance, memory usage, and even disk usage. At MacVector we have coded a simple utility that will split large fastq files into smaller chunks. It’s completely free to download and should work on all versions of macOS/ Mac OS X.

    Download the SplitFastqFile utility

    You run it by simply dropping fastq files onto the application and following the prompts. When complete, you’ll see the split files in a folder, with naming similar to this.

    (Read more….)

    NewImage

    Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

    Posted in Techniques, Tips | Tagged , , , | Comments closed

    Balancing Velvet KMER and coverage

    The Velvet assembly algorithm in MacVector is blazingly fast and generates excellent assemblies. However, you do have to be careful when assembling NGS data to be sure that the parameters you submit are appropriate for the data you are assembling in order to get optimal results. By far the most important parameter is the KMER value. If you are not getting good assemblies, this is the parameter you should change. Below are the results of varying the KMER value for an NGS assembly of a circular 8,859 bp plasmid using data acquired from an Illumina HiSeq machine. In this case, the data consisted of a pair of fastq files with a read length of 75 nt. The original files each contained 1,370,000 paired end reads. The table below shows the longest contig that resulted from using varying numbers of input reads, versus varying the KMER parameter in Velvet.

    MacVector has a feature in the contig editor that simplifies circularization of contigs with overlapping direct repeats at the ends. All the contigs in black could be circularized to generate a 8,859 bp plasmid. Those in red were not full length, or could not be circularized.

    • First, note that Velvet (like most assemblers) does not like a massive over-abundance of coverage. If you submit too many reads, it confuses the algorithm and you have to be very careful with your choice of KMER to get a good assembly.

    • Second, note that the more reads you submit, the higher the KMER needs to be to generate a complete contig.

    The take home lesson from this is that in general, you should tune the amount of data in your NGS set to be between 100x and 1,000x coverage as that gives the most flexibility in your choice of KMER. You should start with a KMER that is ~70% of the average length of your reads (it has to be odd, so 51 in this case), then vary the KMER to see what impact that has. This holds true for bacterial genome assemblies as well as simple plasmids like this. Next week we will discuss a tool to help you break up large NGS data files into smaller segments to facilitate this analysis.

    Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

    Posted in Tips | Tagged , , , | Comments closed