General musings from the MacVector team about sequence analysis, molecular biology, the Mac in general and of course your favorite sequence analysis app for the Mac!

How to split large fastq files for more manageable assemblies

We’ve previously discussed how important it can be to make sure you are using the appropriate number of fastq reads from an NGS experiment to ensure you obtain the results you are looking for. Using too many reads can confuse algorithms with the massive coverage increasing mis-assemblies due to background errors in the reads. In addition, large numbers of reads can significantly impact CPU performance, memory usage, and even disk usage. At MacVector we have coded a simple utility that will split large fastq files into smaller chunks. It’s completely free to download and should work on all versions of macOS/ Mac OS X.

Download the SplitFastqFile utility

You run it by simply dropping fastq files onto the application and following the prompts. When complete, you’ll see the split files in a folder, with naming similar to this.

(Read more….)

NewImage

Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

This entry was posted in Techniques, Tips and tagged , , , . Bookmark the permalink. Trackbacks are closed, but you can post a comment.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*
*

This site uses Akismet to reduce spam. Learn how your comment data is processed.