How to split large fastq files for more manageable assemblies

We’ve previously discussed how important it can be to make sure you are using the appropriate number of fastq reads from an NGS experiment to ensure you obtain the results you are looking for. Using too many reads can confuse algorithms with the massive coverage increasing mis-assemblies due to background errors in the reads. In addition, large numbers of reads can significantly impact CPU performance, memory usage, and even disk usage. At MacVector we have coded a simple utility that will split large fastq files into smaller chunks. It’s completely free to download and should work on all versions of macOS/ Mac OS X.

Download the SplitFastqFile utility

You run it by simply dropping fastq files onto the application and following the prompts. When complete, you’ll see the split files in a folder, with naming similar to this.

(


Not sure if you have Assembler? Choose MacVector | About MacVector. If the screen that appears says “MacVector with Assembler, Pro Edition” then you have it. If not, you can sign up for a fully functional 21 day trial version.

This entry was posted in Techniques, Tips and tagged , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.