Tips for finding ORFs in your sequence

We’ve talked in previous tips about annotating open reading frames as CDS features. However, what if your sequence has no annotated ORF? MacVector’s ANALYZE | OPEN READING FRAMES… tool will help you find any quickly.

However, if you are new to this tool there are a few options that may prove initially confusing. These options modify how ORFs are detected. They are intended to help you find ORFs when you may only have a partial fragment of a gene. So if you’ve just ran an ORF analysis and find too many CDS regions or if you cannot find an ORF and you are not confident you have the full gene, then remember this email!

  • 3’ ENDS ARE STOPS means that in any sequence it looks at each reading frame and checks whether the last full CODON would give a ORF, that is longer than the minimum number of codons. Put simply it assumes that the end of the sequence is the end of a coding region, even if no STOP codons are present.
  • 5’ ENDS ARE STARTS does the reverse. That is assumes the start of the sequence is a START codon.
  • CODONS AFTER STOPS ARE STARTS assumes that after every STOP codon there is a new START codon. Even if the actual codon is not present.
  • In the two screenshots below we have a sequence with a single long CDS region (in blue). If we have these two options turned on then you will see three ORFS on the three forward frames (top image). If you turn them off you see a single CDS (bottom image).



    Don’t forget to annotate your CDS region when you’ve found it. Many tools in MacVector will show additional information about a coding region if it is annotated as a CDS region. If you simply drag the ORF from the RESULTS window it will automatically annotated that as a CDS region. It will also include the coding region’s translation directly in the annotation.

