Jaspar, MacVector & Subsequence searches

MacVector allows you to find motifs, primers, transcription factor binding sites, or any significant region with a consensus sequence, in your sequence using a powerful feature called subsequence searches. This function allows you to keep a library of sequence patterns of either nucleic acid or proteins. You can use subsequences with complex patterns for the search as this function uses a powerful nomenclature (similar to Prosite’s) for creating patterns. Furthermore each pattern can have up to three distinct segments, separated by variable inter-segment regions, and you can control the overall similarity required for a match as well as defining residues which must be 100% conserved. Although you can easily create your own subsequence files of your own regions MacVector also ships with a number of collections of interesting sites of both proteins and genes.

In our recently completed survey one common request was promoter analysis. MacVector has subsequence files of transcription factors for analysis of promoter regions. MacVector includes some transcription factor subsequence files and recently we have also added the Jaspar transcription factor database.  This open database is available as a set of profiles, however, we have converted this to a subsequence library. By its nature searching using a regular expression match is less sensitive than a profile based search. However, MacVector does possess a profiles search so if you are looking for a particular transcription factor site, then you can search with a profile from the set of the original Jaspar profiles (in TRANSFAC profile format) that are also supplied. This function that will take one or two profiles and search a sequence for likely binding sites of those profiles.

The most recent Jaspar subsequences and profiles were generated from the JASPAR_CORE 2010, All_Species redundant set of profiles. This is described on the Jaspar website:

“The JASPAR CORE database contains a curated, non-redundant set of profiles, derived from published collections of experimentally defined transcription factor binding sites for eukaryotes. The prime difference to similar resources (TRANSFAC, etc) consist of the open data access, non-redundancy and quality.”


The Jaspar subsequence files will be freely downloadable from our downloads page.

Technorati Tags:
, ,

This entry was posted in Techniques, Tips and tagged , . Bookmark the permalink. Both comments and trackbacks are currently closed.