General musings from the MacVector team about sequence analysis, molecular biology, the Mac in general and of course your favorite sequence analysis app for the Mac!

101 things you (maybe) didn’t know about MacVector: #29 – Option-Click to Close All Windows

If you are like me, you often find yourself with many, many windows open while using MacVector. Sometimes you just want to get rid of everything and start all over again with a different project. You can always just quit MacVector and start again, but there is an easier way:

Hold down the option key while clicking on the red close button of any window. All of the currently open windows will close. If there are any windows with unsaved changes, these will stay open and you will be prompted to save each open file;

OptionSave.png

You can perhaps see better how this works by looking at the MacVector File menu. Normally, if you open that menu, it will simply have the Close option;

FileClose.png

But if you hold down the option key, the menu item changes to Close All…

FileCloseAll-1.png

This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

Posted in Uncategorized | Leave a comment

101 things you (maybe) didn’t know about MacVector: #28 – Identifying Methylation Blocked Restriction Sites

A big thanks to Jeffrey Dvorin at Boston Children’s Hospital for this great suggestion.

Most common laboratory strains of E. coli contain a number of methylase enzymes that modify DNA residues, preventing certain restriction enzymes from cutting DNA isolated from those strains. The two most relevant enzymes are the Dam methylase that methylates the A in the sequence GATC and the Dcm methylase that methylates the second C in the sequences CCAGG and CCTGG. For a more detailed description of the Dam and Dcm methylases, check out this page on the New England Biolabs website.

Some enzymes are entirely blocked by this methylation. For example MboI recognizes the sequence GATC, the same as the Dam methylase recognition sequence, but is blocked from cutting if the A is methylated. Thus MboI does not cut DNA isolated from a Dam+ E. coli strain. However, Sau3A also recognizes GATC, but is unaffected by the methylation, so cuts Dam+ E. coli DNA completely.

The situation with other enzymes can be more complicated – ClaI recognizes the sequence ATCGAT, but will not cut if either of the A residues are methylated. That means that if the site is flanked by a 5′ G or a 3′ C residue (e.g. GATCGAT or ATCGATC) then one of the A’s will be methylated and the site will not be cleaved. However, other sites (e.g. TATCGATT or AATCGAT etc) do not contain Dam methylase sites and so will be cleaved normally.

MacVector does not have any direct support for identifying cleavage sites that are blocked by the Dam or Dcm methylases. However, thanks to a trick that Jeff Dvorin suggested, there is an easy way to display them on maps. The basic idea is that you modify the restriction enzyme files you normally use for RE searching to include customized methylation site entries. Lets look at an example using the ClaI enzyme discussed above;

DAM-ClaI.renz.png

In the screenshot above, I’ve modified the default Common Enzymes.renz restriction enzyme file to include a new site called DAM-ClaI that is almost identical to the normal ClaI site except that it contains an additional 3′ C residue. Because DNA is double-stranded and MacVector searches both strands, this site will find both GATCGAT and ATCGATC. So now, if I search a DNA sequence with both of these enzymes selected, I can easily see which sites will be blocked by Dam methylase activity. Here’s an example – its obviously artificial as you would never normally see so many ClaI sites in such a short piece of DNA – but you can clearly see the ClaI sites that would be blocked as they have both ClaI and DAM-ClaI sites at the same position on the DNA strand;

ClaISampleSequence.png

Note that the two DAM-ClaI sites are ...GATCGAT... and ...ATCGATC... demonstrating that the single DAM-ClaI site in the .renz file does indeed identify both types of site.

So, we have a simple way of identifying sites that will be blocked by the Dam and Dcm methylases. All you have to do is individually enter the sites listed on the New England Biolabs page into each of your restriction enzyme files and and you are done! What? Think that’s too much work? Of course we have a file to help you out! If you download this file from our website, we’ve already included all the methylation-sensitive restriction enzyme sites and labelled them based on their sensitivity to either the Dam or Dcm methylase. You do need to add all of the sites to your favorite enzyme files, but that is pretty simple;

(a) Use File | Open and locate and open the Methylation-Sensitive Restriction Enzymes.renz file.

(b) Use File | Open and locate and open your favorite restriction enzyme file (e.g. Common Enzymes.renz or New England Biolabs.renz).

(c) Bring Methylation-Sensitive Restriction Enzymes.renz to the front, choose Edit | Select All, followed by Edit | Copy.

(d) Switch to your target .renz file and choose Edit | Paste. The DAM and DCM enzymes will get pasted into the file. Save the file.

Now, whenever you use that file for searches, the methylation sensitive versions will show up in the results. However, note the following caveats;

(i) If you run a search using “Selected Enzymes” (and that is the default for the automatic search that is run whenever you view the Map tab of an open DNA sequence), you must also make sure any corresponding DAM- or DCM- enzymes are checked to be sure they will appear in the results.

(ii) If you limit the results of searches based on the number of cut sites, you may not see the results you expect. For example, in the above ClaI example there were 4 ClaI sites and 2 DAM-ClaI sites. If you set the filter to only show enzymes that cut twice or less you would see only the DAM-ClaI sites. Conversely, if you set the filter to show enzymes that cut between 3 and 5 times, you would see the ClaI sites, but not the DAM-ClaI sites.

We’ll likely incorporate this functionality into a future version of MacVector with a simplified user interface. But for now, this is a great way to identify those pesky methylation blocked sites and works for any version of MacVector.

This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

Posted in 101 Tips | Leave a comment

101 things you (maybe) didn’t know about MacVector: #27 – Documenting Gibson Assemblies

We’ve had a few MacVector users ask us recently if MacVector can be used to create constructs using the increasingly popular “Gibson Assembly” method. For those who are not familiar with this method, it was first described in 2009 by Daniel Gibson of the J. Craig Venter Institute. They showed that you can assemble multiple overlapping fragments in a single reaction to seamlessly join synthetic and natural genes to potentially create entire pathways or even genomes. Although they were originally using DNA fragments with relatively long overlaps (~450bp) to assemble large chunks of a genome, it has subsequently been shown that the reaction can be driven with overlaps as short as 20bp. This means that you can construct vectors from up to ten or more separate fragments (typically themselves generated by PCR with customized primers) in a single reaction, with no restriction enzymes required, as long as you design the fragments appropriately with unique 20+bp overlaps at each end. For more details on the technique there is a Wikipedia article and a University of Cambridge practical guide.

The key to emulating these constructs in MacVector is to create a “fake” restriction enzyme using the sequence of each overlap. You can then manipulate and join the overlapping ends using MacVector’s Cloning Clipboard functionality. Lets walk through a simple example so you can see how this works. Note that to get this to work as described, you MUST be using at least MacVector 12.7.4. You can download the update here.

For this example I’ll start with two cassettes – the first is for a tetracycline resistance gene;

TetCassetteMap.png

and the second is for an ampicillin resistance gene;

AmpCassetteMap.png

You can see that the two cassettes share a 30bp overlap for which I’ve created a feature labelled as “overlapB” with a blue arrow head.

The first thing I’m going to do is create new fake restriction enzyme sites representing each of the three overlaps. While these could be created in a new “.renz” MacVector Restriction Enzyme file, I’m going to add them to the /Applications/MacVector 12.7/Restriction Enzymes/Common Enzymes.renz file. This is the default enzyme file that is used to display restriction enzyme cut locations in the Map tab of any opened DNA sequence. I can open the file in MacVector and then click on the Add button to drop down the enzyme editor sheet;

REEditorEmpty.png

Next I switch back to my tet caassette sequence and select and copy the sequence corresponding to “overlapA”. Because I’d already created a feature all I have to do is click on the overlapA feature in the Map tab and then choose Edit | Copy. I can then switch back to the enzyme editor window and paste the sequence into the upper pane;

EnzymeEditorFilled.png

Here you can see that by default, the little arrows above and below the sequence are positioned at each end – this means that MacVector is treating the sequence as if it created a long 30 nucleotide 5′ overhang. While in reality this is a recombination sequence, not a restriction enzyme site, we will see later that this overhang is extremely useful as it only lets us join fragments that share the same recombination sequence. I can name the “site” and then repeat with overlapB and overlapC.

Finally I make sure that all three sites have their checkboxes selected in the list view;

SelectedOverlapSites.png

The cassettes are automatically updated to show the new overlap “cut sites”;

TetCassetteShowingOverlaps.nucl ― Map.png

I can now click on the overlapA site, hold down the shift key, select the overlap B site, click on the Digest button and the “cleaved” fragment appears on the Cloning Clipboard – here shown with the amp cassette digested in the same way;

CloningClipboardAmpTet.png

I can then click on the right hand end of the tet cassette and drag a line – as I do that the compatible end of the amp cassette is shown with a black dot – incompatible ends are shown with a grey dot;

DragOverlapCloning.png

Finally, when I let go of the mouse over the target end, the two fragments are joined together;

CloningClipboardAmpTetJoined.png

Using this principle you can build up Gibson Assemblies from as many fragments as you like, then finally circularize the built-up vector when you have finished (assuming the last two ends are matching overlaps!). In addition, as you slowly build up a collection of overlap sequences in the restriction enzyme file, you can immediately identify those overlaps in any construct simply by opening the sequence and examining the Map view, or by running a full Analyze | Restriction Enzyme search using your edited .renz file.

This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

Posted in 101 Tips | Leave a comment

MacVector 12.7 Training Workshop, Weatherall Institute for Molecular Medicine

When: Tuesday, 23rd of April 2013, 14:00 until 16:00
Where: WIMM Seminar Room, WIMM

Chris Lindley of MacVector, Inc. will be giving a workshop for both novice and advanced users of MacVector, reviewing both basic and advanced functions in MacVector. In particular, he will highlight the new functionality introduced over the last two years to MacVector. The format is very informal and participants are encouraged to ask questions and help direct the workshop towards areas of the most interest.

The workshop is open to all in the institute.

Please register for the workshop for by emailing Nicki Gray at the WIMM directly.

Technorati Tags:

Posted in Meetings, Tutorials | Tagged | Comments closed

MacVector 12.7 Training Workshop at Emory University

When: Monday March 25, 1:00-3:00 pm
Where: Calhoun Auditorium, Clinic B building “Tunnel Level"

Dr. Kevin Kendall of MacVector, Inc. will be giving a workshop for both novice and advanced users of MacVector, reviewing both basic and advanced functions in MacVector. In particular, he will highlight the new functionality introduced over the last 2 years to MacVector. The format is very informal and participants are encouraged to ask questions and help direct the workshop towards areas of the most interest. Refreshments will be provided.

The workshop is open to faculty, students and researchers.

Please register for the workshop for by emailing Rosemary Maratta.

Technorati Tags: ,

Posted in Meetings, Tutorials | Tagged | Comments closed

Using Blast and Entrez in MacVector with a proxy server

Blast and Entrez connect to the NCBI server using the normal “http” ports (exactly the same as if it was a web browser). If a web browser can access the NCBI’s server then MacVector should be able to.

The address for the Blast server is:

http://www.ncbi.nlm.nih.gov:80/blast/Blast.cgi

However, if you get the following error message then it’s possible that a proxy server is installed on your network and will need to be configured within MacVector.

Screenshot 08 02 2013 12 27

MacVector does not honour the system wide proxy settings that are configured in the System Preferences of OS X. This is due to the NCBI toolkit that MacVector uses to access Entrez and Blast. The only way that this can be configured in MacVector is to place the configuration for the proxy server settings in the following file:

~/Library/Preferences/ncbi.cnf

(That’s the Library folder in the user’s HOME directory and not the system Library folder).

The configuration is as follows where “PROXY SERVER HOSTNAME is the hostname or IP address of the proxy server.

[CONN]
HTTP_PROXY_HOST=
HTTP_PROXY_PORT=3128

Incidentally the Library folder is hidden by default in Lion and Mountain Lion. To view this hold down the OPTION key and then click on the GO menu in Finder.

You will need to be running MacVector 12.6 or higher for this to work.

Technorati Tags: ,

Posted in Tips, Tutorials | Tagged | Comments closed

101 things you (maybe) didn’t know about MacVector: #26 – Creating Features Automatically

Following on from my recent posts on manual and semi-manual creation of features, the next approach I want to discuss is a fully automated function for creating features. How often do you get sent plain sequences that have no features annotated, even though you know your favorite gene is on there, along with a few common cloning vector features like antibiotic resistance genes, multiple cloning sites and replication origins? Or perhaps you have found a vector on a manufacturer’s web site, but it has no annotations, perhaps just a PDF document listing the start and stop locations of the interesting features. Or maybe you’ve downloaded a sequence from Entrez and although it is fully annotated, you’d prefer to see the features colored to match your favored style? For any of these problems, MacVector has an Auto-Annotation function to simplify life.

The way this works relies upon you having a folder (or hierarchy of folders) containing previously annotated sequences. You can then open the plain sequence you have been sent and scan it against the folder(s) of annotated sequences. MacVector will examine each feature it finds and see if it is present on your input sequence. When it finds a match, it not only creates a new feature, but it also copies the graphical feature appearance information so that the new feature exactly matches the one in the folder. There are settings to allow a little fuzziness in the search to allow for a few mismatches or insertions/deletions. You can even control how MacVector handles matches to features that have already been annotated so that the existing annotation is left alone, but the graphical appearance is changed to match the feature in your folder.

If you don’t already have a collection of nicely curated and annotated sequences, you can start with some of the vectors that we supply. The folder /Applications/MacVector 12.7/Common Vectors/ has formatted vectors from a number of manufacturers (Invitrogen, New England Biolabs, Promega) along with common series of vectors (pUC, pBR, pACYC, pGEM, M13, pBluescript) and a starter folder of AnnotatedFragments containing a selection of common cloning vector elements.

Lets walk through the auto-annotation of a cloning vector from Invitrogen. For this example, I’ve chosen pBLOCK-iT 3-DEST, a vector that is not included in MacVector’s /Common Vectors/Invitrogen/ vector collection. Here’s the Map of the vector from the Invitrogen web site;

https___tools.invitrogen.com_content_sfs_vectors_pblockitdest_map.pdf.png

The actual sequence can be copied from the site, but no annotations are included. Here you can see the empty Map of the sequence once I copied it, pasted into a new DNA sequence window and saved it;

pBLOCK-iT3-DEST.nucl ― Map.png

The next step is to invoke Database | Auto-Annotate Sequence;

Sequence Auto-Annotation.png

Here I’ve chosen the Invitrogen vectors folder (/Applications/MacVector 12.7/Common Vectors/Invitrogen/) as the search folder and I’ve set the Feature Modifications to Replace only graphics for existing features. In this case, the source sequence has no existing annotations, so its not particularly critical to choose this. However, if you were starting with a partially annotated sequence you could select this to make sure that none of your annotations were changed. Finally, we click OK to start the search.

When the search completes, a Summary sheet gives us information on how many features were scanned and how many were added to the vector;

Sequence Auto-Annotation Results.png

The Map updates to show all of the graphical annotations;

pBLOCK-iT3-DEST.nucl ― Map-1.png

Now, in this case, I got pretty lucky and all of the features in the PDF map have been found on the vector. If one or two were missing, it would be simple to manually add them using the feature editor I described a couple of blog posts ago. By then saving the newly annotated vector into my Invitrogen folder, I would then automatically get those new features included in any future auto-annotation search.

Posted in 101 Tips | Comments closed

101 things you (maybe) didn’t know about MacVector: #25 – Creating Features from Analysis Results

In my last post I described how you can quickly create and annotate features onto a DNA sequence, although the post was primarily aimed at users who are new to MacVector. In this post I’ll take a look at how you can quickly and easily annotate a DNA sequence with features based on the results of a MacVector analysis. For example, we’ll see that you can run an Open Reading Frame search, then click on the graphical object representing a found ORF, and annotate that as a CDS feature.

Most of these workflows require you to right-click in a window to bring up a context-sensitive menu. If you don’t have a two button mouse, you can accomplish the same thing by holding down the ctrl key and clicking the mouse button.

Restriction Enzyme Sites: After running a restriction enzyme analysis (Analyze | Restriction Enzyme Analysis…), you can right-click on a site in the Cut Site Map to automatically create a misc_feature feature at the cut site;

CreateREFeature.png

A new feature is automatically created and displayed at the appropriate location;

NewPstI Feature.png

And can also be seen in the Features tab;

NewPstFeatureInFeatureTab.png

While misc_feature is really the best feature type you should use for indicating a restriction enzyme cut site, there may be times when you would prefer to use a different feature type. If you hold down the option key, the menu item changes from “Create misc_feature Feature” to simply “Create Feature…“;

OptionClickCreateREFeature.png

After selecting the Create Feature… menu item, the normal Feature Editor pane appears, pre-filled with the location information and a suitable description, but now offering you the opportunity to change the feature type, or any other information you would like to have associated with the feature;

FeatureEditorFromRECreateFeature.png

Subsequence Search Matches: Very similar to restriction enzyme searches, you can annotate the results of subsequence searches, by default as misc_feature features. Unlike restriction enzyme searches, which return a specific single cut site, subsequence searches return the start and stop location where the subsequence matches, so the corresponding feature that is created has both a start and a stop location, and is also strand specific. The features are created using a Hollow Arrow feature type to indicate the start, stop and strand information (e.g. when the match is to the minus strand the arrow will point in the opposite direction). Here’s an example from a subsequence search of an .nsub file containing common sequencing primers against pUC19. In this case, I used the option key approach described above to set the feature type to primer_bind rather than the default misc_feature;

pUC19 Subsequence Map.png

Open Reading Frames: If you use the Analyze | Open Reading Frames menu option to find open reading frames on a DNA sequence, you can click on an ORF result feature to create a new CDS feature;

CreateCDSFeature.png

Primers and Predicted PCR Products: From the Analyze | Primers | Primer3 analysis results Map window, you can right-click on either an individual primer (orange in the image below) or on a predicted product (purple);

Primer3CreateFeature.png

Now the great thing about all these feature creation functions is that they always generate a totally GenBank compliant annotation. MacVector really does a good job on adhering to the GenBank standard for sequences and annotations. If you ever want to save a sequence in GenBank format, select the File | Save As… menu option then choose GenBank Text Format from the Format menu. Now, you can also save sequence files in a format ready for use by the NCBI Sequin program, but maybe thats a post for another day.

This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

Posted in 101 Tips | Tagged , , | Comments closed

Season’s greetings from the MacVector team

HappyChristmas2012There is no time more fitting to say “Thank You” and to wish you a Happy Holiday Season and a New Year full of health, happiness, and great results from your research.

Make sure you put down that pipette and relax , and if you’ve not updated to MacVector 12.7, make sure you take time over the holidays to treat yourself!

Posted in General, Releases | Comments closed

101 things you (maybe) didn’t know about MacVector: #24 – Creating Features

This is a tip primarily aimed at new users of MacVector, but may be of interest to anyone who wants better understand the way MacVector handles features. MacVector can create wonderfully detailed graphical maps of a sequence, showing all the points of interest, restriction sites, open reading frames etc. However, each item to be displayed (with the exception of automatically generated restriction enzyme sites) must be annotated on the sequence as a feature. If you open any of the MacVector DNA sample files (e.g. pBR322 in the /Sample Files/ folder) you can see a list of all of the features in the Features tab.

pBR322 ― Features.png

MacVector considers any annotation that has a start and/or a stop location on the sequence to be a “Feature”. Other annotations that do not have start and stop locations (e.g. keywords, authors and publications) are displayed in the Annotations tab.

There are a number of ways to create a new feature in MacVector – the simplest is to click on the Create button on the toolbar;

FeatureCreateButton.png

This opens the main Feature Editor;

FeatureEditorEmpty.png

Initially, the start and stop locations are undefined, so you need to click on the plus button to open the Location Editor;

Location Editor.png

Type in the start and stop locations, then click OK. Back in the Feature Editor, choose a suitable Feature Keyword from the drop down menu. The list that appears is always the latest GenBank approved list of keywords. If you are annotating a protein coding sequence, be sure to use the CDS keyword. If you can’t find a suitable keyword in the list, then use one of the misc_XXX features.

There are two options for adding comments to a feature. The preferred approach to be fully GenBank compliant) is to use the Qualifiers tab and click on the plus button to add a new qualifier;

FeatureEditorQualifiers.png

This opens the Qualifier Editor that lets you select and valid qualifier for the type of Feature you are creating.

Qualifier Editor.png

You can add as many qualifiers as you like – the tet CDS in pBR322 has seven different qualifiers representing different information regarding the coding region.

Alternatively, you can enter Free Form text for the feature description;

FeatureEditorFreeForm.png

In this case, the comments actually get assigned to a generic “/note=” qualifier. When you finally click OK, a new feature of that type is created and appears in the Features table and also in the graphical Map tab.

A far more convenient way to create a feature with a specified start and end point is to start with a selection in the Editor or Map tabs. If you then choose the Create button, the Feature Editor will be pre-loaded with the selection, saving the step of opening the Location Editor;

pBR322EditorWithSelection.png
FeatureEditorPreSelection.png

MacVector always remembers the last Feature Keyword you used, again saving time if you are creating a lot of new similar features.

In the next post, I’ll discuss some of the shortcuts in MacVector for quickly creating features from the results of various analysis functions.

This is an article in a long running series of tips to help you get the most out of MacVector. If you want to get notified every time a new tip gets published, follow us @MacVector on twitter (or check the feed for the hashtag #101MacVectorTips) or like us on Facebook.

Posted in 101 Tips | Tagged , | Comments closed