MacVector, Universal Type Identifiers and File Extensions

One feature that Macintosh applications have typically provided since the very beginning has been the ability to associate a document with an application such that opening the document (double clicking) would launch the appropriate application. Historically, on the Mac OS, this has been accomplished by including two additional pieces of information stored in the directory entry for the document: type and creator.  The document type is a 4 character identifier such as ‘TEXT’ or ‘NUCL’ which identifies the format of the document. The document creator is another 4 character identifier that is unique to an application – for example, MacVector uses the identifier ‘MVTR’. Although for the most part this has worked well as a Macintosh specific solution, there are a number of issues that have arisen, such as what should happen when a document was on one floppy disk and the application was on another.
A greater problem has arisen with the increase in computer networking and the integration of Macintosh computers into mixed computing environments. These days, files often come from many different computers using differing protocols to move the data between machines.  To solve this problem, Apple Computer developed a new technology known as Uniform Type Identifiers, or UTIs for short, that help software developers associate documents with their application even in a multi-platofrm environment. UTIs allow documents created on a Macintosh that might use the old type and creator, files created on a PC that might use file extensions or even files on served up by a web server using Multipurpose Internet Mail Extensions (MIME type) to all be associated with a specific application.
MacVector 10.5 uses UTIs and adds the common extensions and MIME types used by bioinformaticians to simplify the handling of sequence files obtained from different sources. So, for example, a UTI called “public.text” can have a type of ‘TEXT’, an extension of .txt or .text and a MIME type of “text/plain”. While there are no official assignments for sequence files, there are a number of common conventions. Again, for an example, a UTI “biosequence.genbank” might have a type ‘NUCL’, extensions .gb, .gen or .gbk, and a MIME type of “chemical/seq-na-genbank” or “chemical/x-genbank”. The system is still not perfect – for example, many sequence formats are represented as simple text files with a .txt extension, so MacVector will typically not be the default application to open such files. In these cases, you need to launch MacVector and open the file directly from the File menu. MacVector will then directly examine the contents of the file to determine the “real” type of the file.
If you are using MacVector on a Macintosh in an environment where many of your colleagues are using PCs, Unix machines or downloading sequence data from the Internet, the support for UTIs in MacVector 10.5 should make it so much easier to share data with your collaborators. You can find the list of UTIs used by MacVector in the knowledge base section of our web site.

Technorati Tags:

This entry was posted in Development, General. Bookmark the permalink. Both comments and trackbacks are currently closed.