Sequence editors for biologists on Linux Print E-mail
Friday, 05 October 2007 09:10
Article Index
Sequence editors for biologists on Linux
Functions
Conclusion
All Pages

I migrated a year ago from Windows to Ubuntu "Edgy Eft" Linux which quickly became my main operating system. It's there that I got the problem that each migrant gets at one point: How do I replace those specific softwares which I so badly need for work?

I was well accustomed (addicted?) to Vector NTi (Developed by Invitrogen), the reference for DNA/Protein sequences management and edition softwares in the Windows environment. Here, I write about a software which is a good alternative to Vector NTi for basic needs, and works in the Linux, Windows and Mac OS environments, please let me introduce CLC Free Workbench.

Vector NTi is a nice piece of software for biological sequences edition. A free version with basic functionalities is even proposed to students. This software only runs on Windows and was the last reason I was keeping Windows with a dual boot configuration. Of course, I don't have to stress too much how rebooting only for one specific program can be inefficient and annoying over time.

A first workaround was offered by VMware Server, a virtualization software which is able to run Windows from its partition in Linux. This allowed me to just switch from one system to the other through a simple window switch... Sounds perfect?... Well, that was just no good! VMware has to boot and run Windows on the same resources as Linux resulting in a mutual slowdown, and giving a software (a wonderful software by the way, but programmed by humans nonetheless) direct write permissions to your raw partition is really hard for the nerves....

After several months of struggle, I digged a little further in an alternative solution which had not worked in some first trials because of my inexperience with Linux: CLC Free Workbench. This review is about this nice biological sequence editor developed by CLC Bio which has nearly all the functions I need: A nice and easy graphical interface, a simple database to manage the sequences of genes and plasmids and basic notions of molecular biology (ORF finding, restriction analysis, etc...).

CLC Free Workbench 4.0.2

CLC Free Workbench is a free and basic version of the CLC Combined Workbench by CLC Bio. It was developed in Java and therefore runs on either Windows, MacOS X or any Linux system on which the Java interpreter is installed.

Installation

Sun Java Development Kit installation

CLC Free Workbench was developed in Java and may therefore be run on either Windows, MacOS X, or any Linux system on which the Java interpreter is installed. So first things first, one needs to install Sun's Java packages on his linux distribution (Here described for Kubuntu "Gutsy Gibbon") before attempting anything else. This is made easy with the following command in a console:

$ sudo apt-get install sun-java6-jdk

A quick check that the Sun Java platform is activated by default is recommended:

$ sudo update-alternatives --config java

If it is not, instructions are given to activate it.

CLC Free Workbench 4.0.2 installation

CLC Free Workbench may be downloaded on CLC Bio's website. The installer comes as a single .sh file which must be made executable before it may be launched as root to be granted with access to the system folders:

$ chmod a+x CLCFreeWorkbench_4_0_2.sh
$ sudo ./CLCFreeWorkbench_4_0_2.sh

A wizard guides the user through easy intallation steps. Accept the invitation to start CLC Free Workbench to accept the license (you are still root). One small trap: do not start right away to use the software as everything will be saved in root's folder and with root privileges. You may relaunch it later as a simple user through the installed shortcuts or through the command:

$ clcfreewb4


Functions

I'll detail here the CLC Free Workbench's functions for DNA sequences. It also manages RNA and protein sequences, but does not feature specific functions for those that are not proposed for DNA.

Layout

DNA sequences may be displayed in two modes: a sequence and a map mode.

Image
Sequence layout
Image
Map layout

The software gives a nice look and feel and offers many display parameters to improve readability according to anyone's standards. My only disappointment was that linear DNA fragments may only be viewed as circular in map mode... Strange...

Restriction analyses

The main tool for molecular biologists will certainly be restriction analysis... which CLC Free Workbench does definitely well! It comes with a preloaded and quite comprehensive database of restriction enzymes which may be sorted by supplier, overhang types, "palindromicity" and popularity. Custom lists may be saved to improve work efficiency. Restriction sites are shown in the display pane with their names and overhangs while one will find the total number of sites for each enzymes in the settings pane.

Image
Restriction analysis

I guess the only preferences missing are a conditional display depending on the number of sites present in the considered sequence and highlighting of unique cutters in the display pane.

Alignements and philogeny

Alignements and philogenic trees are two basic features for molecular biologists. CLC Free Workbench's main weakness might reside here. Only simple tools are offered with only few options. The parameters for alignements are the traditional "gap opening" and "gap extension" and "gap closing" costs with two levels of accuracy. The user may choose neither a specific algorithm nor a specific score matrix. In conclusion, CLC Free Workbench's alignement tool and its depending phylogenic tree builber suffice for basic analyses, but might not be suitable for more careful bioinformatic analyses.

Image
Sequence alignement
Image
Phylogenic tree

ORF finding

The open reading frame (ORF) finding tool is also an essential. It is included in CLC Free Workbench and its output may be displayed in both layouts and saved in the "annotations" list of each sequence. The algorithm was updated in version 4.0.2 to correct a bug in ORF finding for circular DNA sequences.


Conclusion

I've now been working with CLC Free Workbench for six months and I am still happy with it. Although it features relatively basic functions, I found a software that was free and matched my needs for everyday's work as a molecular biologist.

CLC Combined Workbench

As I mentioned above CLC Free Workbench is in fact a free teaser to advertise for CLC Bio's software suite, CLC Combined Workbench. Although I did not test it, the Combined Workbench contains functions for BLAST searches, primers design, chromatogram viewing, 3D molecular modeling and more... It seems really nice! I would definitely have purchased it if it was sold for a few hundred dollars. However the price is still prohibiting for me: 2250 USD for an academic license (4500 USD for a comercial license). The price may be lowered to 1500 USD (3000 USD for a commercial license) by purchasing only one of the three sets of tools (for DNA, RNA and protein sequences) which are comprised in CLC Combined Workbench... Well, anyway I need features from at least two of them...

Geneious

To my knowledge, no free comparable GUI biological sequence manager is available for Linux at present time. Note that I precise "free" as another Java-based software exists, I named Geneious (Developed by Biomatters Ltd). I was thinking at first of writing a comparison between the two softwares, but the free version of Geneious does not compare to CLC Free Workbench. It is merely a demo version with a 14 days trial which is not suitable for the basic needs of a biologist. This may come from the fact that the marketing strategies behind the two suites are somewhat different: While CLC Combined Workbench is relatively expensive, its free version is well equiped. On the other hand, the free Geneious version consists only of a demo, but the full software is relatively less expensive (249 USD for students, 495 USD for academics and 995 USD for commercials).

Comments
Search
molgyk  - CLC free workbench 4.0.3   |2007-10-30 20:01:16
A new version (4.0.3) of CLC free workbench has been released mainly to fix some bugs. The installation procedure and the interface are still the same.
misha680  - Vector NTI runs very well in wine   |2007-12-13 07:17:33
Btw, thanks to some work myself and others have done in the previous year, Vector NTI actually runs very well in Wine, and this is definitely an alternative for people using Ubuntu. Basically, it's as simple as really just double-clicking on the installer if you have Wine installed (just "sudo aptitude install wine" first will do); if you'd like a little more of the tools menu (Web ordering) you'll have to dig around for mfc42.dll on google and (i) run "wine" with no parameters to create ~/.wine, (ii) copy mfc42.dll to ~/.wine/drive_c/windows/system32, and (iii) double-click on the installer. Only real big thing not working is online help. Hope this helps.
molgyk  - Thanks for the precision   |2007-12-13 13:33:11
I must say I only tried more than a year ago to run Vector NTi under Wine. By the time, it was not working as simply and I did not check for news since then. It's really good to hear that it may now be run under Linux! Thanks for the post.

A tiny question though: what version of Vector NTi runs now in Wine (if not all)?
misha680  - Which version?   |2008-01-15 00:27:00
Sorry didn't mention the version number. The latest that is downloadable from Invitrogen Vector NTI Advance 10 now works very well (perfectly except no online help index/searching, only context-based) on wine. I use it almost every day.
Geneious Developer  - Geneious doesn't only include a demo   |2008-04-08 07:03:20
When you download Geneious, it runs as Geneious Pro for 14 days and then reverts to Geneious Basic. You can continue to use Geneious Basic for as long as you want, so it's not just a demo as you say.

Also, unlike CLC the free version of Geneious will occasionally (about once in 8 days I think, but not in regular intervals) give you free use of the Pro features - it's called "Geneious Day".

Also, a new version 3.6.2 of Geneious is out and contains loads of new features and improvements.
molgyk  - Geneious   |2008-04-14 22:03:52
Thanks for the comment.

My post is becoming a bit old. Still, when I tested Geneious (I should have mentioned the version), I just thought the free "demo" version was not suitable for the simple everyday needs of a molecular biologist.

That said, I'll be glad to test the new features of Geneious soon!
Only registered users can write comments!

!joomlacomment 4.0 Copyright (C) 2009 Compojoom.com . All rights reserved."

Last Updated on Friday, 03 July 2009 21:47