Nick Crawford Evolution and more…

22Nov/090

Blue Collar Bioinformatics

Just wanted to recommend Blue Collar Bioinformatics a slick blog with lots of useful bioinformatics scripts.  Everything is written in python and the full working source is typically available on GIT.

28Oct/092

F$@%ing R: Adventures with Tcltk in OSX

RlogoI've got a bunch of RNA-seq reads I need to analyze and for the the most part I've been writing my own code to do the analysis.  However, a recent paper in BioInformatics (Wang et al. 2009) describes a new R package for the identification of differentially expressed genes in RNA-seq datasets.  R is a pretty straightforward language with a built-in installation system so I should just have to type two lines of code...

source("http://bioconductor.org/biocLite.R")
biocLite("DEGseq")

Not so quick. When I ran this code R tells me it can't find the DEGseq library. A bit more poking around on the internets and I discover that there's an alternate download site:

source("http://bioinfo.au.tsinghua.edu.cn/software/degseq/DEGseqInstall.R")

But after installing some dependancies it also spits out a bunch of errors.  I compare the errors... Hmmm... Both installs appear to by dying on the tcl/tk install, but tcltk is a default R library.  I can see it right there in "/Library/Frameworks/R.framework/Resources/library".  Two hours later and after trying a bunch of crap I find this helpful website:

http://cran.r-project.org/bin/macosx/tools/

A quick and dirty install of the tcltk-8.5.5-x11.dmg and now "library(tcltk") works like a charm.  No errors.

I install DEGseq with the following set of commands:

source("http://bioconductor.org/biocLite.R")
biocLite("DEGseq")

Now, a day an a half later I can see if it's useful. Woo.

Citations:

L Wang, Z Feng, X Wang, X Wang, X Zhang. DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics (2009)

14May/081

PYTHON Quick Links

I write a lot of code using the PYTHON Programing Language.  I just gave a very brief overview to a friend who has to learn it this summer. In the course of this lesson, it occurred to me that a lot of the bioinformatic resources that I use every day are not collected in one place.  So I've listed a couple of the most useful modules/packages below:

  • BioPython
    • This package lets you interface with NCBI, parse datafiles (e.g. fastas, Genbank, blast output etc.), run blast queries, run clustalw, etc.
  • SciPy
    • N-dimensional array manipulation
  • MatPlotLib
    • Graphing.
  • Python DB API
    • Database integration
  • Google App Engine
    • Free webhosting of python cgi scripts.  It's in beta.
  • Django
    • Python Web Application Development package. It can be used in conjunction with Google App Engine.

Here are a few addition sites that I find useful:

  • Python 2.5 Quick Reference
    • both html and PDF versions are available for free!
  • TextMate
    • OS X text editor. It's not free, but there is a student discount available
  • Forklift
    • OS X ftp program. Also not free, but reasonably priced.
Tagged as: 1 Comment