Installing ABySS
To install ABySS on an system running an older version of gcc and use the following commands.
>./configure --enable-maxk=96 --disable-openmp \
CPPFLAGS=-I<path to google-sparsehash install>/include \
--prefix=<ABySS install directory>/
>make AM_CXXFLAGS=-Wall
>make install
Word by word in Terminal
One of the annoying things in OS terminal is that if you want to traverse word by word in a line of text you need to type 'esc-b' and 'exc-f'. This post on macromates, the textmate blog, explains how to reset these keys. Enjoy.
MultiMarkdown
Since I started using github in a serious way back in January I've begun writing my documentation in the markdown format that displays so nicely on github. Markdown is essentially a parsing tool and a simple text syntax that allows the easy conversion of human 'readable text' to html. It's intuitive, it took less than 5 minutes to pick up, and saves me a ton of time not writing HTML. However, its ease of use is tempered, a bit, by a lack of features. Although it is easy to create headers, lists, and code bocks - simple HTML stuff - it doesn't include the option to create tables, formated mathematical formulas, citations and bibliographies. Since I'm a scientist who wants to produce documents with these sorts of features, this is annoying.
Luckily, the markdown syntax has recently been extended, in a project called MultiMarkdown, to include many of the aforementioned features. Multimarkdown essentially merges the markdown syntax with LaTeX which, if you haven't heard of it, is a rather inscrutable, but extremely powerful text formatting language. It's popular in the CS and physics disciplines. LaTeX produces beautiful documents, but it's easy to spend a week or more adjusting the formatting and reading the API trying to figure out some of the more complicated features. Multimarkdown looks like it will do much of the more basic LaTeX formatting, but without the headache.
Vertebrate Zoology: Bi302
Welcome to Vertebrate Zoology lab (Bi302 Lab). I'll be using this portion of my website to post notes/slides and to answer your questions. Please make use of the comments. More to come.
40 Essential Tools and Resources to Visualize Data | FlowingData
This looks incredibly useful. I really need to sit down and learn Flash and Processing.
40 Essential Tools and Resources to Visualize Data | FlowingData.
Wallace’s Insect Collection Found!
Via the New York Times
The owner wanted a sum that far exceeded Mr. Heggestad’s budget — a colossal $600. “I was just out of law school, I had no money and no business buying it,” he said. But the owner was willing to take installments of $100 a month, and into Mr. Heggestad’s possession fell an incomparable scientific treasure.
The cabinet belonged to Alfred Russel Wallace, the English naturalist who conceived the idea of evolution through natural selection independently of Charles Darwin.
Wow!
Museum to Display Historic Cabinet That Belonged to Alfred Russel Wallace - NYTimes.com.
Blue Collar Bioinformatics
Just wanted to recommend Blue Collar Bioinformatics a slick blog with lots of useful bioinformatics scripts. Everything is written in python and the full working source is typically available on GIT.
Targeted Sequencing Bags a Rare Disease
This looks neato. One of the first papers to use the Agilent Tech to do targeted re-sequencing. I can't wait to get my hands on a PDF.
The impressive economy of this paper is that they targeted (using Agilent chips) less than 30Mb of the human genome, which is less than 1%. They also worked with very few samples; only about 30 cases of Miller Syndrome have been reported in the literature. While I've expressed some reservations about "exome sequencing", this paper does illustrate why it can be very cost effective and my objections (perhaps not made clear enough before) is more a worry about being too restricted to "exomes" and less about targeting.
More @ Omics! Omics!
F$@%ing R: Adventures with Tcltk in OSX
I've got a bunch of RNA-seq reads I need to analyze and for the the most part I've been writing my own code to do the analysis. However, a recent paper in BioInformatics (Wang et al. 2009) describes a new R package for the identification of differentially expressed genes in RNA-seq datasets. R is a pretty straightforward language with a built-in installation system so I should just have to type two lines of code...
source("http://bioconductor.org/biocLite.R")
biocLite("DEGseq")
Not so quick. When I ran this code R tells me it can't find the DEGseq library. A bit more poking around on the internets and I discover that there's an alternate download site:
source("http://bioinfo.au.tsinghua.edu.cn/software/degseq/DEGseqInstall.R")
But after installing some dependancies it also spits out a bunch of errors. I compare the errors... Hmmm... Both installs appear to by dying on the tcl/tk install, but tcltk is a default R library. I can see it right there in "/Library/Frameworks/R.framework/Resources/library". Two hours later and after trying a bunch of crap I find this helpful website:
http://cran.r-project.org/bin/macosx/tools/
A quick and dirty install of the tcltk-8.5.5-x11.dmg and now "library(tcltk") works like a charm. No errors.
I install DEGseq with the following set of commands:
source("http://bioconductor.org/biocLite.R")
biocLite("DEGseq")
Now, a day an a half later I can see if it's useful. Woo.
Citations:
L Wang, Z Feng, X Wang, X Wang, X Zhang. DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics (2009)
