Explanation of the Bioinformatics Code

Update: April 11, 2007: I have a screencast talking about this code.

s = s.split(/\B/).to_a.collect!{|base| @@basecomplement[base]}.to_s

This statement says:

1. s.split(/\B/)Take the sequence and split on non-word boundaries (in an unrelated example it would match hello,there but not hello there)

2. to_a — convert to an array

3. collect!{|base| @@basecomplement[base]}.to_s
Take the array, go through each base, find its complement, and replace the base with its complement. Then create a string once again and put the string into s.

@@basecomplement = {‘a‘ => ‘t‘, ‘c‘ => ‘g‘, ‘t‘ => ‘a‘, ‘g‘ => ‘c‘}

This is just based on the hash literal syntax for Ruby(You are mapping a onto its complement t, etc.)


Global substitute t with u.

codons = Array.new
ending = @len – (@len % 3) – 1
0.step(ending,3){|i| codons.push(@seq[i..i+2])

The last expression says “Start from 0 and and end at ending and step by 3 each time. At each iteration, add the @seq[i], @seq[i+1], and @seq[i+2] to the codons array.”

Just a couple of other comments. In Ruby, a variable with @ prefixed before it is an instance variable for an object. @@ before an identifier denotes class variable. The rest of the code is pretty straightforward. Drop me a comment if you are unclear.

(Start Legal Stuff):The disclaimers listed on the GPL(see my previous post on Bioinformatics) still apply. (End Legal Stuff)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s