塩基配列からA,G,C,Tそれぞれの割合とGC含量を求める

塩基配列からA,G,C,Tそれぞれの割合とGC含量を求めてみましょう。

塩基配列からA,G,C,Tそれぞれの割合とGC含量を求める(Perl)

塩基配列からA,G,C,Tそれぞれの割合とGC含量をPerlのプログラムで求めてみましょう。

use strict;
use warnings;

my $sequence = 'ATGCAGCCCCGGGTACTCC';
my $length = length $sequence;

my $adenine_count  = 0;
my $guanine_count  = 0;
my $cytosine_count = 0;
my $thymine_count  = 0;

while ($sequence =~ /(.)/g) {
  my $char = $1;
  if ($char eq 'A') {
    $adenine_count++;
  }
  elsif ($char eq 'G') {
    $guanine_count++;
  }
  elsif ($char eq 'C') {
    $cytosine_count++;
  }
  elsif ($char eq 'T') {
    $thymine_count++;
  }
}

my $adenine_percent  = $adenine_count  / $length * 100;
my $guanine_percent  = $guanine_count  / $length * 100;
my $cytosine_percent = $cytosine_count / $length * 100;
my $thymine_percent  = $thymine_count  / $length * 100;

my $gc_content = ($guanine_count + $cytosine_count) / $length * 100;

print "Percentage of A: $adenine_percent%\n";
print "Percentage of G: $guanine_percent%\n";
print "Percentage of C: $cytosine_percent%\n";
print "Percentage of T: $thymine_percent%\n";

print "GC content: $gc_content%\n";

出力結果。

Percentage of A: 15.7894736842105%
Percentage of G: 26.3157894736842%
Percentage of C: 42.1052631578947%
Percentage of T: 15.7894736842105%
GCcontent: 68.4210526315789%

塩基配列からA,G,C,Tそれぞれの割合を求める。GC含量を求める。(SPVM)

塩基配列からA,G,C,Tそれぞれの割合とGC含量をSPVMのプログラムで求めてみましょう。

package Sequence {
  sub run : void () {
    my $sequence = "ATGCAGCCCCGGGTACTCC";
    my $length = length $sequence;

    my $adenine_count  = 0;
    my $guanine_count  = 0;
    my $cytosine_count = 0;
    my $thymine_count  = 0;

    for (my $i = 0; $i < $length; ++$i) {
      my $char = $sequence->[$i];
      if ($char == 'A') {
        $adenine_count++;
      }
      elsif ($char == 'G') {
        $guanine_count++;
      }
      elsif ($char == 'C') {
        $cytosine_count++;
      }
      elsif ($char == 'T') {
        $thymine_count++;
      }
    }

    my $adenine_percent  = 100.0 * $adenine_count  / $length;
    my $guanine_percent  = 100.0 * $guanine_count  / $length;
    my $cytosine_percent = 100.0 * $cytosine_count / $length;
    my $thymine_percent  = 100.0 * $thymine_count  / $length;

    my $gc_content = 100.0 * ($guanine_count + $cytosine_count) / $length;

    print "Percentage of A: $adenine_percent%\n";
    print "Percentage of G: $guanine_percent%\n";
    print "Percentage of C: $cytosine_percent%\n";
    print "Percentage of T: $thymine_percent%\n";

    print "GC content: $gc_content%\n";
  }
}

出力結果。

$ perl -I. -e 'use SPVM "Sequence"; Sequence->run;'
Percentage of A: 15.7895%
Percentage of G: 26.3158%
Percentage of C: 42.1053%
Percentage of T: 15.7895%
GC content: 68.4211%

塩基配列からA,G,C,Tそれぞれの割合を求める。GC含量を求める。(PerlからSPVM)

塩基配列からA,G,C,Tそれぞれの割合とGC含量をSPVMのプログラムで求めて、Perlから呼び出してみましょう。

# Sequence.spvm
package Sequence {
  sub percentages : double[] ($sequence : string) {
    my $length = length $sequence;

    my $adenine_count  = 0;
    my $guanine_count  = 0;
    my $cytosine_count = 0;
    my $thymine_count  = 0;

    for (my $i = 0; $i < $length; ++$i) {
      my $char = $sequence->[$i];
      if ($char == 'A') {
        $adenine_count++;
      }
      elsif ($char == 'G') {
        $guanine_count++;
      }
      elsif ($char == 'C') {
        $cytosine_count++;
      }
      elsif ($char == 'T') {
        $thymine_count++;
      }
    }

    my $adenine_percent  = 100.0 * $adenine_count  / $length;
    my $guanine_percent  = 100.0 * $guanine_count  / $length;
    my $cytosine_percent = 100.0 * $cytosine_count / $length;
    my $thymine_percent  = 100.0 * $thymine_count  / $length;

    my $gc_content = 100.0 * ($guanine_count + $cytosine_count) / $length;

    return [$adenine_percent, $guanine_percent, $cytosine_percent, $thymine_percent, $gc_content];
  }
}
# sequence.pl
use strict;
use warnings;
use SPVM 'Sequence';

my $sequence = 'ATGCAGCCCCGGGTACTCC';

my $spvm_array = Sequence->percentages($sequence);
my $percentages_ref = $spvm_array->to_elems;

my $adenine_percent  = $percentages_ref->[0];
my $guanine_percent  = $percentages_ref->[1];
my $cytosine_percent = $percentages_ref->[2];
my $thymine_percent  = $percentages_ref->[3];

my $gc_content = $percentages_ref->[4];

print "Percentage of A: $adenine_percent%\n";
print "Percentage of G: $guanine_percent%\n";
print "Percentage of C: $cytosine_percent%\n";
print "Percentage of T: $thymine_percent%\n";

print "GC content: $gc_content%\n";

出力結果。

$ perl -I. sequence.pl
Percentage of A: 15.7894736842105%
Percentage of G: 26.3157894736842%
Percentage of C: 42.1052631578947%
Percentage of T: 15.7894736842105%
GC content: 68.4210526315789%
Side Bar