subsequences!

I have a long string of letters, in this, case DNA. My intention is to find particular start triplets to begin and stp triplets to end the strings in the subsequence.the substring within these starts and stops triplets(with start and stop riplets inclusive) are then kept in an array in array.

For example my $string = "ATGAAAGTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA"
should produce the substrings below and stored in an array

@whatever =("ATGAAAGTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA",
"GTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTGGTTGGAAATAA",
"ATTGGTTGGAAATAA");

I have this as part of my entire code as my best effort:

while ($seq =~ m/ATGŚTTGŚCTGŚATTŚCTAŚGTGŚATT/gi){
my $matchPosition = pos($seq) - 3;
if (($matchPosition % 3) == 0) {
push (@startsRF1, $matchPosition);
}

while ($seq =~ m/TAGŚTAAŚTGA/gi){
my $matchPosition = pos($seq);
if (($matchPosition % 3) == 0) {
push (@stopsRF1, $matchPosition);
}

my $codonRange = "";
my $startPosition = 0;
my $stopPosition = 0;

@startsRF1 = reverse(@startsRF1);
@stopsRF1 = reverse(@stopsRF1);
while (scalar(@startsRF1) > 0) {
$codonRange = "";
$startPosition = pop(@startsRF1);
if ($startPosition < $stopPosition) {
next;
}

my $ORFseq = "";

while (scalar(@stopsRF1) > 0) {
$stopPosition = pop(@stopsRF1);
if ($stopPosition > $startPosition) {

my $difF = $stopPosition - $startPosition;
$ORFseq = substr($seq, $startPosition,(length($seq)-(length($seq)-$difF)));
push (@arrayOfORFs, $ORFseq);

}

use strict; my $string = "ATGAAAGTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA"; my @array = (); my @starts = qw(ATG TTG CTG ATT CTA GTG); my @stops = qw(TAG TAA TGA); for my $start (@starts) { for my $stop (@stops) { while($string =~ m/$start(.*)$stop/g) { push @array, $start . $1 . $stop; }} }

print join("\n", @array);

subsequences!

ojefua

tonynoriega

Receptional

janharders

ojefua

janharders

phranque

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week