Finding the Greatest Common Factor in Perl

Once again from Knuth’s Art of Computer Programming, one of his first examples of in his book of an easy algorithm to implement is euclid’s method for finding the greatest common factor between two numbers. The code below was developed from the pseudo-code presented in his book:


#!/usr/bin/perl

use strict;
use warnings;

=ignore

Euclid's algorithm for finding greatest common factor of 2 numbers

=cut


my $m = 225;
my $n = 10;

my $remainder;
for ( 1..$n ) {
    $remainder = divide($m, $n);
    if ( $remainder == 0 ) {
        print $n,"\n"; # Found the GCF, break out of the loop
        last;
    }
    else {
        $m = $n;
        $n = $remainder;
    }
}

sub divide {

    my $m = shift;
    my $n = shift;

    my $remainder = $m % $n;
    return $remainder;
}

 

Finding the Greatest Common Factor in Perl

The Knuth-Morris-Pratt Algorithm in Perl and using index() to match a string pattern.

In light of my recent post using a brute-force string searching algorithm, I decided to post an implementation of the Knuth-Morris-Pratt algorithm in Perl. This implementation is essentially straight from the pseudo-code found on the wiki.


#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

my $string = "ABC ABCDAB ABCDABCDABDE";

my $pattern = "ABCDABD";
my $table_array = build_table($pattern);

print "found pattern at index: "
. search_string( $string, $pattern, $table_array ),"\n";


sub search_string {

    my $string      = shift;
    my $pattern     = shift;
    my $table_array = shift;

    my $m = 0; # beginning of current match
    my $i = 0; # the position of the current character in pattern sought

    my @split_string = split(//, $string);
    my @split_pattern = split(//, $pattern);

    while ( $m + $i < scalar(@split_string) ) {
         if ( $split_pattern[$i] eq $split_string[ $m + $i] ) {
             if ( $i == scalar(@split_pattern) - 1 ) {   
                 return $m;             
             }             
         $i++;
         }
         else {
             if ( @{$table_array}[$i] > -1 ) {
                $m = $m + $i - @{$table_array}[$i];
                $i = @{$table_array}[$i];
            }
            else {
                $i = 0;
                $m++;
            }
        }

    }

    return length($string);

}

sub build_table {

    my $string = shift;

    my @split_string = split (//, $string);

    my $pos = 2;
    my $cnd = 0;

    @{$table_array}[0] = -1;
    @{$table_array}[1] = 0;

    while ( $pos < scalar(@split_string) ) {  
       if ( $split_string[$pos-1] eq $split_string[$cnd] ) {   
          $cnd++; 
          @{$table_array}[$pos] = $cnd;
          $pos++;
       } 
       elsif ( $cnd > 0 ) {
            $cnd = @{$table_array}[$cnd];
       }
       else {
            @{$table_array}[$pos] = 0;
            $pos++;
       }

    }

    return $table_array;
}

It should be noted, however, that the index() function in Perl uses the Boyer-Moore algorithm under the hood- so implementing a string searching function like the following may be an easier, and faster solution. It takes a pattern and string (to search within) as arguments.


use strict;
use warnings;
use Data::Dumper;

my $matches = occurrences('CGATGGTCG',
'TCGATGGTAAATACTGTGCGATGGTCGATGGTTCGATGGTCGATGGTCGGGACGATGGTGGGCGATGGTGCGATGGTTCGATGGTACGATGGTCGATGGTACGATGGTCAGGGCGATGGTTAACGCGATGGTGGCAGTCGATGGTTGCGATGGTTCGATGGTCCCGATGGTGCGACGATGGTATTCCGATGGTTCGATGGTCGATGGTACTGCGATGGTCGATGGTACATCGATGGTATCCGATGGTCGATGGTGGCGATGGTCGATGGTCGATGGTCGATGGTGTTATCGATGGTCCGATGGTCGATGGTTAGCGATGGTTATAGGTATCCCGATGGTCGATGGTCGATGGTTACGATGGTCCGATGGTCGATGGTCTTTGTCGATGGTTCGATGGTCGATGGTAACGATGGTCGATGGTTTGTCGATGGTCGCGATGGTCGCCGATGGTGCCGATGGTGGGTCGATGGTGCTCGATGGTCGATGGTCCGCGATGGTTGCGTCGATGGTCGATGGTCGATGGTGGACTCGATGGTCACGATGGTTTCTCGATGGTGGTTCCGATGGTCGATGGTGTCGATGGTACGCAAGTACAGATAGTGCGATGGTGAGGATAGTGCGATGGTAGCGATGGTCGCGATGGTCGATGGTTACTTGCCTGCGATGGTGTGTACGATGGTCGGAACGCCCGATGGTGACGATGGTCATGCGATGGTATTCAATTCGATGGTCTCCGGCCGAAGAAAGCGATGGTCCCAAGATGATCGATGGTCGATGGTCGATGGTGTCGATGGTCCGATGGTCCGTTTCGATGGTACTTCGATGGTTTGCGATGGTATATGTCGATGGTCCAACGATGGTGGTGCGATGGTCTGCGATGGTA');

print join(' ', @{$matches});
sub occurrences {

    my( $x, $y ) = @_;

    my $pos = 0;
    my $matches = 0;
    my @locations;

    while (1) {
        $pos = index($y, $x, $pos);
        last if($pos < 0);
        $matches++;
        $pos++;
        push @locations, $pos;
    }

    return \@locations;
}
The Knuth-Morris-Pratt Algorithm in Perl and using index() to match a string pattern.

Finding the Index and Number of Occurrences of a Pattern in a String

It is a common task in programming to search for patterns in a given string. This is especially true in the field of Bio Informatics. For various reasons, the programmer may want to know the location, or index, that the pattern started at within the String. The following code below is an example of how to find the index and also the number of occurrences of a given pattern in Perl.

sub pattern_count {
    my ($text, $pattern) = @_;

    my $pattern_len = length($pattern);
    my $count = 0;
    my $pos;
    for ( my $i = 0; $i < length($text); $i++) {
        if ( $pattern eq substr($text, $i, $pattern_len ) ) {
            $count += 1;
            $pos = $i + 1;
            print "found $pattern at $pos\n";
        }
    }
    return $count;

}

my $num_occurrences = pattern_count('GATATATGCATATACTT', 'ATAT');

print "The number of times the pattern occurred is : " . $num_occurrences,"\n";


The String to search within is the first argument to the method, and the desired pattern is the second argument given.

The difficult part about using a regular expression for this task is that sometimes the desired pattern may overlap itself in occurrences. Since using the global flag for a regular expression will not pick up the overlapping occurrences, this implementation of walking the string is used.

Finding the Index and Number of Occurrences of a Pattern in a String