Modification d'algo problème de valeur null
salut a tous j'essaie de modifié l'algo de levensthein afin de prendre en compte les transpositions (damerau-levensthein)
voici l'algo de damerau-levensthein qui prend en compte les transpositions
Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
|
int DamerauLevenshteinDistance(char str1[1..lenStr1], char str2[1..lenStr2])
// d is a table with lenStr1+1 rows and lenStr2+1 columns
declare int d[0..lenStr1, 0..lenStr2]
// i and j are used to iterate over str1 and str2
declare int i, j, cost
for i from 0 to lenStr1
d[i, 0] := i
for j from 0 to lenStr2
d[0, j] := j
for i from 1 to lenStr1
for j from 1 to lenStr2
if str1[i] = str2[j] then cost := 0
else cost := 1
d[i, j] := minimum(
d[i-1, j ] + 1, // deletion
d[i , j-1] + 1, // insertion
d[i-1, j-1] + cost // substitution
)
if(i > 1 and j > 1 and str1[i] = str2[j-1] and str1[i-1] = str2[j]) then
d[i, j] := minimum(
d[i, j],
d[i-2, j-2] + cost // transposition
)
return d[lenStr1, lenStr2] |
la différence avec l'algo basique c'est l'ajout d'un if
Code:
1 2 3 4 5
| if(i > 1 and j > 1 and str1[i] = str2[j-1] and str1[i-1] = str2[j]) then
d[i, j] := minimum(
d[i, j],
d[i-2, j-2] + cost // transposition
) |
Voici mon code
Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
|
package Math::Levensthein;
use bytes;
sub Levensthein ($$$);
sub new ($);
sub min ($);
# ---------------------------------
sub new ($)
{
my $class = shift;
return bless {
}, $class;
}
# ---------------------------------
sub Levensthein ($$$)
{
my $self = shift;
# $s1 and $s2 are the two strings
# $len1 and $len2 are their respective lengths
#
my $s1 = shift;
my $s2 = shift;
my ($len1, $len2) = (length $s1, length $s2);
# If one of the strings is empty, the distance is the length
# of the other string
#
return $len2 if ($len1 == 0);
return $len1 if ($len2 == 0);
my %mat;
# Init the distance matrix
#
# The first row to 0..$len1
# The first column to 0..$len2
# The rest to 0
#
# The first row and column are initialized so to denote distance
# from the empty string
#
for (my $i = 0; $i <= $len1; ++$i)
{
for (my $j = 0; $j <= $len2; ++$j)
{
$mat{$i}{$j} = 0;
$mat{0}{$j} = $j;
}
$mat{$i}{0} = $i;
}
# Some char-by-char processing is ahead, so prepare
# array of chars from the strings
#
my @ar1 = split(//, $s1);
my @ar2 = split(//, $s2);
for (my $i = 1; $i <= $len1; ++$i)
{
for (my $j = 1; $j <= $len2; ++$j)
{
# Set the cost to 1 iff the ith char of $s1
# equals the jth of $s2
#
# Denotes a substitution cost. When the char are equal
# there is no need to substitute, so the cost is 0
#
my $cost = ($ar1[$i-1] eq $ar2[$j-1]) ? 0 : 1;
# Cell $mat{$i}{$j} equals the minimum of:
#
# - The cell immediately above plus 1
# - The cell immediately to the left plus 1
# - The cell diagonally above and to the left plus the cost
#
# We can either insert a new char, delete a char or
# substitute an existing char (with an associated cost)
#
$mat{$i}{$j} = min([$mat{$i-1}{$j} + 1,
$mat{$i}{$j-1} + 1,
$mat{$i-1}{$j-1} + $cost]);
}
}
# Finally, the Levenshtein distance equals the rightmost bottom cell
# of the matrix
#
# Note that $mat{$x}{$y} denotes the distance between the substrings
# 1..$x and 1..$y
#
return $mat{$len1}{$len2};
}
# minimal element of a list
#
sub min ($)
{
my @list = @{$_[0]};
my $min = $list[0];
foreach my $i (@list)
{
$min = $i if ($i < $min);
}
return $min;
}
1; |
auquelj'ai ajouter la fameuse condition
comme cela
Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
|
# Cell $mat{$i}{$j} equals the minimum of:
#
# - The cell immediately above plus 1
# - The cell immediately to the left plus 1
# - The cell diagonally above and to the left plus the cost
#
# We can either insert a new char, delete a char or
# substitute an existing char (with an associated cost)
#
$mat{$i}{$j} = min([$mat{$i-1}{$j} + 1,
$mat{$i}{$j-1} + 1,
$mat{$i-1}{$j-1} + $cost]);
if ($i > 1 and $j > 1 and $ar1[$i] eq $ar2[$j-1] and $ar1[$i-1] eq $ar2[$j])
{
$mat{$i}{$j} = min ([$mat{$i}{$j},
$mat{$i-2}{$j-2} + $cost]);
} |
mon problème c'est que perl me renvoie des erreurs sur les conditions eq
Citation:
$ar1[$i] eq $ar2[$j-1] and $ar1[$i-1] eq $ar2[$j]
il arrive que les certaines cases des tableaux soit vide 8O
J'ai beau chercher je ne trouve pas l'erreur parce que les cases vides c'est pas posible du fait que je split
Code:
1 2
| my @ar1 = split(//, $s1);
my @ar2 = split(//, $s2); |
merci de votre aide 8-)