Forum Moderators: coopster & phranque

Message Too Old, No Replies

Fun with regex, returning string other than pattern match

         

csdude55

8:35 pm on Sep 22, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a list of rows in MySQL that look like:

colA | colB
^csdude | 20200114
robzil+a | 20210811
^phran(k|que)$ | 20190420
co+pster | 20200901


Now I'm running a regex to see if my variable matches any of them, and if so then I want to return colB.

I'm currently doing it in a loop, like:

# use selectall_array to return rows as @arr

foreach $key (@arr);
($colA, $colB) = @$key;

if ($str =~ /$colA/) {
$filter = $colB;
}
}


As my list of rows grows, though, that's a slow way to do it. I recognize that it could be a lot faster to process with:

# wrap it in ( ) so I can get $1
$pattern = '(';

foreach $key (@arr);
($colA, $colB) = @$key;
$pattern .= $colA . '|';
}

$pattern =~ /|$/)/;

# this returns the matched pattern
if ($str =~ /$pattern/) { return $1; }


But how do I return the code in $colB?

I tried creating a second hash with the colA regex code as the key, but the $1 match returns the text that matched instead of the regex code:

# wrap it in ( ) so I can get $1
$pattern = '(';

foreach $key (@arr);
($colA, $colB) = @$key;

$match{$colA} = $colB;
$pattern .= $colA . '|';
}

$pattern =~ /\|$/)/;

# assume $str = 'csdude55';
# $1 equals "csdude" instead of "^csdude"
if ($str =~ /$pattern/) {
return $match{$1};
}

csdude55

9:16 pm on Sep 22, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Might have answered my own question... what do you think about named groups?

foreach $key (@arr);
($colA, $colB) = @$key;
$pattern .= '(?<FILTER' . $colB . '>' . $colA . ')|';
}

# just remove the trailing |
$pattern =~ /\|$//;

if ($str =~ /$pattern/) {
# there has to be a better way to get the first key result?
foreach (keys %+) {
return $_;
}
}

csdude55

11:47 pm on Sep 22, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A little uglier, but faster because I eliminate the regex to remove the trailing | and the second foreach:

foreach $key (@arr);
($colA, $colB) = @$key;
$pattern .= '(?<FILTER' . $colB . '>' . $colA . ')';
if ($key ne $arr[-1]) { $pattern .= '|'; }
}

if ($str =~ /$pattern/) {
($filter = (keys %+)[0]) =~ s/^FILTER//;
return $filter;
}


I don't actually need $filter, but I don't know how to return here without it :-/

lucy24

4:43 pm on Sep 23, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't know how to return here without it
Where are you returning to? Aren't you in the same place anyway? With three consecutive posted variants, something may have been lost. Is the final if(blahblah) actually inside a function that gets called elsewhere?

csdude55

5:12 pm on Sep 23, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is the final if(blahblah) actually inside a function that gets called elsewhere?

Correct, I have this as part of a function in a separate file that's "require"d to the main script. Sorry, I should have said that in the beginning :-/

It's not a big deal or anything, I really just pointed that out in the last post for future readers that are trying to do the same thing.