PERL hash modification - Perl Server Side CGI Scripting forum at WebmasterWorld - WebmasterWorld

Forum Moderators: coopster & phranque

Message Too Old, No Replies

PERL hash modification

mrealty

12:39 am on Aug 3, 2009 (gmt 0)

10+ Year Member

I have this code:

my %file_hash;

while (<IN1>) {
next unless $_ =~ m/^\d/;
my ($key, $value) = ($_ =~ m/(\d+)(.*)/);
$file_hash{$key} = $value;
}
for my $key (keys %file_hash) {
print OUT3 $key, $file_hash{$key}, "\n";
}

It's been working out but I need to make the first and second values to be the keys, instead of only the first value. I tried switching it to this, but it doesn't work as intended:

my %file_hash;

while (<IN1>) {
next unless $_ =~ m/^\d/;
my ($key, $status, $value) = ($_ =~ m/(\d+)(\t[A-Z])(.*)/);
$file_hash{$key}{$status} = $value;
}

for my $key (keys %file_hash) {
print OUT3 $key, $file_hash{$key}, "\n";

Here is the input file. Any ideas? Instead of overwriting the duplicate entries based on the first set of numbers, I need to to only overwrite duplicate entries based on the first set of numbers and the {tab} Letter that follows it (those fields are tab delimited below). Sometimes I have a duplicate number in the first field, but if the letter in the second field is different, I want to keep it as a unique entry.

306526 W 69900 1 07/31/2009
311018 W 137900 0 07/31/2009
311018 H 142500 0 07/31/2009
314306 W 146000 0 07/31/2009
315309 W 149900 0 07/31/2009
315671 W 150000 1 07/31/2009
305671 C 155900 1 07/31/2009
310708 W 199900 1 07/31/2009
310715 W 199900 0 07/31/2009
301625 W 322500 1 07/31/2009
313984 W 554900 1 07/31/2009
308064 W 895000 1 07/31/2009
314303 H 47000 0 07/31/2009
312911 H 53900 0 07/31/2009
314303 X 69300 0 07/31/2009
309245 H 88000 1 07/31/2009
308548 H 89000 0 07/31/2009
314389 H 90000 0 07/31/2009

[edited by: phranque at 8:34 am (utc) on Aug. 4, 2009]
[edit reason] disabled graphic smileys ;) [/edit]

perl_diver

2:58 am on Aug 3, 2009 (gmt 0)

10+ Year Member

maybe:

my %file_hash;
while (<DATA>) {
next unless $_ =~ m/^\d/;
my ($key, $status, $value) = split(/\s+/,$_,3);
$file_hash{"$key $status"} = $value;
}
for my $key (keys %file_hash) {
print "$key = $file_hash{$key}", "\n";
}
__DATA__
306526 W 69900 1 07/31/2009
311018 W 137900 0 07/31/2009
311018 H 142500 0 07/31/2009
314306 W 146000 0 07/31/2009
315309 W 149900 0 07/31/2009
315671 W 150000 1 07/31/2009
305671 C 155900 1 07/31/2009
310708 W 199900 1 07/31/2009
310715 W 199900 0 07/31/2009
301625 W 322500 1 07/31/2009
313984 W 554900 1 07/31/2009
308064 W 895000 1 07/31/2009
314303 H 47000 0 07/31/2009
312911 H 53900 0 07/31/2009
314303 X 69300 0 07/31/2009
309245 H 88000 1 07/31/2009
308548 H 89000 0 07/31/2009
314389 H 90000 0 07/31/2009

mattdw

1:49 pm on Aug 3, 2009 (gmt 0)

10+ Year Member

First try changing your regular expression match line to this:

my ($key, $status, $value) = ($line =~ m/(\d+)\s+([A-Z])(.+)/);

Moving the tab out of the regex grouping prevents it from getting into the hash key (unless that's what you want.

Then I think the main problem is in your final for loop. You are only iterating through the first level of hash keys, whose values are themselves keys for the next level of hash. You need to iterate through both. Try something like this:

for my $key (keys %file_hash) {
foreach my $status (keys %{$file_hash{$key}}) {
print $key, $status, $file_hash{$key}{$status}, "\n";
}
}

mrealty

3:01 am on Aug 4, 2009 (gmt 0)

10+ Year Member

Well the first suggestion gave me a good push in the right direction. I changed it to what's below, did a few test runs, and it works. I don't get what you are saying Mattdw. Is it not right the way it is now? The output seems to be fine. Thanks for taking a look at this guys.

my %file_hash;

while (<IN1>) {
next unless $_ =~ m/^\d/;
my ($key, $status, $value) = split(/\s+/,$_,3);
$file_hash{"$key $status"} = $value;
}

for my $key (keys %file_hash) {
my ($one, $two) = split(' ', $key);
print OUT3 "$one\t$two\t$file_hash{$key}";
}

[edited by: phranque at 8:33 am (utc) on Aug. 4, 2009]
[edit reason] disabled graphic smileys ;) [/edit]

mattdw

9:09 pm on Aug 4, 2009 (gmt 0)

10+ Year Member

No, you can definitely do it that way. The way I was illustrating would allow you to separate the keys in a nested hash like $file_hash{$key}{$status} instead of having to join them into one key like $file_hash{"$key $status"}. The first option might be more flexible, but isn't necessary, depending on what you are trying to accomplish.

perl_diver

11:06 pm on Aug 4, 2009 (gmt 0)

10+ Year Member

Save a little work doing it this way:

my %file_hash;
while (<IN1>) {
next unless $_ =~ m/^\d/;
my ($key, $status, $value) = split(/\s+/,$_,3);
$file_hash{"$key\t$status"} = $value;
}
for my $key (keys %file_hash) {
print OUT3 "$key\t$file_hash{$key}";
}