I need to make a separate line for each category so each row will consist of one part number and one category. This will then be output as a table in PHP/MySQL. The parts will be called from the table according to the category. It will result in a parts lookup on a web site.
Example:
part number categories
21030 3005 3006 3007
Question: What is the Perl script for me to change the above to the following:
21030 3005
21030 3006
21030 3007
I know Perl won't work on Excel files. Should I export it as CSV or Tab separated? I know nothing about Perl, unfortunately, but I do remember that it is good at the above sort of thing, and want to save myself a week or more of effort.
Would PHP be a better tool for this job?
Regards,
Steve
I do this stuff all the time - to answer one q, clients will export their data in a delimited format of some sort, I throw together a script that reads it line by line and populates the DB. Perl or PHP can do this equally (as can any other dynamic language.) I prefer to have them export the data with a specific delimiter - not tab or comma, or quote qualified - for example, delimited by the pipe character:
field1¦field2¦field3¦field4
open (FILE, "exported_file.csv");
while (chomp($line = <FILE>)) {
@fields = split (/\¦/,$line);
@cats = split(/\s+/,$fields[2]); #this is your categories
$fields[2] = $cats[0]; # replace $fields[2] with the first one
$select = qq¦insert into db (field1,field2,field2,field4) values¦;
$select .= qq¦('$fields[0]','$fields','$fields[2]','$fields[3]');¦;
$sth = $dbh->prepare("$select;");
$rv = $sth->execute or die("Cannot insert data");
$sth->finish;
}
close(FILE);
[1][edited by: phranque at 7:59 pm (utc) on Sep. 19, 2008]
[edit reason] disabled smileys ;) [/edit]
field1: field2:
partno1 > categorya
partno1 > categoryb
partno1 > categoryc
partno2 > categoryc
partno3 > categoryt
partno3 > categoryr
etc.
Thanks!
Steve
I prefer to have them export the data with a specific delimiter - not tab or comma, or quote qualified
Always wise if the content may possibly contain one of those characters. I think you're safe in this case if you're just using numbers.
Steve to write you an accurate script you really need to tell us how your input file is formatted, and how you want it storing in the database. I'm assuming here that you already have a database of course?
PartNo ¦ CatNo
21030 ¦ 3005a.#*$!x 3006g.#*$!x z3007.2345
Instead of this:
PartNo ¦ CatNo
21030 ¦ 3005a.#*$!x
21030 ¦ 3006g.#*$!x
21030 ¦ z3007.2345
Each category number will be ten digits in length, containing any mixture of letters and/or numbers and/or a period, separated by a space. Of course, in Excel, I can do a search/replace to replace the space with a pipe.
Thanks,
Steve
[perl]my $parts;
foreach( @lines_in_csv) {
my ( $part, $category) = split( /,/);
push( @{ $parts -> {$part}}, $cat);
}
foreach my $partno ( keys( %$parts)) {
my $categories = join( ",", @{ $parts -> {$partno}});
## Now you can do something with $partno and $categories
## $partno is your original part number
## $categories is now a , separated list of all your categories
}
[/perl]
You could easily use rocknbil's code to push this into a db.
right now it is in Excel format, two columns. The final table will be just two columns
ok, try this:
[perl]my $parts;
foreach( @lines_in_csv) {
my ( $part, $categories) = split( /,/);
$parts -> {$part} = split( /\s/, $categories);
}
foreach my $partno ( keys( %$parts)) {
foreach my $category ( @{ $parts -> {$partno}}) {
## Now you can do something with $partno and $category
## $partno is your original part number
## $category will change each time singularly while $partno stays the same
}
}[/perl]
Very similar, I've modified the input loop so it still creates a list of categories, and the output loop so it'll give you each one separately.
Yes, I know this could all be done in one go but I like to simplify or it all gets too brackety! :D
my $parts;
foreach( @lines_in_csv) {
my ( $part, $categories) = split( /,/);
$parts -> {$part} = split( /\s/, $categories);
}
$parts->{$part} will equal the last value from the file for each $part. It will not be an array of all the categories associated with each part. Dabrowskis code posted earlier will work if the push assignment is corrected, $cat below should be $category:
my $parts;
foreach( @lines_in_csv) {
my ( $part, $category) = split( /,/);
push( @{ $parts -> {$part}}, $cat);
}
1. Export the Excel file as "exported_file.csv" in CSV format.
2. Open an HTML page and put the PHP tags on it.
3. Between the PHP tags, put:
my $parts;
foreach( @lines_in_csv) {
my ( $part, $category) = split( /,/);
push( @{ $parts -> {$part}}, $cat);
}
foreach my $partno ( keys( %$parts)) {
foreach my $category ( @{ $parts -> {$partno}}) {
}
}
4. I notice that it makes no mention of the file to open. The above script is all I need to have two columns of figures, one per field? This would save me a ton of time!
Thanks!
Steve
*StevenHu*, this code works, I have tested it, but as a standalone, not as a CGI script. *Perl_diver* is correct, it is not PHP. It cannot run inline with html. Ideally (I think) for your application you'd want an HTML upload form, so you could upload your CSV to the script, then the script would add it to the database.
[perl]use strict;
open( CSV, "part_info.csv") ¦¦ die "Open file failed!\n";
my @lines_in_csv = <CSV>;
close( CSV);
my $parts;
foreach( @lines_in_csv) {
my ( $part, $categories) = split( /,/);
$parts -> {$part} = [ split( /\s/, $categories)];
}
foreach my $partno ( sort keys( %$parts)) {
foreach my $category ( sort @{ $parts -> {$partno}}) {
## Now you can do something with $partno and $category
## $partno is your original part number
## $category will change each time singularly while $partno stays the same
print "$partno - $category\n";
}
}[/perl]
I've added a bit at the top that reads the CSV file called part_info.csv.
I have added a sorting command, so the output is in logical order, instead of Perl's 'how I feel like it' way. Also added a print line so you can see the output.
My test CSV contained:
part1,cat1 cat2 cat3
part2,cat4
part3,cat2 cat4 cat5
And the output is:
part1 - cat1
part1 - cat2
part1 - cat3
part2 - cat4
part3 - cat2
part3 - cat4
part3 - cat5
Please tell me I got it the right way around this time!
If you run it from a command prompt you'll see that it works.
OK, I downloaded ActivePerl and installed it and ran the script. I get the following error for each line:
Bareword found where operator expected at part_info.csv line 3108, near "6005j#*$!xx" <Missing operator before j#*$!xx?
6005j#*$!xx is a valid category code. Every category code is 10 characters long (no spaces) and is composed of letters and/or digits and/or a period.
Thanks,
Steve
[edited by: StevenHu at 2:59 pm (utc) on Sep. 19, 2008]
Bareword found where operator expected at part_info.csv line 3108, near "6005j#*$!xx" <Missing operator before j#*$!xx?
The thing that worries me about that is that it says part_info.csv, that should be the name of the script. Make sure your script is called something like part_info.pl.
Secondly it says line 3108, when the script only has about 25 lines.
You need to have 2 separate files, one .pl file which contains the script above, and one .csv with your data in.
The script will work for any characters in your codes EXCEPT a comma or a space.
When I tried it again, I get the error,
C:\Documents and Settings\shusting>perl categories.pl
Unrecognized character \xA6 in column 29 at categories.pl line 3.
That character is the first pipe symbol. Strange. (I had copied and pasted the script into Windows Notepad.)
I then typed over the symbol with my own keyboard and ran the script again. It printed the result to the screen. Looking better! Unfortunately, I could see only the last part of the 3000 plus lines.
I notice that the script removed the trailing zeroes (see following). 6015.318 should be 6015.31800 Can the script keep the zeroes?
==============================
LRP39255 - 6015.318
LRP39292 - 6015.318
LRP39294 - 6015.318
LRP39310 - 6015.31201
LRP39310 - 6015.31202
LRP39321 - 6015.318
LRP39390 - 6015.318
LRP39410 - 6015.31201
LRP39410 - 6015.31202
LRP39411 - 6015.31201
==============================
Occasionally there are gaps. The gaps appear to be the spaces between, or in front of, or behind, the categories. I can delete those lines.
==============================
SP59L -
SP59L -
SP59L - 7000.#*$!#*$!
SP59M -
SP59M -
SP59M - 7000.#*$!#*$!
SP59XL -
SP59XL -
SP59XL - 7000.#*$!#*$!
SP59XXL -
SP59XXL -
SP59XXL - 7000.#*$!#*$!
==============================
I'm looking through the Perl manual and can't find the script to write the results to a CSV file rather than to the screen. Can you help? It looks like we are almost there!
Thanks,
Steve
Yeah sorry, forgot to mention. Despite the fact that this is a coders website, they still cannot yet display one of the most common symbols properly. You will always have to replace this in any copied code.
"I notice that the script removed the trailing zeroes"
Nope, Excel removed the training zeros. You'll have to faff with it and get it to output all fields as text. It thinks it's a number, which it technically is I guess. You could set it to display to 5d.p. that would work too.
"Occasionally there are gaps"
My bad. This is caused when you have more than one space separating your categories, probably where they've been deleted at some point, or just not entered consistently.
Change this line:
[perl] $parts -> {$part} = [ split( /\s+/, $categories)]; [/perl]
I've added a + character in the regex, that should sort it.
"I'm looking through the Perl manual and can't find the script to write the results to a CSV file rather than to the screen"
The manual will only show you how to use commands, not how to put them together. You're expected to work that bit out for yourself!
ok, change the output loop like this to output to a new csv...
[perl]open( NEWCSV, ">new_part_info.csv") ¦¦ die "File open failed\n";
foreach my $partno ( sort keys( %$parts)) {
foreach my $category ( sort @{ $parts -> {$partno}}) {
print "$partno - $category\n";
print NEWCSV "$partno,$category\n";
}
}
close( NEWCSV);[/perl]
That should work, but rememeber to change the filename if you need to, and to change those pipes!
You can see, apart from adding a line to open a new file, all you need to do is add another print line, but telling it to send it to the new file instead of the screen.
[edited by: Dabrowski at 4:56 pm (utc) on Sep. 19, 2008]
On Windows the file extension is not important. You can name a perl file anything you want as long as you invoke it with perl:
c:\>perl myfile.foo
This still looks like it will not work:
my $parts;
foreach( @lines_in_csv) {
my ( $part, $categories) = split( /,/);
$parts -> {$part} = [ split( /\s/, $categories)];
}
You want to use push() to populate $parts->{$part} with all the values from the file. Your code will still redefine $parts->{$part} with a new value for each line in the file and the result will be that the last value in the file will be retained.
my $parts;
foreach( @lines_in_csv) {
my ( $part, $categories) = split( /,/);
push @{$parts->{$part}},$categories;
}
Or have I missed something by not reading all the posts in the thread?
On a side note, I can't beleive that this forum still inserts those silly double-pipes into messages to replace single pipes. Why can't that be fixed? Even if the files this forum uses are pipe delimited it should not be a problem.
You can name a perl file anything you want as long as you invoke it with perl
You're right of course, but this is a CGI forum and after all, assuming he's using IIS here, it will only execute a .pl file (yes I realise we're not using this in a CGI context). Plus if he's a beginner he should learn the right way.
This still looks like it will not work
ok, I think you made the same mistake I did, and got the input/output streams mixed up.
He has this:
partno,cat1 cat2 cat3
He wants this:
partno,cat1
partno,cat2
partno,cat3
You see the first time I was mistaken, I used push as I thought each category was on a different line.
So using the second method, first I split the line into part number, and categories as a string, then split the categories string into an array.
It really does work, honest! ;)
For some reason my posts' letters are turning into garbage. For instance, j#*$!xx should be j with 5 X's next to it.
That character is the first pipe symbol. Strange. (I had copied and pasted the script into Windows Notepad.)
It printed the result to the screen. Looking better! Unfortunately, I could see only the last part of the 3000 plus lines.
...
I'm looking through the Perl manual and can't find the script to write the results to a CSV file rather than to the screen.
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
Global symbol "$parts" requires explicit package name at categories.pl line 4.
Global symbol "$parts" requires explicit package name at categories.pl line 5.
Global symbol "@lines_in_csv" requires explicit package name at categories.pl li
ne 15.
Execution of categories.pl aborted due to compilation errors.
Here is the script:
use strict;
open( NEWCSV, ">new_part_info.csv") ¦¦ die "File open failed\n";
foreach my $partno ( sort keys( %$parts)) {
foreach my $category ( sort @{ $parts -> {$partno}}) {
print "$partno - $category\n";
print NEWCSV "$partno,$category\n";
}
}
close( NEWCSV);
my $parts;
foreach( @lines_in_csv) {
my ( $part, $categories) = split( /,/);
$parts -> { chomp( $part)} = [ split( /\s+/, chomp( $categories))];
}
foreach my $partno ( sort keys( %$parts)) {
foreach my $category ( sort @{ $parts -> {$partno}}) {
## Now you can do something with $partno and $category
## $partno is your original part number
## $category will change each time singularly while $partno stays the same
print "$partno - $category\n";
}
}
Why is it acting differently now?
Thanks,
Steve
Here's the full script again, with some additional commentary...
[perl]use strict;
## This bit reads in your CSV file.
open( CSV, "part_info.csv") ¦¦ die "Open file failed!\n";
my @lines_in_csv = <CSV>;
close( CSV);
## This bit reads all your part details, and makes a list of the categories
my $parts;
foreach( @lines_in_csv) {
my ( $part, $categories) = split( /,/);
$parts -> { chomp( $part)} = [ split( /\s+/, chomp( $categories))];
}
## This bit opens a new csv file
open( NEWCSV, ">new_part_info.csv") ¦¦ die "File open failed\n";
foreach my $partno ( sort keys( %$parts)) {
foreach my $category ( sort @{ $parts -> {$partno}}) {
## Here, we have $partno and $category, one by one
## So we output once to screen, and once into new file
print "$partno - $category\n";
print NEWCSV "$partno,$category\n";
}
}
close( NEWCSV);[/perl]
Why don't you tell us how your MySQL table is arranged? This script can just as easily create your table for you too?