You are to write a Perl program that analyses text files to obtain statistics on their content. The program should operate as follows:
1) When run, the program should check if an argument has been provided. If not, the program should prompt for, and accept input of, a filename from the keyboard.
2) The filename, either passed as an argument or input from the keyboard, should be checked to ensure it is in MS-DOS format. The filename part should be no longer than 8 characters and must begin with a letter or underscore character followed by up to 7 letters, digits or underscore characters. The file extension should be optional, but if given is should be ".TXT" (upper- or lowercase).
If no extension if given, ".TXT" should be added to the end of the filename. So, for example, if "testfile" is input as the filename, this should become "testfile.TXT". If "input.txt" is entered, this should remain unchanged.
3) If the filename provided is not of the correct format, the program should display a suitable error message and end at this point.
4) The program should then check to see if the file exists using the filename provided. If the file does not exist, a suitable error message should be displayed and the program should end at this point.
5) Next, if the file exists but the file is empty, again a suitable error message should be displayed and the program should end.
6) The file should be read and checked to display crude statistics on the number of characters, words, lines, sentences and paragraphs that are within the file.
Here is the code I have done so far and it doesn't seem to work. Can anybody see why?
#usr/bin/perl
use strict;
use warnings;
if ($#ARGV == -1) #no filename provided as a command line argument.
{
print("Please enter a filename: ");
$filename = <STDIN>;
chomp($filename);
}
else #got a filename as an argument.
{
$filename = $ARGV[0];
}
#perform the specified checks
#check if filename is valid, exit if not
if ($filename !~ m^/[a-z]{1,7}\.TXT$/i)
{
die("File format not valid\n");)
}
if ($filename !~ m/\.TXT$/i)
{
$filename .= ".TXT";
}
#check if filename is actual file, exit if it is.
if (-e $filename)
{
die("File does not exist\n");
}
#check if filename is empty, exit if it is.
if (-s $filename)
{
die("File is empty\n");
}
my $i = 0;
my $p = 1;
my $words = 0;
my $chars = 0;
open(READFILE, "<$data1.txt") or die "Can't open file '$filename: $!";
while (<READFILE>) {
chomp; #removes the input record Separator
$i = $.; #"$". is the input record line numbers, $i++ will also work
$p++ if (m/^$/); #count paragraphs
split (/\s+/); #split sentences into "words"
$words++ #count all characters except spaces and add to $chars
$chars += tr/ //c; #tr/ //c replaces everything in the string with itself, except spaces, and returns the number of such characters replaced
}
#display results
print "There are $i lines in $data1\n";
print "There are $p Paragraphs in $data1\n";
print "There are $words in $data1\n";
print "There are $chars in $data1\n";
close(READFILE);
[edited by: phranque at 8:21 pm (utc) on June 25, 2008]
[edit reason] disable smileys ;) [/edit]
it looks like you should have gotten some type of output from that script...
if ($filename !~ m^/[a-z]{1,7}\.TXT$/i)
The ^ is "line starts with." needs to be on the other side of the /.
if ($filename !~ /^[a-z]{1,7}\.TXT$/i)
Another,
open(READFILE, "<$data1.txt") or die "Can't open file '$filename: $!";
Huh? Ya' wanna run that by me again, what is "$data1.txt" and what does it have to do with $filename? :-)
Syntax error here also:
die("File format not valid\n");)
S/B
die("File format not valid\n");
and here
$words++
S/B
$words++;
Got it to at least run at that point, but there's other things going on, should get you started at least.
[edited by: phranque at 11:20 pm (utc) on Aug. 5, 2008]
[edit reason] disabled smileys ;) [/edit]