Forum Moderators: bakedjake
As a matter of fact, my target output was for the web, so I used the free version of FusionCharts which is Flash, avoiding tools like Excel as much as possible.
Many PHP programmers might have a first reaction to read in the comma delimited file and process it in PHP code, or LAMPers might convert it so MySQL, but both processes would take quite some time to code when a single command line written in a few seconds can easily produce quality results in a fraction of the time.
What I had was a log file specifically tracking local access to some pages and I finally needed to collate and sort this log file into a format I could import directly into FusionCharts and display a simple graph.
My starting log comma delimited log file looked like this:
date time,country,state,city,IP,etc.,etc.,etc.
or as AWK would see it:
$1,$2,$3,$4,$5,etc.,etc.,etc.
My goal was to extract just the US data and sort the hits on individual States and output the results in FusionChart compatible XML.
Here's the one liner that processes the file into XML:
grep "United States" somelogfile.txt ¦ awk -F "," '{print ","$3}' ¦ sort ¦ uniq -c ¦ sort -nr ¦ awk -F "," '{printf "<set name=\"%s\" value=\"%d\"/>\n",$2,$1}' > chartdata.xml
NOTE: WebmasterWorld breaks the solid PIPE bar and replaces it with ¦ which you'll have to fix for this to work.
Let's dissect this to make some sense of it.
grep "United States" somelogfile.txt ¦
awk -F "," '{print ","$3}' ¦
The $3 is the third field, which in my file was the state.
This field ($3) would obviously change for your files or purposes but what comes next should work as-is for most applications.
Note that I added a comma "," in front of every state which will be used later to save some trouble.
sort ¦
uniq -c ¦
sort -nr ¦
Now sort the " count, state" organized data numerically and in reverse order so you have the file ordered from the most popular state down to the least.
awk -F "," '{printf "<set name=\"%s\" value=\"%d\"/>\n",$2,$1}' > chartdata.xml
This is the final XML output command using AWK to chop up the " count, state" data into $1 and $2 and then rewrite it in XML format.
Notice I used the "printf" in AWK to do a formatted output which removes all the spaces from the number field via the decimal format output of "%d". Had the spaces remained it would cause FusionGraphs to not like the data and it's a nice clean step.
The final output looks like this..
<set name="California" value="2608"/>
<set name="Florida" value="2601"/>
<set name="Texas" value="2205"/>
<set name="New York" value="1604"/>
<set name="Ohio" value="1408"/>
...etc
Now you just need to preface the XML data with the <graph...> element and append the file with the closing </graph> element and you're done.
Place those in files graphstart.txt and graphend.txt and cat them together
cat graphstart.txt chartdata.xml graphend.txt > mygraphdata.xml
Voila! There you have a nice XML file "mygraphdata.xml" ready to be graphed.
Of course you might need to manually specify the bar or column color codes to make it look pretty, but I can't answer all your problems in a single command line now can I?
:)
[edited by: encyclo at 5:53 pm (utc) on June 24, 2009]
[edit reason] fixed typo [/edit]