PerlFastLane

Note: You are viewing an old revision of this page. View the current version.

Perl In the Fast -lane

Ok, ok, sorry for the terrible title. But hey I have exciting news: I think I've finally come to understand the awesomeness of the perl -lane method of running perl oneliner command scripts.

Unfortunately this means goodbye awk. I'm sorry old buddy it's been a long run but it's time for me to move on.

In my job I often have to run scripts on thousands of remote systems via a parallel ssh execution tool. For example, I might need to confirm that the version of a particular configuration file is consistent across all the hosts. I can use my parallel eecution tool to easily run a command on all 10,000 or so hosts and dump the results in a text file. The problem is I end up with a very long result file that looks like this:

>>> argle.example.com
command: /usr/sbin/db_update -check
db_update: Version available from dist:  1.15.130 (built: Mon Jan 31 02:32:12 2011)
db_update: Installed database version:   1.15.130 (built: Mon Jan 31 02:32:12 2011)
db_update: Installed database status:    OK (Matches dist version)
db_update: Installed database age:       30 days since db was built
>>> bargle.example.com
command: /usr/sbin/db_update -check
db_update: Version available from dist:  1.15.130 (built: Mon Jan 31 02:32:12 2011)
db_update: Installed database version:   1.15.130 (built: Mon Jan 31 02:32:12 2011)
db_update: Installed database status:    OK (Matches dist version)
db_update: Installed database age:       30 days since db was built

and so on for many pages. The annoyance with this is I really only am interested in the actual 'Installed database version'. However if I grep out just those lines, I lose the context of which host the result came from. Something more clever is needed.

In the past I've done this with the usual commandline tools like grep and awk. for example, I might run something like this as the command on each host:

echo -n "$(hostname): " && /home/y/sbin/ynet_db_update -check | grep "Installed database version" | awk '{ print $5 }'

a slight improvement is to make awk do the work of grep too:

echo -n "$(hostname): " && /usr/sbin/ynet_db_update -check | awk '/Installed database version/ { print $5 }'

which works just fine for sure, but both seemed a little ugly, what with that awkward echo at the beginning to get the hostname. I finally decided to see if I could do something more clever with a perl oneliner.

I've always liked the idea of one line perl scripts but the different command-line arguments always tripped me up. How do you remember to use perl -ne vs. perl -pe for example?

Also, here's a big sticking point for me - how do you replace the smart line splitting functionality in awk, which allows me to do things like awk '{ print $4}' in the above example?

I have to credit the Ksplice blog for finally making me understand perl autosplit mode. All I needed to do was run my command with perl -a to make perl split every line into the array @F which I could then use in exactly the same way I was used to dealing with awk positional parameters $1, $2, etc.

And here's a perl oneliner to do it:

/home/sbin/db_update -check | perl -MSys::Hostname -lane 'print hostname.": ".$F[4] if /Installed database version/'

The key point here is use of -a to autosplit the input line into array @F. You can then use $F[4] exactly like you do in awk '{ print $4 }'. This whole thing is generally a replacement for the standard awk 'find a line and output part of it' idiom: awk '/Installed database version/ '{ print $4 }'. The advantage here is the integration of additional data (like the hostname) is quite simple. I realize that awk programmers out there can probably do this in an equally simple way. If so, please give me your solution in the comments!

http://www.catonmat.net/blog/perl-one-liners-explained-part-two



Our Founder
ToolboxClick to hide/show