Perl: useful tidbits from work

For a few years now, a large part of my job has depended on my writing and maintaining Perl scripts. I thought it would be wise, for my own sake, to note down some of the things I’ve learned along the way. I don’t claim these notes to contain optimal solutions and look forward to any comments they may spark.

Adding an “include” directory to a file

I wasn’t the one to come up with this exact format, and I feel like some of it is extra, however, it has worked and was proliferated throughout many files before I picked them up, so once I have the need to use it for myself, I’ll take the opportunity to figure out what’s really needed. It does show the flexibility of what can be added in the BEGIN block as is.

BEGIN
{
    use File::Spec::Functions 'catfile';
    use File::Basename 'dirname';
    push @INC, catfile(dirname($0), '../other_location/Perl_modules');
}

Generic Code, Project Specifics in a Hash

We had multiple projects which were very similar in what needed to be done via the Perl scripts. To keep the code from being hardcoded to a project, we used a Perl module Custom::Project, containing a hash with details about the products and their dependencies. For instance, one layout might be:

package Custom::Project;
...
  %Products = (
    SourceA_Product => {
      path        => "source_a",
      label_name  => "MYSOURCEA",
    },
    SourceB_Product => {
      path        => "source_b",
      label_name  => "MYSOURCEB",
    },
    Top_Product => {
      path        => "top",
      label_name  => "TOP",
      subproducts => ["SourceA_Product", "SourceB_Product"],
    },
  );

With access to this hash, if I want to apply a label, I can grab the information as long as I know what product I need the information for using something like $Custom::Project::Products{“SourceA_Product”}{label_name}. Or if I’d like to cycle through the subproducts of Top_Product, I can use the array directly by using @{$Custom::Project::Products{“Top_Product”}{subproducts}}

Precompile a Regex

Sometimes it can be useful to store a regex in a variable. In one instance I setup a regex in a variable because it had changed a couple times already, and it was part of a few steps of substitutions, so I wanted to pull out what I could. What I did was create a variable

my $to_be_replaced = qr/NUMBER_(\d)_REPLACEABLE/;

Then, later, in the substitution phase, I was able to do something like this:

$file_contents =~ s/$to_be_replaced/New_Name_$1/g;

Important here is that the number in the text to be replaced was captured and used in the replacement text.

Name both key and value when iterating over a hash

Instead of foreach and only naming a key, an alternative is to use

while ((my $key, my $value) = each %hash) {}

Passing values to subroutines

Previously, I had always used shift to pop the values passed to subroutines. There is an alternative which looks something like this

my ($value1, $value2, $value3) = @_;

Since Perl doesn’t require you to be strict about dictating what will be passed where, this makes a simple way to quickly write out the arguments passed through, though does require attention be paid when making changes. However, I feel it is more readable than using repetitive shifts.
A second point is not really specific to argument passing, but is useful to the cause. Obviously you cannot pass an array to a subroutine directly as the way the arguments are passed is fairly blatantly like an array based on my above note. Passing an array can be critical, though, and it’s not out of the question, it’s simply a matter of how you pass it. Instead of pushing the whole array in by passing it as @my_array, you can pass a reference to the array using \@my_array. Once the subroutine receives it, it will need to dereference what it receives, we’ll call it $my_array_ref in the subroutine, as such: @{$my_array_ref}. Note that this goes for returning information from a subroutine as well; you can return multiple parameters, and they can be references.

Execution Calls

There are a few different ways to execute something from within a Perl script. They all return slightly differently, meaning it’s important to choose the right one.

system($CMD); # Returns success or failure result
`$CMD`;       # Returns output of running command (the characters are backticks and this is effectively the same as qx/$CMD/; & qx{$CMD};)
exec($CMD);   # Does not return

I know there is also an option open(), but I’m not as familiar with it in terms of executing commands, instead of things like opening files. I will direct you to Perl HowTo if you wish for more detailed descriptions.

Saving a file with Unix newlines

This may not be an issue if you’re on a Unix or Linux machine, but if you’re on a Windows machine and the file you’re creating needs to have the Unix newline format, there’s an one extra step you can take to make it happen.

my $result = open(FILE, ">" . $filename) ? 0 : 1;
binmode FILE;

Parallel tasks

Sometimes it’s possible to perform tasks in parallel, and Perl has a few options for performing parallel tasks, but I’ve started to prefer threads (& threads::shared).

use threads;
use threads::shared;
...
my @threads;
foreach my $folder (@list)
{
  my $thr = threads->new(\&MyFunction, $arg1, $arg2, $arg3);
  push(@threads, $thr);
}
foreach (@threads)
{
  $result = $_->join;
  die "Failed with result $result\n" if ($result);
}
undef @threads;

I did at one point run across a problem due to the size of what was being built by several threads, but I was able to alleviate the problem by increase the stack size by replacing use threads; with:

use threads ('stack_size' => 4096 * 10);

Here 4096 is the page size, and I can’t recall if I ever determined the default stack size, but this size seemed to be sufficient.

Dialog Boxes

Although it’s nice to fully automate tasks, sometimes, there are things you cannot escape requiring the user to do. In one such case, I decided that command line prompting would not be sufficient, so I opted to present the user with a dialog box.

use Tk;
use Tk::DialogBox;
...
do {
  my $mw = MainWindow->new;
  $mw->withdraw();
  $dialog_response = $mw->messageBox(-title=>"My Dialog Box Title",
		  -message=>"Please answer Yes or No to my question here",
		  -type=>"YesNo"
		  );
} while ($dialog_response ne "Yes");