Perl: useful tidbits from work

For a few years now, a large part of my job has depended on my writing and maintaining Perl scripts. I thought it would be wise, for my own sake, to note down some of the things I’ve learned along the way. I don’t claim these notes to contain optimal solutions and look forward to any comments they may spark.

Adding an “include” directory to a file

I wasn’t the one to come up with this exact format, and I feel like some of it is extra, however, it has worked and was proliferated throughout many files before I picked them up, so once I have the need to use it for myself, I’ll take the opportunity to figure out what’s really needed. It does show the flexibility of what can be added in the BEGIN block as is.

BEGIN
{
    use File::Spec::Functions 'catfile';
    use File::Basename 'dirname';
    push @INC, catfile(dirname($0), '../other_location/Perl_modules');
}

Generic Code, Project Specifics in a Hash

We had multiple projects which were very similar in what needed to be done via the Perl scripts. To keep the code from being hardcoded to a project, we used a Perl module Custom::Project, containing a hash with details about the products and their dependencies. For instance, one layout might be:

package Custom::Project;
...
  %Products = (
    SourceA_Product => {
      path        => "source_a",
      label_name  => "MYSOURCEA",
    },
    SourceB_Product => {
      path        => "source_b",
      label_name  => "MYSOURCEB",
    },
    Top_Product => {
      path        => "top",
      label_name  => "TOP",
      subproducts => ["SourceA_Product", "SourceB_Product"],
    },
  );

With access to this hash, if I want to apply a label, I can grab the information as long as I know what product I need the information for using something like $Custom::Project::Products{“SourceA_Product”}{label_name}. Or if I’d like to cycle through the subproducts of Top_Product, I can use the array directly by using @{$Custom::Project::Products{“Top_Product”}{subproducts}}

Precompile a Regex

Sometimes it can be useful to store a regex in a variable. In one instance I setup a regex in a variable because it had changed a couple times already, and it was part of a few steps of substitutions, so I wanted to pull out what I could. What I did was create a variable

my $to_be_replaced = qr/NUMBER_(\d)_REPLACEABLE/;

Then, later, in the substitution phase, I was able to do something like this:

$file_contents =~ s/$to_be_replaced/New_Name_$1/g;

Important here is that the number in the text to be replaced was captured and used in the replacement text.

Name both key and value when iterating over a hash

Instead of foreach and only naming a key, an alternative is to use

while ((my $key, my $value) = each %hash) {}

Passing values to subroutines

Previously, I had always used shift to pop the values passed to subroutines. There is an alternative which looks something like this

my ($value1, $value2, $value3) = @_;

Since Perl doesn’t require you to be strict about dictating what will be passed where, this makes a simple way to quickly write out the arguments passed through, though does require attention be paid when making changes. However, I feel it is more readable than using repetitive shifts.
A second point is not really specific to argument passing, but is useful to the cause. Obviously you cannot pass an array to a subroutine directly as the way the arguments are passed is fairly blatantly like an array based on my above note. Passing an array can be critical, though, and it’s not out of the question, it’s simply a matter of how you pass it. Instead of pushing the whole array in by passing it as @my_array, you can pass a reference to the array using \@my_array. Once the subroutine receives it, it will need to dereference what it receives, we’ll call it $my_array_ref in the subroutine, as such: @{$my_array_ref}. Note that this goes for returning information from a subroutine as well; you can return multiple parameters, and they can be references.

Execution Calls

There are a few different ways to execute something from within a Perl script. They all return slightly differently, meaning it’s important to choose the right one.

system($CMD); # Returns success or failure result
`$CMD`;       # Returns output of running command (the characters are backticks and this is effectively the same as qx/$CMD/; & qx{$CMD};)
exec($CMD);   # Does not return

I know there is also an option open(), but I’m not as familiar with it in terms of executing commands, instead of things like opening files. I will direct you to Perl HowTo if you wish for more detailed descriptions.

Saving a file with Unix newlines

This may not be an issue if you’re on a Unix or Linux machine, but if you’re on a Windows machine and the file you’re creating needs to have the Unix newline format, there’s an one extra step you can take to make it happen.

my $result = open(FILE, ">" . $filename) ? 0 : 1;
binmode FILE;

Parallel tasks

Sometimes it’s possible to perform tasks in parallel, and Perl has a few options for performing parallel tasks, but I’ve started to prefer threads (& threads::shared).

use threads;
use threads::shared;
...
my @threads;
foreach my $folder (@list)
{
  my $thr = threads->new(\&MyFunction, $arg1, $arg2, $arg3);
  push(@threads, $thr);
}
foreach (@threads)
{
  $result = $_->join;
  die "Failed with result $result\n" if ($result);
}
undef @threads;

I did at one point run across a problem due to the size of what was being built by several threads, but I was able to alleviate the problem by increase the stack size by replacing use threads; with:

use threads ('stack_size' => 4096 * 10);

Here 4096 is the page size, and I can’t recall if I ever determined the default stack size, but this size seemed to be sufficient.

Dialog Boxes

Although it’s nice to fully automate tasks, sometimes, there are things you cannot escape requiring the user to do. In one such case, I decided that command line prompting would not be sufficient, so I opted to present the user with a dialog box.

use Tk;
use Tk::DialogBox;
...
do {
  my $mw = MainWindow->new;
  $mw->withdraw();
  $dialog_response = $mw->messageBox(-title=>"My Dialog Box Title",
		  -message=>"Please answer Yes or No to my question here",
		  -type=>"YesNo"
		  );
} while ($dialog_response ne "Yes");

ClearCase: useful tidbits from work

Where I work, the primary version control system used is ClearCase. It’s the first version control system I learned, so even through its challenges, it remains my reference point for learning other version control systems. It’s easy to become familiar with the basics, but I’ve learned a few tricks that I find myself having to re-look-up, hoping I can find them again because I don’t need them frequently. That and, a lot of the tasks we used to do manually, we have rolled into scripts which govern building tasks so less is left to human error. So, here I will note some of the things I’ve found useful, but don’t always remember without looking them up.
Two of my favorite references online are:
YoLinux
Phil for Humanity (all the articles labeled “ClearCase Support”)
Then you have the regular, old manual:
IBM Cleartool Manual

cleartool ls

We’ll start with something obvious, that I don’t really have to look up, I just tend to have to think for a few seconds before I remember that it will help me figure out why I’m looking at the wrong thing. To clarify, if you’re in a directory, and you use cleartool ls, it will list the contents of the directory and which configuration specification rule is active on said item.

cleartool lsco

This, I found, was important for avoiding trouble when importing. lsco (or lscheckout), as may seem obvious, lists files checkedout. What was useful to me was checking on a specific branch, so I was using something like:

cleartool lsco -brtype BRANCH -recurse

Another useful checkout search is for all your own checkouts (via -me) in your view (current view via -cview), in all vobs (via -avobs)

cleartool lsco -me -avobs -cview

Find a branch or label

Need to know what labels or branches exist?

cleartool lstype -kind brtype -invob \my_vob

The -invob option can be super helpful if you are potentially in a different vob.
Alternatively, if you want to check if a single label or branch exists, another option is to ask for its description.

cleartool desc -s lbtype:MY_LABEL

Importing

We regularly imported source from a vendor, which we added on to, so it was important to keep things organized. This was our standard import call.

clearfsimport -nsetevent -recurse -mklabel <LABEL> <DIR_TO_IMPORT> <CC_LOCATION_TO_IMPORT_TO> 1> Logfileofyourchoice.txt 2>Errorlogfileofyourchoice.txt

Finding things

Just as an example, this will find all items on my_branch and as it finds each one, it will bring up a version tree for it. This can be useful in small cases, but is not advised when there will be many results.

cleartool find . -version "version(.../my_branch/LATEST)" -exe "cleartool lsvt -g %CLEARCASE_PN%"

(Note: the use of “.” is to indicate that the search should be in the current folder; you may replace this with a different path.)
Of course, alternatively, you can get just a list of all the files by replacing -exe and its parameters with simply -print. It can also be helpful sometimes to find files changed between two points, in which case, in the -version parameter block, adding && !version(.../other_branch/LATEST) is helpful. Note that -version isn’t terribly picky, so you can use branches or labels in the ().

Doing something with what you find

In the previous example, the -exe told ClearCase to open the version tree for each file found. Another example is to label each file found:

cleartool find . -all -version "lbtype(<LABEL>)" -exe "ct mklabel <OTHER_LABEL> %CLEARCASE_XPN%"

Here we use the clearcase special variable (with the extended path name, which would be more important if we changed “.” to a different folder) to indicate the file we’d like to apply a secondary label to.