{"id":39,"date":"2012-09-14T22:36:59","date_gmt":"2012-09-14T22:36:59","guid":{"rendered":"http:\/\/elene.dahners.com\/blog\/?p=39"},"modified":"2012-09-14T22:36:59","modified_gmt":"2012-09-14T22:36:59","slug":"perl-useful-tidbits-from-work","status":"publish","type":"post","link":"http:\/\/elene.dahners.com\/blog\/2012\/09\/14\/perl-useful-tidbits-from-work\/","title":{"rendered":"Perl: useful tidbits from work"},"content":{"rendered":"<p>For a few years now, a large part of my job has depended on my writing and maintaining Perl scripts. I thought it would be wise, for my own sake, to note down some of the things I&#8217;ve learned along the way. I don&#8217;t claim these notes to contain optimal solutions and look forward to any comments they may spark.<\/p>\n<h1>Adding an &#8220;include&#8221; directory to a file<\/h1>\n<p>I wasn&#8217;t the one to come up with this exact format, and I feel like some of it is extra, however, it has worked and was proliferated throughout many files before I picked them up, so once I have the need to use it for myself, I&#8217;ll take the opportunity to figure out what&#8217;s really needed. It does show the flexibility of what can be added in the <code>BEGIN<\/code> block as is.<\/p>\n<pre>BEGIN\r\n{\r\n\u00a0\u00a0 \u00a0use File::Spec::Functions 'catfile';\r\n\u00a0\u00a0 \u00a0use File::Basename 'dirname';\r\n\u00a0\u00a0 \u00a0push @INC, catfile(dirname($0), '..\/other_location\/Perl_modules');\r\n}<\/pre>\n<h1>Generic Code, Project Specifics in a Hash<\/h1>\n<p>We had multiple projects which were very similar in what needed to be done via the Perl scripts. To keep the code from being hardcoded to a project, we used a Perl module <code>Custom::Project<\/code>, containing a hash with details about the products and their dependencies. For instance, one layout might be:<\/p>\n<pre>package Custom::Project;\r\n...\r\n  %Products = (\r\n    SourceA_Product =&gt; {\r\n      path        =&gt; \"source_a\",\r\n      label_name  =&gt; \"MYSOURCEA\",\r\n    },\r\n    SourceB_Product =&gt; {\r\n      path        =&gt; \"source_b\",\r\n      label_name  =&gt; \"MYSOURCEB\",\r\n    },\r\n    Top_Product =&gt; {\r\n      path        =&gt; \"top\",\r\n      label_name  =&gt; \"TOP\",\r\n      subproducts =&gt; [\"SourceA_Product\", \"SourceB_Product\"],\r\n   \u00a0},\r\n  );<\/pre>\n<p>With access to this hash, if I want to apply a label, I can grab the information as long as I know what product I need the information for using something like $Custom::Project::Products{&#8220;SourceA_Product&#8221;}{label_name}. Or if I&#8217;d like to cycle through the subproducts of Top_Product, I can use the array directly by using @{$Custom::Project::Products{&#8220;Top_Product&#8221;}{subproducts}}<\/p>\n<h2>Precompile a Regex<\/h2>\n<p>Sometimes it can be useful to store a regex in a variable. In one instance I setup a regex in a variable because it had changed a couple times already, and it was part of a few steps of substitutions, so I wanted to pull out what I could. What I did was create a variable<\/p>\n<pre>my $to_be_replaced = qr\/NUMBER_(\\d)_REPLACEABLE\/;<\/pre>\n<p>Then, later, in the substitution phase, I was able to do something like this:<\/p>\n<pre>$file_contents =~ s\/$to_be_replaced\/New_Name_$1\/g;<\/pre>\n<p>Important here is that the number in the text to be replaced was captured and used in the replacement text.<\/p>\n<h1>Name both key and value when iterating over a hash<\/h1>\n<p>Instead of <code>foreach<\/code> and only naming a key, an alternative is to use<\/p>\n<pre>while ((my $key, my $value) = each %hash) {}<\/pre>\n<h1>Passing values to subroutines<\/h1>\n<p>Previously, I had always used shift to pop the values passed to subroutines. There is an alternative which looks something like this<\/p>\n<pre>my ($value1, $value2, $value3) = @_;<\/pre>\n<p>Since Perl doesn&#8217;t require you to be strict about dictating what will be passed where, this makes a simple way to quickly write out the arguments passed through, though does require attention be paid when making changes. However, I feel it is more readable than using repetitive shifts.<br \/>\nA second point is not really specific to argument passing, but is useful to the cause. Obviously you cannot pass an array to a subroutine directly as the way the arguments are passed is fairly blatantly like an array based on my above note. Passing an array can be critical, though, and it&#8217;s not out of the question, it&#8217;s simply a matter of how you pass it. Instead of pushing the whole array in by passing it as <code>@my_array<\/code>, you can pass a reference to the array using <code>\\@my_array<\/code>. Once the subroutine receives it, it will need to dereference what it receives, we&#8217;ll call it <code>$my_array_ref<\/code> in the subroutine, as such: <code>@{$my_array_ref}<\/code>. Note that this goes for returning information from a subroutine as well; you can return multiple parameters, and they can be references.<\/p>\n<h1>Execution Calls<\/h1>\n<p>There are a few different ways to execute something from within a Perl script. They all return slightly differently, meaning it&#8217;s important to choose the right one.<\/p>\n<pre>system($CMD); # Returns success or failure result\r\n`$CMD`;       # Returns output of running command (the characters are backticks and this is effectively the same as qx\/$CMD\/; &amp; qx{$CMD};)\r\nexec($CMD);   # Does not return<\/pre>\n<p>I know there is also an option open(), but I&#8217;m not as familiar with it in terms of executing commands, instead of things like opening files. I will direct you to <a title=\"Perl HowTo\" href=\"http:\/\/www.perlhowto.com\/executing_external_commands\" target=\"_blank\">Perl HowTo<\/a> if you wish for more detailed descriptions.<\/p>\n<h1>Saving a file with Unix newlines<\/h1>\n<p>This may not be an issue if you&#8217;re on a Unix or Linux machine, but if you&#8217;re on a Windows machine and the file you&#8217;re creating needs to have the Unix newline format, there&#8217;s an one extra step you can take to make it happen.<\/p>\n<pre>my $result = open(FILE, \"&gt;\" . $filename) ? 0 : 1;\r\nbinmode FILE;<\/pre>\n<h1>Parallel tasks<\/h1>\n<p>Sometimes it&#8217;s possible to perform tasks in parallel, and Perl has a few options for performing parallel tasks, but I&#8217;ve started to prefer threads (&amp; threads::shared).<\/p>\n<pre>use threads;\r\nuse threads::shared;\r\n...\r\nmy @threads;\r\nforeach my $folder (@list)\r\n{\r\n  my $thr = threads-&gt;new(\\&amp;MyFunction, $arg1, $arg2, $arg3);\r\n  push(@threads, $thr);\r\n}\r\nforeach (@threads)\r\n{\r\n  $result = $_-&gt;join;\r\n  die \"Failed with result $result\\n\" if ($result);\r\n}\r\nundef @threads;<\/pre>\n<p>I did at one point run across a problem due to the size of what was being built by several threads, but I was able to alleviate the problem by increase the stack size by replacing <code>use threads;<\/code> with:<\/p>\n<pre>use threads ('stack_size' =&gt; 4096 * 10);<\/pre>\n<p>Here 4096 is the page size, and I can&#8217;t recall if I ever determined the default stack size, but this size seemed to be sufficient.<\/p>\n<h1>Dialog Boxes<\/h1>\n<p>Although it&#8217;s nice to fully automate tasks, sometimes, there are things you cannot escape requiring the user to do. In one such case, I decided that command line prompting would not be sufficient, so I opted to present the user with a dialog box.<\/p>\n<pre>use Tk;\r\nuse Tk::DialogBox;\r\n...\r\ndo {\r\n  my $mw = MainWindow->new;\r\n  $mw->withdraw();\r\n  $dialog_response = $mw->messageBox(-title=>\"My Dialog Box Title\",\r\n\t\t  -message=>\"Please answer Yes or No to my question here\",\r\n\t\t  -type=>\"YesNo\"\r\n\t\t  );\r\n} while ($dialog_response ne \"Yes\");<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>For a few years now, a large part of my job has depended on my writing and maintaining Perl scripts. I thought it would be wise, for my own sake, to note down some of the things I&#8217;ve learned along &hellip; <a href=\"http:\/\/elene.dahners.com\/blog\/2012\/09\/14\/perl-useful-tidbits-from-work\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24,11],"tags":[42,25,26],"class_list":["post-39","post","type-post","status-publish","format-standard","hentry","category-perl","category-programming","tag-perl","tag-tips","tag-tricks"],"_links":{"self":[{"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/posts\/39","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/comments?post=39"}],"version-history":[{"count":35,"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/posts\/39\/revisions"}],"predecessor-version":[{"id":102,"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/posts\/39\/revisions\/102"}],"wp:attachment":[{"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/media?parent=39"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/categories?post=39"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/elene.dahners.com\/blog\/wp-json\/wp\/v2\/tags?post=39"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}