Dean::Util - Utilities created by Dean Serenevy

# NAME

Dean::Util - Utilities created by Dean Serenevy

# SYNOPSIS

 use Dean::Util qw/map_pair nsign min_max/;
...

Then later, to remove dependence on Dean::Util

 perl -MDean::Util -we insert_Dean_Util_functions The/Module.pm

# DESCRIPTION

This is a set of utility functions for the perl programming language that I find myself rewriting frequently. Normally, putting functions into a module introduces a dependency on that module which can be a hassle in some situations. This is a "smart" module which is capable of replacing the use Dean::Util... line with the code for the requested functions. Thus, machines that have Dean::Util installed can use it as a module, but when requested, a (Dean::Util) dependency-free version of the file may be made.

# EXPORTED FUNCTIONS

## :utility - Using Dean::Util

### list_Dean_Util_functions

This function prints a column-formatted list of the functions included in the Dean::Util package.

### check_Dean_Util_functions

This function attempts to verify that the Dean/Util.pm is properly structured. This function is intended to be run only by people who make changes to the Dean/Util.pm file to check that their code is properly formatted for the module to parse.

### get_Dean_Util_code

Returns a hash ref with an entry of the following type for each function and variable defined in Dean::Util.

 name => { code    => '...',
pod     => '...',
depends => [ 'thing 1', 'thing 2', ... ]
}

Some additional information may be included in each sub-hash for debugging purposes or internal use.

### insert_Dean_Util_functions

Replaces all occurrences of "use Dean::Util ...;" ("..." is everything up to first semi-colon, so don't use qw; ;) with the actual source code of the functions requested from Dean::Util. The original files are saved to a backup file which is just the original file name with a ~ appended. The list of files to modify is either included as a list of arguments or is read from @ARGV.

As in the function get_Dean_Util_function_string, the special symbols INCLUDE_POD and POD_ONLY may be used to indicate that all further inclusions (restricted to each individual "use" block) should include their POD documentation before the code, or exclude the code and only output the POD documentation. Example:

 use Dean::Util qw/max min INCLUDE_POD join_multi map_pair/;
use Dean::Util qw/is_num is_int/;
# ... later, possibly even after __END__
use Dean::Util qw/POD_ONLY is_num is_int/;

Would include code and POD documentation for join_multi and map_pair. The code and POD documentation for is_num and is_int would be inserted separately.

Note: Multiple use Dean::Util inclusions may result in multiple subroutine definitions so don't use the same function twice unless they are in different scopes.

Once insert_Dean_Util_functions has been used to "export" a list of Dean::Util functions, this command will replace Dean::Util function blocks with more recent function versions, thus upgrading the exported script.

### remove_Dean_Util_functions

Once insert_Dean_Util_functions has been used to "export" a list of Dean::Util functions, this command can be used to remove them and restore the use Dean::Util line.

### get_Dean_Util_function_string

Returns the source code for the functions provided as arguments. If the argument list is empty, the function list is taken from @ARGV.

The special symbols INCLUDE_POD and POD_ONLY may be used to indicate that all further inclusions should include their POD documentation before the code, or exclude the code and only output the POD documentation. Example:

 get_Dean_Util_function_string qw/max min INCLUDE_POD join_multi map_pair/;

Would include the POD documentation for only join_multi and map_pair.

 get_Dean_Util_function_string qw/POD_ONLY format_cols/;

Would return just the POD documentation for format_cols.

# EXPORTABLE FUNCTIONS

## :numerical - Numerical Functions

### $pi The string, pi, to 30 digits after the decimal. ###$e

The string, e, to 30 digits after the decimal.

### fmin

 fmin { block } @list
fmin \&sub, @list

### fmin_dirty

 fmin_dirty { block } @list
fmin_dirty \&sub, @list

### maximizer

 maximizer { block } @list
maximizer \&sub, @list

Return the item of @list which yields the maximum value when evaluated by the given code. The code may be provided either as a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, perl will issue warnings. ### minimizer_dirty  minimizer_dirty { block } @list minimizer_dirty \&sub, @list Return the item of @list which yields the minimum value when evaluated by the code. code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, they will be ignored and the corresponding list item will not be considered as a minimizer.

Note however that no filtering is performed on @list so undefined values will be passed to the subroutine as a normal element.

### maximizer_dirty

 maximizer_dirty { block } @list
maximizer_dirty \&sub, @list

Return the item of @list which yields the maximum value when evaluated by the code. code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, they will be ignored and the corresponding list item will not be considered as a minimizer. Note however that no filtering is performed on @list so undefined values will be passed to the subroutine as a normal element. ### ceil($)

If the argument is numeric, then returns the smallest integer which is greater than or equal to the given argument. Otherwise this function will spew warnings.

### ceil_dirty($) If the argument is numeric, then returns the smallest integer which is greater than or equal to the given argument. Otherwise this function will return undef. ### floor($)

If the argument is numeric, then returns the largest integer which is less than or equal to the given argument. Otherwise this function spews warnings.

### floor_dirty($) If the argument is numeric, then returns the largest integer which is less than or equal to the given argument. Otherwise this function returns undef. ### round  round($value )          # round to integer
round( $value, 2 ) # round to even round($value, "0.01" )  # round to cent

Round $value to multiple of second parameter. Applies traditional algorithm. Namely, round($value ) == int($value + .5). Internal comparisons are performed at "string precision" to combat numerical precision problems. Thus, do not expect to to be able to round to too many digits. ### unbiased_round  unbiased_round($value )          # round to integer
unbiased_round( $value, 2 ) # round to even unbiased_round($value, "0.01" )  # round to cent

An unbiased round removes the upward bias of the traditional rounding algorithm by rounding the midpoint value up sometimes and down other times. The convention is to round midpoint values to even multiples, and round all other values normally.

For example, unbiased_round( 2.5 ) == 2 since 2 is even, however unbiased_round( 1.5 ) == 2 as well since 2 is even.

This system can be extended to the generalized rounding algorithm:

 unbiased_round( 1, 2 ) == 0   # since 0 is an even multiple of 2
unbiased_round( 3, 2 ) == 4   # since 4 is an even multiple of 2

### sum

Returns the sum of all numeric entries in a list. Undefined/non-numeric values cause warnings.

### product

Returns the product of all numeric entries in a list. Undefined/non-numeric values cause warnings.

### average

Returns the average over all entries in a list. Undefined or non-numeric entries will spew warnings.

### sum_dirty

Returns the sum of all numeric entries in a list. Undefined/non-numeric values are ignored.

### product_dirty

Returns the product of all numeric entries in a list. Undefined/non-numeric values are ignored.

### average_dirty

Returns the average over all entries in a list. Undefined or non-numeric entries contribute a 0 to the average.

### min_max

Returns a pair ($m,$M) which is the minimum and maximum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values will cause warnings.

### max_min

Returns a pair ($M,$m) which is the maximum and minimum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values will cause warnings.

### min_max_dirty

Returns a pair ($m,$M) which is the minimum and maximum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values are silently ignored.

### max_min_dirty

Returns a pair ($M,$m) which is the maximum and minimum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values are silently ignored.

### sieve_of_eratosthenes

 my $sieve = sieve_of_eratosthenes($n );
sieve_of_eratosthenes( $m,$sieve );

Constructs a bit string $sieve using the Sieve of Eratosthenes so that:  vec($sieve, $n, 1) == 1 iff$n is prime

If a sieve (or an undefined scalar) is provided as a second argument, it will be appended to.

Note: Since perl's length command deals only in bytes, this subroutine will round $n up to make sure that $sieve is correct to a whole number of bytes. In particular, you are guaranteed to be able to trust $sieve up to $n = 8 * length($sieve) - 1. ### is_prime Determine primality. Constructs the Sieve of Eratosthenes to determine primality. The sieve is reused for each call to is_prime so scripts are encouraged to prepare the sieve by calling is_prime on a large number before making multiple calls to is_prime.  # SLOW: takes 21.89 seconds @primes = grep is_prime($_), 1..400000;
 # FAST: takes 1.387 seconds
@primes = reverse grep is_prime($_), reverse 1..400000; This function may take some shortcuts if it can so if you want to prepare the sieve append the option "force_sieve",  # SLOW: is_prime( 400000 ); # this test shortcuts since 400000 is even @primes = grep is_prime($_), 1..400000;
 # FAST:
is_prime( 400000, force_sieve => 1 );

### base_hash

Given a base, this function returns a hash which may be used in future calls to the other base functions.

A base is described by:

 integer <= 36 (0-9 a-z)
array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
string        (string of symbols, shortcut for [split //, $str] hash ref (the output of a previous call to base_hash, this is silly in this case) ### base2base  base2base( string, base, base ) String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:  . , : ; _ | / \ - + '  " Bases are described by:  integer <= 36 (0-9 a-z) array ref (list of symbols, length == base, index i == i, yes you get to define zero) string (string of symbols, shortcut for [split //,$str]
hash ref      (the output of base_hash)

### base2integer

 base2integer( string, base )

Convert a string to another base. The string may not be a decimal.

Base is described by:

 integer <= 36 (0-9 a-z)
array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
string        (string of symbols, shortcut for [split //, $str] hash ref (the output of base_hash or symbol => value pairs) ### base2decimal  base2decimal( string, base ) String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:  . , : ; _ | / \ - + '  " Base is described by:  integer <= 36 (0-9 a-z) array ref (list of symbols, length == base, index i == i, yes you get to define zero) string (string of symbols, shortcut for [split //,$str]
hash ref      (the output of base_hash)

### decimal2base

 decimal2base( string, base )

String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:

 . , : ; _ | / \ - + '  "

Base is described by:

 integer <= 36 (0-9 a-z)
array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
string        (string of symbols, shortcut for [split //, $str] hash ref (the output of base_hash) ### factorial  factorial($n )

Returns $n! if$n is a non-negative integer.

### pct_change

 pct_change( $orig,$new )

Simply returns the percent change between the two values ($new-$orig)/$orig. Exists solely because I don't like how the formula looks in a line of real code. ## :stat_prob - Statistical / Probability ### pascals_triangle  pascals_triangle( Int$n )

Return nth row of pascal's triangle (starting at 0).

### random_binomial

 random_binomial( Int $n ) Return random integer from 0 to n (inclusive) following a binomial distribution. This is only useful up to n == 8 *$Config{intsize} - 1

### prob_model_invariants

 prob_model_invariants( \%model, %options )

The model is a hash with keys the outcomes and values the corresponding probabilities. At most one of the probabilities may be undefined in which case it will be computed automatically (as $1 - \sum p_i$) and added to your passed probability model.

### roll_dice

Roll n dice (default 1) and return the results. In scalar context, only the sum is returned. In list context, the individual rolls are returned as well as the final sum of the values (the sum is returned in the last position).

### randomize

Randomize a list of values. Essentially the Fisher-Yates shuffle code from perlfaq4 ("How do I shuffle an array randomly?"). If the array is passed by reference then it will be altered, otherwise a copy is made. Returns a new list or a reference to a list depending on context.

### one_var

 one_var( @data );
one_var( \@data );
one_var( \@data, $sorted ); Returns a hash (or hash reference if called in scalar context) of one-variable statistics on the input data. If the $sorted parameter is not defined then the data is assumed to be not sorted and the subroutine will make its own sorted copy of the data. If the $sorted parameter is defined but false, then the subroutine will sort @data in place (@data will be altered). If the $sorted parameter is true then the data will be assumed to be already sorted. The returned hash will have the following keys:

average
mean
x-bar

The average value of the data

sum
sum x

The summation of the data

sum_sq
sum x^2

The sum of the squares of the data

Svar
sample_variance

The sample variance, 1/n-1 * sum (x_i - average)^2

Sx
sample_standard_deviation

The sample standard deviation, sqrt( Svar )

variance
sigma_sq

The population variance, E( (X - E(X))^2 )

sigma
standard_deviation

The population standard deviation, sqrt( variance )

se
standard_error

The standard error of the mean, for computing confidence intervals

n

The number of measurements in the sample

min

The smallest data element

max

The smallest data element

Q1

The first quartile computed using broken "Basic Math Course Method".

Q2
med
median

The sample median

Q3

The third quartile computed using broken "Basic Math Course Method".

char:sum
char:Sigma
char:sigma

The corresponding Unicode characters: "\x{2211}", "\x{03A3}", "\x{03C3}". Be warned that char:sum is a different symbol than char:Sigma and that the terminal that you are writing to will need to understand UTF-8 font encoding.

Note: the list only needs to be sorted to compute the quartiles, min, median, and max values. If you are not interested in these values then you can speed up the computation by providing $sorted with a true valued (regardless of whether the data is sorted) and simply ignore those values in the output. ### percentile  percentile($p, @data)
percentile($p, \@data) percentile($p, \@data, $sorted) percentile($p, \@data, %options)

Return the $p-th percentile using the weighted average at X_{(n+1)p} method (http://www.xycoon.com/method_2.htm) That is, the number such that approximately 100 *$p of the data values are less than or equal to the given value. If an array reference is given as well as a third true value, the data will be assumed to be already sorted. The following options are available.

sorted

Boolean value indicating whether the data are sorted already. If not, they will be sorted numerically.

method

One of "midpoint", "floor", "ceil", or "scaled". This controls what to do when a percentile divider is between two entries. The default behavior is "scaled", the returned percentile will be an appropriate linear combination of the neighboring entries. The "midpoint" method always returns the midpoint of the neighboring entries. Finally, the "floor" and "ceil" methods always return the lower or higher neighbor respectively.

The "method" also affects the return value when return => "index" is enabled.

return

Either "value" or "index". Affects whether we return the actual percentile value, or simply its index in the array.

### correlation

 my $r = correlation( \@X, \@Y ); my %I = correlation( \@X, \@Y ); my$r = correlation( \@X, \@Y, %options );

Pearson product-moment correlation coefficient.

one_var_x
one_var_y

The result hash from one_var()

sd_x
sd_y
mean_x
mean_y

The sample standard deviation and mean of x and y.

 permutations( $n ); permutations( @list ); # 1 < @list !! permutations( \@list ); Return a list of all permutations of the given input list. Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider using Algorithm::Permute or Algorithm::FastPermute from CPAN. ### k_arrangements  k_arrangements( \@list,$k );
k_arrangements( $n,$k );

Return a list of all arrangements (sub-permutations) of the given input list of length $k. If $n and $k are both integers, then simply the number of $k arrangements is returned.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

### arrangements

 arrangements( $n ); arrangements( \@list ); arrangements( \@list,$k );
arrangements( $n,$k );
 arrangements( @list );  # @list > 2 !!!

Return a list of all arrangements (sub-permutations) of the given input list (regardless of length). If the list is provided as a reference and an integer $k is provided then the results will be restricted to length$k as in the k_arrangements subroutine.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

### k_combinations

 k_combinations( \@list, $k ); k_combinations($n, $k ); Return a list of all combinations of the given input list of length$k.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

### combinations

 combinations( $n ); combinations( \@list ); combinations( \@list,$k );
combinations( $n,$k );
 combinations( @list );  # @list > 2 !!!

Return a list of all combinations of the given input list (regardless of length). If the list is provided as a reference and an integer $k is provided then the results will be restricted to length$k as in the k_combinations subroutine.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

### npdf

 npdf $x npdf$x, $mu npdf$x, $mu,$sigma

Compute the probability P( X = $x ) assuming a normal distribution with mean $mu and standard deviation $sigma. $mu and $sigma are assumed to be 0 and 1 respectively if they are missing. $sigma must be positive.

### ncdf

 ncdf $x ncdf$x, $mu ncdf$x, $mu,$sigma

Compute the probability P( X <= $x ) assuming a normal distribution with mean $mu and standard deviation $sigma. $mu and $sigma are assumed to be 0 and 1 respectively if they are missing. $sigma must be positive.

## :math - Mathematical Functions

my $d = &dotprod(\@x, [1,2,3]); Compute the dot product of two vectors ### modular_inverse $inverse = modular_inverse( $x,$m );

Compute the inverse of $x in the group Z_m. The inverse will be within the set [0..$m-1].

Note: $x must be relatively prime to$m.

### gcd

Compute the Greatest Common Divisor of a list of integers using the Euclidean algorithm.

### lcm

Compute the Least Common Multiple of a list of integers.

 ($alpha,$beta, $d) = extended_euclidean_algorithm($a, $b) For a pair of integers, a and b, perform the extended Euclidean algorithm to compute alpha, beta, and d such that:  d = alpha * a + beta * b In particular, if d = 1 then alpha = a^-1 mod b. ### frac  my ($N, $D) = frac($dec )

Convert a decimal to a fraction. Returns undef if number is not rationalizable (must have repeating decimals).

### ndiff(&;@)

 my $df = ndiff \&f; my$df = ndiff \&f, $x; Perform numerical differentiation using the central difference formula.  f'(a) \approx ( f(a+h) - f(a-h) ) / (2h) If M \approx f(a) \approx f''(c) for all c \in [a-h, a+h], then the total error (truncation plus round-off) is on the order of:  error = M * (h^2/6 + eps/h) where eps is the machine epsilon (eps = 2E-16 on 32-bit perl; (1 + 2E-16 != 1), however (1 + (2E-16)/2 == 1) ). Thus, error is minimized when h \approx \sqrt[3]{eps}. We choose h = 2**(-20) = 0.00000095367431640625. Examples:  sub f {$_[0]**2 }
my $df = ndiff \&f; printf "%.5f | %.5f\n", f($_), $df->($_) for 0..10;
 say "f'(3) = ", ndiff(\&f, 3);
 $df = ndiff {$_ ** 2 };

### Nintegrate

 Nintegrate { block } $a,$b, $n Nintegrate \&sub,$a, $b,$n

Integrate a function between two values using a composite Simpson's rule. The last argument $n is optional and specifies the number of intervals to divide the region into. The default is 1000. The function is assumed to be continuous with continuous derivatives up to order 4. $n should be even, but we adjust it if it is not. The error is given by,

             5
(b-a)     (4)
err = --------  f  ( x )
4
180 n

for some x in the interval (a,b).

### interpolating_function

 interpolating_function \%function, $message,$nowarn

## :list - List Utilities

### binary_search(&@)

 binary_search { $_ > 4 } @sorted_nums; binary_search \&f, @sorted_nums; Implements a binary search. Second argument must be an array (not a list) and must be sorted. Returns the index of the first element for which the function &f returns true. Returns undef if there is no such element. Function must return true for all elements larger than desired element. To search for a particular element, the following must be done:  my$i = binary_search { $_ >= 4 } @sorted_nums;$i = undef unless $sorted_nums[$i] == 4;

### text_sort

Natural sort with case folding and Unicode support. Mostly a direct use of Unicode::Collate with automatic binary string decoding (assumes UTF-8) and numerical substring extraction as in natural_sort.

Limitations:

  It doesn't "properly" sort negative numbers, non-fixed decimal values,
nor integers larger than 10^24 ≈ 2^83.

  @sorted = text_sort_by { $_->title } @books; Natural sort with case folding and Unicode support. Mostly a direct use of Unicode::Collate with automatic binary string decoding (assumes UTF-8) and numerical substring extraction as in natural_sort. Callback is called on each item and should return a string for comparison. Limitations: Necessarily, it does not "properly" sort negative numbers or non-fixed decimal values. It also can not sort integers larger than 10^24 ≈ 2^83. ### natural_sort A "fast, flexible, stable sort" that sorts strings naturally (that is, numerical substrings are compared as numbers). Code lifted from tye on perlmonks: http://www.perlmonks.org/?node_id=442285 Limitations: http://www.perlmonks.org/?node_id=483466  It doesn't "properly" sort negative numbers, non-fixed decimal values, nor integers larger than 2^32-1. ### natural_cmp A fast, flexible, stable comparator that sorts strings naturally (that is, numerical substrings are compared as numbers). Code lifted from tye on perlmonks: http://www.perlmonks.org/?node_id=442285 Limitations: http://www.perlmonks.org/?node_id=483466  It doesn't "properly" sort negative numbers, non-fixed decimal values, nor integers larger than 2^32-1. ### cartesian  cartesian \@list1, \@list2, ... cartesian$n1, $n2, ... Form the cartesian product of the elements in the lists. That is, all lists of the form [$e1, $e2, ... ] where $e1 comes from @list1, and so on. This function returns an array reference in scalar context, and a list in list context.

In the second form, the lists [1..$n1], [1..$n2], ... will be constructed, and the cartesian product of those lists will be computed. Note however, that the two forms can not be combined, you must either provide only arrays or only numbers.

### transposed

 transposed \@LoL

Transpose the (possibly non-regular) list of lists @LoL. Returns a new list reference containing the objects in @LoL.

### flatten

 flatten @LoLoLoL

Will recursively run through each element of the input list and will return all components as a single large list. Lists may be arbitrarily nested and any objects which are not perl ARRAY's will be considered plain elements. The expansion is done depth-first. Returns a reference in scalar context, and the list of elements in list context.

Example:

 @y = flatten [1, 2, 3], [4, 5], [[6, 7], 8, 9];
say "Hooray!" if "@y" eq "1 2 3 4 5 6 7 8 9";

### find_index

 find_index \&f, \@array
find_index { BLOCK } \@array
find_index { BLOCK } \@array, $start,$stop, $step May be called with either a function or a block as the first argument. The function will then begin at $start (or zero) and then step by $step (or 1) until we reach $stop (or the end of the array).

$_ will be set to the current array entry which will also be passed to the function as its only argument. Thus you may use either $_ or $_[0] within your function. $start may be greater then $stop in which case we will proceed backwards. In all cases the sign of $d will be adjusted if necessary so that we finish in finite time.

### find_index_with_memory

 find_index_with_memory \&f, \@array
find_index_with_memory { BLOCK } \@array
find_index_with_memory { BLOCK } \@array, $start,$stop, $step May be called with either a function or a block as the first argument. The function will then begin at $start (or zero) and then step by $step (or 1) until we reach $stop (or the end of the array).

The function will set the caller's $a to the previous array entry and $b to the current array entry and will also pass the two entries to the function as its only arguments. Thus you may use either $a,$b or $_[0],$_[1] as the previous and current entries respectively.

Return the index of the first item of @list for which the code returns true. Code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. You may pass @list by reference (which means that you must pass it by reference if it contains an array reference in its first entry). If you pass @list by reference and provide a third argument, then the third argument will be taken to be the first position that should be checked. In this case the returned index will still correspond correctly to a position in @list. ### bucketize  my %buckets = bucketize { block } @list; my %buckets = bucketize \&tagger, @list; my$buckets = bucketize \&tagger, @list;

Partition items into buckets given a generic tagger. Returns hash ref in scalar context. Tagger should accept a single argument (or use $_) and should return a tag indicating the bucket to place the item in. Function is called in list context so that the following works as expected:  %by_file_type = bucketize { /\.([^\.]+)$/ } @images;

Also note that values are given as bound aliases, so they can also be "cleverly" modified:

 # ("foo-bar", "foo-baz", "bip-bop")
#  becomes: ( foo => ["bar","baz"], bip => ["bop"] )
my $h = unique \@list; Takes a list (or reference to an array) and returns a list of unique (up to stringification) objects in apparently random order. In scalar context, a histogram (hash with objects as keys, and counts as values) is returned. Note: List::MoreUtils::uniq preserves the original order of the elements. ### lex_sort  lex_sort @list_of_lists lex_sort sub{ }, @list_of_lists Sort the lists lexicographically element-wise. The sorting subroutine may use the package variables $a and $b or may take two arguments, but need only worry about element-wise comparison. Example:  lex_sort( [qw/abc ac a/], [qw/abc ab c d/], [qw/x y z/], [qw/abc ab c/] ) # gives: # ( [qw/abc ab c/], # [qw/abc ab c d/], # [qw/abc ac a/], # [qw/x y z/] # ) Similarly with numerical data using: sub{$a <=> $b } ## :patterns - Tests and Patterns ###$_re_int

Pattern which matches an integer expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.

### $_re_num Pattern which matches an floating-point expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern. ###$_re_exp

Pattern which matches an exponent part (Ex: 2.3 e -10) of a floating-point expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.

Returns a true value if the argument looks like a word. If no argument is provided, $_ is examined. Words do not have spaces and do not typically have punctuation, though hyphens "-" and underscores are allowed. ###$_re_image_ext

Pattern which matches image-type file name extensions. The list of extensions matched (case insensitive) are:

BMP CMYK CMYKA DCM DCX DIB DPS DPX EPI EPS EPS2 EPS3 EPSF EPSI EPT FAX FITS FPX G3 GIF GIF87 GRAY ICB ICM ICO ICON IPTC JBG JBIG JP2 JPC JPEG JPG MAP MIFF MNG MONO MPC MTV MVG OTB P7 PAL PALM PBM PCD PCDS PCL PCT PCX PDB PGM PICON PICT PIX PLASMA PNG PNM PPM PSD PTIF RAS RGB RGBA RLA RLE ROSE SGI SUN SVG TGA TIF TIFF UYVY VDA VICAR VID VIFF VST WBMP X XBM XC XCF XPM XV XWD YUV

Returns a true value if the argument looks like an image file. If no argument is provided, $_ is examined. The ist of extensions matched (case insensitive) are: BMP CMYK CMYKA DCM DCX DIB DPS DPX EPI EPS EPS2 EPS3 EPSF EPSI EPT FAX FITS FPX G3 GIF GIF87 GRAY ICB ICM ICO ICON IPTC JBG JBIG JP2 JPC JPEG JPG MAP MIFF MNG MONO MPC MTV MVG OTB P7 PAL PALM PBM PCD PCDS PCL PCT PCX PDB PGM PICON PICT PIX PLASMA PNG PNM PPM PSD PTIF RAS RGB RGBA RLA RLE ROSE SGI SUN SVG TGA TIF TIFF UYVY VDA VICAR VID VIFF VST WBMP X XBM XC XCF XPM XV XWD YUV ### readonly Returns true if scalar argument is readonly. (Taken from Scalar::Util.) ### like_array Returns true if the object can behave like an array. (This is just a nicer way to call UNIVERSAL::isa) ### like_hash Returns true if the object can behave like a hash. (This is just a nicer way to call UNIVERSAL::isa) ### like_scalar Returns true if the object can behave like a scalar. (This is just a nicer way to call UNIVERSAL::isa) ## :parse - General Interpreters / Parsers ### parse_debian_control_format Parses text in Debian Control file format (http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-controlsyntax). Returns an arrayref of records (one for each paragraph). ### parse_user_agent WARNING: This function is out of date, naïve, and generally broken. See HTTP::BrowserDetect for a more up-to-date solution. I intend to eventually send some patches or provide a wrapper (HTTP::BrowserDetect::Practical(?)) to that module (for instance, it is my belief that agent strings like "msie" should be reported as a bot).  my$hashref = parse_user_agent( $string ); my %hash = parse_user_agent($string );

Given a user-agent string returns a hash containing the following fields. Fields which can not be determined are left undefined.

generic_os

Returns the generic operating system type: Windows, Mac, OS2, Linux, UNIX

os

Returns the specific operating system type: Windows Vista, Windows Server 2003, Windows XP, Windows 2000, Debian, ...

type

Note: For this field, we try to make our best guess at which class the agent string fits into.

program

Quasi-canonicalized program name: Internet Explorer, Netscape, Mozilla, Firefox, wget, ...

version

Our best guess at the program version

engine

The Browser's rendering engine: Gecko, KHTML, MSHTML, Presto (opera), WebCore (apple), custom (other custom engines)

engine-version

The version of the rendering engine

user-agent

The unmodified user-agent string

obsolete

If true, the agent appears to be an obsolete web browser

### str2hash

Parse a string into a hash using Text::Balanced::extract_delimited. This function recognizes perl 5 style hashes as well as the basic perl 6 adverbial form. Items missing a value will set the corresponding hash value to true.

Example:

 str2hash 'foo, bar => "Hmmm, a comma", :baz<23>, :!bip, quxx => Spaces are fine'

Parses to:

 { foo => 1,
bar => 'Hmmm, a comma',
baz => 23,
bip => 0,
quxx => 'Spaces are fine',
}

Unfortunately, the adverbial form will behave strangely with embedded commas:

 str2hash ':baz<Well, how odd>'

becomes

 { ':baz<Well' => 1,
'how odd>'  => 1,
}

### unformat

WARNING: still quite experimental!

### commify

 my $val = commify(1234342.32); my$val = commify("%.2f", 1234342.3234234);

Insert commas into number. If passed two parameters, the first will be taken as a sprintf format string which will be applied to the value before commifying.

### rtf2txt

 rtf2txt( file => $filename_or_handle ) rtf2txt( string =>$rtf_text )
 rtf2txt( $existing_file ) rtf2txt($rtf_text )

### nicef

 nicef( $num,$digits )

Nicely formats sprintf("%.${digits}f",$num) by removing trailing 0's and unnecessary decimals. $digits defaults to 2. ### length2pt Given a string like "4in" or "2ft - 7in", return the value as a number of points (72 points per inch). undef is returned if we can't parse the string. Recognized units:  pt in, ft, mi km, m, cm, mm, nm ### uri_rel2abs  my$url = uri_rel2abs( $path,$base )

Converts a path into an absolute path based at the given base unless the path is already absolute. Any file part of the base is ignored.

This subroutine is should be a proper rfc3986 uri parser as it is simply calls URI->new_abs. However, proper parsing pays a penalty in execution time. Compare the benchmarks between uri_rel2abs and uri_rel2abs_fast:

        Rate   URI  FAST
URI   208/s    --  -93%
FAST 3012/s 1350%    --

 my $url = uri_rel2abs_fast($path, $base ) Converts a path into an absolute path based at the given base unless the path is already absolute. Any file part of the base is ignored. This subroutine is not and will likely never be a reasonable implementation of a proper rfc3986 uri parser. At the moment, however, it appears to be "good enough" for typical web address (http, ftp, mms, ...) handling. The uri_rel2abs function uses the URI module to properly produce an absolute uri, however at a significant speed cost.  Rate URI FAST URI 208/s -- -93% FAST 3012/s 1350% -- ### glob2regexp Constructs a regular expression pattern (string) that matches the same patterns as the given glob. The pattern matches a whole string and is anchored using ^ and $ unless the glob ends with * in which case the trailing .*$ will be removed. Keep this in mind if you wish to capture the pattern matched by the glob. Current capabilities: Globby chars * match many chars; ? match one char Escaping of globby chars \** matches '\*Hello', \\\** matches "\\*Hello" Grouping constructs [abc] match a character, [^abc] don't match chars, {foo,bar} match options Current restrictions: The globby chars '*' and '?' may not appear within grouping constructs ('[]' and '{}'). Can't match grouping chars in groups: '[ab\]]' does not work. ### str($)

Returns string form of argument (forces string context) if it is defined, otherwise returns the empty string.

### replace_windows_characters

Replaces unsightly Extended Windows characters with reasonable ASCII equivalents.

 See: http://www.cs.tut.fi/~jkorpela/www/windows-chars.html
See: http://search.cpan.org/~barbie/Text-Demoroniser
(and probably a million other places)

### strip_space

Remove all space from the provided argument. If the argument is undefined, return the empty string.

### sign($) Returns "+" or "-" depending on the sign of the argument. ### nsign($)

Returns "" or "-" depending on the sign of the argument.

### canonicalize_newlines

Replace CRLF, CR, LF with the Perl magic \n. Arguments are modified in-place. If no arguments are provided then $_ is altered instead. Any undefined arguments are ignored. (though canonicalize_newlines(undef) will not alter $_).

Replace CRLF, CR, LF with the Perl magic \n. Arguments are copied before canonicalization. If no arguments are provided then $_ is used instead. Any undefined arguments result in undefined output values. ### canonicalize_timeword Transform a reasonable (case-insensitive) abbreviations (or plural forms) of "second", "minute", "hour", "day", "week", "month", "year" into one of these canonical forms. Whitespace and numerical values are allowed at the beginning of the string and will be ignored (and not included in the return value). NOTE: minutes are preferred over months, thus "m" will return "minute" rather than "month". ### qbash($)

Returns a string quoted for bash-like shells. The string must contain only printable characters or whitespace, otherwise the subroutine will die. The return value is an untainted string wrapped in single quotes ' that is ready (and safe) to pass to a shell.

A note on encoding:

If a string would be considered otherwise unquotable, an attempt will be made to interpret it as encoded UTF-8. If this is successful, then the string will be re-checked and, if acceptable, escaped and then re-encoded. If your expressions are in some other encoding, you will need to decode them yourself (and probably re-encode them before use).

 stringify( $thing, %options ) Stringifies Perl objects (SCALAR, HASH, or ARRAY based). Stringifies only a single object at a time, and accepts the options below. Note: CODE, GLOB, LVALUE, and Regexp references are not supported. stringify_underlying_object By default, overloaded stringification will be respected. Set this option to true to stringify the underlying object rather than use its overload function. list_type List which describes how lists are translated.  DEFAULT: [ "[", ",", "]" ] hash_type List which describes how hashes are translated.  DEFAULT: [ "{", "=>", ",", "}" ] ### simple_range2list  simple_range2list @ranges Expand "#,#..#,#-#,a..z,a-z,2:23,2:5:23,a:5:zz" strings to lists. Beginning ending blocks may be anything matching [\w\.]+, though I'm not sure how well underscores will behave. Commas may separate multiple range chunks. A plain value v (numerical or non-numerical) will produce the range 1..v or 'a'..v. If no step size is given, The standard perl .. is used to expand the range. Ranges with step sizes are incremented by the step size (may only be decimal valued if both start and end values are numerical) until the value exceeds the right hand value. For integers, see also Set::IntSpan::Fast: $set->add_from_string(
{ sep => qr/(?:\s*,\s*|\s+)/, range => qr/(?:\.\.|\-|\:)/ },
$string ); ### canonicalize_filename  canonicalize_filename$f;
$new = canonicalize_filename$f;
canonicalize_filename $f, %options; Removes anything too exotic from the file name $f. In void context, $f is modified, otherwise, $f is left unaltered and the modified file name is returned. In all cases the canonicalized name will be untainted. The following options will affect the behavior of this subroutine. The default values are shown:

replacement => ""

If a string value, invalid characters will be replaced with this value. If a hash reference then characters will be replaced by their corresponding values. Any values not present in the replacement hash will be replaced with the value in the 'DEFAULT' key (if present) or the empty string.

allow => 'print'

Must be one of 'print', 'basic', 'ascii', or a pattern matching A SINGLE legal character. The 'print' class will allow just about anything through that is not a control character including unicode characters and punctuation if your perl supports that. The 'basic' class should only allow characters that do not require escaping or quoting in a Linux shell (currently allows: \w-+.~%). The 'ascii' class permits regular printable windows and MacOS safe ascii (not unicode).

allow_subdirs => 1

If true, subdirectory separators will be allowed (uses File::Spec to determine volume and directory separators for your system).

squash_duplicates => 'dwim'

If false, each invalid character will be replaced separately. If the value is 'like' then, repeated illegal values are replaced by only a single replacement value. If the value is any true value other than 'dwim' then, consecutive illegal values (even if they do not match) will be replaced with the replacement value for the first illegal character in the substring. Finally, if the value is 'dwim' then a replacement hash will cause the "like" behavior and a replacement string will result in "true" behavior.

Example:

 %replace = ( replacement => { ':' => "-", " " => "+" } );
 # 'dwim' default using replacement hash: gives "foo-+bar"
canonicalize_filename( "foo: bar", allow => 'basic', %replace );
 # 'dwim' default using replacement string: gives "foo-bar"
canonicalize_filename( "foo: bar", allow => 'basic', replacement => "-" );

Trim leading/trailing whitespace. Trims $_ if no arguments provided. In void context, the arguments are altered, otherwise they are not changed and the trimmed values are returned. ## :time - Time Management ### now If the floating option is passed, a DateTime object will be created with no time zone information. Otherwise, creates a DateTime object in the local time zone. Keep in mind, time is difficult. If wall time in Eastern time zone (-0500) is "3:11 pm" and time() == 1298664681, then:  |------------------+------------+----------------------------+-----------------------------| | Function |$dt->epoch | RFC822                     | $dt->set_time_zone("+0300") | |------------------+------------+----------------------------+-----------------------------| | now() | 1298664681 | 25 Feb 2011 15:11:21 -0500 | 25 Feb 2011 23:11:21 +0300 | | now(floating=>1) | 1298646681 | 25 Feb 2011 15:11:21 +0000 | 25 Feb 2011 15:11:21 +0300 | | DateTime->now | 1298664681 | 25 Feb 2011 20:11:21 +0000 | 25 Feb 2011 23:11:21 +0300 | |------------------+------------+----------------------------+-----------------------------| Think carefully about what exactly you want. ### ymd Behaves like localtime in scalar context, but returns the date as "YYYY-MM-DD". Returns the components of that string in list context. ### ymd_hms Behaves like localtime in scalar context, but returns the date as "YYYY-MM-DD HH:MM:SS". Returns the components of that string in list context. Hours are presented in 24 hour format. ### seconds2human  seconds2human( seconds, start-unit, end-unit ) Convert an arbitrary number of seconds to a "nice" human-readable form. the second and third arguments are optional and specify the first and last time units presented (note specifying a start unit rounds the precision of your result to the given unit). The resulting data are separated by the value of $". Units available are: seconds, minutes, hours, days, months, and years. If the input seconds include a decimal portion, then the seconds value will be rounded to three places using the format "%.3f".

Example:

 seconds2human 99999999, 'd', 'mos.'   # gives: "38 months 17 days"
 local $" = ', '; seconds2human 99999999, 'm', 'hour' # gives: "27777 hours, 46 minutes" ### seconds2hms  seconds2hms$sec
seconds2hms $sec,$sep

Convert an arbitrary number of seconds to a "hh:mm:ss" string. The "hh" portion of the string will always be at least two digits long (but may be more if more than 99 hours are represented by given number of seconds.

### seconds2time

 seconds2time $sec seconds2time$sec, $pad seconds2time$sec, %options

binmode $OUT, ":encoding(iso-8859-1)"; print$OUT $_ while defined($_ = <$IN> ); ### touch  touch @files; touch \MODE @files; Create files using optional numeric mode (e.g: touch \0700, "foo"). If files exist, their atime and mtime will be updated to the current time. ### canonpath Like canonpath command in File::Spec, but only works on unix filesystems (also cygwin if$^O eq 'cygwin'). However, it will clean up "/../" components whereas File::Spec->canonpath will not.

The code has been modified from File::Spec::Unix::canonpath in the PathTools package by Ken Williams.

### fmap

 my @foos = fmap { s/^FOO: (.*)/$_Util::fmap::file: '$1' line $./ } @files my @foos = fmap { s/^FOO: (.*)/$_Util::fmap::file: '$1' line$./ } \%options, @files

Transform files. Loop through the lines of each file and apply a function. Replace each line with the new value of $_. The current file name will be available in the variable $_Util::fmap::file and will be one of the entries in the file list given to the subroutine. Of course, the standard perl variable $. ($INPUT_LINE_NUMBER when use English; is in effect) will be available for your use as well.

In scalar or list context returns a hashref (or hash) of (filename => [ new contents ]) pairs. The values are arrayrefs containing the modified lines of each file.

In void context, alters files in-place, just like using perl -pi -e from the command line.

if_mode

File mode when reading the file (the default is simply "<").

of_mode

File mode when writing the file (the default is simply ">").

backup

If a single character string (E.g., '~') or if starts with a leading dot (E.g., '.bak'), is appended to the filename as a backup suffix, Otherwise is treated as the backup file name ((E.g., 'old_foo'). The default is '~'.

### fgrep

 my @foos = fgrep { s/^FOO: (.*)/$_Util::fgrep::file: '$1' line $./ } @files my @foos = fgrep { s/^FOO: (.*)/$_Util::fgrep::file: '$1' line$./ } \"<:encoding(UTF-8)", @files

Grep files. Loop through the lines of each file and apply a function. If the function returns a true value then $_ (after the function application) will be appended to a list to be returned. The current file name will be available in the variable $_Util::fgrep::file and will be one of the entries in the file list given to the subroutine. Of course, the standard perl variable $. ($INPUT_LINE_NUMBER when use English; is in effect) will be available for your use as well.

In scalar context just the number of matches will be returned.

NOTE: If you want to chomp your lines note that the last line of a file may not contain a newline (or whatever $/ is) so use something like either of the following:  my @foos = fgrep { chomp; /^FOO/ } @files; my @foos = fgrep { /^FOO/ and chomp || 1 } @files; If a string reference $mode is provided as the first argument after the subroutine block it will be taken as the file mode (the default is simply "<").

### find

  #XXX: BUGS!
Currently not entirely correct but getting better. Known bug:
* -mindepth available but broken
 my @files = find [ '/' ], qw/-type f -name *.pm/;

File::Find using find(1) semantics. Currently supported find options are given below (descriptions taken from find(1)). Unlike find, this subroutine defaults to returning the list of matches rather than defaulting to the -print action. Tests are performed in the order specified so a failure early on will prevent further tests/actions from being performed. Note: this function will never be a full find2perl replacement.

-depth

Process each directory's contents before the directory itself.

-follow

Dereference symbolic links. This is the option that most closely follows find(1)'s behavior but is not a perfect match. In particular, a symbolic link which (if followed) would actually result in a circular reference will be processed by find(1), but not by this function.

NOTE: This option corresponds to the follow_fast option to File::Find

-follow_smart

Dereference symbolic links. Circular references (as well as links that would cause a circular reference) will be automatically removed (symbolic links will only appear if the "real" file would not have been found otherwise). Dangling symbolic links will be ignored.

NOTE: This option corresponds to the follow option to File::Find

-no_chdir

Sets corresponding File::Find option: Does not "chdir()" to each directory as it recurses. When true, the first argument to -wanted and -exec routines will bee a full path. For example, when examining the file "/some/path/foo.ext" while doing find ["/some"] you will have:

 @_ = ($_ = '/some/path/foo.ext', '/some/path/', '/some/path/foo.ext', '/the/realpath/foo.ext') -untaint -untaint_pattern Untaint directory names before "chdir()"'ing into them. Untaints using -untaint_pattern. -untaint_pattern defaults to qr|^([-+@\w./]+)$|. Your untaint pattern may be a string or pre-compiled (qr) pattern, but MUST capture the directory name to $1. -maxdepth levels Descend at most levels (a non-negative integer) levels of directories below the command line arguments. '-maxdepth 0' means only apply the tests and actions to the command line arguments. -quiet Disable "Permission denied" warnings for unreadable directories. Tests -iname pattern Like -name, but the match is case insensitive. For example, the patterns 'fo*' and 'F??' match the file names 'Foo', 'FOO', 'foo', 'fOo', etc. -iregex pattern Like -regex, but the match is case insensitive. -name pattern Base of file name (the path with the leading directories removed) matches glob pattern (or regexp if passed as qr// compiled regexp). The metacharacters ('*', '?', and '[]') do not match a '.' at the start of the base name. -regex pattern File name matches regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named './fubar3', you can use the regular expression '.*bar.' or '.*b.*3', but not 'b.*r3'. -type char File is of type "char":  b block (buffered) special c character (unbuffered) special d directory p named pipe (FIFO) f regular file l symbolic link s socket D door (Solaris) Actions -wanted subroutine -exec subroutine Execute subroutine; The subroutine is executed in the directory containing the file and is passed three parameters: the file's name, the current directory (relative to the starting directory), the file's full path (relative to the starting directory). If the "-follow" option is provided then the "true" filename (all symbolic links resolved) will be provided as a fourth argument. That is:  @_ = ($_, $File::Find::dir,$File::Find::name, \%info);

For example, when examining the file "/some/path/foo.ext" while doing find ["/some"] you will have:

 @_ = ($_ = 'foo.ext', '/some/path', '/some/path/foo.ext', \%info) Where  %info = ( path =>$_                    = "foo.ext",
dir      => $File::Find::dir = "/some/path/", name =>$File::Find::name     = "/some/path/foo.ext",
fullname => $File::Find::fullname = "/the/realpath/foo.ext", top_dir => "/some", # current path being examined rel_dir => "path", # relative to top_dir rel_path => "path/foo.ext", # relative to top_dir filename => "foo.ext", # even when -no_chdir basename => "foo", # removes last extension only stat => File::stat # File::stat object ); If we call find ["D"] from "/foo",  D/ |-- bar -- bip -- baz.txt Then %info will be:  | path | dir | name | fullname | top_dir | rel_dir | rel_path | filename | basename | |---------+-------+---------------+----------+---------+---------+-------------+----------+----------| | . | D | D | undef | D | . | . | . | . | | bar | D | D/bar | undef | D | . | bar | bar | bar | | bip | D | D/bip | undef | D | . | bip | bip | bip | | baz.txt | D/bip | D/bip/baz.txt | undef | D | bip | bip/baz.txt | baz.txt | baz | A "-wanted" subroutine will automatically set "$File::Find::prune" if the subroutine returns false. An "-exec" subroutine will do no such magic.

-print0

print the full file name on the standard output, followed by a null character. This allows file names that contain new-lines to be correctly interpreted by programs that process the find output.

-print

print the full file name on the standard output, followed by a newline.

-prune_all_failures

Discard and prune any files for which any test fails.

-prune_hidden

Discard and prune any hidden files. At the moment this means anything starting with '.' since I don't know how to detect "hidden" files on any systems other than Linux.

-prune_iname pattern

Like -prune_name, but the match is case insensitive. For example, the patterns 'fo*' and 'F??' match the file names 'Foo', 'FOO', 'foo', 'fOo', etc.

-prune_name pattern

Discard and prune any files where base of file name (the path with the leading directories removed) matches shell pattern pattern. The metacharacters ('*', '?', and '[]') do not match a '.' at the start of the base name.

-prune_rcs

Discard and prune any files or directories that look like they belong to a revision control system. At the moment this means any directories named: ".svn", "CVS", "blib", "{arch}", ".bzr", "_darcs", "RCS", "SCCS", ".git", ".pc"

-prune_backup

Discard and prune any files or directories that look like backups. This includes:

  * ends in "~" or ".bak"
* matches "#*#" or ".#*"
* matches "*.tmp" or ".tmp-[_a-zA-Z0-9]+"
-prune_regex pattern

Discard and prune any names matching the regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named './fubar3', you can use the regular expression '.*bar.' or '.*b.*3', but not 'b.*r3'.

Main Limitations:

No grouping via (), no -or.

Returns true if first file is newer than second file. Also returns true if first file exists but second does not.

### lastline

 my $line = lastline$file;
my $line = lastline "<:encoding(UTF-8)",$file;

Returns the last line of a file. Includes a seek() optimization based on the lengths of the first several lines so that reading the last line of a large file should be reasonably efficient.

By default the input will not be decoded. Either provide an initial scalar reference containing the file mode (with proper encoding, for example \"<:encoding(UTF-8)") or decode the string before using it.

 fprint $filename, @stuff fprint \$mode, $filename, @stuff Prints stuff to the indicated filename. If a mode is provided (for example, \">:encoding(UTF-8)") then it will be used instead of the default mode (">"). ### fprint_bu  fprint_bu$filename, @stuff
fprint_bu \$mode,$filename, @stuff

Prints stuff to the indicated filename, but backup filename (by appending a ~) first. If a mode is provided (for example, \">:encoding(UTF-8)") then it will be used instead of the default mode (">").

 fappend $filename, @stuff fappend \$mode, $filename, @stuff Append stuff to the indicated filename. If a mode is provided (for example, \">>:encoding(UTF-8)") then it will be used instead of the default mode (">>"). ### fincrement  fincrement$filename
fincrement $filename,$amount
fincrement $filename, pre =>$pre, post => $post, layers =>$perlio_layers
fincrement $filename,$amount, pre => $pre, post =>$post

Increments the number contained in $filename. On success, the new value is returned (Note: may be zero if $filename contained "-1"). On failure, undef is returned.

The amount to add to the file's value may be provided. If it is missing, then a value of one is assumed. The optional parameters $pre and $post specify strings to print to the file before and after the number. These strings default to the empty string and a single newline respectively.

Note: $filename must contain only a number (with possible whitespace), or must exactly contain the concatenation of $pre, number, and $post. If $filename does not exist, then it will be initialized to "0"

The "layers" option can be used to set the PerlIO layers for the opened files (for example layers => ":encoding(UTF-8)"). By default, no layers are applied.

### cat

 my $stuff = cat$file;
my $stuff = cat \$mode, $file; Read in the entirety of a file. If requested in list context, the lines are returned. In scalar context, the file is returned as one large string. If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is "<").

### bcat

Read in the entirety of a binary file. If requested in list context, the lines are returned. In scalar context, the file is returned as one large string.

### bu_open

 bu_open $file bu_open$fh, $file bu_open$fh, $file, "$file.bak"
bu_open \$mode,$file
bu_open \$mode,$fh, $file bu_open \$mode, $fh,$file, "$file.bak"  ($writer, $reader) = bu_open \$mode, $file Backup and open. The general idea is, if the file exists, rename it by appending a "~" to its name, then open the original name in write mode. This sub croaks if any operation fails. The backup file is created new so that the inode of the original file does not change. If only a single string variable argument is given and the function is called in void context, then the requested file is backed up and opened, "upgrading" the given argument to a filehandle. Example: $file = "foo";
bu_open $file; # Note: bu_open "foo"; would be a fatal error print$file "Bar\n";

In scalar context, $file is unchanged and a write-only filehandle is returned. In list context, a filehandle for both the new file (write only) and the backup (read only) are returned. If a mode is provided as a SCALAR reference (for example, \">:encoding(UTF-8)") then it will be used instead of the default mode (">"). If two arguments are given, the first will be used to store the newly opened filehandle, and the second should hold the file name. Finally, the final argument (if provided) will be used for the backup file (rather than the $file argument with a "~" appended).

### catfile

Calls the File::Spec catfile and canonpath methods.

### realfile

Unnecessary! use Cwd::realpath

## :shell - Shell Operations

### safe_pipe

 safe_pipe [ options, ] command, input
 my $results = safe_pipe [ 'command', 'arg' ], @input; my @results = safe_pipe [ 'command', 'arg' ], @input; my$results = safe_pipe \%opt, [ 'command', 'arg' ], @input;

Pipe data to a shell command safely (without touching a command line) and retrieve the results. Notably, this is the situation that IPC::Open2 says that is dangerous (may block forever) using open2. If process execution fails for any reason an error is thrown.

In void context, all command output will be directed to STDERR making this command almost equivalent to:

 my $pid = open my$F, "|-", 'command', 'arg' or die;
print $F @input; close$F;
waitpid( $pid, 0 ); Options: capture_err If true, STDERR will also be captured and included in returned results. allow_error_exit By default, this sub will verify that the command exited successfully. (0 ==$?) and throw an error if anything went wrong. Setting allow_error_exit to a true value will prevent this sub from examining the return value of the command.

Setting allow_error_exit to an array of allowed exit status will ignore only those (error) exit codes (code 0 will be considered a success).

Modified code from merlyn: http://www.perlmonks.org/index.pl?node_id=339092

Note: Input and output will not be encoded/decoded thus should be octets.

Note: locally alters $SIG{CHLD} ## :color - Color ### NOCOLOR  NOCOLOR(__PACKAGE__) if !$opt{color};
NOCOLOR()            if !$opt{color}; Replaces subroutines and package variables whose name matches one of the names in the :color_subs or :color_strings export tags with inert versions which do not insert any color sequences. Subroutines are replaced by the identity function and strings are replaced with the empty string. The default package is the caller's current package. WARNING: This subroutine has no good way of knowing that the subroutines and variables that it finds are really color subroutines and variables. It does however check that subroutines have a '$' prototype and it only has access to package variables (those not declared by my). This combined with the fact that there is only so many things that a function called "BLUE" could reasonably do means that this should not generally be a problem.

SUBS affected:

 BOLD UNDERLINE DARK BLINK REVERSE CONCEALED STRIKE
BLACK RED GREEN YELLOW BLUE MAGENTA CYAN WHITE
GREY GRAY BRIGHT_RED BRIGHT_GREEN BRIGHT_YELLOW BRIGHT_BLUE BRIGHT_MAGENTA BRIGHT_CYAN
ON_BLACK ON_RED ON_GREEN ON_YELLOW ON_BLUE ON_MAGENTA ON_CYAN ON_WHITE
ON_GREY ON_GRAY ON_BRIGHT_RED ON_BRIGHT_GREEN ON_BRIGHT_YELLOW ON_BRIGHT_BLUE ON_BRIGHT_MAGENTA ON_BRIGHT_CYAN

SCALARS affected:

 $BOLD$BOLD_OFF $UNDERLINE$UNDERLINE_OFF $DARK$DARK_OFF $BLINK$BLINK_OFF $REVERSE$REVERSE_OFF
$CONCEALED$CONCEALED_OFF $STRIKE$STRIKE_OFF $NORMAL$DEFAULT_FG $DEFAULT_BG$BLACK $RED$GREEN $YELLOW$BLUE $MAGENTA$CYAN $WHITE$GREY $GRAY$BRIGHT_RED $BRIGHT_GREEN$BRIGHT_YELLOW $BRIGHT_BLUE$BRIGHT_MAGENTA $BRIGHT_CYAN$ON_BLACK $ON_RED$ON_GREEN $ON_YELLOW$ON_BLUE $ON_MAGENTA$ON_CYAN $ON_WHITE$ON_GREY $ON_GRAY$ON_BRIGHT_RED $ON_BRIGHT_GREEN$ON_BRIGHT_YELLOW $ON_BRIGHT_BLUE$ON_BRIGHT_MAGENTA $ON_BRIGHT_CYAN ### hsl2rgb  my$rgb    = hsl2rgb( $H,$S, $L ); my @colors = hsl2rgb( @hsl_colors ); Convert HSL colors (triples from 0 to 1) to RGB colors (triples from 0 to 255). ### rainbow  rainbow($n );
rainbow( $n, %colors_options); Return a list of $n rainbow colors (ROYGBIV).

Any options supported by colors can be provided and will be passed along, including the n and colors options, so you probably don't want to include those options.

### wavelength2rgb

Convert a wavelength (a number between 380 nm and 780 nm) to a RGB triplet (0 ≤ x_i ≤ 1). Returns undef if given an out-of-range wavelength.

Formulas taken from Dan Bruton's color science page (http://www.midnightkite.com/color.html).

Make text bold

### DARK($) Make text dark ### UNDERLINE($)

Make text underline

### BLINK($) Make text blink ### REVERSE($)

Make text reverse

### CONCEALED($) Make text concealed ### STRIKE($)

Strike-through text (rarely implemented)

Make text red

Make text yellow

### BLUE($) Make text blue ### MAGENTA($)

Make text magenta

Make text white

Make text gray

### BRIGHT_RED($) Make text bright_red ### BRIGHT_GREEN($)

Make text bright_green

### BRIGHT_YELLOW($) Make text bright_yellow ### BRIGHT_BLUE($)

Make text bright_blue

### BRIGHT_MAGENTA($) Make text bright_magenta ### BRIGHT_CYAN($)

Make text bright_cyan

Make text on_red

### ON_GREEN($) Make text on_green ### ON_YELLOW($)

Make text on_yellow

### ON_BLUE($) Make text on_blue ### ON_MAGENTA($)

Make text on_magenta

### ON_CYAN($) Make text on_cyan ### ON_WHITE($)

Make text on_white

### ON_GREY($) Make text on_grey ### ON_GRAY($)

Make text on_gray

### ON_BRIGHT_RED($) Make text on_bright_red ### ON_BRIGHT_GREEN($)

Make text on_bright_green

### ON_BRIGHT_YELLOW($) Make text on_bright_yellow ### ON_BRIGHT_BLUE($)

Make text on_bright_blue

### ON_BRIGHT_MAGENTA($) Make text on_bright_magenta ### ON_BRIGHT_CYAN($)

Make text on_bright_cyan

## :color_strings - Color Strings

### $NORMAL Undo all color modifications ###$DEFAULT_FG

Remove foreground coloring

Make text bold

Make text dark

### $DARK_OFF Undo make text dark ###$UNDERLINE

Make text underline

### $BLINK_OFF Undo make text blink ###$REVERSE

Make text reverse

### $REVERSE_OFF Undo make text reverse ###$CONCEALED

Make text concealed

### $CONCEALED_OFF Undo make text concealed ###$STRIKE

Make text strike-through

Make text black

Make text green

Make text blue

Make text cyan

Make text grey

### $GRAY Make text gray ###$BRIGHT_RED

Make text bright_red

### $BRIGHT_GREEN Make text bright_green ###$BRIGHT_YELLOW

Make text bright_yellow

### $BRIGHT_BLUE Make text bright_blue ###$BRIGHT_MAGENTA

Make text bright_magenta

### $BRIGHT_CYAN Make text bright_cyan ###$ON_BLACK

Make text on_black

### $ON_RED Make text on_red ###$ON_GREEN

Make text on_green

### $ON_YELLOW Make text on_yellow ###$ON_BLUE

Make text on_blue

### $ON_MAGENTA Make text on_magenta ###$ON_CYAN

Make text on_cyan

### $ON_WHITE Make text on_white ###$ON_GREY

Make text on_grey

### $ON_GRAY Make text on_gray ###$ON_BRIGHT_RED

Make text on_bright_red

### $ON_BRIGHT_GREEN Make text on_bright_green ###$ON_BRIGHT_YELLOW

Make text on_bright_yellow

### $ON_BRIGHT_BLUE Make text on_bright_blue ###$ON_BRIGHT_MAGENTA

Make text on_bright_magenta

### $ON_BRIGHT_CYAN Make text on_bright_cyan ## :display - Display functions ### sprint_one_var  binmode STDOUT, ":encoding(UTF-8)"; print sprint_one_var scalar one_var \@data; Returns a string describing data set, such as:  N = 12: μ = 11.42, σ = 9.19 95% CI ( 6.11, 16.72) « 3.55, 6.70, 7.72, 12.05, 36.25 » five_nr Show five number summary (default true) ci Show 95% confidence interval (default false) dpad Width of digit (N) field (default 3) pad Width of floating point fields (default 5) digits Digits after decimal, use negative number to format with nicef instead of %f (default 2) ### mk_progressbar Generates a progress subroutine. Sample usage might be (you provide the $items iterator and do_something sub or something equivalent):

 my $nr_items =$items->count;
my $progress = mk_progressbar( total =>$nr_items, countdown => 1 );
print STDERR "Processing items ";
while (my $item =$items->next) {
$progress->($nr_items--);
do_something($item); }$progress->(0);

With the above, your code now has a nice progress bar.

type

"bar", "dot", "percent", or "spinner". DEFAULT: bar

total

Number of items to process. Note: "total" and progress counts may be decimal. DEFAULT: 1

countdown

When true, progress sub expects value to decrease from total to 0 rather than increase from 0 to total. DEFAULT: undef (false)

format (percent type only)

sprintf format to display percentage. DEFAULT: "%.2f%%"

length (bar type only)

DEFAULT: 20

symbol (bar and dot types only)

DEFAULT: "*" for bar; "." for dot

symbols (spinner type only)

DEFAULT: [ qw( - \ | / ) ]

break (dot type only)

DEFAULT: 50

Print newline after every 10 dots.

space (dot type only)

DEFAULT: 10

Print space after every 10 dots.

fh

Output file handle. DEFAULT: STDERR

prefix

String to print before progress info.

suffix

String to print after progress info.

### clprint

  my ($i, @mark) = (0, qw[ - \ | / ]); print "Working: "; for (@things) { clprint$mark[($i %= 4)++]; # ... other stuff ... } clprint;  clprint \$var, @stuff;
clprint \*STDOUT, \$var, @stuff; A CLearing print. Erases whatever was printed last time and prints the next thing. This subroutine is smart enough not to try to erase past a newline even if you are using the perl variables $, or $\. This subroutine makes use of the clength subroutine so that color escape sequences are properly measured. Calling the subroutine with no arguments forgets the previously printed thing without erasing it from the screen. If a GLOB or IO::* is given as a first parameter then, that will be used for output. The default is STDERR and is stored in the $_Util::clprint::out variable if you want to change it.

If a reference to a scalar is given then that variable will be used to store the text history. This allows for multiple clprint levels. (Though it is up to you to nest them properly.)

### sprint_hash

 sprint_hash $sep, %hash Returns a string:  "key" => "value"$sep"key" => "value"$sep... If $sep is not provided (I.E., sprint_hash is called with an even number of arguments) $sep will default to $/ (typically "\n").

### print_hash

Prints the results of sprint_hash

### ctext

 ctext( $text,$width, "left" | "right" )

Center a string horizontally over a given width (both left and right sides are padded with space). An optional third parameter specifies whether to err to the left or to the right. The default is left, to put an extra space to the right if necessary. undef is returned if $width < length$text.

### lrtext

 lrtext( $left,$right, $width ) Return a string with enough space separating the $left and $right text so that the line fills the entire $width.

### text_wrap_paragraphs

Splits a string on multiple consecutive newlines and passes each chunk to text_wrap. Returns the resulting paragraphs as a list of paragraphs. This function takes the same arguments as text_wrap.

### text_wrap

Takes a string and wraps the test to be at most a certain width. Text is split at whitespace, and hyphens (though actual hyphenation is beyond the scope of my interest). Long words are placed on lines by themselves, all whitespace is canonicalized, and the resulting string does not have a trailing newline.

This function uses the non-core package Term::ReadKey. Available options:

width

Total width of the paragraph, including any indentation. The default is the width of the terminal or the value of $ENV{COLUMNS} or 80. If width is negative, then that value will be subtracted from whatever width is auto-detected. indent A per-line indentation amount. The default is zero. fill If true, spaces will be added to the END of each line to make them exactly the right width. You might want this if you are colorizing the background so that the background color extends the full width on each line. The default is false. wrap_chars A list of characters that we are allowed to wrap on. The default is [ '-', ' ' ]. ### text_justify_paragraphs Splits a string on multiple consecutive newlines and passes each chunk to text_justify. Returns the resulting paragraphs as a list of paragraphs. This function takes the same arguments as text_justify. ### text_justify Takes a string and wraps the test to exactly be a certain width. Text is split at whitespace, and hyphens (though actual hyphenation is beyond the scope of my interest). Long words are placed on lines by themselves, all whitespace is canonicalized, and the resulting string does not have a trailing newline. This function uses the non-core package Term::ReadKey. Available options: width Total width of the paragraph, including any indentation. The default is the width of the terminal or the value of $ENV{COLUMNS} or 80. If width is negative, then that value will be subtracted from whatever width is auto-detected.

indent

A per-line indentation amount. The default is zero.

justify_last

If true the last line of the paragraph will be justified also. The default is false.

fill

If true, spaces will be added to the END of each line to make them exactly the right width. You might want this if you are colorizing the background so that the background color extends the full width on each line. The default is true.

wrap_chars

A list of characters that we are allowed to wrap on. The default is [ '-', ' ' ].

### print_cols

Prints the results of format_cols

### format_cols

 format_cols \@array, %options

Format the given list of items into columns according to the given options. This function has a couple of improvements over Term::PrintCols. In particular, it has more options, and is capable of correctly formatting lists with embedded ANSI color codes. This function uses the non-core package Term::ReadKey if the total_width option is not specified. The layout algorithm was inspired by GNU ls.

Available options are.

align => alignment string

The alignment string is a word in the characters: l, r, c, standing for Left, Right, and Center respectively. These control the alignment of each column. The last character is repeated as many times as necessary for the number of columns used in the formatted table. For example an alignment string of "lc" would center all columns after the first. The default value is "l".

col_width => integer

The minimum allowed column width. This number will be used if there are no items longer than the given integer minus col_space (to allow for a space).

col_space => integer

The minimal amount of spacing to place between each column. The actual column spacing may be larger since this function expands the columns to occupy the total available width. This value defaults to 2.

col_join => string || array

String(s) used to join columns. This option overrides the col_space option. If more columns are used than elements available in the col_join array then the last element will be repeated for all subsequent column dividers.

enumerate => bool
enumerate_start => integer
enumerate_append => string

If enumerate is true, right aligned numbers will be prepended to each item. Enumeration will being with enumerate_start (default 1), and the enumerate_append string will be appended to each number (default ". ").

indent => integer

Amount of indentation to include on the left side of each line. This number will be taken from the total_width option before the list is formatted. The default is no indentation.

max_cols => integer

The maximum number of columns to create. Sometimes it may be preferable to specify the maximum number of columns rather than the minimum column width.

cols => integer

The exact number of columns to create.

orientation => 'horizontal' | 'vertical'

Specify whether the columns are to be filled horizontally or vertically. For example, if the list of items is (1..9), then the resulting column layouts would be:

 horizontal:              vertical:
1  2  3  4               1  4  6  8
5  6  7  8               2  5  7  9
9                        3

The default orientation is vertical

total_width => integer

The total number of terminal columns to use. This option tries to find the correct width of the terminal first by using Term::ReadKey, then by examining $ENV{COLUMNS} and finally defaults to 80 characters. If the total_width is negative, then that value will be subtracted from whatever width is auto-detected. max_width => integer The maximum number of terminal columns to use. The default is to not constrain the "total_width". fill_last_column => bool If true, spaces will be added to the end of the last column in the same way that space is added to the end of all other columns. Otherwise, the last column will not be space padded on the right. The default is false, do not fill the right side of the last column. uninitialized => 'warn' | 'die' | 'ignore' Behavior upon finding uninitialized values. The default is 'warn'. ### histogram  print histogram( \%data, %options ) Returns a text histogram. The data hash consists of id => frequency. The graph looks best if the id's are short and all approximately the same length. The following options may also be provided: height The height of the tallest bar of the histogram. DEFAULT: 10 key_order Either an array containing the order in which to display the histogram data or the keyword 'sort'. DEFAULT: sort Note: %data may contain more data than is requested in key_order. We will only create a histogram with the key_order data. max_frequency The largest frequency. You might want to provide this for two reasons. To provide a uniform scaling over multiple histograms or as an optimization (if you already have this value it would save us the work of recomputing it). By default we will compute it from the key data that we are actually displaying. DEFAULT: undef bar_width The width of each histogram bar. If undefined either 1 or the width of the widest label will be used (depending on the value of show_labels). DEFAULT: undef bar_char The character to use to draw the bars. DEFAULT: "*" col_skip The inter-column spacing. DEFAULT: 1 indent Amount of indentation to include on the left side of each line. DEFAULT: 0 axis_overhang The distance beyond the end bars that the histogram extends. DEFAULT: 2 show_axis Print a horizontal bar beneath the histogram but above the labels. DEFAULT: true show_labels Print the labels centered under their respective bars. DEFAULT: true ## :input - Prompting and input ### get_boolean  get_boolean { default =>$d, true => $t, false =>$f }, $input Canonicalizes boolean input (with default). $input may be a string or a filehandle. If $input is omitted then one line of data is read from <STDIN>. This subroutine just tries to match either /^\s*[yYtT1]\w*\s*$/ or /^\s*[nNfF0]\w*\s*$/. Anything else causes the default value to be returned. If the default value is not set (which is not the same as being set to undef) then the input will be returned as-is if it does not appear to be boolean or is empty or undefined. Using this subroutine in this way is somewhat fragile since something like "truck" or "Typhoon" will be canonicalized to "true" but "The Clash" will not. Thus, it is not very sophisticated in its distinction between boolean and non-boolean inputs. ### Yn Returns "y" or "n", defaulting to "y", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>. WARNING: This subroutine returns "y" if the input is the empty string or undefined. ### yN Returns "y" or "n", defaulting to "n", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>. This subroutine returns "n" if the input is the empty string or undefined. ### yn Returns "y" or "n" if the input appears to be boolean. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>. WARNING: The empty string and undef are not considered to be boolean and will not be canonicalized to "y" or "n". ### Tf Returns "1" or "0", defaulting to "1", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>. WARNING: This subroutine returns "1" if the input is the empty string or undefined. ### tF Returns "1" or "0", defaulting to "0", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>. This subroutine returns "0" if the input is the empty string or undefined. ### tf Returns "1" or "0" if the input appears to be boolean. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>. WARNING: The empty string and undef are not considered to be boolean and will not be canonicalized to "1" or "0". ### prompt See Also: IO::Prompt(?)  my$x = prompt();
my $x = prompt( "prompt" ); my$x = prompt( "prompt", "help string" );
my $x = prompt( "prompt", { help hash } );  my$x = prompt( "prompt", %options );
my $x = prompt( "prompt", help string/hash, %options ); Prompt the user until valid input is received. The default prompt is '? '. The return value is the user input without the trailing newline. The provided help may be either a help string which will be printed to screen when the help command is given (see below) or may be a hash of command => "help string" pairs which will be used if help for a particular command is requested. The hash value corresponding to the empty hash key ("" => "General help") will be used for the general help response. help Declare the help message/hash explicitly. default Specify a default response which will be returned if the user provides no response. Specifying this option makes the value of "allow_empty" irrelevant. Value may be set globally by setting $_Util::prompt::default.

allowed

An expression like "help_command" which specifies the allowed input values. A list provides a list of all possible case insensitive inputs. A regular expression may capture a sub-portion of the input line and the captured portion will be used as a canonicalized value. Finally a subroutine is expected to return the canonicalized value of the input. The default is to allow any DEFINED input value. Value may be set globally by setting $_Util::prompt::allowed. allow_empty Boolean value which (if true) allows an empty response value. The default is false. Value may be set globally by setting $_Util::prompt::allow_empty.

help_command

A literal string, list of literals, regular expression pattern, or subroutine which determines whether the user has asked for help. If a help hash was provided then patterns should capture the requested command in $1 and subroutines should return the requested command (or undef if the input is not a request for help). The default help_command is '?'. Some valid examples:  help_command => '?' help_command => ['?', 'h ', 'help '] help_command => qr/\?\s*(\w*)/ help_command => sub { ($_[0] =~ /\?\s*(\w*)/) ? ($1 || "help_bar") : undef } Value may be set globally by setting $_Util::prompt::help_command.

trim

A shortcut to set both "trim_leading" and "trim_trailing" to the same value.

trim_trailing

If true, any leading (resp. trailing) whitespace will be removed from the user's input prior to any processing by this subroutine. The default is true. Values may be set globally by setting $_Util::prompt::trim_leading and $_Util::prompt::trim_trailing.

input_filehandle

Specify the input filehandle. The default is STDIN. Value may be set globally by setting $_Util::prompt::input_filename. output_filehandle Specify the output filehandle. The default is STDOUT. Value may be set globally by setting $_Util::prompt::output_filename.

on_undef

Specify what to do when an undefined value is given as input. The following values are recognized:

  return     : causes "prompt" subroutine to immediately return undef
make_empty : replaces the undefined value with the empty string and continues
continue   : do nothing in particular ("default" and "allowed" will still apply)

Any other value will cause the script to croak with the the value as the error message. The default value is "make_empty". Value may be set globally by setting $_Util::prompt::on_undef. no_echo If true, user's input is not echoed to the screen. Value may be set globally by setting $_Util::prompt::no_echo.

## :plot - Graphs and Plots

### plot_colors

 plot_colors( $n ); plot_colors($n, %colors_options);

Return a list of $n colors that are nice for making a plot of. The colors are chosen to be visually distinct, however if $n is large enough (more than 13) you will get a rainbow of colors.

Any options supported by colors can be provided and will be passed along, including the n and colors options, so you probably don't want to include those options.

### ps_barchart !!incomplete

 ps_barchart( \@data );
ps_barchart( \@data, %options );
ps_barchart( %data_and_options );

Generate a postscript barchart.

Examples:

 my @x = map { int(rand(20)) } 1..15;
my @y = map { int(rand(20)) } 1..15;
my @z = map { int(rand(20)) } 1..15;
 # A simple dynamic web graph:
print "Content-Type: image/png\n\n", ps_barchart \@x;
 # Neighboring bars:
ps_barchart file => "graph.png",
data => [ foo => \@x, bar => \@y, baz => \@z ];
 # Stacked bars: ( [ [foo => \@x], [bar => \@y], ... ]  is also OK. )
ps_barchart file    => "graph.gif", style => "stacked",
xlabels => [qw/ay bee cee dee ee ef gee ach eye jay kay ell em en oh/],
data    => [ foo => \@x, bar => \@y, baz => \@z ];
 # xlabels are dates, bars are already tiered ($x[$i] <= $y[$i] <= $z[$i] for all $i): ps_barchart file => "graph.gif", style => "prestacked", xlabels => [qw/2005-01 2005-02 2005-03 2005-04 2005-05 2005-06 2005-07 2005-08/], timefmt => "%Y-$m", format => ["x %b %y", "y %g" ],
data    => [ foo => \@x, bar => \@y, baz => \@z ];

##XXX: Alas, I still have to go through and make it be able to handle a proper histogram

## :image - Image Routines

### compile_latex

Compiles a LaTeX file. The following options are accepted.

latex

An integer specifying the number of times latex is to be run. Reasonable values are 1 (the default) or 2 (if your document has references which need to be resolved).

compiler

Arrayref containing compile command to use. Auto-chosen from latex, pdflatex, or perltex (each running in batch mode; perltex can handle either latex or pdflatex documents) by looking for pdftex option on \documentclass command line (may be in comment at end of line) or (uncommented) perltex \usepackage command.

pdftex

Set to true if latex compiler produces pdf documents rather than dvi documents.

print

1 or printer name. Will be printed using dvips.

dvips

1/0 creates a PostScript file.

dvipdf

1/0 creates a PDF file.

bibtex

1/0 runs BibTeX at the right time.

index

1/0 runs makeindex at the right time.

 compile_doc $file |$dir | [ paths ],
output => [qw/ pdf ps /],
   compile => [qw/ latex1 bibtex makeindex ... latex /],
   # set to reasonable defaults for all known thinguns
compile_bibtex_command => [ command prefix ],# only reasonable for $file or$dir calls
compile_bibtex_command => sub { passed file or dir or paths },
# called with output of prev command in $_; returns true if need to call bibtex # called with arguments: ( command_chain => [qw/ latex1 bibtex /], command_output => { latex1 => ..., bibtex => ... } ) # ^- commands called so far (in order) ^- output from commands (most recent call only) compile_bibtex_test => sub{ grep /\.bib$/, glob('*') },
compile_makeindex_test=>sub{my%o=@_;my $res=$o{command_output}{latex1}||$o{command_output}{latex};$res=~/run makeindex/},
# If true, test will be performed up to Int times and bibtex will be
# called up to Int times
compile_bibtex_multi => Int,
   convertto_pdf => [ ... list of preferred sources ... ],
   convert_dvi_pdf => [ ... command prefix, filename.dvi is appended ... ],
convert_pdf_ps => sub { called in chdir, given filename.pdf as arg, must produce filename.ps },
...

### tex2image

Given a string of LaTeX code, returns an image file as a "string". The following options may be provided after the LaTeX string. Also, all options available to compile_latex are accepted in this function.

file

Save output to the indicated file instead of returning the image as a string.

type

Specify the save file type. This should be a standard "file extension" for the desired output (E.g. "gif" or "png"). The default output is an EPS file. (The ImageMagick command "convert" must be available on your system for this option to succeed.)

A header string placed between \documentclass{article} and \begin{document}. Only useful if input tex code does not include \begin{document} or \documentclass{article}.

Note: \usepackage{color} and \pagestyle{empty} are always included if either \begin{document} or \documentclass{article} are missing in the provided LaTeX string.

convert_args

Additional arguments to pass to convert when making the image. By default this is ["-transparent", "white"].

X

Specify the X resolution (default 144)

Y

Specify the Y resolution (default 144)

color
pagecolor

Specify the color or page color. Each may be an RGB hex triplet ("#40036f", the "#" is required!) LaTeX named color (red | green | blue | yellow | cyan | magenta | black | white and perhaps others depending on the DVI driver), a single number representing gray value, an "r,g,b" triplet, or a "c,m,y,k" quadruple. All numbers are percentages between 0 and 1, inclusive. The default values are "black" and "white" respectively.

### magic_convert !UNIMPLEMENTED

 magic_convert $file, %options magic_convert$old_file, $new_file, %options magic_convert \@files, %options magic_convert \@files,$dir, %options

Convert file types and resize images. Valid options are given below. Colors may be given as (X11) color names or RGB hex triplets.

format => $ext Specify an output file format. This "option" is required for all invocation styles except for the second, where the output format will be guessed from the $new_file name if this option is not provided.

transparent => $color If the target image type supports transparency, then the specified color will be made transparent during the conversion. grow =>$grow

A boolean value which, if true, indicates that the image should be enlarged in order to fit maximally into the specified resolution / size.

size => $WIDTHxHEIGHT or \@width_and_height Specified either as a list of two elements or a string of the form "640x480", this option forces the image to fit within a box of the given size. resolution =>$value

For vector-based inputs the resolution will affect the resulting image size. Note that the "max_size" option will override this option under most circumstances.

intent => "icon" | "thumbnail" | "web" | "email" | "screen" | "print" | "hires"

A fuzzy way to set the "resolution" and "size" options to reasonable (by current technology standards) sizes. The "icon" intent will aim for an image of size 128x128. The "thumbnail" intent will limit the image to a 250x250 box. The "web" and "email" intents assume 640x480 screens, while the "screen" intent assumes a 1024x768 screen. The "print" intent assumes a 5"x5" image at 300 DPI and the "hires" intent assumes 5"x5" at 600 DPI.

## :LaTeX - LaTeX generating routines

### quotetex

Like quotemeta, but makes strings LaTeX safe. Replaces all LaTeX special characters with replacements which will correctly compile in LaTeX.

### tree2tex

 tree2tex \%tree, %options

Convert arbitrarily nested HoH's to the LaTeX code which will produce a tree diagram,

 # This,
{ A => { b => 1, c => 1 }, B => { f => { e => 1, f => 1 }, g => 1 } }
 # Becomes code that produces this,
  A -+- b
|
+- c
  B -+- f -+- e
|     |
|     +- f
+- g

The leaf nodes may point to any value which is not a reference. You will need to \usepackage{pstricks,pst-node} for the code produced by this subroutine to function properly. Accepted options,

column_spacing

A LaTeX measurement for the amount of spacing to use between each column. This amount is placed before and after each column (using the LaTeX \tabcolsep variable) so should be half of the actual desired column spacing. The default is "1.5em".

row_stretch

A multiplier for the row stretch. Used to set the LaTeX multiplier \arraystretch. The default is 1.

tabular_format

The tree is built using the tabular environment. This option sets the format for the tabular. If the format is a single character then it will be duplicated for each level of your tree. Otherwise, you will need to make sure that you include enough columns for your diagram (one column for each level of the tree). The default is "l".

node_label_start

The starting node label. Useful if you are using alphabetic node labels elsewhere in your document. The default node labeling is "A".."Z","AA",... using the perl magic incrementer. Meaningful values for node_label_start are all-caps words, all-lower-case words, or numbers.

sort

Boolean value dictating whether we should sort the key values. The default is to sort tree nodes. Set this to false if you have a tied hash which will return keys in your desired order. Some modules which may help with this,

 Tie::Hash::Sorted                    - specify your own sort function
Tie::IxHash or Tie::Hash::Indexed    - key order is insert order
use_leaf_values

If the leaves of your tree point to useful string values then you may specify use_leaf_values => 1 to have this subroutine use the leaf values as labels for the leaves rather than the leaf keys.

vertical

Boolean value which, if true, tells the subroutine to transpose the resulting tree. This has the effect of putting the root nodes across the top rather than down the left side.

Note: This still needs some work. In particular, when the matrix is transposed, the labels are not centered above their children.

nc

Node connection type. May be any LaTeX node connection type. Currently must be one of: line, Line, curve, arc, bar, diag, diagg, angle, angles, loop, circle. The default is "angles".

node_sep

A LaTeX measurement for the amount of spacing to place around each node. The default is "1ex".

## :html - HTML utilities

### uri

 my $uri = uri($base, @path_components, \%query_params );

### js_toggle

 js_toggle [ label1 => id1, label2 => id2, ... ], %options;
js_toggle [ label1 => [id1a, id1b, ...], label2 => id2, ... ], %options;
js_toggle [ [displayed_label1, hidden_label1], [id1a, id1b, ...], ... ], %options;

Constructs a list of html snippets that can be placed in a document that will switch on and off the indicated ids. Ids may be associated to multiple labels,

The options supported are given below.

Toggle buttons simply show and hide the corresponding IDs. Radio buttons always show the associated ids and hide all other ids. The default behavior is "toggle".

id_prefix => $prefix Each label will be wrapped in a <span id="ID"> tag. The "ID" must be unique for each page. To ensure this, the function appends an integer to the "id_prefix" which is incremented as necessary (the incremented value is remembered between calls to js_toggle). The default prefix is "JST", but can be changed using this option. reset_counter =>$bool

If true, the counter used to ensure that ids are unique will be reset to zero at the end of the function call. This can be helpful if you want to include style information for the generated labels in a style sheet (though you could also wrap your label in a <span> tag before passing it to this function). The default is to not reset the counter.

visibility => $visibility Indicates the initial visibility state of the items (only relevant if "hidden" labels are provided). $visibility may be a label or list of labels listing those labels which will be displayed when the page loads (you will need to manage the page styles to ensure this). Alternatively, $visibility may be "1", "0", or a list of "1"'s and "0"'s indicating which items (by position) will be visible. Specifying just "1" or "0" indicates that all or none of the objects will be initially visible. The default is to assume that the first item is visible and that all others are hidden. #XXX: Ugh, this is confusingly worded! (and wrong!) display => \@displaystyles A list of display styles to be used for making objects visible. This will typically be "block" or "inline", but CSS 2 allows lots of things (list-item, table, ...). Any display styles left undefined will default to "block". The size of the displaystyles list should correspond to the length of the concatenated ids without removing duplicates. For example:  js_toggle [ foo => ["A", "B"], bar => ["B", "C"], baz => "D" ], display => [qw/ block inline block block table/]; Thus, display styles may depend on the label the object is currently associated with. use_functions =>$bool

*** NOT IMPLEMENTED ***

The variable $_Util::js_toggle_functions includes a function "js_toggle_display" which can be used by this subroutine to decrease the amount of inline javascript. This can reduce bandwidth by quite a bit if this code is places in an external file, or by a little bit if placed in the page <head> in a <script> block. If this option is set to a true value, then it will be assumed that the function "js_toggle_display" is available and it will be used. You will need to ensure that the code in $_Util::js_toggle_functions is inserted into the web page in the appropriate fashion.

### libxml_doc

 libxml_doc( $thing ) libxml_doc($thing, $parser ) libxml_doc($thing, type => '<TYPE>' )
libxml_doc( $thing,$parser, type => '<TYPE>' )

Construct a XML::LibXML HTML document object easily and quietly. $thing can be a filename (or something that stringifies to a filename), a URL, or actual HTML. Alternatively, you can be specific and specify one of the three types: 'FILE', 'URL', or 'HTML'. The real benefit of this subroutine though is that all XML::LibXML error messages are discarded. XML::LibXML and LWP::Simple will be automatically loaded if necessary. ## :sql - Database manipulation routines ### sql_hash_multi Use DBIx::Simple: map_pairs { push @{$hash{$a}},$b } $db->query($sql, @stuff)->flat;

 sql_hash_multi( $dbh,$sql, \@stuff, %options );

Prepares and executes a database request for database pairs returned by the query. Query should produce exactly two values per row. Return value is the hashref constructed from the row pairs with first column as keys and arrays of the second column as values. Hash values will be arrays even if only one result is returned for the corresponding key. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_hash = sql_hash_multi($dbh, $sql, closure => 1 );$hash = $get_hash->([]);$hash = $get_hash->([]);$hash = $get_hash->([]);  # clean up statement handle when finished$get_hash->();

### sql_hash

Use DBIx::Simple: %hash = $db->query($sql, @stuff)->flat;

 sql_hash( $dbh,$sql, \@stuff, %options );

Prepares and executes a database request for database pairs returned by the query. Query should produce exactly two values per row. Return value is the hashref constructed for the row pairs. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_hash = sql_hash($dbh, $sql, closure => 1 );$hash = $get_hash->([]);$hash = $get_hash->([]);$hash = $get_hash->([]);  # clean up statement handle when finished$get_hash->();

### sql_col

Use DBIx::Simple: @col = $db->query($sql, @stuff)->flat;

 sql_col( $dbh,$sql, \@stuff, %options );

Prepares and executes a database request for an entire table column returned by the query. Return value is an arrayref of zero or more values in the column. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_col = sql_col($dbh, $sql, closure => 1 );$col = $get_col->([]);$col = $get_col->([]);$col = $get_col->([]);  # clean up statement handle when finished$get_col->();

### sql_all

Use DBIx::Simple: @hashes = $db->query($sql, @stuff)->hashes;

 sql_all( $dbh,$sql, \@stuff, %options );

Prepares and executes a database request for a all table rows returned by the query. Return value is an arrayref of zero or more hashrefs describing the result. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_em = sql_all($dbh, $sql, closure => 1 );$rows = $get_em->([]);$rows = $get_em->([]);$rows = $get_em->([]);  # clean up statement handle when finished$get_em->();
name

Used to set name canonicalization option in DBI fetcher. May be "lc" or "uc". Anything else returns keys in database case (depends on database).

### sql_one

Use DBIx::Simple: $result =$db->query($sql, @stuff); while ($row = $result->hash) { ... }  sql_one($dbh, $sql, \@stuff, %options ); Prepares and executes a database request for a single table row. Return value is a unique hashref describing the result row or undef if no results were found. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form). options: closure If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).  my$get_one = sql_one( $dbh,$sql, closure => 1 );
$row =$get_one->([]);
$row =$get_one->([]);
$row =$get_one->([]);
 # clean up statement handle when finished
$get_one->(); name Used to set name canonicalization option in DBI fetcher. May be "lc" or "uc". Anything else returns keys in database case (depends on database). ### sql_value Use DBIx::Simple: ($value) = $db->query($sql, @stuff)->list;

 sql_value( $dbh,$sql, \@stuff, %options );

Prepares and executes a database request for a single value. Return value is requested value. Sub returns an empty list (that is, no value) if no rows matched the query. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form). If more than one column is returned by the query, an array ref of the row (will be a copy of the DBI arrayref) will be returned.

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_val = sql_value($dbh, $sql, closure => 1 );$value = $get_val->([]);$value = $get_val->([]);$value = $get_val->([]);  # clean up statement handle when finished (not really necessary here)$get_val->();

### sql_insert

Use DBIx::Simple: $db->insert($table, \%stuff);

 sql_insert( $dbh,$table, \%stuff, %options );

Prepares and executes a table insert. Return value is the result of the statement execution.

options:

closure

If true, a sub ref is returned instead that will perform inserts using a prepared statement handle. The data in the initial call will not be inserted so can simply be a "template" hash showing which columns will be needed.

 my $my_insert = sql_insert($dbh, $table, \%stuff, closure => 1 );$my_insert->( \%stuff );
$my_insert->( \%stuff1 );$my_insert->( \%stuff2 );
 # clean up statement handle when finished
$my_insert->(); on_conflict Algorithm to deal with conflicts. See: http://www.sqlite.org/lang_conflict.html Support Matrix (partial support ):  ROLLBACK ABORT FAIL IGNORE REPLACE Pg Yes Yes Yes SQLite Yes Yes Yes Yes Yes primary_key Possibly needed for on_conflict => REPLACE by some drivers. Is scalar or arrayref of columns to use as primary key. If not provided, the DBI method primary_key will be used to try to determine the primary key columns. This option can be used to override or to avoid computation of the auto-determined values. ## :postscript - PostScript generating routines ### psplot_sub This subroutine takes one argument, a subroutine reference followed by some or all of the following options. In scalar context the postscript code which draws the graph is returned. In list context, a hash reference which contains the actual option values used is also returned. This can be used to position other information around the graph. at Relative translation (an array ref, [dx, dy], specifying bottom left corner). Default: [0,0] color RGB triplet in percentages (0 <= percent <= 1). Default: [0,0,0] intervals Number of intervals to cut the region into. Default: 100 xscale Length in points of a unit vector on the x-axis. Default: 1 xmin Minimal x value. Default: -10 xmax Maximal x value (will be set from width/xscale if not defined). Default: undef yscale Length in points of a unit vector on the y-axis (will be set from height/ymin/ymax if a height is provided). Default: 1 ymin Minimal y value (will DWIM if not defined. Will chop the graph if set too high). Default: undef ymax Maximal y value (will DWIM if not defined. Will chop the graph if set too low). Default: undef width Width of the graph in points (72 points = 1 inch). Default: undef height Height of the graph in points (72 points = 1 inch). Default: undef To create a postscript document, you will need a header something like the following:  print <<HEADER; %!PS-Adobe-2.0 %Creator:$ENV{USER}
%%Title: Raster Plot
%%BoundingBox: -10 -10 500 500
%%Magnification: 1.0000
HEADER
 print psplot_sub( ... );
 print "\nshowpage\n";

### psplot_parametric_sub

This subroutine takes one argument, a subroutine reference followed by some or all of the following options. The provided subroutine should return a list of two values for a given input. In scalar context the postscript code which draws the graph is returned. In list context, a hash reference which contains the actual option values used is also returned. This can be used to position other information around the graph.

at

Relative translation (an array ref, [dx, dy], specifying bottom left corner). Default: [0,0]

color

RGB triplet in percentages (0 <= percent <= 1). Default: [0,0,0]

intervals

Number of intervals to cut the t-interval into. Default: 100

tmin

Minimum t value. Default: 0

tmax

Maximal t value. Default: 10

xscale

Length in points of a unit vector on the x-axis (will be set from width/xmin/xmax if a width is provided). Default: 1

xmin

Minimal x value (will DWIM if not defined. Will chop the graph if set too high). Default: undef

xmax

Maximal x value (will DWIM if not defined. Will chop the graph if set too low). Default: undef

yscale

Length in points of a unit vector on the y-axis (will be set from height/ymin/ymax if a height is provided). Default: 1

ymin

Minimal y value (will DWIM if not defined. Will chop the graph if set too high). Default: undef

ymax

Maximal y value (will DWIM if not defined. Will chop the graph if set too low). Default: undef

width

Width of the graph in points (72 points = 1 inch). Default: undef

height

Height of the graph in points (72 points = 1 inch). Default: undef

To create a postscript document, you will need a header something like the following:

 print <<HEADER;

Print the information @text to $ofstream if $info_level is greater than or equal to $level. Returns 1 if message was printed and 0 if it was not. $info_level defaults to $CALLER::INFO_LEVEL and $ofstream defaults to $CALLER::LOG if it is a GLOB or IO::* object. A newline will be appended to the last string of @text if it is not already present. The default INFO_LEVEL is 0. The default LOG is STDERR. NOTE: $INFO_LEVEL and $LOG must be package variables (declared with our or use vars) for this function to work correctly. ### DEBUG  DEBUG( @text ); Calls info with a level of 0. Also prefixes each line of text with "DEBUG: ". ### INFO  INFO( @text ); Calls info with a level of 1. Also prefixes each line of text with "INFO: ". ### NOTICE  NOTICE( @text ); Calls info with a level of 2. Also prefixes each line of text with "NOTICE: ". ### WARNING  WARNING( @text ); Calls info with a level of 3. Also prefixes each line of text with "WARNING: ". ### ERR  ERR( @text ); Calls info with a level of 4. Also prefixes each line of text with "ERR: ". ### ERROR  ERROR( @text ); Calls info with a level of 4. Also prefixes each line of text with "ERROR: ". (is an alias for ERR()) ### CRIT  CRIT( @text ); Calls info with a level of 5. Also prefixes each line of text with "CRIT: ". ### ALERT  ALERT( @text ); Calls info with a level of 6. Also prefixes each line of text with "ALERT: ". ### EMERG  EMERG( @text ); Calls info with a level of 7. Also prefixes each line of text with "EMERG: ". ## :system - System / sysadmin tools ### pidof  my @progs = pidof$program;
my @progs = pidof %opts;

Searches /proc for running programs matching the given name or options. Any options will match against the correcponding value via smartmatch EXCEPT for the pid option which must be an exact PID or array of PIDs.

program

Name of command (excludes path part).

command

Command - includes path if used in execution of program which makes this a bit unreliable if the command is started from a command prompt.

cmdline

Contents of /proc/$pid/cmdline, namely the command and command line arguments joined by NULL characters. Programs without command line arguments will immediately fail. args Matched against array of just the command line arguments $VALUE ~~ @args. Programs without command line arguments will immediately fail.

pid

PID or array of PIDs to examine.

user

User name

uid

User id

group

Group name

gid

Group id

### do_as

 do_as "username", sub { ... };
do_as "username:groupname", sub { ... };

Locally change the effective user id and execute some code. Only works if current user is root!

Ensures that $ENV{USER} and$ENV{HOME} are set appropriately. Will eventually include options which will attempt to setup DISPLAY, DBUS, XAUTH, SSH_AGENT, GPG_AGENT, and other variables useful for running and connecting to existing X sessions, apps, and daemons of the user.

## :op - Core function extensions

### pmap(&@)

Parallel map. Applies function to each item in input list. Evaluation order is not defined, however, result array will be ordered as if the map were performed sequentially. Function is called in list context and may produce any list of items serializable by Storable.

 # Quickly convert a bunch of images to png:
pmap { my $old =$_; s/\.[^.]+/.png/; system convert => $old =>$_ } @images;
 # Result order matches input order
use Time::HiRes qw/ sleep /;
say join " ", pmap { sleep(my $sleep = rand); say "$_: Hello ($sleep)";$_ } 0..9;

$_ is set to each value in turn, though note that $_ will be a copy, not an alias. Therefore modifications to $_ will not be preserved as they are using normal map. Overhead is reasonably small, but there is little reason to use this function if your tasks finish quickly. Rough "worst case" benchmarks (on Linux): $_Util::pmap::threads = 2;
pmap { say "Hello" } 1..1;          # 0.028559 seconds
pmap { say "Hello" } 1..1_000;      # 0.027256 seconds
pmap { say "Hello" } 1..10_000;     # 0.067582 seconds
pmap { say "Hello" } 1..100_000;    # 0.556916 seconds
say "Hello" for 1..100_000;         # 0.011928 seconds
 @x = pmap { $_ + 1 } 1..1; # 0.032267 seconds @x = pmap {$_ + 1 } 1..1_000;      # 0.030821 seconds
@x = pmap { $_ + 1 } 1..10_000; # 0.098850 seconds @x = pmap {$_ + 1 } 1..100_000;    # 0.660198 seconds
@x = map  { $_ + 1 } 1..100_000; # 0.024077 seconds Optimizations: Configuration: The following variables can be used to control the thread objects used. Their default values are shown.  %_Util::pmap::t_opts = (stack_size => 16*4096);$_Util::pmap::threads = sub { ... };

### subopts

 my %opt = subopts( \@_, OPTIONS )

"Parse" subroutine options into an options hash. Handles mixtures of positional and named parameters, default values, parameter validation, and other features. Parsing of the argument array is controlled by the following options:

positional => A
 my %opt = subopts( [1,2,3], positional => [qw/foo bar baz/] );
# %opt = (foo => 1, bar => 2, baz => 3)

List of positional argument names.

p6_positional => A

Like positional, but processing stops at the first "known" key value. This allows for Perl 6-like flexi-parameters. This, however, is somewhat dangerous if data may match key names. For example:

 # Uh-oh: parses as ( date => "Jan 1", late => 0 )
my %opt = subopts( ["Jan 1", "late"],
p6_positional => [qw/date note/],
allowed => ["late"],
validate => { late => "bool" }
);
defaults => Ho*

Default values. Values may be arbitrary objects. Subroutine values may be expanded if 'eval_defaults' option is provided.

required => A

Any required parameters must not be undefined. Defaults are processed before requiredness is considered so any required parameter with a valid default will never cause an error.

validate => Ho*

Hash of validators. Keys are parameter names, values are validators.

 Sub validators:
arguments: $value, \%params_so_far return: BOOL | SCALAR_REF  Regexp validators: not automatically anchored, be sure to anchor your patterns if you want that untaint => BOOL | HoBOOL If true, any parameters satisfying their corresponding validator will be untainted. Parameters without a validator will not be untainted. allowed => A Key names (in addition to 'required' key names which may appear in the options. sloppy_known => BOOL By default, 'defaults' and 'validate' hashes are ignored when considering 'allowed' option keys (so that the same default and validate hashes may be used for multiple subs). Specifying this option will include their keys in the list of 'allowed' keys. no_dups => BOOL By default,  %opt = subopts( [ foo => 1, foo => 2 ] ) Will set $opt{foo} = 2. If 'no_dups' is true, this sub will throw an error at any duplicated kay names.

eval_defaults => BOOL

If true, any default values which are sub refs will be executed and their return values used. Useful if default value is expensive to compute. default subs are called with two parameters: the key name and the current options hash. he options hash WILL contain ALL user-set parameters but there are no guarantees about the order in which the defaults are expanded.

### modtime

Stupid! use Path::Class::File->stat->mtime

 print "File last modified: " . localtime( modtime($f) ) ### SPLIT Split an expression on a pattern ignoring split patterns within delimited text.  SPLIT PATTERN, EXPR, LIMIT SPLIT PATTERN, EXPR SPLIT PATTERN SPLIT Split PATTERN may be a string literal, qr// regular expression, or hashref containing splitting options. Beware, unlike Perl's split builtin, this function does not currently support captures in PATTERN. This may be fixed at some point in the future. If EXPR is missing, a splitting subroutine is generated and returned.  my$splitter = SPLIT;
my $splitter = SPLIT qr/\s*,\s*/; my$splitter = SPLIT \%options;
 my @pieces = $splitter->($text );
my @pieces = $splitter->($text, $limit );  my @pieces = SPLIT qr/\s*,\s*/,$text;
child => CODE || $exec_string || \@exec_command error => CODE || 'text to die by' ignore => BOOLEAN BUGS: "ignore" option doesn't work (local SIGCHLD is useless). Not sure how to fix it. Want: * no zombies * children to be killed when parent exits * to be able to fork from subs without globals (thus, open "|-" probably bad unless we keep them in a package var) ### gzdo Works just like the builtin do command, but reads the file using the PerlIO gzip layer. Just like the builtin command, this function will search @INC and update %INC. However, in addition this function will also attempt to append a ".gz" extension and will read that file if it exists (or exists in @INC). The following package variables modify the behavior of this subroutine:$_Util::gzdo::gzip_layer_options

Defaults to "(autopop)", this string is appended to the open MODE. Set to the empty string to disable automatic file type checking.

$_Util::gzdo::gz_extension Defaults to ".gz", setting this string affects the gzip extension that will be appended to the file name if necessary. Set to a false value to disable. ### hpush(\%@)  hpush %hash, key1 =>$value, key2 => $value2, ... Add pairs to an existing hash. Keys already existing in %hash are overwritten. ### hdefaults(\%@)  hdefaults %hash, key1 =>$value, key2 => $value2, ... Add pairs to an existing hash. Keys already defined in %hash are preserved. ### subhash  my %shallow = subhash( \%orig, @keys ); Extract keys from a hash. Similar to:  @shallow{@keys} = @orig{@keys}; But does not auto-vivify when key does not exist in %orig (and does not create key in %shallow). ### parse_date  my$dt  = parse_date( $string, %opt ); my$dt2 = parse_date( $dt1, %opt ); Parses a date and then converts it to a DateTime object. If the input is already a DateTime object, it will be CLONED and returned. Some date formats specifically guaranteed by this function:  2006:08:28 20:56:25 # Stored in exif date fields by my camera floating If true, time zone information will not be included the DateTime object. clone Defaults to true. When true, DateTime objects passed to this function will be cloned before being returned. ### map_pairs(&@) See also: List::MoreUtils pairwise or List::Util 1.33 pairmap Applies a function passing two elements from the array at a time. That is, given a function &f and a list of inputs, x1, x2, ..., this function returns the list ( f(x1, x2), f(x3, x4), ... ). The function may be a code block and may take either two arguments or use $a and $b as in perl's sort function. If there are an odd number of elements in the list the last iteration will be called with undef as the second parameter ($b).

Example:

 @z = map_pairs { "$a:$b\n" } %hash;

Note: The return list will not be constructed if this function is called in void context. Therefore you are not a bad person if you do the following:

 map_pairs { print "$a:$b\n" } %hash;

### map_pair(&&@)

Applies a pair of functions on a flat list of tuples. Given two functions, \&f and \&g, and a list of inputs, x1, x2, ..., this function returns the list ( f(x1), g(x2), f(x3), g(x4), ... ).

### deep_eq

Test if two complex (possibly circular) data structures are equal.

Solution based on code by Roy Johnson (http://www.perlmonks.org/?node_id=304250). Modified to match my indenting style and to fix some bugs in the original. I have also made it safe to use on blessed and circular objects.

### SYSTEM

Note: The Dean::Util::safe_pipe() is generally a better choice since that sub checks the exit status of the command.

Works like the perl system command except that any string expressions passed by reference will not be quoted. This allows, for example, pipes and redirects while still allowing safe escaping of arguments.

If the first argument to SYSTEM is a reference to the string "DEBUG" then the escaped command will be printed to STDERR before being executed.

EXAMPLES:

 # A truly silent mplayer
SYSTEM "mplayer", "Movie Files/LOTR: trailer 2.mov", \">/dev/null", \"2>/dev/null";
 # multiple commands in one line.
SYSTEM "echo", ";", \";", "echo", q/$( This is echoed properly )/;  # See what exactly is happening. SYSTEM \"DEBUG", "echo", ";", \";", "echo", q/$( This is echoed properly )/;

### QX

Note: The Dean::Util::safe_pipe() is generally a better choice since that sub checks the exit status of the command.

Works like a combination of SYSTEM and the qx operator. It behaves like SYSTEM in that it is a subroutine which takes a list of string expressions that are quoted before being passed to the shell. Any string expressions passed by reference will not be quoted. This allows, for example, pipes and redirects while still allowing safe escaping of arguments. The return value is the STDOUT of the executed command, just like with the qx operator.

### EXEC

Works like a combination of SYSTEM and exec. It behaves like SYSTEM in that it is a subroutine which takes a list of string expressions that are quoted before being passed to the shell. Any string expressions passed by reference will not be quoted. This allows, for example, pipes and redirects while still allowing safe escaping of file names. This function never returns, just like with the exec function.

### SELECT

Works like perl's select function, but is instead given a string which is opened as a file and then selected. The special string '-' will not change the default output stream. An undefined or empty string will select "/dev/null" instead.

If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is ">"). ### EXISTS  EXISTS$hash_ref, qw| key1 arbitrarily/deep/key |;
EXISTS $hash_ref, @paths, { sep =>$separator };

Safely test for deep key existence. Recursion happens by splitting on $separator ("/" by default), there is no means for escaping. Returns true only if all keys exist. Array refs are allowed if corresponding path components are numeric. ### HAS  HAS$hash_ref, qw| key1 arbitrarily/deep/key |;
HAS $hash_ref, @paths, { sep =>$separator };

### SPRINTF

 SPRINTF $o,$fmt, $h1,$h2, ...

Format the data in the hashes $h1,$h2, ... into the format string $fmt given in the language specified in the option hash$o.

Example:

 $o = { a => [ s => "artist" ], t => [ s => "title", "name" ], N => sub { scalar localtime }, }; @songs = ( { artist => "Arlo Guthrie", title => "The Motorcycle Song" }, { artist => "Cypress Hill", name => "Psycobetabuckdown" }, );  @formatted_songs = SPRINTF$o, "%20t - %a", @songs;

## :perl6 - Perl 6 functions

### smartmatch

Perl 5.010 pretty much killed the need for this...

 smartmatch( $X,$Y );

smartmatches $X ~~$Y. Inspired by the Perl6 operator, but a complete deviation since this is not designed to be the deciding form of a switch statement. Primarily tests for thing which are annoying to type out.

Returns 1 or '' as long a CODE is not one of the match variables. Returns undef if comparison is not possible.

Matches are commutative unless explicitly presented as otherwise

 str|num ~~   str|num      natural equality test, though see convert_string_to_regexp option
str|num ~~   Regexp       natural pattern match
 ARRAY   ~~   Regexp       all(ARRAY) =~ Regexp
HASH    ~~   Regexp       all(keys(HASH)) =~ Regexp
 undef   ~~   ARRAY        undef \in ARRAY
num|str ~~   ARRAY        str \in @ARRAY
 HASH    ~~   num          keys(%HASH) == num
ARRAY   ~~   num          @ARRAY == num
 ARRAY   ~~   ARRAY        @ARRAY <<~~>> @ARRAY  # test elements individually
HASH    ~~   ARRAY        exists(@HASH{@ARRAY})
HASH    ~~   HASH         have same keys
 CODE    ~~   Any          CODE( Any )
ARRAY   ~~   CODE         CODE( all(ARRAY) )
 Any     ~~   CODE         reserved
ARRAY   ~~   undef        reserved
ARRAY   ~~   str          reserved
convert_string_to_regexp: 'left' | 'right' | Bool

Strings of the form: qr(...), qr/.../, qr(\W)...(\1), ... are automatically upgraded to regular expressions.

### zip

 zip \@x, \@y, ...

The Perl6 zip function (almost). Given a list of arrays, returns a list of the array elements "zipped" together, that is: $x[0],$y[0], ..., $x[1],$y[1], .... The lists need not be the same length the short lists will simply be ignored after they run out.

### uniq

Takes a list (or reference to an array) and discards all but one of successive identical objects (up to stringification) from the list. In scalar context, an array reference is returned.

Note: This is different from the unique function which will remove all duplicates from the list.

# TODO

range2list and list2range

convert #..#,#-#,a..z,a-z,2:23,2:5:23 strings to lists and back. split /,/ first.

A more general form "suggested" on PerlMonks (http://www.perlmonks.org/?node_id=427615):

 foo[01:100]bar-[fred,barney,wilma]

Though, shell syntax might be better (see bash(1) /EXPANSION):

 foo{001..100}bar-{fred,barney,wilma{1,2}}
 sub f { sqrt($_[0]) } print adaptive( \&f, 0, 1, 0.0005 ),$/;
 sub adaptive {
my ($f,$a, $b,$eps) = @_;
my $s1 = simp($f, $a,$b);
my $s2 = simp2($f, $a,$b);
my $err = abs($s2-$s1)/15; if ($err < $eps) { return$s2;
} else {
return adaptive($f,$a, ($a+$b)/2, $eps/2) + adaptive($f, ($a+$b)/2, $b,$eps/2);
}
}
 sub simp {
my ($f,$a, $b) = @_; return ($b-$a)*($f->($a) + 4*$f->(($a+$b)/2) + $f->($b))/6;
}
 sub simp2 {
my ($f,$a, $b) = @_; return simp($f,$a,($a+$b)/2) + simp($f,($a+$b)/2,\$b);
}

# BUGS

No known bugs, if you find one, please report it via email.

# AUTHOR

 Dean Serenevy
dean@cs.serenevy.net
http://dean.serenevy.net

perl(1).