Dean::Util - Utilities created by Dean Serenevy



NAME

Dean::Util - Utilities created by Dean Serenevy


SYNOPSIS

 use Dean::Util qw/map_pair nsign min_max/;
 ...

Then later, to remove dependence on Dean::Util

 perl -MDean::Util -we insert_Dean_Util_functions The/Module.pm


DESCRIPTION

This is a set of utility functions for the perl programming language that I find myself rewriting frequently. Normally, putting functions into a module introduces a dependency on that module which can be a hassle in some situations. This is a "smart" module which is capable of replacing the use Dean::Util... line with the code for the requested functions. Thus, machines that have Dean::Util installed can use it as a module, but when requested, a (Dean::Util) dependency-free version of the file may be made.


EXPORTED FUNCTIONS

:utility - Using Dean::Util

list_Dean_Util_functions

This function prints a column-formatted list of the functions included in the Dean::Util package.

check_Dean_Util_functions

This function attempts to verify that the Dean/Util.pm is properly structured. This function is intended to be run only by people who make changes to the Dean/Util.pm file to check that their code is properly formatted for the module to parse.

get_Dean_Util_code

Returns a hash ref with an entry of the following type for each function and variable defined in Dean::Util.

 name => { code    => '...',
           pod     => '...',
           depends => [ 'thing 1', 'thing 2', ... ]
         }

Some additional information may be included in each sub-hash for debugging purposes or internal use.

insert_Dean_Util_functions

Replaces all occurrences of "use Dean::Util ...;" ("..." is everything up to first semi-colon, so don't use qw; ;) with the actual source code of the functions requested from Dean::Util. The original files are saved to a backup file which is just the original file name with a ~ appended. The list of files to modify is either included as a list of arguments or is read from @ARGV.

As in the function get_Dean_Util_function_string, the special symbols INCLUDE_POD and POD_ONLY may be used to indicate that all further inclusions (restricted to each individual "use" block) should include their POD documentation before the code, or exclude the code and only output the POD documentation. Example:

 use Dean::Util qw/max min INCLUDE_POD join_multi map_pair/;
 use Dean::Util qw/is_num is_int/;
 # ... later, possibly even after __END__
 use Dean::Util qw/POD_ONLY is_num is_int/;

Would include code and POD documentation for join_multi and map_pair. The code and POD documentation for is_num and is_int would be inserted separately.

Note: Multiple use Dean::Util inclusions may result in multiple subroutine definitions so don't use the same function twice unless they are in different scopes.

upgrade_Dean_Util_functions

Once insert_Dean_Util_functions has been used to "export" a list of Dean::Util functions, this command will replace Dean::Util function blocks with more recent function versions, thus upgrading the exported script.

remove_Dean_Util_functions

Once insert_Dean_Util_functions has been used to "export" a list of Dean::Util functions, this command can be used to remove them and restore the use Dean::Util line.

get_Dean_Util_function_string

Returns the source code for the functions provided as arguments. If the argument list is empty, the function list is taken from @ARGV.

The special symbols INCLUDE_POD and POD_ONLY may be used to indicate that all further inclusions should include their POD documentation before the code, or exclude the code and only output the POD documentation. Example:

 get_Dean_Util_function_string qw/max min INCLUDE_POD join_multi map_pair/;

Would include the POD documentation for only join_multi and map_pair.

 get_Dean_Util_function_string qw/POD_ONLY format_cols/;

Would return just the POD documentation for format_cols.


EXPORTABLE FUNCTIONS

:numerical - Numerical Functions

$pi

The string, pi, to 30 digits after the decimal.

$e

The string, e, to 30 digits after the decimal.

$sqrt2

The string, sqrt(2), to 30 digits after the decimal.

max

See also: List::Util max

Return the maximum number in a list of values. All arguments must be numeric, use max_dirty for untrusted or mixed data.

min

See also: List::Util min

Return the minimum number in a list of values. All arguments must be numeric, use min_dirty for untrusted or mixed data.

max_dirty

Return the maximum number in a list of values. This version of max should be used for untrusted data since undefined or non-numeric values are silently ignored rather than trowing errors.

min_dirty

Return the minimum number in a list of values. This version of min should be used for untrusted data since undefined or non-numeric values are silently ignored rather than trowing errors.

fmax

 fmax { block } @list
 fmax \&sub, @list

Return the maximum function value given by evaluating the given code at each element of the list. The code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, perl will issue warnings.

fmin

 fmin { block } @list
 fmin \&sub, @list

Return the minimum function value given by evaluating the given code at each element of the list. The code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, perl will issue warnings.

fmax_dirty

 fmax_dirty { block } @list
 fmax_dirty \&sub, @list

Return the maximum function value given by evaluating the given code at each element of the list. The code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, they will be ignored.

fmin_dirty

 fmin_dirty { block } @list
 fmin_dirty \&sub, @list

Return the minimum function value given by evaluating the given code at each element of the list. The code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, they will be ignored.

minimizer

 minimizer { block } @list
 minimizer \&sub, @list

Return the item of @list which yields the minimum value when evaluated by the given code. The code may be provided either as a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, perl will issue warnings.

maximizer

 maximizer { block } @list
 maximizer \&sub, @list

Return the item of @list which yields the maximum value when evaluated by the given code. The code may be provided either as a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, perl will issue warnings.

minimizer_dirty

 minimizer_dirty { block } @list
 minimizer_dirty \&sub, @list

Return the item of @list which yields the minimum value when evaluated by the code. code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, they will be ignored and the corresponding list item will not be considered as a minimizer.

Note however that no filtering is performed on @list so undefined values will be passed to the subroutine as a normal element.

maximizer_dirty

 maximizer_dirty { block } @list
 maximizer_dirty \&sub, @list

Return the item of @list which yields the maximum value when evaluated by the code. code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. If the code returns any undefined or non-numeric values, they will be ignored and the corresponding list item will not be considered as a minimizer.

Note however that no filtering is performed on @list so undefined values will be passed to the subroutine as a normal element.

ceil($)

If the argument is numeric, then returns the smallest integer which is greater than or equal to the given argument. Otherwise this function will spew warnings.

See also: POSIX::ceil [identical functionality]

ceil_dirty($)

If the argument is numeric, then returns the smallest integer which is greater than or equal to the given argument. Otherwise this function will return undef.

floor($)

If the argument is numeric, then returns the largest integer which is less than or equal to the given argument. Otherwise this function spews warnings.

See also: POSIX::floor [identical functionality]

floor_dirty($)

If the argument is numeric, then returns the largest integer which is less than or equal to the given argument. Otherwise this function returns undef.

round

 round( $value )          # round to integer
 round( $value, 2 )       # round to even
 round( $value, "0.01" )  # round to cent

Round $value to multiple of second parameter. Applies traditional algorithm. Namely, round( $value ) == int($value + .5).

Internal comparisons are performed at "string precision" to combat numerical precision problems. Thus, do not expect to to be able to round to too many digits.

unbiased_round

 unbiased_round( $value )          # round to integer
 unbiased_round( $value, 2 )       # round to even
 unbiased_round( $value, "0.01" )  # round to cent

An unbiased round removes the upward bias of the traditional rounding algorithm by rounding the midpoint value up sometimes and down other times. The convention is to round midpoint values to even multiples, and round all other values normally.

For example, unbiased_round( 2.5 ) == 2 since 2 is even, however unbiased_round( 1.5 ) == 2 as well since 2 is even.

This system can be extended to the generalized rounding algorithm:

 unbiased_round( 1, 2 ) == 0   # since 0 is an even multiple of 2
 unbiased_round( 3, 2 ) == 4   # since 4 is an even multiple of 2

sum

See also: List::Util sum

Returns the sum of all numeric entries in a list. Undefined/non-numeric values cause warnings.

product

See also: List::Util reduce

Returns the product of all numeric entries in a list. Undefined/non-numeric values cause warnings.

average

Returns the average over all entries in a list. Undefined or non-numeric entries will spew warnings.

sum_dirty

Returns the sum of all numeric entries in a list. Undefined/non-numeric values are ignored.

product_dirty

Returns the product of all numeric entries in a list. Undefined/non-numeric values are ignored.

average_dirty

Returns the average over all entries in a list. Undefined or non-numeric entries contribute a 0 to the average.

min_max

Returns a pair ($m, $M) which is the minimum and maximum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values will cause warnings.

max_min

Returns a pair ($M, $m) which is the maximum and minimum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values will cause warnings.

min_max_dirty

Returns a pair ($m, $M) which is the minimum and maximum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values are silently ignored.

max_min_dirty

Returns a pair ($M, $m) which is the maximum and minimum numbers, respectively, in a list of values without looping over the list twice. Undefined or non-numeric values are silently ignored.

sieve_of_eratosthenes

 my $sieve = sieve_of_eratosthenes( $n );
 sieve_of_eratosthenes( $m, $sieve );

Constructs a bit string $sieve using the Sieve of Eratosthenes so that:

 vec($sieve, $n, 1) == 1   iff   $n is prime

If a sieve (or an undefined scalar) is provided as a second argument, it will be appended to.

Note: Since perl's length command deals only in bytes, this subroutine will round $n up to make sure that $sieve is correct to a whole number of bytes. In particular, you are guaranteed to be able to trust $sieve up to $n = 8 * length($sieve) - 1.

is_prime

Determine primality. Constructs the Sieve of Eratosthenes to determine primality. The sieve is reused for each call to is_prime so scripts are encouraged to prepare the sieve by calling is_prime on a large number before making multiple calls to is_prime.

 # SLOW: takes 21.89 seconds
 @primes = grep is_prime($_), 1..400000;
 # FAST: takes 1.387 seconds
 @primes = reverse grep is_prime($_), reverse 1..400000;

This function may take some shortcuts if it can so if you want to prepare the sieve append the option "force_sieve",

 # SLOW:
 is_prime( 400000 ); # this test shortcuts since 400000 is even
 @primes = grep is_prime($_), 1..400000;
 # FAST:
 is_prime( 400000, force_sieve => 1 );
 @primes = grep is_prime($_), 1..400000;

next_prime

 my $m = next_prime( $n )

Compute the next prime integer larger than $n.

base_hash

Given a base, this function returns a hash which may be used in future calls to the other base functions.

A base is described by:

 integer <= 36 (0-9 a-z)
 array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
 string        (string of symbols, shortcut for [split //, $str]
 hash ref      (the output of a previous call to base_hash, this is silly in this case)

base2base

 base2base( string, base, base )

String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:

 . , : ; _ | / \ - + ' ` "

Bases are described by:

 integer <= 36 (0-9 a-z)
 array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
 string        (string of symbols, shortcut for [split //, $str]
 hash ref      (the output of base_hash)

base2integer

 base2integer( string, base )

Convert a string to another base. The string may not be a decimal.

Base is described by:

 integer <= 36 (0-9 a-z)
 array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
 string        (string of symbols, shortcut for [split //, $str]
 hash ref      (the output of base_hash or symbol => value pairs)

base2decimal

 base2decimal( string, base )

String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:

 . , : ; _ | / \ - + ' ` "

Base is described by:

 integer <= 36 (0-9 a-z)
 array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
 string        (string of symbols, shortcut for [split //, $str]
 hash ref      (the output of base_hash)

decimal2base

 decimal2base( string, base )

String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:

 . , : ; _ | / \ - + ' ` "

Base is described by:

 integer <= 36 (0-9 a-z)
 array ref     (list of symbols, length == base, index i == i, yes you get to define zero)
 string        (string of symbols, shortcut for [split //, $str]
 hash ref      (the output of base_hash)

factorial

 factorial( $n )

Returns $n! if $n is a non-negative integer.

pct_change

 pct_change( $orig, $new )

Simply returns the percent change between the two values ($new-$orig)/$orig. Exists solely because I don't like how the formula looks in a line of real code.

:stat_prob - Statistical / Probability

pascals_triangle

 pascals_triangle( Int $n )

Return nth row of pascal's triangle (starting at 0).

random_binomial

 random_binomial( Int $n )

Return random integer from 0 to n (inclusive) following a binomial distribution. This is only useful up to n == 8 * $Config{intsize} - 1

prob_model_invariants

 prob_model_invariants( \%model, %options )

The model is a hash with keys the outcomes and values the corresponding probabilities. At most one of the probabilities may be undefined in which case it will be computed automatically (as $1 - \sum p_i$) and added to your passed probability model.

roll_dice

Roll n dice (default 1) and return the results. In scalar context, only the sum is returned. In list context, the individual rolls are returned as well as the final sum of the values (the sum is returned in the last position).

randomize

See also: List::Util shuffle

Randomize a list of values. Essentially the Fisher-Yates shuffle code from perlfaq4 ("How do I shuffle an array randomly?"). If the array is passed by reference then it will be altered, otherwise a copy is made. Returns a new list or a reference to a list depending on context.

one_var

 one_var( @data );
 one_var( \@data );
 one_var( \@data, $sorted );

Returns a hash (or hash reference if called in scalar context) of one-variable statistics on the input data. If the $sorted parameter is not defined then the data is assumed to be not sorted and the subroutine will make its own sorted copy of the data. If the $sorted parameter is defined but false, then the subroutine will sort @data in place (@data will be altered). If the $sorted parameter is true then the data will be assumed to be already sorted. The returned hash will have the following keys:

average
mean
x-bar

The average value of the data

sum
sum x

The summation of the data

sum_sq
sum x^2

The sum of the squares of the data

Svar
sample_variance

The sample variance, 1/n-1 * sum (x_i - average)^2

Sx
sample_standard_deviation

The sample standard deviation, sqrt( Svar )

variance
sigma_sq

The population variance, E( (X - E(X))^2 )

sigma
standard_deviation

The population standard deviation, sqrt( variance )

se
standard_error

The standard error of the mean, for computing confidence intervals

n

The number of measurements in the sample

min

The smallest data element

max

The smallest data element

Q1

The first quartile computed using broken "Basic Math Course Method".

Q2
med
median

The sample median

Q3

The third quartile computed using broken "Basic Math Course Method".

char:sum
char:Sigma
char:sigma

The corresponding Unicode characters: "\x{2211}", "\x{03A3}", "\x{03C3}". Be warned that char:sum is a different symbol than char:Sigma and that the terminal that you are writing to will need to understand UTF-8 font encoding.

Note: the list only needs to be sorted to compute the quartiles, min, median, and max values. If you are not interested in these values then you can speed up the computation by providing $sorted with a true valued (regardless of whether the data is sorted) and simply ignore those values in the output.

percentile

 percentile($p, @data)
 percentile($p, \@data)
 percentile($p, \@data, $sorted)
 percentile($p, \@data, %options)

Return the $p-th percentile using the weighted average at X_{(n+1)p} method (http://www.xycoon.com/method_2.htm) That is, the number such that approximately 100 * $p of the data values are less than or equal to the given value. If an array reference is given as well as a third true value, the data will be assumed to be already sorted. The following options are available.

sorted

Boolean value indicating whether the data are sorted already. If not, they will be sorted numerically.

method

One of "midpoint", "floor", "ceil", or "scaled". This controls what to do when a percentile divider is between two entries. The default behavior is "scaled", the returned percentile will be an appropriate linear combination of the neighboring entries. The "midpoint" method always returns the midpoint of the neighboring entries. Finally, the "floor" and "ceil" methods always return the lower or higher neighbor respectively.

The "method" also affects the return value when return => "index" is enabled.

return

Either "value" or "index". Affects whether we return the actual percentile value, or simply its index in the array.

correlation

 my $r = correlation( \@X, \@Y );
 my %I = correlation( \@X, \@Y );
 my $r = correlation( \@X, \@Y, %options );

Pearson product-moment correlation coefficient.

one_var_x
one_var_y

The result hash from one_var()

sd_x
sd_y
mean_x
mean_y

The sample standard deviation and mean of x and y.

permutations

 permutations( $n );
 permutations( @list );  # 1 < @list !!
 permutations( \@list );

Return a list of all permutations of the given input list.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider using Algorithm::Permute or Algorithm::FastPermute from CPAN.

k_arrangements

 k_arrangements( \@list, $k );
 k_arrangements( $n, $k );

Return a list of all arrangements (sub-permutations) of the given input list of length $k. If $n and $k are both integers, then simply the number of $k arrangements is returned.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

arrangements

 arrangements( $n );
 arrangements( \@list );
 arrangements( \@list, $k );
 arrangements( $n, $k );
 arrangements( @list );  # @list > 2 !!!

Return a list of all arrangements (sub-permutations) of the given input list (regardless of length). If the list is provided as a reference and an integer $k is provided then the results will be restricted to length $k as in the k_arrangements subroutine.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

k_combinations

 k_combinations( \@list, $k );
 k_combinations( $n, $k );

Return a list of all combinations of the given input list of length $k.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

combinations

 combinations( $n );
 combinations( \@list );
 combinations( \@list, $k );
 combinations( $n, $k );
 combinations( @list );  # @list > 2 !!!

Return a list of all combinations of the given input list (regardless of length). If the list is provided as a reference and an integer $k is provided then the results will be restricted to length $k as in the k_combinations subroutine.

Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.

npdf

 npdf $x
 npdf $x, $mu
 npdf $x, $mu, $sigma

Compute the probability P( X = $x ) assuming a normal distribution with mean $mu and standard deviation $sigma. $mu and $sigma are assumed to be 0 and 1 respectively if they are missing. $sigma must be positive.

ncdf

 ncdf $x
 ncdf $x, $mu
 ncdf $x, $mu, $sigma

Compute the probability P( X <= $x ) assuming a normal distribution with mean $mu and standard deviation $sigma. $mu and $sigma are assumed to be 0 and 1 respectively if they are missing. $sigma must be positive.

:math - Mathematical Functions

hypot

 my $h = hypot($x, $y);

See also: Math::Libm

Euclidean distance function: returns sqrt(x*x+y*y)

dotprod(\@\@)

 my $d = dotprod @x, @y;
 my $d = &dotprod(\@x, [1,2,3]);

Compute the dot product of two vectors

modular_inverse

 $inverse = modular_inverse( $x, $m );

Compute the inverse of $x in the group Z_m. The inverse will be within the set [0..$m-1].

Note: $x must be relatively prime to $m.

gcd

Compute the Greatest Common Divisor of a list of integers using the Euclidean algorithm.

lcm

Compute the Least Common Multiple of a list of integers.

extended_euclidean_algorithm

 ($alpha, $beta, $d) = extended_euclidean_algorithm($a, $b)

For a pair of integers, a and b, perform the extended Euclidean algorithm to compute alpha, beta, and d such that:

 d = alpha * a  +  beta * b

In particular, if d = 1 then alpha = a^-1 mod b.

frac

 my ($N, $D) = frac( $dec )

Convert a decimal to a fraction. Returns undef if number is not rationalizable (must have repeating decimals).

ndiff(&;@)

 my $df = ndiff \&f;
 my $df = ndiff \&f, $x;

Perform numerical differentiation using the central difference formula.

 f'(a) \approx ( f(a+h) - f(a-h) ) / (2h)

If M \approx f(a) \approx f''(c) for all c \in [a-h, a+h], then the total error (truncation plus round-off) is on the order of:

 error = M * (h^2/6 + eps/h)

where eps is the machine epsilon (eps = 2E-16 on 32-bit perl; (1 + 2E-16 != 1), however (1 + (2E-16)/2 == 1) ). Thus, error is minimized when h \approx \sqrt[3]{eps}. We choose h = 2**(-20) = 0.00000095367431640625.

Examples:

 sub f { $_[0]**2 }
 my $df = ndiff \&f;
 printf "%.5f  |  %.5f\n", f($_), $df->($_) for 0..10;
 say "f'(3) = ", ndiff(\&f, 3);
 $df = ndiff { $_ ** 2 };

Nintegrate

 Nintegrate { block } $a, $b, $n
 Nintegrate \&sub, $a, $b, $n

Integrate a function between two values using a composite Simpson's rule. The last argument $n is optional and specifies the number of intervals to divide the region into. The default is 1000.

The function is assumed to be continuous with continuous derivatives up to order 4. $n should be even, but we adjust it if it is not. The error is given by,

             5
        (b-a)     (4)
 err = --------  f  ( x )
             4
        180 n

for some x in the interval (a,b).

interpolating_function

 interpolating_function \%function, $message, $nowarn

Returns a perl subroutine which interpolates %function linearly using interpolate. $message is an optional message that will be used if an input value is given which is out of range of the interpolator.

interpolate

 interpolate $x, \%function, \@keys, $message, $nowarn

Perform an interpolation of the provided function at the point $x. The keys of the function need not be evenly spaced, the value is approximated linearly. The last two parameters are optional, @keys is a sorted list of the keys of the function and $message is used in the error message that is printed if $x is out of range of the interpolator.

continuous_compounding

 continuous_compounding P => $P, r => $r, t => $t;
 continuous_compounding A => $A, P => $P, r => $r, t => $t, solve_for => $q;

Given any three of "A" (Accumulated balance), "P" (Principal balance), "r" (interest Rate), and "t" (Time to withdrawal), this function will return the fourth. If all four values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a "solve_for" key which points to one of "A", "P", "r", or "t". All values are case insensitive.

discrete_compounding

 discrete_compounding P => $P, r => $r, t => $t, n => $n;
 discrete_compounding A => $A, P => $P, r => $r, t => $t, n => $n, solve_for => $q;

Given "n" (Number of compoundings per year) and any three of "A" (Accumulated balance), "P" (Principal balance), "r" (interest Rate), and "t" (Time to withdrawal), this function will return the fourth. If all five values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a "solve_for" key which points to one of "A", "P", "r", or "t". All values are case insensitive.

savings_plan

 savings_plan pmt => $pmt, r => $r, t => $t, n => $n;
 savings_plan A => $A, pmt => $pmt, r => $r, t => $t, n => $n, solve_for => $q;

Given "n" (Number of deposits per year), "r" (interest Rate), and any two of "A" (Accumulated balance), "pmt" (Payment amount), and "t" (Time to withdrawal), this function will return the third. If all five values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a "solve_for" key which points to one of "A", "pmt", "r", or "t". All values are case insensitive.

loan_payment

 loan_payment pmt => $pmt, r => $r, t => $t, n => $n;
 loan_payment L => $L, pmt => $pmt, r => $r, t => $t, n => $n, solve_for => $q;

Given "n" (Number of deposits per year), "r" (interest Rate), and any two of "L" (Loan amount), "pmt" (Payment amount), and "t" (Time to full payback), this function will return the third. If all five values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a "solve_for" key which points to one of "A", "pmt", "r", or "t". All values are case insensitive.

union

use Set::Object

 union( $L1, $L2, ... )

Return the list of (string) elements which appear in any of the given arrays. Objects are stringified, and the string values are returned. This may be upgraded to be smarter someday.

intersection

use Set::Object

 intersection( $L1, $L2, ... )

Return the list of (string) elements which appear in all of the given arrays. Objects are stringified, and the string values are returned. This may be upgraded to be smarter someday.

difference

use Set::Object

 difference( $L1, $L2, ... )

Return the list of (string) elements which appear in $L1 but not in any of the subsequent arrays. Objects are stringified, and the string values are returned.

:list - List Utilities

binary_search(&@)

 binary_search { $_ > 4 } @sorted_nums;
 binary_search \&f, @sorted_nums;

Implements a binary search. Second argument must be an array (not a list) and must be sorted. Returns the index of the first element for which the function &f returns true. Returns undef if there is no such element.

Function must return true for all elements larger than desired element. To search for a particular element, the following must be done:

 my $i = binary_search { $_ >= 4 } @sorted_nums;
 $i = undef unless $sorted_nums[$i] == 4;

text_sort

Natural sort with case folding and Unicode support. Mostly a direct use of Unicode::Collate with automatic binary string decoding (assumes UTF-8) and numerical substring extraction as in natural_sort.

Limitations:

  It doesn't "properly" sort negative numbers, non-fixed decimal values,
  nor integers larger than 10^24 ≈ 2^83.

text_sort_by(&@)

  @sorted = text_sort_by { $_->title } @books;

Natural sort with case folding and Unicode support. Mostly a direct use of Unicode::Collate with automatic binary string decoding (assumes UTF-8) and numerical substring extraction as in natural_sort. Callback is called on each item and should return a string for comparison.

Limitations:

Necessarily, it does not "properly" sort negative numbers or non-fixed decimal values.

It also can not sort integers larger than 10^24 ≈ 2^83.

natural_sort

A "fast, flexible, stable sort" that sorts strings naturally (that is, numerical substrings are compared as numbers).

Code lifted from tye on perlmonks: http://www.perlmonks.org/?node_id=442285

Limitations: http://www.perlmonks.org/?node_id=483466

  It doesn't "properly" sort negative numbers, non-fixed decimal values,
  nor integers larger than 2^32-1.

natural_cmp

A fast, flexible, stable comparator that sorts strings naturally (that is, numerical substrings are compared as numbers).

Code lifted from tye on perlmonks: http://www.perlmonks.org/?node_id=442285

Limitations: http://www.perlmonks.org/?node_id=483466

  It doesn't "properly" sort negative numbers, non-fixed decimal values,
  nor integers larger than 2^32-1.

cartesian

 cartesian \@list1, \@list2, ...
 cartesian $n1, $n2, ...

Form the cartesian product of the elements in the lists. That is, all lists of the form [ $e1, $e2, ... ] where $e1 comes from @list1, and so on. This function returns an array reference in scalar context, and a list in list context.

In the second form, the lists [1..$n1], [1..$n2], ... will be constructed, and the cartesian product of those lists will be computed. Note however, that the two forms can not be combined, you must either provide only arrays or only numbers.

transposed

 transposed \@LoL

Transpose the (possibly non-regular) list of lists @LoL. Returns a new list reference containing the objects in @LoL.

flatten

 flatten @LoLoLoL

Will recursively run through each element of the input list and will return all components as a single large list. Lists may be arbitrarily nested and any objects which are not perl ARRAY's will be considered plain elements. The expansion is done depth-first. Returns a reference in scalar context, and the list of elements in list context.

Example:

 @y = flatten [1, 2, 3], [4, 5], [[6, 7], 8, 9];
 say "Hooray!" if "@y" eq "1 2 3 4 5 6 7 8 9";

find_index

 find_index \&f, \@array
 find_index { BLOCK } \@array
 find_index { BLOCK } \@array, $start, $stop, $step

May be called with either a function or a block as the first argument. The function will then begin at $start (or zero) and then step by $step (or 1) until we reach $stop (or the end of the array).

$_ will be set to the current array entry which will also be passed to the function as its only argument. Thus you may use either $_ or $_[0] within your function.

$start may be greater then $stop in which case we will proceed backwards. In all cases the sign of $d will be adjusted if necessary so that we finish in finite time.

find_index_with_memory

 find_index_with_memory \&f, \@array
 find_index_with_memory { BLOCK } \@array
 find_index_with_memory { BLOCK } \@array, $start, $stop, $step

May be called with either a function or a block as the first argument. The function will then begin at $start (or zero) and then step by $step (or 1) until we reach $stop (or the end of the array).

The function will set the caller's $a to the previous array entry and $b to the current array entry and will also pass the two entries to the function as its only arguments. Thus you may use either $a, $b or $_[0], $_[1] as the previous and current entries respectively.

$start may be greater then $stop in which case we will proceed backwards. In all cases the sign of $d will be adjusted if necessary so that we finish in finite time.

first

See also: List::Util first

 first \&sub, @list         # if @list is not list of arrays
 first { block }  @list     # if @list is not list of arrays
 first { block } \@list
 first { block } \@list, $start_pos

Return the first item of @list for which the code returns true. Code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. You may pass @list by reference (which means that you must pass it by reference if it contains an array reference in its first entry). If you pass @list by reference and provide a third argument, then the third argument will be taken to be the first position that should be checked.

first_pos

See also: List::MoreUtils first_index

 first_pos \&sub, @list
 first_pos { block } @list
 first_pos { block } \@list, $start_pos

Return the index of the first item of @list for which the code returns true. Code may be either a subroutine reference or a code block. $_ will be set to each list entry and will also be passed in as the first (and only) argument. You may pass @list by reference (which means that you must pass it by reference if it contains an array reference in its first entry). If you pass @list by reference and provide a third argument, then the third argument will be taken to be the first position that should be checked. In this case the returned index will still correspond correctly to a position in @list.

bucketize

 my %buckets = bucketize { block } @list;
 my %buckets = bucketize \&tagger, @list;
 my $buckets = bucketize \&tagger, @list;

Partition items into buckets given a generic tagger. Returns hash ref in scalar context. Tagger should accept a single argument (or use $_) and should return a tag indicating the bucket to place the item in. Function is called in list context so that the following works as expected:

 %by_file_type = bucketize { /\.([^\.]+)$/ } @images;

Also note that values are given as bound aliases, so they can also be "cleverly" modified:

 # ("foo-bar", "foo-baz", "bip-bop")
 #  becomes: ( foo => ["bar","baz"], bip => ["bop"] )
 my %buckets = bucketize { s/^([^-]+)-//; $1 } @x;

partition

See also: List::MoreUtils part

 ($true, $false) = partition { block } @list
 ($true, $false) = partition \&test_func, @list

Partitions a list into two lists based on the truth value of a subroutine or block. The return value is two array references, the first of which is the elements of the original list for which the function returned true, and the second are those elements for which the function returned false.

even_positions

 @list_2 = even_positions @list_1;
 @list_2 = even_positions \@list_1;

Returns the elements of the list that have even indices. Argument may be list or arrayref, always returns a list of values.

odd_positions

 @list_2 = odd_positions @list_1;
 @list_2 = odd_positions \@list_1;

Returns the elements of the list that have even indices. Argument may be list or arrayref, always returns a list of values.

suggestion_sort

 suggestion_sort \@list, \@preferred

Returns @list sorted by the order of the objects in @preferred. All elements are matched as strings and elements of @list that are not in @preferred are placed at the end of the resulting list in a way that preserves their original ordering within @list.

Notes: Undefined entries will be ignored. Only the first appearance of an element in the @preferred list will be considered. Repetitions in @list will be reduced to a single occurrence.

unique

See also: List::MoreUtils uniq

 my @u = unique @list;
 my @u = unique \@list;
 my $h = unique @list;
 my $h = unique \@list;

Takes a list (or reference to an array) and returns a list of unique (up to stringification) objects in apparently random order. In scalar context, a histogram (hash with objects as keys, and counts as values) is returned.

Note: List::MoreUtils::uniq preserves the original order of the elements.

lex_sort

 lex_sort @list_of_lists
 lex_sort sub{  }, @list_of_lists

Sort the lists lexicographically element-wise. The sorting subroutine may use the package variables $a and $b or may take two arguments, but need only worry about element-wise comparison.

Example:

 lex_sort( [qw/abc ac a/], [qw/abc ab c d/], [qw/x y z/], [qw/abc ab c/] )
 # gives:
 #  ( [qw/abc ab c/],
 #    [qw/abc ab c d/],
 #    [qw/abc ac a/],
 #    [qw/x y z/]
 #  )

Similarly with numerical data using: sub{ $a <=> $b }

:patterns - Tests and Patterns

$_re_int

Pattern which matches an integer expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.

$_re_num

Pattern which matches an floating-point expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.

$_re_exp

Pattern which matches an exponent part (Ex: 2.3 e -10) of a floating-point expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.

$_re_wrd

Pattern which matches safe "word-like" data. This pattern does not match whitespace and most punctuation but does allow hyphens "-" and underscores.

is_int

Returns a true value if the argument looks like an integer expression. If no argument is provided, $_ is examined. Beware, this subroutine allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings for which this subroutine returns true.

is_num

Returns a true value if the argument looks like a floating-point (or integer) expression. If no argument is provided, $_ is examined. Beware, this subroutine allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings for which this subroutine returns true.

is_float

Returns a true value if the argument looks like a floating-point (or integer) expression. If no argument is provided, $_ is examined. Beware, this subroutine allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings for which this subroutine returns true.

is_word

Returns a true value if the argument looks like a word. If no argument is provided, $_ is examined. Words do not have spaces and do not typically have punctuation, though hyphens "-" and underscores are allowed.

$_re_image_ext

Pattern which matches image-type file name extensions. The list of extensions matched (case insensitive) are:

BMP CMYK CMYKA DCM DCX DIB DPS DPX EPI EPS EPS2 EPS3 EPSF EPSI EPT FAX FITS FPX G3 GIF GIF87 GRAY ICB ICM ICO ICON IPTC JBG JBIG JP2 JPC JPEG JPG MAP MIFF MNG MONO MPC MTV MVG OTB P7 PAL PALM PBM PCD PCDS PCL PCT PCX PDB PGM PICON PICT PIX PLASMA PNG PNM PPM PSD PTIF RAS RGB RGBA RLA RLE ROSE SGI SUN SVG TGA TIF TIFF UYVY VDA VICAR VID VIFF VST WBMP X XBM XC XCF XPM XV XWD YUV

is_image_file

Returns a true value if the argument looks like an image file. If no argument is provided, $_ is examined. The ist of extensions matched (case insensitive) are:

BMP CMYK CMYKA DCM DCX DIB DPS DPX EPI EPS EPS2 EPS3 EPSF EPSI EPT FAX FITS FPX G3 GIF GIF87 GRAY ICB ICM ICO ICON IPTC JBG JBIG JP2 JPC JPEG JPG MAP MIFF MNG MONO MPC MTV MVG OTB P7 PAL PALM PBM PCD PCDS PCL PCT PCX PDB PGM PICON PICT PIX PLASMA PNG PNM PPM PSD PTIF RAS RGB RGBA RLA RLE ROSE SGI SUN SVG TGA TIF TIFF UYVY VDA VICAR VID VIFF VST WBMP X XBM XC XCF XPM XV XWD YUV

readonly

Returns true if scalar argument is readonly. (Taken from Scalar::Util.)

like_array

Returns true if the object can behave like an array. (This is just a nicer way to call UNIVERSAL::isa)

like_hash

Returns true if the object can behave like a hash. (This is just a nicer way to call UNIVERSAL::isa)

like_scalar

Returns true if the object can behave like a scalar. (This is just a nicer way to call UNIVERSAL::isa)

:parse - General Interpreters / Parsers

parse_debian_control_format

Parses text in Debian Control file format (http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-controlsyntax). Returns an arrayref of records (one for each paragraph).

parse_user_agent

WARNING: This function is out of date, naïve, and generally broken. See HTTP::BrowserDetect for a more up-to-date solution. I intend to eventually send some patches or provide a wrapper (HTTP::BrowserDetect::Practical(?)) to that module (for instance, it is my belief that agent strings like "msie" should be reported as a bot).

 my $hashref = parse_user_agent( $string );
 my %hash    = parse_user_agent( $string );

Given a user-agent string returns a hash containing the following fields. Fields which can not be determined are left undefined.

generic_os

Returns the generic operating system type: Windows, Mac, OS2, Linux, UNIX

os

Returns the specific operating system type: Windows Vista, Windows Server 2003, Windows XP, Windows 2000, Debian, ...

type

One of: browser, textbrowser, bot, downloader, mobile

Note: For this field, we try to make our best guess at which class the agent string fits into.

program

Quasi-canonicalized program name: Internet Explorer, Netscape, Mozilla, Firefox, wget, ...

version

Our best guess at the program version

engine

The Browser's rendering engine: Gecko, KHTML, MSHTML, Presto (opera), WebCore (apple), custom (other custom engines)

engine-version

The version of the rendering engine

user-agent

The unmodified user-agent string

obsolete

If true, the agent appears to be an obsolete web browser

str2hash

Parse a string into a hash using Text::Balanced::extract_delimited. This function recognizes perl 5 style hashes as well as the basic perl 6 adverbial form. Items missing a value will set the corresponding hash value to true.

Example:

 str2hash 'foo, bar => "Hmmm, a comma", :baz<23>, :!bip, quxx => Spaces are fine'

Parses to:

 { foo => 1,
   bar => 'Hmmm, a comma',
   baz => 23,
   bip => 0,
   quxx => 'Spaces are fine',
 }

Unfortunately, the adverbial form will behave strangely with embedded commas:

 str2hash ':baz<Well, how odd>'

becomes

 { ':baz<Well' => 1,
   'how odd>'  => 1,
 }

unformat

WARNING: still quite experimental!

 unformat $fmt, @strings
 unformat \%options, @strings

Attempts to reverse the actions of sprintf or other formatted output (for instance date formats or apache logs). The return value is a list of reports (see below) unless these was only a single input string to parse in which case unformat may be safely called in scalar context.

format

The format string

as

Specify how to return the findings. By default just a list of matched components is returned however, we can also return the following reports:

hash

A hash mapping conversions (or their corresponding names, if given) to their corresponding strings. BEWARE KEY COLLISION

  { ~conv, str, ~conv, str, ... }
list

The default, the return values are each an array of strings that could have been used to generate one of the input strings.

  [ str, str, ... ]
list_list

Each return value is an array of two arrays the first of which is the list of strings returned by the "list" option. The second is the conversion instructions giving each corresponding string.

  [ [ str, str, ... ],  [ conv, conv, ... ] ]

Note, in this case, each list of conversions is an array reference pointing to the same array, so altering one will alter them all.

pairs

Each return value is a flat array of pairs:

  [ conv, str, conv, str, ... ]
regex

Return a regular expression that will match the given pattern. In scalar context just the list is returned. In list context the conversions will be returned also.

  ( regex, conv, conv, ... )
tuples

Each return value is an array of arrays each with two elements. First the conversion instruction and second the string that it matched.

  [ [conv, str], [conv, str], ... ]

In all cases except for the hash, the conversion instructions are the precise ones given in the format string, including any formatting options. For the hash however, the conversion are the simplified two-character labels (E.g. "%s" instead of "% 35s").

Additionally, the escape '%%' is treated as a string literal '%' and will not appear in any of the report types. A "formatted percent" (for instance "%-05%") will pass through the conversions and will appear in the reports if you define a special conversion for it (since we define no standard conversion for this case).

conversion_aliases

A hash of aliases between conversion types. Use this to map your custom conversion (for instance from the date formatting commands) to standard perl conversions. Conversions of the form ( a => "s" ) will preserve formatting options while aliases that start with '%' ( Y => "%04d" ) will use the formatting options "04" rather than any options that may have appeared before the "Y". (Which would presumedly cause "0035" to parse to 35.) Conversion aliases are searched before conversions or special conversions. Once can also add aliases that include the conversion options to override other behavior ( '02Y' => '%02d', Y => 's' ).

special_conversions

A hash of conversions as in the conversions option but these conversions will be added to the list of standard conversions and will be consulted first should a standard conversion type appear in this listing.

conversions

A hash of conversions ( type => action ). Each "type" is simply the conversion type (E.g. the "s" in "%- 10s") and each action is a pattern that CAPTURES (preferably non-greedily) the conversion type (for instance (s => '(.*?)')). The action could also be a subroutine which accepts two arguments. First the formatting options and second the conversion type. For instance, a sub action for the "f" conversion type might convert its arguments (".1", "f") into the pattern '(\d+\.\d{1})'.

Be sure that all of your conversions produce a pattern that captures exactly one substring.

Specifying this option replaces the built-in conversions which attempt to reverse the standard perl conversions listed in the sprintf documentation.

conversion_map

If defined and a hash then the conversions in the above reports will be transformed by this hash. conversions will be first searched for in their full form (including formatting options) both with and without their leading '%', then searched for under only the conversions type (both with and without the '%'). Anything not appearing in the conversion map will be treated normally as described above.

conversion_pattern
 Default: '(%([^a-zA-Z%]*)([%a-zA-Z]))'

Should capture three strings. The entire conversion pattern, any formatting options that may be present, and the conversion type. The default pattern captures single character conversions as well as the '%' escape ("%%"). See also the "Limitations" below.

Limitations: format conversions are assumed to be one character long. That is, conversions like "%ld" will be interpreted as "%l". This can be fixed by altering the conversion_pattern but I don't have the need to be careful about it. If you code up a more careful parser and are willing to share, feel free to send it and I will add it in.

Also, no locale information is considered. sprintf considers the "LC_NUMERIC" value to affect how numbers are formatted. We do not make such considerations here.

:canonicalize - Canonicalization

decode_english

 $text = decode_english($text);

Ensures that text is a decoded string. First attempts to interpret as UTF-8 byte string, then as Windows-1252 byte string (which shoiuld accept everything).

For English text, this should usually do what you want.

decode_first

 $text = decode_first($text, @encodings);

Ensures that text is a decoded string. First attempts to decode using the given encodings (in order) until one succeeds. Dies if none of the encodings succeed.

approx_date

Format date into one of the following depending on how far away the date is:

 tomorrow
 today
 yesterday
 Fri, 30 Sep 2011
 10 Sep 2011
 Mar 2011
 2009

nice_date

My preferred date format: '%a, %e %b %Y'. Also collapses spaces. Can override by setting $_Util::nice_date::format, but why would you want to?

nice_time

My preferred time format: '%l:%M%P'. Also collapses spaces. Can override by setting $_Util::nice_time::format, but why would you want to?

nice_datetime

My preferred datetime format: '%a, %e %b %Y %l:%M%P'. Also collapses spaces. Can override by setting $_Util::nice_datetime::format, but why would you want to?

commify

 my $val = commify(1234342.32);
 my $val = commify("%.2f", 1234342.3234234);

Insert commas into number. If passed two parameters, the first will be taken as a sprintf format string which will be applied to the value before commifying.

rtf2txt

 rtf2txt( file => $filename_or_handle )
 rtf2txt( string => $rtf_text )
 rtf2txt( $existing_file )
 rtf2txt( $rtf_text )

nicef

 nicef( $num, $digits )

Nicely formats sprintf("%.${digits}f", $num) by removing trailing 0's and unnecessary decimals. $digits defaults to 2.

length2pt

Given a string like "4in" or "2ft - 7in", return the value as a number of points (72 points per inch). undef is returned if we can't parse the string.

Recognized units:

 pt
 in, ft, mi
 km, m, cm, mm, nm

uri_rel2abs

 my $url = uri_rel2abs( $path, $base )

Converts a path into an absolute path based at the given base unless the path is already absolute. Any file part of the base is ignored.

This subroutine is should be a proper rfc3986 uri parser as it is simply calls URI->new_abs. However, proper parsing pays a penalty in execution time. Compare the benchmarks between uri_rel2abs and uri_rel2abs_fast:

        Rate   URI  FAST
 URI   208/s    --  -93%
 FAST 3012/s 1350%    --

uri_rel2abs_fast

 my $url = uri_rel2abs_fast( $path, $base )

Converts a path into an absolute path based at the given base unless the path is already absolute. Any file part of the base is ignored.

This subroutine is not and will likely never be a reasonable implementation of a proper rfc3986 uri parser. At the moment, however, it appears to be "good enough" for typical web address (http, ftp, mms, ...) handling.

The uri_rel2abs function uses the URI module to properly produce an absolute uri, however at a significant speed cost.

        Rate   URI  FAST
 URI   208/s    --  -93%
 FAST 3012/s 1350%    --

glob2regexp

Constructs a regular expression pattern (string) that matches the same patterns as the given glob. The pattern matches a whole string and is anchored using ^ and $ unless the glob ends with * in which case the trailing .*$ will be removed. Keep this in mind if you wish to capture the pattern matched by the glob.

Current capabilities:

Globby chars

* match many chars; ? match one char

Escaping of globby chars

\** matches '\*Hello', \\\** matches "\\*Hello"

Grouping constructs

[abc] match a character, [^abc] don't match chars, {foo,bar} match options

Current restrictions:

The globby chars '*' and '?' may not appear within grouping constructs ('[]' and '{}').
Can't match grouping chars in groups: '[ab\]]' does not work.

str($)

Returns string form of argument (forces string context) if it is defined, otherwise returns the empty string.

replace_windows_characters

Replaces unsightly Extended Windows characters with reasonable ASCII equivalents.

 See: http://www.cs.tut.fi/~jkorpela/www/windows-chars.html
 See: http://search.cpan.org/~barbie/Text-Demoroniser
 (and probably a million other places)

strip_space

Remove all space from the provided argument. If the argument is undefined, return the empty string.

sign($)

Returns "+" or "-" depending on the sign of the argument.

nsign($)

Returns "" or "-" depending on the sign of the argument.

canonicalize_newlines

Replace CRLF, CR, LF with the Perl magic \n. Arguments are modified in-place. If no arguments are provided then $_ is altered instead. Any undefined arguments are ignored. (though canonicalize_newlines(undef) will not alter $_).

canonicalize_newlines_copy

Replace CRLF, CR, LF with the Perl magic \n. Arguments are copied before canonicalization. If no arguments are provided then $_ is used instead. Any undefined arguments result in undefined output values.

canonicalize_timeword

Transform a reasonable (case-insensitive) abbreviations (or plural forms) of "second", "minute", "hour", "day", "week", "month", "year" into one of these canonical forms. Whitespace and numerical values are allowed at the beginning of the string and will be ignored (and not included in the return value).

NOTE: minutes are preferred over months, thus "m" will return "minute" rather than "month".

qbash($)

Returns a string quoted for bash-like shells. The string must contain only printable characters or whitespace, otherwise the subroutine will die. The return value is an untainted string wrapped in single quotes ' that is ready (and safe) to pass to a shell.

A note on encoding:

If a string would be considered otherwise unquotable, an attempt will be made to interpret it as encoded UTF-8. If this is successful, then the string will be re-checked and, if acceptable, escaped and then re-encoded. If your expressions are in some other encoding, you will need to decode them yourself (and probably re-encode them before use).

stringify

 stringify( $thing, %options )

Stringifies Perl objects (SCALAR, HASH, or ARRAY based). Stringifies only a single object at a time, and accepts the options below. Note: CODE, GLOB, LVALUE, and Regexp references are not supported.

stringify_underlying_object

By default, overloaded stringification will be respected. Set this option to true to stringify the underlying object rather than use its overload function.

list_type

List which describes how lists are translated.

 DEFAULT: [ "[", ",", "]" ]
hash_type

List which describes how hashes are translated.

 DEFAULT: [ "{", "=>", ",", "}" ]

simple_range2list

 simple_range2list @ranges

Expand "#,#..#,#-#,a..z,a-z,2:23,2:5:23,a:5:zz" strings to lists. Beginning ending blocks may be anything matching [\w\.]+, though I'm not sure how well underscores will behave. Commas may separate multiple range chunks.

A plain value v (numerical or non-numerical) will produce the range 1..v or 'a'..v.

If no step size is given, The standard perl .. is used to expand the range.

Ranges with step sizes are incremented by the step size (may only be decimal valued if both start and end values are numerical) until the value exceeds the right hand value.

For integers, see also Set::IntSpan::Fast:

 $set->add_from_string(
   { sep => qr/(?:\s*,\s*|\s+)/, range => qr/(?:\.\.|\-|\:)/ },
   $string
 );

canonicalize_filename

 canonicalize_filename $f;
 $new = canonicalize_filename $f;
 canonicalize_filename $f, %options;

Removes anything too exotic from the file name $f. In void context, $f is modified, otherwise, $f is left unaltered and the modified file name is returned. In all cases the canonicalized name will be untainted. The following options will affect the behavior of this subroutine. The default values are shown:

replacement => ""

If a string value, invalid characters will be replaced with this value. If a hash reference then characters will be replaced by their corresponding values. Any values not present in the replacement hash will be replaced with the value in the 'DEFAULT' key (if present) or the empty string.

allow => 'print'

Must be one of 'print', 'basic', 'ascii', or a pattern matching A SINGLE legal character. The 'print' class will allow just about anything through that is not a control character including unicode characters and punctuation if your perl supports that. The 'basic' class should only allow characters that do not require escaping or quoting in a Linux shell (currently allows: \w-+.~%). The 'ascii' class permits regular printable windows and MacOS safe ascii (not unicode).

allow_subdirs => 1

If true, subdirectory separators will be allowed (uses File::Spec to determine volume and directory separators for your system).

squash_duplicates => 'dwim'

If false, each invalid character will be replaced separately. If the value is 'like' then, repeated illegal values are replaced by only a single replacement value. If the value is any true value other than 'dwim' then, consecutive illegal values (even if they do not match) will be replaced with the replacement value for the first illegal character in the substring. Finally, if the value is 'dwim' then a replacement hash will cause the "like" behavior and a replacement string will result in "true" behavior.

Example:

 %replace = ( replacement => { ':' => "-", " " => "+" } );
 # 'dwim' default using replacement hash: gives "foo-+bar"
 canonicalize_filename( "foo: bar", allow => 'basic', %replace );
 # 'dwim' default using replacement string: gives "foo-bar"
 canonicalize_filename( "foo: bar", allow => 'basic', replacement => "-" );

trim

Trim leading/trailing whitespace. Trims $_ if no arguments provided. In void context, the arguments are altered, otherwise they are not changed and the trimmed values are returned.

:time - Time Management

now

If the floating option is passed, a DateTime object will be created with no time zone information. Otherwise, creates a DateTime object in the local time zone.

Keep in mind, time is difficult. If wall time in Eastern time zone (-0500) is "3:11 pm" and time() == 1298664681, then:

 |------------------+------------+----------------------------+-----------------------------|
 | Function         | $dt->epoch | RFC822                     | $dt->set_time_zone("+0300") |
 |------------------+------------+----------------------------+-----------------------------|
 | now()            | 1298664681 | 25 Feb 2011 15:11:21 -0500 | 25 Feb 2011 23:11:21 +0300  |
 | now(floating=>1) | 1298646681 | 25 Feb 2011 15:11:21 +0000 | 25 Feb 2011 15:11:21 +0300  |
 | DateTime->now    | 1298664681 | 25 Feb 2011 20:11:21 +0000 | 25 Feb 2011 23:11:21 +0300  |
 |------------------+------------+----------------------------+-----------------------------|

Think carefully about what exactly you want.

ymd

Behaves like localtime in scalar context, but returns the date as "YYYY-MM-DD". Returns the components of that string in list context.

ymd_hms

Behaves like localtime in scalar context, but returns the date as "YYYY-MM-DD HH:MM:SS". Returns the components of that string in list context. Hours are presented in 24 hour format.

seconds2human

 seconds2human( seconds, start-unit, end-unit )

Convert an arbitrary number of seconds to a "nice" human-readable form. the second and third arguments are optional and specify the first and last time units presented (note specifying a start unit rounds the precision of your result to the given unit). The resulting data are separated by the value of $". Units available are: seconds, minutes, hours, days, months, and years. If the input seconds include a decimal portion, then the seconds value will be rounded to three places using the format "%.3f".

Example:

 seconds2human 99999999, 'd', 'mos.'   # gives: "38 months 17 days"
 local $" = ', ';
 seconds2human 99999999, 'm', 'hour'   # gives: "27777 hours, 46 minutes"

seconds2hms

 seconds2hms $sec
 seconds2hms $sec, $sep

Convert an arbitrary number of seconds to a "hh:mm:ss" string. The "hh" portion of the string will always be at least two digits long (but may be more if more than 99 hours are represented by given number of seconds.

seconds2time

 seconds2time $sec
 seconds2time $sec, $pad
 seconds2time $sec, %options

Convert a number of seconds (from 0 to 86400) to a "h:mm AM/PM" string. If a second $pad parameter is given, that symbol will be used to force the hour portion to be precisely 2 characters wide (typical values are 0 and " "). You may also fully specify "pad", "AM", "PM", and "sep" (separator, default ":") options. The AM and PM strings should include a leading space if you want it.

human2seconds

Converts a human-written string of a timespan expressed in various abbreviations of seconds, minutes, hours, days, weeks, months, and years into an integer representing the same time span in seconds.

Subroutine dies if it is incapable of parsing the input string.

Examples:

 human2seconds "3 dys. 2hr 15m"   # 260820
 human2seconds "3q 2wk"           # dies: doesn't recognize 3q

%as_month

A hash containing mappings between various months and abbreviations to their full month names (all keys are lowercase):

  month => Month
  mon   => Month
  mon.  => Month
  ##    => Month
  #     => Month

Also includes 4 letter keys for September.

%as_month_number

A hash containing mappings between various months and abbreviations to their two digit month numbers (all keys are lowercase):

  month => ##
  mon   => ##
  mon.  => ##
  #     => ##

Also includes 4 letter keys for September.

:file_comp - File related computations

size_sum

Given a list of sizes (possibly negative) converts each entry to its corresponding number of bytes, sums the values and then converts the result back to a human readable size. Prefixes are computed base 2 (K = 1024, M = 1048576, ...).

Example:

 print size_sum qw/1.5MB -650kB -1253kB/;

size_sum_SI

DEPRECATED: size_sum now uses MB and MiB

Given a list of sizes (possibly negative) converts each entry to its corresponding number of bytes, sums the values and then converts the result back to a human readable size. Prefixes are treated as standard SI prefixes (K = 1000, M = 1000000, ...).

Example:

 print size_sum_SI qw/1.5MB -650kB -1253kB/;

size2bytes

Given a string like "4MB" or "3TiB - 400G", return the value as a number of bytes. undef is returned if we can't parse the string. Prefixes are computed base 2 (Ki = 1024, Mi = 1048576, ...) or using standard SI prefixes (K + 1000, M = 1000000).

size2bytes_2

Given a string like "4MB" or "3TB - 400G", return the value as a number of bytes. undef is returned if we can't parse the string. Prefixes are computed base 2 (K = 1024, M = 1048576, ...).

size2bytes_SI

DEPRECATED: size2bytes now uses MB and MiB

Given a string like "4MB" or "3TB - 400G", return the value as a number of bytes. undef is returned if we can't parse the string. Prefixes are treated as standard SI prefixes (K = 1000, M = 1000000, ...).

bytes2size

Print a human-readable string of the form 20.4MiB from the corresponding number of bytes (an integer). An optional second parameter specifies the minimal digits of accuracy which is 3 by default, 1.21 but 12.1). An optional third parameter specifies the minimum number of digits after the decimal place to keep which is 1 by default. Prefixes are computed using either base 2 (Ki = 1024, Mi = 1048576, ...).

bytes2size_SI

DEPRECATED: bytes2size now emits KiB, MiB, ...

Print a human-readable string of the form 20.4MB from the corresponding number of bytes (an integer). An optional second parameter specifies the minimal digits of accuracy which is 3 by default, 1.21 but 12.1). An optional third parameter specifies the minimum number of digits after the decimal place to keep which is 1 by default. Prefixes are treated as standard SI prefixes (K = 1000, M = 1000000, ...).

:file - File Operations

rofh

Read only filehandle

 my $fh = rofh $filename;
 my $fh = rofh \$mode, $filename;

Simply performs an open or croak with an appropriate message. If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is "<").

wofh

Write only filehandle

 my $fh = wofh $filename;
 my $fh = wofh \$mode, $filename;

Simply performs an open or croak with an appropriate message. If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is ">").

rwfh

Read-write filehandle

 my $fh = rwfh $filename;
 my $fh = rwfh \$mode, $filename;

Simply performs an open or croak with an appropriate message. If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is "+<").

rofhz

Read only compressed filehandle

 my $fh = rofhz $filename;
 my $fh = rofhz \$mode, $filename;

Simply performs an open or croak with an appropriate message. Requires perl compiled with PerlIO support (perl 5.8, I believe). The gzip PerlIO layer is loaded with the autopop option so that uncompressed files can be open using this function. If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is "<:gzip(autopop)").

Note: To properly decode UTF-8 files use the mode "<:gzip(autopop):encoding(UTF-8)"

wofhz

Write only compressed filehandle

 my $fh = wofhz $filename;
 my $fh = wofhz \$mode, $filename;

Simply performs an open or croak with an appropriate message. Requires perl compiled with PerlIO support (perl 5.8, I believe). If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is ">:gzip:encoding(UTF-8)").

Note: To properly encode UTF-8 files use the mode ">:gzip:encoding(UTF-8)"

rwfhz

Read-write compressed filehandle

 my $fh = rwfhz $filename;
 my $fh = rwfhz \$mode, $filename;

Simply performs an open or croak with an appropriate message. Requires perl compiled with PerlIO support (perl 5.8, I believe). The gzip PerlIO layer is loaded with the autopop option so that uncompressed files can be open using this function. If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is "+<:gzip(autopop)").

Note: To properly decode UTF-8 files use the mode "+<:gzip(autopop):encoding(UTF-8)"

in_and_out

 my ($IN, $OUT) = in_and_out( @ARGV[0,1] );
 my ($IN, $OUT) = in_and_out( @ARGV[0,1], %options );

Open file handles for text processing. Solves typical command line: "do_something input output" where input/output may be missing or "-" (use STDIN/STDOUT) and input may equal output (this sub handles file copying).

Return value is a pair of filehandles ready for processing. Use binmode to append PerlIO layers if desired.

bak

String to append for backups when input == output. Defaults to: ~

Example:

 my ($IN, $OUT) = in_and_out( @ARGV[0,1] );
 binmode $IN, ":encoding(UTF-8)";
 binmode $OUT, ":encoding(iso-8859-1)";
 print $OUT $_ while defined( $_ = <$IN> );

touch

 touch @files;
 touch \MODE @files;

Create files using optional numeric mode (e.g: touch \0700, "foo"). If files exist, their atime and mtime will be updated to the current time.

canonpath

Like canonpath command in File::Spec, but only works on unix filesystems (also cygwin if $^O eq 'cygwin'). However, it will clean up "/../" components whereas File::Spec->canonpath will not.

The code has been modified from File::Spec::Unix::canonpath in the PathTools package by Ken Williams.

fmap

 my @foos = fmap { s/^FOO: (.*)/$_Util::fmap::file: '$1' line $./ } @files
 my @foos = fmap { s/^FOO: (.*)/$_Util::fmap::file: '$1' line $./ } \%options, @files

Transform files. Loop through the lines of each file and apply a function. Replace each line with the new value of $_. The current file name will be available in the variable $_Util::fmap::file and will be one of the entries in the file list given to the subroutine. Of course, the standard perl variable $. ($INPUT_LINE_NUMBER when use English; is in effect) will be available for your use as well.

In scalar or list context returns a hashref (or hash) of (filename => [ new contents ]) pairs. The values are arrayrefs containing the modified lines of each file.

In void context, alters files in-place, just like using perl -pi -e from the command line.

if_mode

File mode when reading the file (the default is simply "<").

of_mode

File mode when writing the file (the default is simply ">").

backup

If a single character string (E.g., '~') or if starts with a leading dot (E.g., '.bak'), is appended to the filename as a backup suffix, Otherwise is treated as the backup file name ((E.g., 'old_foo'). The default is '~'.

fgrep

 my @foos = fgrep { s/^FOO: (.*)/$_Util::fgrep::file: '$1' line $./ } @files
 my @foos = fgrep { s/^FOO: (.*)/$_Util::fgrep::file: '$1' line $./ } \"<:encoding(UTF-8)", @files

Grep files. Loop through the lines of each file and apply a function. If the function returns a true value then $_ (after the function application) will be appended to a list to be returned. The current file name will be available in the variable $_Util::fgrep::file and will be one of the entries in the file list given to the subroutine. Of course, the standard perl variable $. ($INPUT_LINE_NUMBER when use English; is in effect) will be available for your use as well.

In scalar context just the number of matches will be returned.

NOTE: If you want to chomp your lines note that the last line of a file may not contain a newline (or whatever $/ is) so use something like either of the following:

 my @foos = fgrep { chomp; /^FOO/ } @files;
 my @foos = fgrep { /^FOO/ and chomp || 1 } @files;

If a string reference $mode is provided as the first argument after the subroutine block it will be taken as the file mode (the default is simply "<").

find

  #XXX: BUGS!
  Currently not entirely correct but getting better. Known bug:
    * -mindepth available but broken
 my @files = find [ '/' ], qw/-type f -name *.pm/;

File::Find using find(1) semantics. Currently supported find options are given below (descriptions taken from find(1)). Unlike find, this subroutine defaults to returning the list of matches rather than defaulting to the -print action. Tests are performed in the order specified so a failure early on will prevent further tests/actions from being performed. Note: this function will never be a full find2perl replacement.

-depth

Process each directory's contents before the directory itself.

-follow

Dereference symbolic links. This is the option that most closely follows find(1)'s behavior but is not a perfect match. In particular, a symbolic link which (if followed) would actually result in a circular reference will be processed by find(1), but not by this function.

NOTE: This option corresponds to the follow_fast option to File::Find

-follow_smart

Dereference symbolic links. Circular references (as well as links that would cause a circular reference) will be automatically removed (symbolic links will only appear if the "real" file would not have been found otherwise). Dangling symbolic links will be ignored.

NOTE: This option corresponds to the follow option to File::Find

-no_chdir

Sets corresponding File::Find option: Does not "chdir()" to each directory as it recurses. When true, the first argument to -wanted and -exec routines will bee a full path. For example, when examining the file "/some/path/foo.ext" while doing find ["/some"] you will have:

 @_ = ($_ = '/some/path/foo.ext', '/some/path/', '/some/path/foo.ext', '/the/realpath/foo.ext')
-untaint
-untaint_pattern

Untaint directory names before "chdir()"'ing into them. Untaints using -untaint_pattern. -untaint_pattern defaults to qr|^([-+@\w./]+)$|. Your untaint pattern may be a string or pre-compiled (qr) pattern, but MUST capture the directory name to $1.

-maxdepth levels

Descend at most levels (a non-negative integer) levels of directories below the command line arguments. '-maxdepth 0' means only apply the tests and actions to the command line arguments.

-quiet

Disable "Permission denied" warnings for unreadable directories.

Tests

-iname pattern

Like -name, but the match is case insensitive. For example, the patterns 'fo*' and 'F??' match the file names 'Foo', 'FOO', 'foo', 'fOo', etc.

-iregex pattern

Like -regex, but the match is case insensitive.

-name pattern

Base of file name (the path with the leading directories removed) matches glob pattern (or regexp if passed as qr// compiled regexp). The metacharacters ('*', '?', and '[]') do not match a '.' at the start of the base name.

-regex pattern

File name matches regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named './fubar3', you can use the regular expression '.*bar.' or '.*b.*3', but not 'b.*r3'.

-type char

File is of type "char":

  b      block (buffered) special
  c      character (unbuffered) special
  d      directory
  p      named pipe (FIFO)
  f      regular file
  l      symbolic link
  s      socket
  D      door (Solaris)

Actions

-wanted subroutine
-exec subroutine

Execute subroutine; The subroutine is executed in the directory containing the file and is passed three parameters: the file's name, the current directory (relative to the starting directory), the file's full path (relative to the starting directory). If the "-follow" option is provided then the "true" filename (all symbolic links resolved) will be provided as a fourth argument. That is:

 @_ = ($_, $File::Find::dir, $File::Find::name, \%info);

For example, when examining the file "/some/path/foo.ext" while doing find ["/some"] you will have:

 @_ = ($_ = 'foo.ext', '/some/path', '/some/path/foo.ext', \%info)

Where

 %info = (
   path     => $_                    = "foo.ext",
   dir      => $File::Find::dir      = "/some/path/",
   name     => $File::Find::name     = "/some/path/foo.ext",
   fullname => $File::Find::fullname = "/the/realpath/foo.ext",
   top_dir  => "/some",              # current path being examined
   rel_dir  => "path",               # relative to top_dir
   rel_path => "path/foo.ext",       # relative to top_dir
   filename => "foo.ext",            # even when -no_chdir
   basename => "foo",                # removes last extension only
 );

If we call find ["D"] from "/foo",

 D/
 |-- bar
 `-- bip
     `-- baz.txt

Then %info will be:

 | path    | dir   | name          | fullname | top_dir | rel_dir | rel_path    | filename | basename |
 |---------+-------+---------------+----------+---------+---------+-------------+----------+----------|
 | .       | D     | D             | undef    | D       | .       | .           | .        | .        |
 | bar     | D     | D/bar         | undef    | D       | .       | bar         | bar      | bar      |
 | bip     | D     | D/bip         | undef    | D       | .       | bip         | bip      | bip      |
 | baz.txt | D/bip | D/bip/baz.txt | undef    | D       | bip     | bip/baz.txt | baz.txt  | baz      |

A "-wanted" subroutine will automatically set "$File::Find::prune" if the subroutine returns false. An "-exec" subroutine will do no such magic.

-print0

print the full file name on the standard output, followed by a null character. This allows file names that contain new-lines to be correctly interpreted by programs that process the find output.

-print

print the full file name on the standard output, followed by a newline.

-prune_all_failures

Discard and prune any files for which any test fails.

-prune_hidden

Discard and prune any hidden files. At the moment this means anything starting with '.' since I don't know how to detect "hidden" files on any systems other than Linux.

-prune_iname pattern

Like -prune_name, but the match is case insensitive. For example, the patterns 'fo*' and 'F??' match the file names 'Foo', 'FOO', 'foo', 'fOo', etc.

-prune_name pattern

Discard and prune any files where base of file name (the path with the leading directories removed) matches shell pattern pattern. The metacharacters ('*', '?', and '[]') do not match a '.' at the start of the base name.

-prune_rcs

Discard and prune any files or directories that look like they belong to a revision control system. At the moment this means any directories named: ".svn", "CVS", "blib", "{arch}", ".bzr", "_darcs", "RCS", "SCCS", ".git", ".pc"

-prune_backup

Discard and prune any files or directories that look like backups. This includes anything ending in "~" or ".bak", matching "#*#", or ending in ".tmp" or matching ".tmp-[_a-zA-Z0-9]+"

-prune_regex pattern

Discard and prune any names matching the regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named './fubar3', you can use the regular expression '.*bar.' or '.*b.*3', but not 'b.*r3'.

Main Limitations:

No grouping via (), no -or.

newer

Returns true if first file is newer than second file. Also returns true if first file exists but second does not.

lastline

 my $line = lastline $file;
 my $line = lastline "<:encoding(UTF-8)", $file;

Returns the last line of a file. Includes a seek() optimization based on the lengths of the first several lines so that reading the last line of a large file should be reasonably efficient.

By default the input will not be decoded. Either provide an initial scalar reference containing the file mode (with proper encoding, for example \"<:encoding(UTF-8)") or decode the string before using it.

fprint

See also: File::Slurp

 fprint $filename, @stuff
 fprint \$mode, $filename, @stuff

Prints stuff to the indicated filename. If a mode is provided (for example, \">:encoding(UTF-8)") then it will be used instead of the default mode (">").

fprint_bu

 fprint_bu $filename, @stuff
 fprint_bu \$mode, $filename, @stuff

Prints stuff to the indicated filename, but backup filename (by appending a ~) first. If a mode is provided (for example, \">:encoding(UTF-8)") then it will be used instead of the default mode (">").

fappend

See also: File::Slurp

 fappend $filename, @stuff
 fappend \$mode, $filename, @stuff

Append stuff to the indicated filename. If a mode is provided (for example, \">>:encoding(UTF-8)") then it will be used instead of the default mode (">>").

fincrement

 fincrement $filename
 fincrement $filename, $amount
 fincrement $filename, pre => $pre, post => $post, layers => $perlio_layers
 fincrement $filename, $amount, pre => $pre, post => $post

Increments the number contained in $filename. On success, the new value is returned (Note: may be zero if $filename contained "-1"). On failure, undef is returned.

The amount to add to the file's value may be provided. If it is missing, then a value of one is assumed. The optional parameters $pre and $post specify strings to print to the file before and after the number. These strings default to the empty string and a single newline respectively.

Note: $filename must contain only a number (with possible whitespace), or must exactly contain the concatenation of $pre, number, and $post.

If $filename does not exist, then it will be initialized to "0"

The "layers" option can be used to set the PerlIO layers for the opened files (for example layers => ":encoding(UTF-8)"). By default, no layers are applied.

cat

See also: File::Slurp

 my $stuff = cat $file;
 my $stuff = cat \$mode, $file;

Read in the entirety of a file. If requested in list context, the lines are returned. In scalar context, the file is returned as one large string. If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is "<").

bcat

See also: File::Slurp

Read in the entirety of a binary file. If requested in list context, the lines are returned. In scalar context, the file is returned as one large string.

bu_open

 bu_open $file
 bu_open $fh, $file
 bu_open $fh, $file, "$file.bak"
 bu_open \$mode, $file
 bu_open \$mode, $fh, $file
 bu_open \$mode, $fh, $file, "$file.bak"
 ($writer, $reader) = bu_open \$mode, $file

Backup and open. The general idea is, if the file exists, rename it by appending a "~" to its name, then open the original name in write mode. This sub croaks if any operation fails. The backup file is created new so that the inode of the original file does not change.

If only a single string variable argument is given and the function is called in void context, then the requested file is backed up and opened, "upgrading" the given argument to a filehandle. Example:

 $file = "foo";
 bu_open $file;         # Note: bu_open "foo"; would be a fatal error
 print $file "Bar\n";

In scalar context, $file is unchanged and a write-only filehandle is returned.

In list context, a filehandle for both the new file (write only) and the backup (read only) are returned.

If a mode is provided as a SCALAR reference (for example, \">:encoding(UTF-8)") then it will be used instead of the default mode (">").

If two arguments are given, the first will be used to store the newly opened filehandle, and the second should hold the file name.

Finally, the final argument (if provided) will be used for the backup file (rather than the $file argument with a "~" appended).

catfile

Calls the File::Spec catfile and canonpath methods.

realfile

Unnecessary! use Cwd::realpath

:shell - Shell Operations

safe_pipe

 safe_pipe [ options, ] command, input
 my $results = safe_pipe [ 'command', 'arg' ], @input;
 my @results = safe_pipe [ 'command', 'arg' ], @input;
 my $results = safe_pipe \%opt, [ 'command', 'arg' ], @input;

Pipe data to a shell command safely (without touching a command line) and retrieve the results. Notably, this is the situation that IPC::Open2 says that is dangerous (may block forever) using open2. If process execution fails for any reason an error is thrown.

In void context, all command output will be directed to STDERR making this command almost equivalent to:

 my $pid = open my $F, "|-", 'command', 'arg' or die;
 print $F @input; close $F;
 waitpid( $pid, 0 );

Options:

capture_err

If true, STDERR will also be captured and included in returned results.

allow_error_exit

By default, this sub will verify that the command exited successfully. (0 == $?) and throw an error if anything went wrong. Setting allow_error_exit to a true value will prevent this sub from examining the return value of the command.

Setting allow_error_exit to an array of allowed exit status will ignore only those (error) exit codes (code 0 will be considered a success).

Modified code from merlyn: http://www.perlmonks.org/index.pl?node_id=339092

Note: Input and output will not be encoded/decoded thus should be octets.

Note: locally alters $SIG{CHLD}

:color - Color

NOCOLOR

 NOCOLOR(__PACKAGE__) if !$opt{color};
 NOCOLOR()            if !$opt{color};

Replaces subroutines and package variables whose name matches one of the names in the :color_subs or :color_strings export tags with inert versions which do not insert any color sequences. Subroutines are replaced by the identity function and strings are replaced with the empty string. The default package is the caller's current package.

WARNING: This subroutine has no good way of knowing that the subroutines and variables that it finds are really color subroutines and variables. It does however check that subroutines have a '$' prototype and it only has access to package variables (those not declared by my). This combined with the fact that there is only so many things that a function called "BLUE" could reasonably do means that this should not generally be a problem.

SUBS affected:

 BOLD UNDERLINE DARK BLINK REVERSE CONCEALED STRIKE
 BLACK RED GREEN YELLOW BLUE MAGENTA CYAN WHITE
 GREY GRAY BRIGHT_RED BRIGHT_GREEN BRIGHT_YELLOW BRIGHT_BLUE BRIGHT_MAGENTA BRIGHT_CYAN
 ON_BLACK ON_RED ON_GREEN ON_YELLOW ON_BLUE ON_MAGENTA ON_CYAN ON_WHITE
 ON_GREY ON_GRAY ON_BRIGHT_RED ON_BRIGHT_GREEN ON_BRIGHT_YELLOW ON_BRIGHT_BLUE ON_BRIGHT_MAGENTA ON_BRIGHT_CYAN

SCALARS affected:

 $BOLD $BOLD_OFF $UNDERLINE $UNDERLINE_OFF $DARK $DARK_OFF $BLINK $BLINK_OFF $REVERSE $REVERSE_OFF
 $CONCEALED $CONCEALED_OFF $STRIKE $STRIKE_OFF $NORMAL $DEFAULT_FG $DEFAULT_BG
 $BLACK $RED $GREEN $YELLOW $BLUE $MAGENTA $CYAN $WHITE
 $GREY $GRAY $BRIGHT_RED $BRIGHT_GREEN $BRIGHT_YELLOW $BRIGHT_BLUE $BRIGHT_MAGENTA $BRIGHT_CYAN
 $ON_BLACK $ON_RED $ON_GREEN $ON_YELLOW $ON_BLUE $ON_MAGENTA $ON_CYAN $ON_WHITE
 $ON_GREY $ON_GRAY $ON_BRIGHT_RED $ON_BRIGHT_GREEN $ON_BRIGHT_YELLOW $ON_BRIGHT_BLUE $ON_BRIGHT_MAGENTA $ON_BRIGHT_CYAN

hsl2rgb

 my $rgb    = hsl2rgb( $H, $S, $L );
 my @colors = hsl2rgb( @hsl_colors );

Convert HSL colors (triples from 0 to 1) to RGB colors (triples from 0 to 255).

rainbow

 rainbow( $n );
 rainbow( $n, %colors_options);

Return a list of $n rainbow colors (ROYGBIV).

Any options supported by colors can be provided and will be passed along, including the n and colors options, so you probably don't want to include those options.

wavelength2rgb

Convert a wavelength (a number between 380 nm and 780 nm) to a RGB triplet (0 ≤ x_i ≤ 1). Returns undef if given an out-of-range wavelength.

Formulas taken from Dan Bruton's color science page (http://www.midnightkite.com/color.html).

$_re_color_escape

A pre-compiled regular expression that matches any of the colors or font manipulations provided in this package.

strip_color

Remove the color tags from a list of strings. The uncolored strings are returned. Does not modify the input strings and can be used on constant strings.

strip_color_violently

Remove the color tags from a list of strings. The uncolored strings are returned. Modifies the input strings and therefore may not be used on constant strings.

clength

Compute the length of a possibly colored string. The standard perl length function gets confused about how long a colored or decorated string is. This function fixes that so that you can center or align data.

%color

A hash of color names => escape sequences. Included are text style sequences,

  BOLD UNDERLINE DARK BLINK REVERSE CONCEALED

Also, the following colors:

  BLACK GREY GRAY WHITE
  RED GREEN YELLOW BLUE MAGENTA CYAN
  BRIGHT_RED BRIGHT_GREEN BRIGHT_YELLOW BRIGHT_BLUE BRIGHT_MAGENTA BRIGHT_CYAN

And their corresponding backgrounds:

  ON_BLACK ON_GREY ON_GRAY ON_WHITE
  ON_RED ON_GREEN ON_YELLOW ON_BLUE ON_MAGENTA ON_CYAN
  ON_BRIGHT_RED ON_BRIGHT_GREEN ON_BRIGHT_YELLOW ON_BRIGHT_BLUE
  ON_BRIGHT_MAGENTA ON_BRIGHT_CYAN

colors

At the most basic level, converts colors to different formats, however this subroutine is capable of quite a bit more than that.

Examples:

 colors [qw/red green blue/], format => "ps";
 colors [qw/red green blue/], format => "ps", n => 2;
colors

A list of colors, can be an X11 color name or any of the other formats recognized by Color::Calc.

n

Only return n colors.

interpolate

If false, requesting more colors than available in the colors list will throw a fatal error. The default is to create new colors between the given colors if there are insufficient colors provided. The interpolate command will also cause colors to be interpolated if the distribute option is set.

distribute

By default, if fewer colors are requested than are contained in the colors list, this subroutine will select the first n colors. Providing a true value for distribute will cause the subroutine to evenly spread out the choice of colors over the range of colors provided (if n > 2 then the first and last colors are guaranteed to be included).

format

Specify the style of the returned colors. Can be anything supported by Color::Calc which is currently (Color::Calc::VERSION == 1.0): "tuple", "hex", "html", "object" (a Graphics::ColorObject object), "pdf". The default format is "object".

The following formats are also accepted and are handled by this subroutine directly: "ps" | "postscript".

background

Try to make the colors appear on the given background color. Colors will be altered if this option is provided.

:color_subs - Color Subroutines

BOLD($)

Make text bold

DARK($)

Make text dark

UNDERLINE($)

Make text underline

BLINK($)

Make text blink

REVERSE($)

Make text reverse

CONCEALED($)

Make text concealed

STRIKE($)

Strike-through text (rarely implemented)

BLACK($)

Make text black

RED($)

Make text red

GREEN($)

Make text green

YELLOW($)

Make text yellow

BLUE($)

Make text blue

MAGENTA($)

Make text magenta

CYAN($)

Make text cyan

WHITE($)

Make text white

GREY($)

Make text grey

GRAY($)

Make text gray

BRIGHT_RED($)

Make text bright_red

BRIGHT_GREEN($)

Make text bright_green

BRIGHT_YELLOW($)

Make text bright_yellow

BRIGHT_BLUE($)

Make text bright_blue

BRIGHT_MAGENTA($)

Make text bright_magenta

BRIGHT_CYAN($)

Make text bright_cyan

ON_BLACK($)

Make text on_black

ON_RED($)

Make text on_red

ON_GREEN($)

Make text on_green

ON_YELLOW($)

Make text on_yellow

ON_BLUE($)

Make text on_blue

ON_MAGENTA($)

Make text on_magenta

ON_CYAN($)

Make text on_cyan

ON_WHITE($)

Make text on_white

ON_GREY($)

Make text on_grey

ON_GRAY($)

Make text on_gray

ON_BRIGHT_RED($)

Make text on_bright_red

ON_BRIGHT_GREEN($)

Make text on_bright_green

ON_BRIGHT_YELLOW($)

Make text on_bright_yellow

ON_BRIGHT_BLUE($)

Make text on_bright_blue

ON_BRIGHT_MAGENTA($)

Make text on_bright_magenta

ON_BRIGHT_CYAN($)

Make text on_bright_cyan

:color_strings - Color Strings

$NORMAL

Undo all color modifications

$DEFAULT_FG

Remove foreground coloring

$DEFAULT_BG

Remove background coloring

$BOLD

Make text bold

$BOLD_OFF

Undo make text bold

$DARK

Make text dark

$DARK_OFF

Undo make text dark

$UNDERLINE

Make text underline

$UNDERLINE_OFF

Undo make text underline

$BLINK

Make text blink

$BLINK_OFF

Undo make text blink

$REVERSE

Make text reverse

$REVERSE_OFF

Undo make text reverse

$CONCEALED

Make text concealed

$CONCEALED_OFF

Undo make text concealed

$STRIKE

Make text strike-through

$STRIKE_OFF

Undo make text strike-through

$BLACK

Make text black

$RED

Make text red

$GREEN

Make text green

$YELLOW

Make text yellow

$BLUE

Make text blue

$MAGENTA

Make text magenta

$CYAN

Make text cyan

$WHITE

Make text white

$GREY

Make text grey

$GRAY

Make text gray

$BRIGHT_RED

Make text bright_red

$BRIGHT_GREEN

Make text bright_green

$BRIGHT_YELLOW

Make text bright_yellow

$BRIGHT_BLUE

Make text bright_blue

$BRIGHT_MAGENTA

Make text bright_magenta

$BRIGHT_CYAN

Make text bright_cyan

$ON_BLACK

Make text on_black

$ON_RED

Make text on_red

$ON_GREEN

Make text on_green

$ON_YELLOW

Make text on_yellow

$ON_BLUE

Make text on_blue

$ON_MAGENTA

Make text on_magenta

$ON_CYAN

Make text on_cyan

$ON_WHITE

Make text on_white

$ON_GREY

Make text on_grey

$ON_GRAY

Make text on_gray

$ON_BRIGHT_RED

Make text on_bright_red

$ON_BRIGHT_GREEN

Make text on_bright_green

$ON_BRIGHT_YELLOW

Make text on_bright_yellow

$ON_BRIGHT_BLUE

Make text on_bright_blue

$ON_BRIGHT_MAGENTA

Make text on_bright_magenta

$ON_BRIGHT_CYAN

Make text on_bright_cyan

:display - Display functions

sprint_one_var

 binmode STDOUT, ":encoding(UTF-8)";
 print sprint_one_var scalar one_var \@data;

Returns a string describing data set, such as:

  N =    12: μ = 11.42, σ =  9.19  95% CI ( 6.11, 16.72)  «  3.55,  6.70,  7.72, 12.05, 36.25 »
five_nr

Show five number summary (default true)

ci

Show 95% confidence interval (default false)

dpad

Width of digit (N) field (default 3)

pad

Width of floating point fields (default 5)

digits

Digits after decimal, use negative number to format with nicef instead of %f (default 2)

mk_progressbar

Generates a progress subroutine. Sample usage might be (you provide the $items iterator and do_something sub or something equivalent):

 my $nr_items = $items->count;
 my $progress = mk_progressbar( total => $nr_items, countdown => 1 );
 print STDERR "Processing items ";
 while (my $item = $items->next) {
   $progress->($nr_items--);
   do_something($item);
 }
 $progress->(0);

With the above, your code now has a nice progress bar.

type

"bar", "dot", "percent", or "spinner". DEFAULT: bar

total

Number of items to process. Note: "total" and progress counts may be decimal. DEFAULT: 1

countdown

When true, progress sub expects value to decrease from total to 0 rather than increase from 0 to total. DEFAULT: undef (false)

format (percent type only)

sprintf format to display percentage. DEFAULT: "%.2f%%"

length (bar type only)

DEFAULT: 20

symbol (bar and dot types only)

DEFAULT: "*" for bar; "." for dot

symbols (spinner type only)

DEFAULT: [ qw( - \ | / ) ]

break (dot type only)

DEFAULT: 50

Print newline after every 10 dots.

space (dot type only)

DEFAULT: 10

Print space after every 10 dots.

fh

Output file handle. DEFAULT: STDERR

prefix

String to print before progress info.

suffix

String to print after progress info.

clprint

  my ($i, @mark) = (0, qw[ - \ | / ]);
  print "Working: ";
  for (@things) {
    clprint $mark[($i %= 4)++];
    # ... other stuff ...
  }
  clprint;
  clprint \$var, @stuff;
  clprint \*STDOUT, \$var, @stuff;

A CLearing print. Erases whatever was printed last time and prints the next thing. This subroutine is smart enough not to try to erase past a newline even if you are using the perl variables $, or $\. This subroutine makes use of the clength subroutine so that color escape sequences are properly measured.

Calling the subroutine with no arguments forgets the previously printed thing without erasing it from the screen.

If a GLOB or IO::* is given as a first parameter then, that will be used for output. The default is STDERR and is stored in the $_Util::clprint::out variable if you want to change it.

If a reference to a scalar is given then that variable will be used to store the text history. This allows for multiple clprint levels. (Though it is up to you to nest them properly.)

sprint_hash

 sprint_hash $sep, %hash

Returns a string:

 "key" => "value"$sep"key" => "value"$sep...

If $sep is not provided (I.E., sprint_hash is called with an even number of arguments) $sep will default to $/ (typically "\n").

print_hash

Prints the results of sprint_hash

ctext

 ctext( $text, $width, "left" | "right" )

Center a string horizontally over a given width (both left and right sides are padded with space). An optional third parameter specifies whether to err to the left or to the right. The default is left, to put an extra space to the right if necessary. undef is returned if $width < length $text.

lrtext

 lrtext( $left, $right, $width )

Return a string with enough space separating the $left and $right text so that the line fills the entire $width.

text_wrap_paragraphs

See also: Text::Autoformat

Splits a string on multiple consecutive newlines and passes each chunk to text_wrap. Returns the resulting paragraphs as a list of paragraphs. This function takes the same arguments as text_wrap.

text_wrap

See also: Text::Autoformat

Takes a string and wraps the test to be at most a certain width. Text is split at whitespace, and hyphens (though actual hyphenation is beyond the scope of my interest). Long words are placed on lines by themselves, all whitespace is canonicalized, and the resulting string does not have a trailing newline.

This function uses the non-core package Term::ReadKey. Available options:

width

Total width of the paragraph, including any indentation. The default is the width of the terminal or the value of $ENV{COLUMNS} or 80. If width is negative, then that value will be subtracted from whatever width is auto-detected.

indent

A per-line indentation amount. The default is zero.

fill

If true, spaces will be added to the END of each line to make them exactly the right width. You might want this if you are colorizing the background so that the background color extends the full width on each line. The default is false.

wrap_chars

A list of characters that we are allowed to wrap on. The default is [ '-', ' ' ].

text_justify_paragraphs

Splits a string on multiple consecutive newlines and passes each chunk to text_justify. Returns the resulting paragraphs as a list of paragraphs. This function takes the same arguments as text_justify.

text_justify

Takes a string and wraps the test to exactly be a certain width. Text is split at whitespace, and hyphens (though actual hyphenation is beyond the scope of my interest). Long words are placed on lines by themselves, all whitespace is canonicalized, and the resulting string does not have a trailing newline.

This function uses the non-core package Term::ReadKey. Available options:

width

Total width of the paragraph, including any indentation. The default is the width of the terminal or the value of $ENV{COLUMNS} or 80. If width is negative, then that value will be subtracted from whatever width is auto-detected.

indent

A per-line indentation amount. The default is zero.

justify_last

If true the last line of the paragraph will be justified also. The default is false.

fill

If true, spaces will be added to the END of each line to make them exactly the right width. You might want this if you are colorizing the background so that the background color extends the full width on each line. The default is true.

wrap_chars

A list of characters that we are allowed to wrap on. The default is [ '-', ' ' ].

print_cols

Prints the results of format_cols

format_cols

 format_cols \@array, %options

Format the given list of items into columns according to the given options. This function has a couple of improvements over Term::PrintCols. In particular, it has more options, and is capable of correctly formatting lists with embedded ANSI color codes. This function uses the non-core package Term::ReadKey if the total_width option is not specified. The layout algorithm was inspired by GNU ls.

Available options are.

align => alignment string

The alignment string is a word in the characters: l, r, c, standing for Left, Right, and Center respectively. These control the alignment of each column. The last character is repeated as many times as necessary for the number of columns used in the formatted table. For example an alignment string of "lc" would center all columns after the first. The default value is "l".

col_width => integer

The minimum allowed column width. This number will be used if there are no items longer than the given integer minus col_space (to allow for a space).

col_space => integer

The minimal amount of spacing to place between each column. The actual column spacing may be larger since this function expands the columns to occupy the total available width. This value defaults to 2.

col_join => string || array

String(s) used to join columns. This option overrides the col_space option. If more columns are used than elements available in the col_join array then the last element will be repeated for all subsequent column dividers.

indent => integer

Amount of indentation to include on the left side of each line. This number will be taken from the total_width option before the list is formatted. The default is no indentation.

max_cols => integer

The maximum number of columns to create. Sometimes it may be preferable to specify the maximum number of columns rather than the minimum column width.

cols => integer

The exact number of columns to create.

orientation => 'horizontal' | 'vertical'

Specify whether the columns are to be filled horizontally or vertically. For example, if the list of items is (1..9), then the resulting column layouts would be:

 horizontal:              vertical:
    1  2  3  4               1  4  6  8
    5  6  7  8               2  5  7  9
    9                        3

The default orientation is vertical

total_width => integer

The total number of terminal columns to use. This option tries to find the correct width of the terminal first by using Term::ReadKey, then by examining $ENV{COLUMNS} and finally defaults to 80 characters. If the total_width is negative, then that value will be subtracted from whatever width is auto-detected.

max_width => integer

The maximum number of terminal columns to use. The default is to not constrain the "total_width".

fill_last_column => bool

If true, spaces will be added to the end of the last column in the same way that space is added to the end of all other columns. Otherwise, the last column will not be space padded on the right. The default is false, do not fill the right side of the last column.

uninitialized => 'warn' | 'die' | 'ignore'

Behavior upon finding uninitialized values. The default is 'warn'.

histogram

 print histogram( \%data, %options )

Returns a text histogram. The data hash consists of id => frequency. The graph looks best if the id's are short and all approximately the same length. The following options may also be provided:

height

The height of the tallest bar of the histogram. DEFAULT: 10

key_order

Either an array containing the order in which to display the histogram data or the keyword 'sort'. DEFAULT: sort

Note: %data may contain more data than is requested in key_order. We will only create a histogram with the key_order data.

max_frequency

The largest frequency. You might want to provide this for two reasons. To provide a uniform scaling over multiple histograms or as an optimization (if you already have this value it would save us the work of recomputing it). By default we will compute it from the key data that we are actually displaying. DEFAULT: undef

bar_width

The width of each histogram bar. If undefined either 1 or the width of the widest label will be used (depending on the value of show_labels). DEFAULT: undef

bar_char

The character to use to draw the bars. DEFAULT: "*"

col_skip

The inter-column spacing. DEFAULT: 1

indent

Amount of indentation to include on the left side of each line. DEFAULT: 0

axis_overhang

The distance beyond the end bars that the histogram extends. DEFAULT: 2

show_axis

Print a horizontal bar beneath the histogram but above the labels. DEFAULT: true

show_labels

Print the labels centered under their respective bars. DEFAULT: true

:input - Prompting and input

get_boolean

 get_boolean { default => $d, true => $t, false => $f }, $input

Canonicalizes boolean input (with default). $input may be a string or a filehandle. If $input is omitted then one line of data is read from <STDIN>. This subroutine just tries to match either /^\s*[yYtT1]\w*\s*$/ or /^\s*[nNfF0]\w*\s*$/. Anything else causes the default value to be returned.

If the default value is not set (which is not the same as being set to undef) then the input will be returned as-is if it does not appear to be boolean or is empty or undefined. Using this subroutine in this way is somewhat fragile since something like "truck" or "Typhoon" will be canonicalized to "true" but "The Clash" will not. Thus, it is not very sophisticated in its distinction between boolean and non-boolean inputs.

Yn

Returns "y" or "n", defaulting to "y", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>.

WARNING: This subroutine returns "y" if the input is the empty string or undefined.

yN

Returns "y" or "n", defaulting to "n", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>.

This subroutine returns "n" if the input is the empty string or undefined.

yn

Returns "y" or "n" if the input appears to be boolean. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>.

WARNING: The empty string and undef are not considered to be boolean and will not be canonicalized to "y" or "n".

Tf

Returns "1" or "0", defaulting to "1", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>.

WARNING: This subroutine returns "1" if the input is the empty string or undefined.

tF

Returns "1" or "0", defaulting to "0", depending on the input. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>.

This subroutine returns "0" if the input is the empty string or undefined.

tf

Returns "1" or "0" if the input appears to be boolean. Argument may be a string, filehandle, or empty. If called with no arguments, then a single line is read from <>.

WARNING: The empty string and undef are not considered to be boolean and will not be canonicalized to "1" or "0".

prompt

See Also: IO::Prompt(?)

 my $x = prompt();
 my $x = prompt( "prompt" );
 my $x = prompt( "prompt", "help string" );
 my $x = prompt( "prompt", { help hash } );
 my $x = prompt( "prompt", %options );
 my $x = prompt( "prompt", help string/hash, %options );

Prompt the user until valid input is received. The default prompt is '? '. The return value is the user input without the trailing newline.

The provided help may be either a help string which will be printed to screen when the help command is given (see below) or may be a hash of command => "help string" pairs which will be used if help for a particular command is requested. The hash value corresponding to the empty hash key ("" => "General help") will be used for the general help response.

help

Declare the help message/hash explicitly.

default

Specify a default response which will be returned if the user provides no response. Specifying this option makes the value of "allow_empty" irrelevant. Value may be set globally by setting $_Util::prompt::default.

allowed

An expression like "help_command" which specifies the allowed input values. A list provides a list of all possible case insensitive inputs. A regular expression may capture a sub-portion of the input line and the captured portion will be used as a canonicalized value. Finally a subroutine is expected to return the canonicalized value of the input. The default is to allow any DEFINED input value. Value may be set globally by setting $_Util::prompt::allowed.

allow_empty

Boolean value which (if true) allows an empty response value. The default is false. Value may be set globally by setting $_Util::prompt::allow_empty.

help_command

A literal string, list of literals, regular expression pattern, or subroutine which determines whether the user has asked for help. If a help hash was provided then patterns should capture the requested command in $1 and subroutines should return the requested command (or undef if the input is not a request for help). The default help_command is '?'.

Some valid examples:

 help_command => '?'
 help_command => ['?', 'h ', 'help ']
 help_command => qr/\?\s*(\w*)/
 help_command => sub { ($_[0] =~ /\?\s*(\w*)/) ? ($1 || "help_bar") : undef }

Value may be set globally by setting $_Util::prompt::help_command.

trim

A shortcut to set both "trim_leading" and "trim_trailing" to the same value.

trim_leading
trim_trailing

If true, any leading (resp. trailing) whitespace will be removed from the user's input prior to any processing by this subroutine. The default is true. Values may be set globally by setting $_Util::prompt::trim_leading and $_Util::prompt::trim_trailing.

input_filehandle

Specify the input filehandle. The default is STDIN. Value may be set globally by setting $_Util::prompt::input_filename.

output_filehandle

Specify the output filehandle. The default is STDOUT. Value may be set globally by setting $_Util::prompt::output_filename.

on_undef

Specify what to do when an undefined value is given as input. The following values are recognized:

  return     : causes "prompt" subroutine to immediately return undef
  make_empty : replaces the undefined value with the empty string and continues
  continue   : do nothing in particular ("default" and "allowed" will still apply)

Any other value will cause the script to croak with the the value as the error message. The default value is "make_empty". Value may be set globally by setting $_Util::prompt::on_undef.

no_echo

If true, user's input is not echoed to the screen. Value may be set globally by setting $_Util::prompt::no_echo.

:plot - Graphs and Plots

plot_colors

 plot_colors( $n );
 plot_colors( $n, %colors_options);

Return a list of $n colors that are nice for making a plot of. The colors are chosen to be visually distinct, however if $n is large enough (more than 13) you will get a rainbow of colors.

Any options supported by colors can be provided and will be passed along, including the n and colors options, so you probably don't want to include those options.

ps_barchart !!incomplete

 ps_barchart( \@data );
 ps_barchart( \@data, %options );
 ps_barchart( %data_and_options );

Generate a postscript barchart.

Examples:

 my @x = map { int(rand(20)) } 1..15;
 my @y = map { int(rand(20)) } 1..15;
 my @z = map { int(rand(20)) } 1..15;
 # A simple dynamic web graph:
 print "Content-Type: image/png\n\n", ps_barchart \@x;
 # Neighboring bars:
 ps_barchart file => "graph.png",
             data => [ foo => \@x, bar => \@y, baz => \@z ];
 # Stacked bars: ( [ [foo => \@x], [bar => \@y], ... ]  is also OK. )
 ps_barchart file    => "graph.gif", style => "stacked",
             xlabels => [qw/ay bee cee dee ee ef gee ach eye jay kay ell em en oh/],
             data    => [ foo => \@x, bar => \@y, baz => \@z ];
 # xlabels are dates, bars are already tiered ($x[$i] <= $y[$i] <= $z[$i] for all $i):
 ps_barchart file    => "graph.gif", style => "prestacked",
             xlabels => [qw/2005-01 2005-02 2005-03 2005-04 2005-05 2005-06 2005-07 2005-08/],
             timefmt => "%Y-$m", format => ["x %b %y", "y %g" ],
             data    => [ foo => \@x, bar => \@y, baz => \@z ];

##XXX: Alas, I still have to go through and make it be able to handle a proper histogram

:image - Image Routines

compile_latex

Compiles a LaTeX file. The following options are accepted.

latex

An integer specifying the number of times latex is to be run. Reasonable values are 1 (the default) or 2 (if your document has references which need to be resolved).

compiler

Arrayref containing compile command to use. Auto-chosen from latex, pdflatex, or perltex (each running in batch mode; perltex can handle either latex or pdflatex documents) by looking for pdftex option on \documentclass command line (may be in comment at end of line) or (uncommented) perltex \usepackage command.

pdftex

Set to true if latex compiler produces pdf documents rather than dvi documents.

print

1 or printer name. Will be printed using dvips.

dvips

1/0 creates a PostScript file.

dvipdf

1/0 creates a PDF file.

bibtex

1/0 runs BibTeX at the right time.

index

1/0 runs makeindex at the right time.

Comments: Proposed future interface:

 compile_doc $file | $dir | [ paths ],
   output => [qw/ pdf ps /],
   compile => [qw/ latex1 bibtex makeindex ... latex /],
   # set to reasonable defaults for all known thinguns
   compile_bibtex_command => [ command prefix ],# only reasonable for $file or $dir calls
   compile_bibtex_command => sub { passed file or dir or paths },
   # called with output of prev command in $_; returns true if need to call bibtex
   # called with arguments: ( command_chain => [qw/ latex1 bibtex /], command_output => { latex1 => ..., bibtex => ... } )
   #                          ^- commands called so far (in order)    ^- output from commands (most recent call only)
   compile_bibtex_test => sub{ grep /\.bib$/, glob('*') },
   compile_makeindex_test=>sub{my%o=@_;my $res=$o{command_output}{latex1}||$o{command_output}{latex};$res=~/run makeindex/},
   # If true, test will be performed up to Int times and bibtex will be
   # called up to Int times
   compile_bibtex_multi => Int,
   convertto_pdf => [ ... list of preferred sources ... ],
   convert_dvi_pdf => [ ... command prefix, filename.dvi is appended ... ],
   convert_pdf_ps => sub { called in chdir, given filename.pdf as arg, must produce filename.ps },
   ...

tex2image

Given a string of LaTeX code, returns an image file as a "string". The following options may be provided after the LaTeX string. Also, all options available to compile_latex are accepted in this function.

file

Save output to the indicated file instead of returning the image as a string.

type

Specify the save file type. This should be a standard "file extension" for the desired output (E.g. "gif" or "png"). The default output is an EPS file. (The ImageMagick command "convert" must be available on your system for this option to succeed.)

header

A header string placed between \documentclass{article} and \begin{document}. Only useful if input tex code does not include \begin{document} or \documentclass{article}.

Note: \usepackage{color} and \pagestyle{empty} are always included if either \begin{document} or \documentclass{article} are missing in the provided LaTeX string.

convert_args

Additional arguments to pass to convert when making the image. By default this is ["-transparent", "white"].

X

Specify the X resolution (default 144)

Y

Specify the Y resolution (default 144)

color
pagecolor

Specify the color or page color. Each may be an RGB hex triplet ("#40036f", the "#" is required!) LaTeX named color (red | green | blue | yellow | cyan | magenta | black | white and perhaps others depending on the DVI driver), a single number representing gray value, an "r,g,b" triplet, or a "c,m,y,k" quadruple. All numbers are percentages between 0 and 1, inclusive. The default values are "black" and "white" respectively.

magic_convert !UNIMPLEMENTED

 magic_convert $file, %options
 magic_convert $old_file, $new_file, %options
 magic_convert \@files, %options
 magic_convert \@files, $dir, %options

Convert file types and resize images. Valid options are given below. Colors may be given as (X11) color names or RGB hex triplets.

format => $ext

Specify an output file format. This "option" is required for all invocation styles except for the second, where the output format will be guessed from the $new_file name if this option is not provided.

transparent => $color

If the target image type supports transparency, then the specified color will be made transparent during the conversion.

grow => $grow

A boolean value which, if true, indicates that the image should be enlarged in order to fit maximally into the specified resolution / size.

size => $WIDTHxHEIGHT or \@width_and_height

Specified either as a list of two elements or a string of the form "640x480", this option forces the image to fit within a box of the given size.

resolution => $value

For vector-based inputs the resolution will affect the resulting image size. Note that the "max_size" option will override this option under most circumstances.

intent => "icon" | "thumbnail" | "web" | "email" | "screen" | "print" | "hires"

A fuzzy way to set the "resolution" and "size" options to reasonable (by current technology standards) sizes. The "icon" intent will aim for an image of size 128x128. The "thumbnail" intent will limit the image to a 250x250 box. The "web" and "email" intents assume 640x480 screens, while the "screen" intent assumes a 1024x768 screen. The "print" intent assumes a 5"x5" image at 300 DPI and the "hires" intent assumes 5"x5" at 600 DPI.

:LaTeX - LaTeX generating routines

quotetex

Like quotemeta, but makes strings LaTeX safe. Replaces all LaTeX special characters with replacements which will correctly compile in LaTeX.

tree2tex

 tree2tex \%tree, %options

Convert arbitrarily nested HoH's to the LaTeX code which will produce a tree diagram,

 # This,
 { A => { b => 1, c => 1 }, B => { f => { e => 1, f => 1 }, g => 1 } }
 # Becomes code that produces this,
  A -+- b
     |
     +- c
  B -+- f -+- e
     |     |
     |     +- f
     +- g

The leaf nodes may point to any value which is not a reference. You will need to \usepackage{pstricks,pst-node} for the code produced by this subroutine to function properly. Accepted options,

column_spacing

A LaTeX measurement for the amount of spacing to use between each column. This amount is placed before and after each column (using the LaTeX \tabcolsep variable) so should be half of the actual desired column spacing. The default is "1.5em".

row_stretch

A multiplier for the row stretch. Used to set the LaTeX multiplier \arraystretch. The default is 1.

tabular_format

The tree is built using the tabular environment. This option sets the format for the tabular. If the format is a single character then it will be duplicated for each level of your tree. Otherwise, you will need to make sure that you include enough columns for your diagram (one column for each level of the tree). The default is "l".

node_label_start

The starting node label. Useful if you are using alphabetic node labels elsewhere in your document. The default node labeling is "A".."Z","AA",... using the perl magic incrementer. Meaningful values for node_label_start are all-caps words, all-lower-case words, or numbers.

sort

Boolean value dictating whether we should sort the key values. The default is to sort tree nodes. Set this to false if you have a tied hash which will return keys in your desired order. Some modules which may help with this,

 Tie::Hash::Sorted                    - specify your own sort function
 Tie::IxHash or Tie::Hash::Indexed    - key order is insert order
use_leaf_values

If the leaves of your tree point to useful string values then you may specify use_leaf_values => 1 to have this subroutine use the leaf values as labels for the leaves rather than the leaf keys.

vertical

Boolean value which, if true, tells the subroutine to transpose the resulting tree. This has the effect of putting the root nodes across the top rather than down the left side.

Note: This still needs some work. In particular, when the matrix is transposed, the labels are not centered above their children.

nc

Node connection type. May be any LaTeX node connection type. Currently must be one of: line, Line, curve, arc, bar, diag, diagg, angle, angles, loop, circle. The default is "angles".

node_sep

A LaTeX measurement for the amount of spacing to place around each node. The default is "1ex".

:html - HTML utilities

uri

 my $uri = uri( $base, @path_components, \%query_params );

All arguments optional. The first argument ($base) is treated specially only in that a leading "/" will be respected. Any path components may include query parameters and/or an anchor. Query parameters will be merged (order is preserved). The last anchor defined takes precedence. Any hash references will be URI escaped and appended as query parameters. You may include as many hash refs as you like and they may appear anywhere in the argument list.

xml_attr

 xml_attr( %attr )
 xml_attr( @kv )

Creates XML attribute string. Encodes keys and values. Attributes will be included as long as their values are defined, even if empty. Attribute names must be both defined and non-empty.

If any attribute names are passed as SCALAR references, then their values will be interpreted as boolean values controlling whether the valueless attribute name should be included.

 xml_attr( class => "foo", \"bar" => 1 );             #  'class="foo" bar'
 xml_attr( \'class="foo"' => 1, \'bar="baz"' => 0 );  #  'class="foo"'

xml_tag

 xml_tag( $name )
 xml_tag( $name => %attr )
 xml_tag( $name => $content )
 xml_tag( $name => \$content )
 xml_tag( $name => $content, %attr )
 xml_tag( $name => \$content, %attr )

Creates XML tag with content and attributes. Name is assumed to not contain XML special characters (><&"), but all other values will be xml encoded except perhaps the content, which will pass through unchanged if a scalar reference is passed.

xml_btag

 xml_btag( $name )
 xml_btag( $name => %attr )

Creates beginning XML tag with attributes. Name is assumed to not contain XML special characters (><&"), but all other values will be xml encoded.

xml_encode

Minimal encoding of XML entities. Behaves like encode_entities($input, '<>&"').

js_toggle

 js_toggle [ label1 => id1, label2 => id2, ... ], %options;
 js_toggle [ label1 => [id1a, id1b, ...], label2 => id2, ... ], %options;
 js_toggle [ [displayed_label1, hidden_label1], [id1a, id1b, ...], ... ], %options;

Constructs a list of html snippets that can be placed in a document that will switch on and off the indicated ids. Ids may be associated to multiple labels,

The options supported are given below.

type => "radio" | "toggle"

Toggle buttons simply show and hide the corresponding IDs. Radio buttons always show the associated ids and hide all other ids. The default behavior is "toggle".

id_prefix => $prefix

Each label will be wrapped in a <span id="ID"> tag. The "ID" must be unique for each page. To ensure this, the function appends an integer to the "id_prefix" which is incremented as necessary (the incremented value is remembered between calls to js_toggle). The default prefix is "JST", but can be changed using this option.

reset_counter => $bool

If true, the counter used to ensure that ids are unique will be reset to zero at the end of the function call. This can be helpful if you want to include style information for the generated labels in a style sheet (though you could also wrap your label in a <span> tag before passing it to this function). The default is to not reset the counter.

visibility => $visibility

Indicates the initial visibility state of the items (only relevant if "hidden" labels are provided). $visibility may be a label or list of labels listing those labels which will be displayed when the page loads (you will need to manage the page styles to ensure this). Alternatively, $visibility may be "1", "0", or a list of "1"'s and "0"'s indicating which items (by position) will be visible. Specifying just "1" or "0" indicates that all or none of the objects will be initially visible. The default is to assume that the first item is visible and that all others are hidden.

#XXX: Ugh, this is confusingly worded! (and wrong!)

display => \@displaystyles

A list of display styles to be used for making objects visible. This will typically be "block" or "inline", but CSS 2 allows lots of things (list-item, table, ...). Any display styles left undefined will default to "block". The size of the displaystyles list should correspond to the length of the concatenated ids without removing duplicates. For example:

 js_toggle [ foo => ["A", "B"], bar => ["B", "C"], baz => "D" ],
   display => [qw/  block inline       block block       table/];

Thus, display styles may depend on the label the object is currently associated with.

use_functions => $bool

*** NOT IMPLEMENTED ***

The variable $_Util::js_toggle_functions includes a function "js_toggle_display" which can be used by this subroutine to decrease the amount of inline javascript. This can reduce bandwidth by quite a bit if this code is places in an external file, or by a little bit if placed in the page <head> in a <script> block. If this option is set to a true value, then it will be assumed that the function "js_toggle_display" is available and it will be used. You will need to ensure that the code in $_Util::js_toggle_functions is inserted into the web page in the appropriate fashion.

libxml_doc

 libxml_doc( $thing )
 libxml_doc( $thing, $parser )
 libxml_doc( $thing, type => '<TYPE>' )
 libxml_doc( $thing, $parser, type => '<TYPE>' )

Construct a XML::LibXML HTML document object easily and quietly. $thing can be a filename (or something that stringifies to a filename), a URL, or actual HTML. Alternatively, you can be specific and specify one of the three types: 'FILE', 'URL', or 'HTML'. The real benefit of this subroutine though is that all XML::LibXML error messages are discarded.

XML::LibXML and LWP::Simple will be automatically loaded if necessary.

:sql - Database manipulation routines

sql_hash_multi

Use DBIx::Simple: map_pairs { push @{$hash{$a}}, $b } $db->query($sql, @stuff)->flat;

 sql_hash_multi( $dbh, $sql, \@stuff, %options );

Prepares and executes a database request for database pairs returned by the query. Query should produce exactly two values per row. Return value is the hashref constructed from the row pairs with first column as keys and arrays of the second column as values. Hash values will be arrays even if only one result is returned for the corresponding key. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_hash = sql_hash_multi( $dbh, $sql, closure => 1 );
 $hash = $get_hash->([]);
 $hash = $get_hash->([]);
 $hash = $get_hash->([]);
 # clean up statement handle when finished
 $get_hash->();

sql_hash

Use DBIx::Simple: %hash = $db->query($sql, @stuff)->flat;

 sql_hash( $dbh, $sql, \@stuff, %options );

Prepares and executes a database request for database pairs returned by the query. Query should produce exactly two values per row. Return value is the hashref constructed for the row pairs. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_hash = sql_hash( $dbh, $sql, closure => 1 );
 $hash = $get_hash->([]);
 $hash = $get_hash->([]);
 $hash = $get_hash->([]);
 # clean up statement handle when finished
 $get_hash->();

sql_col

Use DBIx::Simple: @col = $db->query($sql, @stuff)->flat;

 sql_col( $dbh, $sql, \@stuff, %options );

Prepares and executes a database request for an entire table column returned by the query. Return value is an arrayref of zero or more values in the column. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_col = sql_col( $dbh, $sql, closure => 1 );
 $col = $get_col->([]);
 $col = $get_col->([]);
 $col = $get_col->([]);
 # clean up statement handle when finished
 $get_col->();

sql_all

Use DBIx::Simple: @hashes = $db->query($sql, @stuff)->hashes;

 sql_all( $dbh, $sql, \@stuff, %options );

Prepares and executes a database request for a all table rows returned by the query. Return value is an arrayref of zero or more hashrefs describing the result. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_em = sql_all( $dbh, $sql, closure => 1 );
 $rows = $get_em->([]);
 $rows = $get_em->([]);
 $rows = $get_em->([]);
 # clean up statement handle when finished
 $get_em->();
name

Used to set name canonicalization option in DBI fetcher. May be "lc" or "uc". Anything else returns keys in database case (depends on database).

sql_one

Use DBIx::Simple: $result = $db->query($sql, @stuff); while ($row = $result->hash) { ... }

 sql_one( $dbh, $sql, \@stuff, %options );

Prepares and executes a database request for a single table row. Return value is a unique hashref describing the result row or undef if no results were found. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form).

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_one = sql_one( $dbh, $sql, closure => 1 );
 $row = $get_one->([]);
 $row = $get_one->([]);
 $row = $get_one->([]);
 # clean up statement handle when finished
 $get_one->();
name

Used to set name canonicalization option in DBI fetcher. May be "lc" or "uc". Anything else returns keys in database case (depends on database).

sql_value

Use DBIx::Simple: ($value) = $db->query($sql, @stuff)->list;

 sql_value( $dbh, $sql, \@stuff, %options );

Prepares and executes a database request for a single value. Return value is requested value. Sub returns an empty list (that is, no value) if no rows matched the query. @stuff is an optional list of placeholder substitutions (note: not optional when using the closure form). If more than one column is returned by the query, an array ref of the row (will be a copy of the DBI arrayref) will be returned.

options:

closure

If true, a sub ref is returned instead that will execute and fetch values using a prepared statement handle. When calling the closure, an array of parameters must be provided (even if empty).

 my $get_val = sql_value( $dbh, $sql, closure => 1 );
 $value = $get_val->([]);
 $value = $get_val->([]);
 $value = $get_val->([]);
 # clean up statement handle when finished (not really necessary here)
 $get_val->();

sql_insert

Use DBIx::Simple: $db->insert($table, \%stuff);

 sql_insert( $dbh, $table, \%stuff, %options );

Prepares and executes a table insert. Return value is the result of the statement execution.

options:

closure

If true, a sub ref is returned instead that will perform inserts using a prepared statement handle. The data in the initial call will not be inserted so can simply be a "template" hash showing which columns will be needed.

 my $my_insert =  sql_insert( $dbh, $table, \%stuff, closure => 1 );
 $my_insert->( \%stuff );
 $my_insert->( \%stuff1 );
 $my_insert->( \%stuff2 );
 # clean up statement handle when finished
 $my_insert->();
on_conflict

Algorithm to deal with conflicts. See: http://www.sqlite.org/lang_conflict.html

Support Matrix (partial support ):

          ROLLBACK   ABORT   FAIL   IGNORE   REPLACE
 Pg                   Yes            Yes       Yes
 SQLite     Yes       Yes    Yes     Yes       Yes
primary_key

Possibly needed for on_conflict => REPLACE by some drivers. Is scalar or arrayref of columns to use as primary key. If not provided, the DBI method primary_key will be used to try to determine the primary key columns. This option can be used to override or to avoid computation of the auto-determined values.

:postscript - PostScript generating routines

psplot_sub

This subroutine takes one argument, a subroutine reference followed by some or all of the following options. In scalar context the postscript code which draws the graph is returned. In list context, a hash reference which contains the actual option values used is also returned. This can be used to position other information around the graph.

at

Relative translation (an array ref, [dx, dy], specifying bottom left corner). Default: [0,0]

color

RGB triplet in percentages (0 <= percent <= 1). Default: [0,0,0]

intervals

Number of intervals to cut the region into. Default: 100

xscale

Length in points of a unit vector on the x-axis. Default: 1

xmin

Minimal x value. Default: -10

xmax

Maximal x value (will be set from width/xscale if not defined). Default: undef

yscale

Length in points of a unit vector on the y-axis (will be set from height/ymin/ymax if a height is provided). Default: 1

ymin

Minimal y value (will DWIM if not defined. Will chop the graph if set too high). Default: undef

ymax

Maximal y value (will DWIM if not defined. Will chop the graph if set too low). Default: undef

width

Width of the graph in points (72 points = 1 inch). Default: undef

height

Height of the graph in points (72 points = 1 inch). Default: undef

To create a postscript document, you will need a header something like the following:

 print <<HEADER;
 %!PS-Adobe-2.0
 %Creator: $ENV{USER}
 %%Title: Raster Plot
 %%BoundingBox: -10 -10 500 500
 %%Magnification: 1.0000
 %%EndComments
 HEADER
 print psplot_sub( ... );
 print "\nshowpage\n";

psplot_parametric_sub

This subroutine takes one argument, a subroutine reference followed by some or all of the following options. The provided subroutine should return a list of two values for a given input. In scalar context the postscript code which draws the graph is returned. In list context, a hash reference which contains the actual option values used is also returned. This can be used to position other information around the graph.

at

Relative translation (an array ref, [dx, dy], specifying bottom left corner). Default: [0,0]

color

RGB triplet in percentages (0 <= percent <= 1). Default: [0,0,0]

intervals

Number of intervals to cut the t-interval into. Default: 100

tmin

Minimum t value. Default: 0

tmax

Maximal t value. Default: 10

xscale

Length in points of a unit vector on the x-axis (will be set from width/xmin/xmax if a width is provided). Default: 1

xmin

Minimal x value (will DWIM if not defined. Will chop the graph if set too high). Default: undef

xmax

Maximal x value (will DWIM if not defined. Will chop the graph if set too low). Default: undef

yscale

Length in points of a unit vector on the y-axis (will be set from height/ymin/ymax if a height is provided). Default: 1

ymin

Minimal y value (will DWIM if not defined. Will chop the graph if set too high). Default: undef

ymax

Maximal y value (will DWIM if not defined. Will chop the graph if set too low). Default: undef

width

Width of the graph in points (72 points = 1 inch). Default: undef

height

Height of the graph in points (72 points = 1 inch). Default: undef

To create a postscript document, you will need a header something like the following:

 print <<HEADER;
 %!PS-Adobe-2.0
 %Creator: $ENV{USER}
 %%Title: Raster Plot
 %%BoundingBox: 0 0 500 500
 %%Magnification: 1.0000
 %%EndComments
 HEADER
 print psplot_parametric_sub( ... );
 print "\nshowpage\n";

epsplot_linear_forms

 epsplot_linear_forms \@linear_forms
 epsplot_linear_forms $file, \@linear_forms
 epsplot_linear_forms $file, \@linear_forms, %options

Create a 1" x 1" eps file of the given linear forms. The given matrix should be in general position if you want that, otherwise any elements of the form [0,0,*] will cause a "round box" to be drawn around the lines that represents a line at infinity. If a file name or file handle is provided, then the resulting image will be written to the file, otherwise the image code will be returned.

Any subset of the following options may be provided. If the subroutine is called in list context then the return value will be a hash of the actual option values used together with a plot entry that contains the actual postscript code.

center_origin => boolean

Boolean value indicating whether the center of the generated eps document must correspond to the origin (0,0). (DEFAULT: 0)

force_square => boolean

Boolean value indicating whether the x and y scales must be the same. (DEFAULT: 0)

x_min => number
x_max => number
y_min => number
y_max => number

Numbers giving the corresponding values. If these values are not provided, then they will be computed to ensure that all pair-wise intersection points fit within the final view. You can get (for instance) all positive intersections by only specifying x_min => 0 and y_min => 0.

line_width => number

Width of the lines drawn (in points, 72 pt = 1 in). (DEFAULT: 1.5)

in_sep => multiple

Multiples (possibly non-integral) of the line_width to separate the arrangement graph from the line at infinity since intersecting the line at infinity is bad chi. (DEFAULT: 3)

x_pad => percentage
y_pad => percentage

If this subroutine computes any of the *_min/max values, then this will cause intersections to occur on the edge of the graph which is ugly. The x and y paddings are percentages of the graph that should be dedicated to the space beyond the intersections on each side. (DEFAULT: .15)

precision => integer or undef

Number of places after the decimal to use when comparing slopes of lines. Due to machine precision, determining whether two lines are parallel is tricky. Thus, by default this subroutine throws away some precision in the computation of the bounding box. The down side is that we may sometimes miss the intersection of two lines if they occur "far away". Setting this value to undef will use the standard perl == test for numbers. Setting this value to an integer will use a string equality test rounding to the given number of places after the decimal. (DEFAULT: 7)

:untaint - Untainting

See also Tests and Patterns

untaint

BAD PROGRAMMER, do not use!

untaint_int

Strict int untainter/converter. Must match /^[-+]?[0-9]+$/

untaint_num

Strict numeric untainter/converter. Must match /^[-+]?([0-9]+\.?|[0-9]*\.[0-9]+)$/

untaint_file

Returns untainted string if argument could be passed unquoted to a bash shell (for instance, spaces are not allowed). If argument is undefined or contains any illegal characters at all, undef is returned. See also: canonicalize_filename.

Note: Is currently more restrictive than necessary. This will improve over time.

:logging - Simple logging utilities

info

 info( $level, [\$info_level], [\$ofstream], @text )

Print the information @text to $ofstream if $info_level is greater than or equal to $level. Returns 1 if message was printed and 0 if it was not. $info_level defaults to $CALLER::INFO_LEVEL and $ofstream defaults to $CALLER::LOG if it is a GLOB or IO::* object. A newline will be appended to the last string of @text if it is not already present.

The default INFO_LEVEL is 0. The default LOG is STDERR.

NOTE: $INFO_LEVEL and $LOG must be package variables (declared with our or use vars) for this function to work correctly.

DEBUG

 DEBUG( @text );

Calls info with a level of 0. Also prefixes each line of text with "DEBUG: ".

INFO

 INFO( @text );

Calls info with a level of 1. Also prefixes each line of text with "INFO: ".

NOTICE

 NOTICE( @text );

Calls info with a level of 2. Also prefixes each line of text with "NOTICE: ".

WARNING

 WARNING( @text );

Calls info with a level of 3. Also prefixes each line of text with "WARNING: ".

ERR

 ERR( @text );

Calls info with a level of 4. Also prefixes each line of text with "ERR: ".

ERROR

 ERROR( @text );

Calls info with a level of 4. Also prefixes each line of text with "ERROR: ". (is an alias for ERR())

CRIT

 CRIT( @text );

Calls info with a level of 5. Also prefixes each line of text with "CRIT: ".

ALERT

 ALERT( @text );

Calls info with a level of 6. Also prefixes each line of text with "ALERT: ".

EMERG

 EMERG( @text );

Calls info with a level of 7. Also prefixes each line of text with "EMERG: ".

:system - System / sysadmin tools

pidof

 my @progs = pidof $program;
 my @progs = pidof %opts;

Searches /proc for running programs matching the given name or options. Any options will match against the correcponding value via smartmatch EXCEPT for the pid option which must be an exact PID or array of PIDs.

program

Name of command (excludes path part).

command

Command - includes path if used in execution of program which makes this a bit unreliable if the command is started from a command prompt.

cmdline

Contents of /proc/$pid/cmdline, namely the command and command line arguments joined by NULL characters. Programs without command line arguments will immediately fail.

args

Matched against array of just the command line arguments $VALUE ~~ @args. Programs without command line arguments will immediately fail.

pid

PID or array of PIDs to examine.

user

User name

uid

User id

group

Group name

gid

Group id

do_as

 do_as "username", sub { ... };
 do_as "username:groupname", sub { ... };

Locally change the effective user id and execute some code. Only works if current user is root!

Ensures that $ENV{USER} and $ENV{HOME} are set appropriately. Will eventually include options which will attempt to setup DISPLAY, DBUS, XAUTH, SSH_AGENT, GPG_AGENT, and other variables useful for running and connecting to existing X sessions, apps, and daemons of the user.

:op - Core function extensions

pmap(&@)

Parallel map. Applies function to each item in input list. Evaluation order is not defined, however, result array will be ordered as if the map were performed sequentially. Function is called in list context and may produce any list of items serializable by Storable.

 # Quickly convert a bunch of images to png:
 pmap { my $old = $_; s/\.[^.]+/.png/; system convert => $old => $_ } @images;
 # Result order matches input order
 use Time::HiRes qw/ sleep /;
 say join " ", pmap { sleep(my $sleep = rand); say "$_: Hello ($sleep)"; $_ } 0..9;

$_ is set to each value in turn, though note that $_ will be a copy, not an alias. Therefore modifications to $_ will not be preserved as they are using normal map.

Overhead is reasonably small, but there is little reason to use this function if your tasks finish quickly. Rough "worst case" benchmarks (on Linux):

 $_Util::pmap::threads = 2;
 pmap { say "Hello" } 1..1;          # 0.028559 seconds
 pmap { say "Hello" } 1..1_000;      # 0.027256 seconds
 pmap { say "Hello" } 1..10_000;     # 0.067582 seconds
 pmap { say "Hello" } 1..100_000;    # 0.556916 seconds
 say "Hello" for 1..100_000;         # 0.011928 seconds
 @x = pmap { $_ + 1 } 1..1;          # 0.032267 seconds
 @x = pmap { $_ + 1 } 1..1_000;      # 0.030821 seconds
 @x = pmap { $_ + 1 } 1..10_000;     # 0.098850 seconds
 @x = pmap { $_ + 1 } 1..100_000;    # 0.660198 seconds
 @x = map  { $_ + 1 } 1..100_000;    # 0.024077 seconds

Optimizations:

Configuration:

The following variables can be used to control the thread objects used. Their default values are shown.

 %_Util::pmap::t_opts  = (stack_size => 16*4096);
 $_Util::pmap::threads = sub { ... };

$_Util::pmap::threads should be a number or a sub which tries to determine the number of CPUs on the system. The default sub should work for Linux, Windows, and BSD|Darwin and will fall back to "2" if it can not determine the proper number of CPUs. This sub will be called at most once as its value will be cached upon the first call of pmap. Setting the NUMBER_OF_PROCESSORS environment variable is probably the easiest way to control the number of threads used.

This sub uses only: threads and Storable (core modules since perl 5.008)

pgrep(&@)

Parallel grep. Returns items in input list for which the function returns true. Evaluation order is not defined, however, result array will be ordered as if the grep were performed sequentially.

 # order is preserved in @good
 my @good = pgrep { expensive_test($_) } @data;

subopts

 my %opt = subopts( \@_, OPTIONS )

"Parse" subroutine options into an options hash. Handles mixtures of positional and named parameters, default values, parameter validation, and other features. Parsing of the argument array is controlled by the following options:

positional => A
 my %opt = subopts( [1,2,3], positional => [qw/foo bar baz/] );
 # %opt = (foo => 1, bar => 2, baz => 3)

List of positional argument names.

p6_positional => A

Like positional, but processing stops at the first "known" key value. This allows for Perl 6-like flexi-parameters. This, however, is somewhat dangerous if data may match key names. For example:

 # Uh-oh: parses as ( date => "Jan 1", late => 0 )
 my %opt = subopts( ["Jan 1", "late"],
                    p6_positional => [qw/date note/],
                    allowed => ["late"],
                    validate => { late => "bool" }
                  );
defaults => Ho*

Default values. Values may be arbitrary objects. Subroutine values may be expanded if 'eval_defaults' option is provided.

required => A

Any required parameters must not be undefined. Defaults are processed before requiredness is considered so any required parameter with a valid default will never cause an error.

validate => Ho*

Hash of validators. Keys are parameter names, values are validators.

 Sub validators:
   arguments: $value, \%params_so_far
   return: BOOL | SCALAR_REF
 Regexp validators:
   not automatically anchored, be sure to anchor your patterns if you want that
untaint => BOOL | HoBOOL

If true, any parameters satisfying their corresponding validator will be untainted. Parameters without a validator will not be untainted.

allowed => A

Key names (in addition to 'required' key names which may appear in the options.

sloppy_known => BOOL

By default, 'defaults' and 'validate' hashes are ignored when considering 'allowed' option keys (so that the same default and validate hashes may be used for multiple subs). Specifying this option will include their keys in the list of 'allowed' keys.

no_dups => BOOL

By default,

 %opt = subopts( [ foo => 1, foo => 2 ] )

Will set $opt{foo} = 2. If 'no_dups' is true, this sub will throw an error at any duplicated kay names.

eval_defaults => BOOL

If true, any default values which are sub refs will be executed and their return values used. Useful if default value is expensive to compute. default subs are called with two parameters: the key name and the current options hash. he options hash WILL contain ALL user-set parameters but there are no guarantees about the order in which the defaults are expanded.

modtime

Stupid! use Path::Class::File->stat->mtime

Computes the epoch time when the file was last modified.

 print "File last modified: " . localtime( modtime($f) )

SPLIT

Split an expression on a pattern ignoring split patterns within delimited text.

 SPLIT PATTERN, EXPR, LIMIT
 SPLIT PATTERN, EXPR
 SPLIT PATTERN
 SPLIT

Split PATTERN may be a string literal, qr// regular expression, or hashref containing splitting options. Beware, unlike Perl's split builtin, this function does not currently support captures in PATTERN. This may be fixed at some point in the future.

If EXPR is missing, a splitting subroutine is generated and returned.

 my $splitter = SPLIT;
 my $splitter = SPLIT qr/\s*,\s*/;
 my $splitter = SPLIT \%options;
 my @pieces = $splitter->( $text );
 my @pieces = $splitter->( $text, $limit );
 my @pieces = SPLIT qr/\s*,\s*/, $text;
 my @pieces = SPLIT \%options, $text;
 my @pieces = SPLIT \%options, $text, $limit;

The following options are recognized in the options hash:

on => PATTERN

String literal or qr// pattern, see above.

delimiters => q|"'`|

Delimiters passed to the Text::Balanced::gen_delimited_pat subroutine.

escape => '\'

Escape characters passed as the second argument to the Text::Balanced::gen_delimited_pat subroutine.

FORK

 FORK { Child Code }
 FORK \&child, \&parent, \&error
 FORK { Child Code } %options
 &FORK(%options)

Returns child process id on success and nothing (undef) on failure unless parent action is defined.

options:

parent => CODE || 'exit' || $exec_string || \@exec_command
child => CODE || $exec_string || \@exec_command
error => CODE || 'text to die by'
ignore => BOOLEAN

BUGS: "ignore" option doesn't work (local SIGCHLD is useless).

Not sure how to fix it. Want:

* no zombies * children to be killed when parent exits * to be able to fork from subs without globals (thus, open "|-" probably bad unless we keep them in a package var)

gzdo

Works just like the builtin do command, but reads the file using the PerlIO gzip layer. Just like the builtin command, this function will search @INC and update %INC. However, in addition this function will also attempt to append a ".gz" extension and will read that file if it exists (or exists in @INC). The following package variables modify the behavior of this subroutine:

$_Util::gzdo::gzip_layer_options

Defaults to "(autopop)", this string is appended to the open MODE. Set to the empty string to disable automatic file type checking.

$_Util::gzdo::gz_extension

Defaults to ".gz", setting this string affects the gzip extension that will be appended to the file name if necessary. Set to a false value to disable.

hpush(\%@)

 hpush %hash, key1 => $value, key2 => $value2, ...

Add pairs to an existing hash. Keys already existing in %hash are overwritten.

hdefaults(\%@)

 hdefaults %hash, key1 => $value, key2 => $value2, ...

Add pairs to an existing hash. Keys already defined in %hash are preserved.

subhash

 my %shallow = subhash( \%orig, @keys );

Extract keys from a hash. Similar to:

 @shallow{@keys} = @orig{@keys};

But does not auto-vivify when key does not exist in %orig (and does not create key in %shallow).

parse_date

 my $dt  = parse_date( $string, %opt );
 my $dt2 = parse_date( $dt1, %opt );

Parses a date and then converts it to a DateTime object.

If the input is already a DateTime object, it will be CLONED and returned.

Some date formats specifically guaranteed by this function:

 2006:08:28 20:56:25     # Stored in exif date fields by my camera
floating

If true, time zone information will not be included the DateTime object.

clone

Defaults to true. When true, DateTime objects passed to this function will be cloned before being returned.

map_pairs(&@)

See also: List::MoreUtils pairwise

Applies a function passing two elements from the array at a time. That is, given a function &f and a list of inputs, x1, x2, ..., this function returns the list ( f(x1, x2), f(x3, x4), ... ).

The function may be a code block and may take either two arguments or use $a and $b as in perl's sort function. If there are an odd number of elements in the list the last iteration will be called with undef as the second parameter ($b).

Example:

 @z = map_pairs { "$a: $b\n" } %hash;

Note: The return list will not be constructed if this function is called in void context. Therefore you are not a bad person if you do the following:

 map_pairs { print "$a: $b\n" } %hash;

map_pair(&&@)

Applies a pair of functions on a flat list of tuples. Given two functions, \&f and \&g, and a list of inputs, x1, x2, ..., this function returns the list ( f(x1), g(x2), f(x3), g(x4), ... ).

The first argument may be a code block, either of the first two arguments may take a single argument or use $_. Some Examples:

 sub f { shift()  + 2 }
 sub g { shift() ** 2 }
 @y = map_pair \&f, \&g, 1..20;     # as expected
 # NOT ALLOWED, second argument may not be code block
 @z = map_pair { $_ + 2 } { $_ ** 3 } 1..20
 # Use this instead (note comma after second argument)
 @z = map_pair { $_ + 2 } sub { $_ ** 3 }, 1..20

join_pair

 join_pair $a, $b, @x

Does an "alternating join". Returns the string "$x[0]$a$x[1]$b$x[2]$a...".

join_multi

 join_multi \@a, @x

Does an "alternating join". Returns the string "$x[0]$a[0]$x[1]$a[1]$x[2]...". The list @a may be cycled through multiple times if @a < @x.

deep_eq

Test if two complex (possibly circular) data structures are equal.

Solution based on code by Roy Johnson (http://www.perlmonks.org/?node_id=304250). Modified to match my indenting style and to fix some bugs in the original. I have also made it safe to use on blessed and circular objects.

See also: Test::Deep

SYSTEM

Note: The Dean::Util::safe_pipe() is generally a better choice since that sub checks the exit status of the command.

Works like the perl system command except that any string expressions passed by reference will not be quoted. This allows, for example, pipes and redirects while still allowing safe escaping of arguments.

If the first argument to SYSTEM is a reference to the string "DEBUG" then the escaped command will be printed to STDERR before being executed.

EXAMPLES:

 # A truly silent mplayer
 SYSTEM "mplayer", "Movie Files/LOTR: trailer 2.mov", \">/dev/null", \"2>/dev/null";
 # multiple commands in one line.
 SYSTEM "echo", ";", \";", "echo", q/$( This is echoed properly )/;
 # See what exactly is happening.
 SYSTEM \"DEBUG", "echo", ";", \";", "echo", q/$( This is echoed properly )/;

QX

Note: The Dean::Util::safe_pipe() is generally a better choice since that sub checks the exit status of the command.

Works like a combination of SYSTEM and the qx operator. It behaves like SYSTEM in that it is a subroutine which takes a list of string expressions that are quoted before being passed to the shell. Any string expressions passed by reference will not be quoted. This allows, for example, pipes and redirects while still allowing safe escaping of arguments. The return value is the STDOUT of the executed command, just like with the qx operator.

EXEC

Works like a combination of SYSTEM and exec. It behaves like SYSTEM in that it is a subroutine which takes a list of string expressions that are quoted before being passed to the shell. Any string expressions passed by reference will not be quoted. This allows, for example, pipes and redirects while still allowing safe escaping of file names. This function never returns, just like with the exec function.

SELECT

Works like perl's select function, but is instead given a string which is opened as a file and then selected. The special string '-' will not change the default output stream. An undefined or empty string will select "/dev/null" instead.

If a string reference $mode is provided as a first argument it will be taken as the file mode (the default is ">").

EXISTS

 EXISTS $hash_ref, qw| key1 arbitrarily/deep/key |;
 EXISTS $hash_ref, @paths, { sep => $separator };

Safely test for deep key existence. Recursion happens by splitting on $separator ("/" by default), there is no means for escaping. Returns true only if all keys exist. Array refs are allowed if corresponding path components are numeric.

HAS

 HAS $hash_ref, qw| key1 arbitrarily/deep/key |;
 HAS $hash_ref, @paths, { sep => $separator };

Safely test for deep key definedness. Recursion happens by splitting on $separator ("/" by default), there is no means for escaping. Returns true only if all keys exist and are defined. Array refs are allowed if corresponding path components are numeric.

TRUE

 TRUE $hash_ref, qw| key1 arbitrarily/deep/key |;
 TRUE $hash_ref, @paths, { sep => $separator, false_pat => $pattern };

Safely test for deep key truth. Recursion happens by splitting on $separator ("/" by default, set $separator to undef to disable this behavior), there is no means for escaping. Returns true only if all keys exist and are true. Values matched by $pattern (^(?i:false)$ by default) as well as an empty list or empty hash will all cause 0 to be returned. Array refs are allowed if corresponding path components are numeric.

GETPATH

 GETPATH $hash_ref, 'deeply/nested/value'
 GETPATH $hash_ref, $path, { sep => $separator }

Fetch values in recursive hashes using the handy path notation. Keys are separated by $separator which defaults to "/". There is no way to escape $separators in a key, so choose your separator carefully. Array refs are allowed if corresponding path components are numeric. This function does not perform autovivification at any level. Non-existent keys at any level of the requested path immediately returns.

SETPATH

 SETPATH $hash_ref, 'deeply/nested/value', $value
 SETPATH $hash_ref, $path, $value, { sep => $separator }

Set values in recursive hashes using the handy path notation. Keys are separated by $separator which defaults to "/". There is no way to escape $separators in a key, so choose your separator carefully. Nested hashes are created explicitly so that this subroutine is safe to use on tied hashes (see CAVEATS / ISSUES / BUGS in DBM::Deep). Array refs are allowed if corresponding path components are numeric. HOWEVER! Any path components which do not already exist will be created as hashes regardless of the value of the keys.

unique_id

Returns a unique identifier. This not as "unique" as Sys::UniqueID, but uses only core modules.

Code copied from Sys::UniqueID except that Socket is used to get the IP address rather than Sys::HostIP. This works fine as long as hostname returns a good host name that gethostbyname(3) can resolve to a sufficiently distinct IP address (distinct among all machines generating unique ids). A typical pool of servers on the same network should have no problems. However, ids generated by code running on a client may have issues.

When the counter wraps, this function may need to sleep before proceeding to ensure a unique timestamp. The Time::HiRes module is used for this (core module in 5.8.0). The amount of time slept before polling the new time can be adjusted using the $_Util::unique_id::sleep package variable. The default sleep time is 0.1 seconds.

SPRINTF

 SPRINTF $o, $fmt, $h1, $h2, ...

Format the data in the hashes $h1, $h2, ... into the format string $fmt given in the language specified in the option hash $o.

Example:

 $o = { a => [ s => "artist" ],
        t => [ s => "title", "name" ],
        N => sub { scalar localtime },
      };
 @songs = ( { artist => "Arlo Guthrie", title => "The Motorcycle Song" },
            { artist => "Cypress Hill", name  => "Psycobetabuckdown" },
          );
 @formatted_songs = SPRINTF $o, "%20t - %a", @songs;

See also: String::Formatter

:perl6 - Perl 6 functions

smartmatch

Perl 5.010 pretty much killed the need for this...

 smartmatch( $X, $Y );

smartmatches $X ~~ $Y. Inspired by the Perl6 operator, but a complete deviation since this is not designed to be the deciding form of a switch statement. Primarily tests for thing which are annoying to type out.

Returns 1 or '' as long a CODE is not one of the match variables. Returns undef if comparison is not possible.

Matches are commutative unless explicitly presented as otherwise

 str|num ~~   str|num      natural equality test, though see convert_string_to_regexp option
 str|num ~~   Regexp       natural pattern match
 ARRAY   ~~   Regexp       all(ARRAY) =~ Regexp
 HASH    ~~   Regexp       all(keys(HASH)) =~ Regexp
 undef   ~~   ARRAY        undef \in ARRAY
 num|str ~~   ARRAY        str \in @ARRAY
 HASH    ~~   num          keys(%HASH) == num
 ARRAY   ~~   num          @ARRAY == num
 ARRAY   ~~   ARRAY        @ARRAY <<~~>> @ARRAY  # test elements individually
 HASH    ~~   ARRAY        exists(@HASH{@ARRAY})
 HASH    ~~   HASH         have same keys
 CODE    ~~   Any          CODE( Any )
 ARRAY   ~~   CODE         CODE( all(ARRAY) )
 Any     ~~   CODE         reserved
 ARRAY   ~~   undef        reserved
 ARRAY   ~~   str          reserved
convert_string_to_regexp: 'left' | 'right' | Bool

Strings of the form: qr(...), qr/.../, qr(\W)...(\1), ... are automatically upgraded to regular expressions.

zip

See also: List::MoreUtils zip

 zip \@x, \@y, ...

The Perl6 zip function (almost). Given a list of arrays, returns a list of the array elements "zipped" together, that is: $x[0], $y[0], ..., $x[1], $y[1], .... The lists need not be the same length the short lists will simply be ignored after they run out.

uniq

Takes a list (or reference to an array) and discards all but one of successive identical objects (up to stringification) from the list. In scalar context, an array reference is returned.

Note: This is different from the unique function which will remove all duplicates from the list.


TODO

range2list and list2range

convert #..#,#-#,a..z,a-z,2:23,2:5:23 strings to lists and back. split /,/ first.

A more general form "suggested" on PerlMonks (http://www.perlmonks.org/?node_id=427615):

 foo[01:100]bar-[fred,barney,wilma]

Though, shell syntax might be better (see bash(1) /EXPANSION):

 foo{001..100}bar-{fred,barney,wilma{1,2}}
Adaptive Simpson's rule
 sub f { sqrt($_[0]) }
 print adaptive( \&f, 0, 1, 0.0005 ), $/;
 sub adaptive {
  my ($f, $a, $b, $eps) = @_;
  my $s1 = simp($f, $a, $b);
  my $s2 = simp2($f, $a, $b);
  my $err = abs($s2-$s1)/15;
  if ($err < $eps) {
    return $s2;
  } else {
    return adaptive($f, $a, ($a+$b)/2, $eps/2) + adaptive($f, ($a+$b)/2, $b, $eps/2);
  }
 }
 sub simp {
  my ($f, $a, $b) = @_;
  return ($b-$a)*($f->($a) + 4*$f->(($a+$b)/2) + $f->($b))/6;
 }
 sub simp2 {
  my ($f, $a, $b) = @_;
  return simp($f,$a,($a+$b)/2) + simp($f,($a+$b)/2,$b);
 }


BUGS

No known bugs, if you find one, please report it via email.


AUTHOR

 Dean Serenevy
 dean@cs.serenevy.net
 http://dean.serenevy.net


LICENSE

This software (except where attributed to another author) is hereby placed into the public domain. If you use this code, a simple comment in the code giving credit and an email letting me know that you find it useful would be courteous but is not required.

The software is provided "as is" without warranty of any kind, either expressed or implied including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software.


SEE ALSO

perl(1).

 Dean::Util - Utilities created by Dean Serenevy