| Dean::Util - Utilities created by Dean Serenevy |
ceil($)ceil_dirty($)floor($)floor_dirty($)dotprod(\@\@)ndiff(&;@)binary_search(&@)str($)sign($)nsign($)qbash($)BOLD($)DARK($)UNDERLINE($)BLINK($)REVERSE($)CONCEALED($)STRIKE($)BLACK($)RED($)GREEN($)YELLOW($)BLUE($)MAGENTA($)CYAN($)WHITE($)GREY($)GRAY($)BRIGHT_RED($)BRIGHT_GREEN($)BRIGHT_YELLOW($)BRIGHT_BLUE($)BRIGHT_MAGENTA($)BRIGHT_CYAN($)ON_BLACK($)ON_RED($)ON_GREEN($)ON_YELLOW($)ON_BLUE($)ON_MAGENTA($)ON_CYAN($)ON_WHITE($)ON_GREY($)ON_GRAY($)ON_BRIGHT_RED($)ON_BRIGHT_GREEN($)ON_BRIGHT_YELLOW($)ON_BRIGHT_BLUE($)ON_BRIGHT_MAGENTA($)ON_BRIGHT_CYAN($)hpush(\%@)map_pairs(&@)map_pair(&&@)
Dean::Util - Utilities created by Dean Serenevy
use Dean::Util qw/map_pair nsign min_max/; ...
Then later, to remove dependance on Dean::Util
perl -MDean::Util -we insert_Dean_Util_functions The/Module.pm
This is a set of utility functions that I find myself rewriting frequently.
Normally, putting functions into a module introduces a dependency on that
module which can be a hassle in some situations. This is a ``smart'' module
which is capable of replacing the use Dean::Util... line with the code
for the requested functions. Thus, machines that have Dean::Util installed
can use it as a module, but when requested, a (Dean::Util) dependency-free
version of the file may be made.
This function prints a column-formatted list of the functions included in the Dean::Util package.
This function attempts to verify that the Dean/Util.pm is properly structured. This function is intended to be run only by people who make changes to the Dean/Util.pm file to check that their code is properly formatted for the module to parse.
Returns a hash ref with an entry of the following type for each function and variable defined in Dean::Util.
name => { code => '...',
pod => '...',
depends => [ 'thing 1', 'thing 2', ... ]
}
Some additional information may be included in each sub-hash for debugging purposes or internal use.
Replaces all occurances of ``use Dean::Util ...;'' (``...'' is everything up to
first semi-colon, so don't use qw; ;) with the actual source code of the
functions requested from Dean::Util. The original files are saved to a
backup file which is just the original filename with a ~ appended. The
list of files to modify is either included as a list of arguments or is
read from @ARGV.
As in the function
get_Dean_Util_function_string, the
special symbols INCLUDE_POD and POD_ONLY may be used to indicate that
all further inclusions (restricted to each individual ``use'' block) should
include their POD documentation before the code, or exclude the code and
only output the POD documentation. Example:
use Dean::Util qw/max min INCLUDE_POD join_multi map_pair/; use Dean::Util qw/is_num is_int/; # ... later, possibly even after __END__ use Dean::Util qw/POD_ONLY is_num is_int/;
Would include code and POD documentation for join_multi and map_pair. The code and POD documentation for is_num and is_int would be inserted separately.
Note: Multiple use Dean::Util inclusions may result in multiple
subroutine definitions so don't use the same function twice unless they
are in different scopes.
Once insert_Dean_Util_functions has been used to ``export'' a list of
Dean::Util functions, this command will replace Dean::Util function
blocks with more recent function versions, thus upgrading the exported
script.
Returns the source code for the functions provided as arguments. If the
argument list is empty, the function list is taken from @ARGV.
The special symbols INCLUDE_POD and POD_ONLY may be used to indicate
that all further inclusions should include their POD documentation before
the code, or exclude the code and only output the POD documentation.
Example:
get_Dean_Util_function_string qw/max min INCLUDE_POD join_multi map_pair/;
Would include the POD documentation for only join_multi and map_pair.
get_Dean_Util_function_string qw/POD_ONLY format_cols/;
Would return just the POD documentation for format_cols.
The string, pi, to 30 digits after the decimal.
The string, e, to 30 digits after the decimal.
See also: List::Util max
Return the maximum number in a list of values. All arguments must be numeric, use max_dirty for untrusted or mixed data.
See also: List::Util min
Return the minimum number in a list of values. All arguments must be numeric, use min_dirty for untrusted or mixed data.
Return the maximum number in a list of values. This version of max should be used for untrusted data since undefined or non-numeric values are silently ignored rather than trowing errors.
Return the minimum number in a list of values. This version of min should be used for untrusted data since undefined or non-numeric values are silently ignored rather than trowing errors.
fmax { block } @list
fmax \&sub, @list
Return the maximum function value given by evaluating the given code at
each element of the list. The code may be either a subroutine reference or
a code block. $_ will be set to each list entry and will also be passed
in as the first (and only) argument. If the code returns any undefined or
non-numeric values, perl will issue warnings.
fmin { block } @list
fmin \&sub, @list
Return the minimum function value given by evaluating the given code at
each element of the list. The code may be either a subroutine reference or
a code block. $_ will be set to each list entry and will also be passed
in as the first (and only) argument. If the code returns any undefined or
non-numeric values, perl will issue warnings.
fmax_dirty { block } @list
fmax_dirty \&sub, @list
Return the maximum function value given by evaluating the given code at
each element of the list. The code may be either a subroutine reference or
a code block. $_ will be set to each list entry and will also be passed
in as the first (and only) argument. If the code returns any undefined or
non-numeric values, they will be ignored.
fmin_dirty { block } @list
fmin_dirty \&sub, @list
Return the minimum function value given by evaluating the given code at
each element of the list. The code may be either a subroutine reference or
a code block. $_ will be set to each list entry and will also be passed
in as the first (and only) argument. If the code returns any undefined or
non-numeric values, they will be ignored.
minimizer { block } @list
minimizer \&sub, @list
Return the item of @list which yields the minimum value when evaluated
by the given code. The code may be provided either as a subroutine
reference or a code block. $_ will be set to each list entry and will
also be passed in as the first (and only) argument. If the code returns any
undefined or non-numeric values, perl will issue warnings.
maximizer { block } @list
maximizer \&sub, @list
Return the item of @list which yields the maximum value when evaluated
by the given code. The code may be provided either as a subroutine
reference or a code block. $_ will be set to each list entry and will
also be passed in as the first (and only) argument. If the code returns any
undefined or non-numeric values, perl will issue warnings.
minimizer_dirty { block } @list
minimizer_dirty \&sub, @list
Return the item of @list which yields the minimum value when evaluated
by the code. code may be either a subroutine reference or a code block.
$_ will be set to each list entry and will also be passed in as the
first (and only) argument. If the code returns any undefined or non-numeric
values, they will be ignored and the corresponding list item will not be
considered as a minimizer.
Note however that no filtering is performed on @list so undefined values
will be passed to the subroutine as a normal element.
maximizer_dirty { block } @list
maximizer_dirty \&sub, @list
Return the item of @list which yields the maximum value when evaluated
by the code. code may be either a subroutine reference or a code block.
$_ will be set to each list entry and will also be passed in as the
first (and only) argument. If the code returns any undefined or non-numeric
values, they will be ignored and the corresponding list item will not be
considered as a minimizer.
Note however that no filtering is performed on @list so undefined values
will be passed to the subroutine as a normal element.
ceil($)If the argument is numeric, then returns the smallest integer which is greater than or equal to the given argument. Otherwise this function will spew warnings.
ceil_dirty($)If the argument is numeric, then returns the smallest integer which is greater than or equal to the given argument. Otherwise this function will return undef.
floor($)If the argument is numeric, then returns the largest integer which is less than or equal to the given argument. Otherwise this function spwes warnings.
floor_dirty($)If the argument is numeric, then returns the largest integer which is less than or equal to the given argument. Otherwise this function returns undef.
See also: List::Util sum
Returns the sum of all numeric entries in a list. Undefined/non-numeric values cause warnings.
See also: List::Util reduce
Returns the product of all numeric entries in a list. Undefined/non-numeric values cause warnings.
Returns the average over all entries in a list. Undefined or non-numeric entries will spew warnings.
Returns the sum of all numeric entries in a list. Undefined/non-numeric values are ignored.
Returns the product of all numeric entries in a list. Undefined/non-numeric values are ignored.
Returns the average over all entries in a list. Undefined or non-numeric entries contribute a 0 to the average.
Returns a pair ($m, $M) which is the minimum and maximum numbers,
respectively, in a list of values without looping over the list twice.
Undefined or non-numeric values will cause warnings.
Returns a pair ($M, $m) which is the maximum and minimum numbers,
respectively, in a list of values without looping over the list twice.
Undefined or non-numeric values will cause warnings.
Returns a pair ($m, $M) which is the minimum and maximum numbers,
respectively, in a list of values without looping over the list twice.
Undefined or non-numeric values are silently ignored.
Returns a pair ($M, $m) which is the maximum and minimum numbers,
respectively, in a list of values without looping over the list twice.
Undefined or non-numeric values are silently ignored.
my $sieve = sieve_of_eratosthenes( $n ); sieve_of_eratosthenes( $m, $sieve );
Constructs a bitstring $sieve using the Sieve of Eratosthenes so that:
vec($sieve, $n, 1) == 1 iff $n is prime
If a sieve (or an undefined scalar) is provided as a second argument, it will be appended to.
Note: Since perl's length command deals only in bytes, this subroutine
will round $n up to make sure that $sieve is correct to a whole
number of bytes. In particular, you are guaranteed to be able to trust
$sieve up to $n = 8 * length($sieve) - 1.
Determine primality. Constructs the Sieve of Eratosthenes to determine
primality. The sieve is reused for each call to is_prime so scripts are
encouraged to prepare the sieve by calling is_prime on a large number
before making multiple calls to is_prime.
# SLOW: takes 21.89 seconds @primes = grep is_prime($_), 1..400000;
# FAST: takes 1.387 seconds @primes = reverse grep is_prime($_), reverse 1..400000;
This function may take some shortcuts if it can so if you want to prepare the sieve append the option ``force_sieve'',
# SLOW: is_prime( 400000 ); # this test shortcuts since 400000 is even @primes = grep is_prime($_), 1..400000;
# FAST: is_prime( 400000, force_sieve => 1 ); @primes = grep is_prime($_), 1..400000;
my $m = next_prime( $n )
Compute the next prime integer larger than $n.
Given a base, this function returns a hash which may be used in future calls to the other base functions.
A base is described by:
integer <= 36 (0-9 a-z) array ref (list of symbols, length == base, index i == i, yes you get to define zero) string (string of symbols, shortcut for [split //, $str] hash ref (the output of a previous call to base_hash, this is silly in this case)
base2base( string, base, base )
String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:
. , : ; _ | / \ - + ' ` "
Bases are described by:
integer <= 36 (0-9 a-z) array ref (list of symbols, length == base, index i == i, yes you get to define zero) string (string of symbols, shortcut for [split //, $str] hash ref (the output of base_hash)
base2integer( string, base )
Convert a string to another base. The string may not be a decimal.
Base is described by:
integer <= 36 (0-9 a-z) array ref (list of symbols, length == base, index i == i, yes you get to define zero) string (string of symbols, shortcut for [split //, $str] hash ref (the output of base_hash or symbol => value pairs)
base2decimal( string, base )
String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:
. , : ; _ | / \ - + ' ` "
Base is described by:
integer <= 36 (0-9 a-z) array ref (list of symbols, length == base, index i == i, yes you get to define zero) string (string of symbols, shortcut for [split //, $str] hash ref (the output of base_hash)
decimal2base( string, base )
String may be decimal. The following symbols are tried (in order) to be used as the punctuation between the integer and fraction part of the number:
. , : ; _ | / \ - + ' ` "
Base is described by:
integer <= 36 (0-9 a-z) array ref (list of symbols, length == base, index i == i, yes you get to define zero) string (string of symbols, shortcut for [split //, $str] hash ref (the output of base_hash)
factorial( $n )
Returns $n! if $n is a non-negative integer.
prob_model_invariants( \%model, %options )
The model is a hash with keys the outcomes and values the corresponding probabilities. At most one of the probabilities may be undefined in which case it will be computed automatically (as $1 - \sum p_i$) and added to your passed probability model.
Roll n dice (default 1) and return the results. In scalar context, only the sum is returned. In list context, the individual rolls are returned as well as the final sum of the values (the sum is returned in the last position).
See also: List::Util shuffle
Randomize a list of values. Essentially the Fisher-Yates shuffle code from perlfaq4 (``How do I shuffle an array randomly?''). If the array is passed by reference then it will be altered, otherwise a copy is made. Returns a new list or a reference to a list depending on context.
one_var( @data ); one_var( \@data ); one_var( \@data, $sorted );
Returns a hash (or hash reference if called in scalar context) of
one-variable statistics on the input data. If the $sorted parameter is
not defined then the data is assumed to be not sorted and the subroutine
will make its own sorted copy of the data. If the $sorted parameter is
defined but false, then the subroutine will sort @data in place
(@data will be altered). If the $sorted parameter is true then the
data will be assumed to be already sorted. The returned hash will have the
following keys:
The average value of the data
The summation of the data
The sum of the squares of the data
The sample variance, 1/n-1 * sum (x_i - average)^2
The sample standard deviation, sqrt( Svar )
The population variance, E( (X - E(X))^2 )
The population standard deviation, sqrt( variance )
The number of measurements in the sample
The smallest data element
The smallest data element
The first quartile computed using broken ``Basic Math Course Method''.
The sample median
The third quartile computed using broken ``Basic Math Course Method''.
The corresponding Unicode characters: ``\x{2211}'', ``\x{03A3}'', ``\x{03C3}''. Be warned that char:sum is a different symbol than char:Sigma and that the terminal that you are writing to will need to understand UTF-8 font encodings.
Note: the list only needs to be sorted to compute the quartiles, min,
median, and max values. If you are not interested in these values then you
can speed up the computation by providing $sorted with a true valued
(regardless of whether the data is sorted) and simply ignore those values
in the output.
percentile($p, @data) percentile($p, \@data) percentile($p, \@data, $sorted) percentile($p, \@data, %options)
Return the $p-th percentile using the weighted average at X_{(n+1)p}
method (http://www.xycoon.com/method_2.htm) That is, the number such that
approximately 100 * $p of the data values are less than or equal to the
given value. If an array reference is given as well as a third true value,
the data will be assumed to be already sorted. The following options are
available.
Boolean value indicating whether the data are sorted already. If not, they will be sorted numerically.
One of ``midpoint'', ``floor'', ``ceil'', or ``scaled''. This controls what to do when a percentile divider is between two entries. The default behavior is ``scaled'', the returned percentile will be an appropriate linear combination of the neighboring entries. The ``midpoint'' method always returns the midpoint of the neighboring entries. Finally, the ``floor'' and ``ceil'' methods always return the lower or higher neighbor respectively.
The ``method'' also affects the return value when return => "index" is
enabled.
Either ``value'' or ``index''. Affects whether we return the actual percentile value, or simply its index in the array.
my $r = correlation( \@X, \@Y ); my %I = correlation( \@X, \@Y ); my $r = correlation( \@X, \@Y, %options );
Pearson product-moment correlation coefficient.
The result hash from one_var()
The sample standard deviation and mean of x and y.
permutations( $n ); permutations( @list ); # 1 < @list !! permutations( \@list );
Return a list of all permutations of the given input list.
Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider using Algorithm::Permute or Algorithm::FastPermute from cpan.
k_arrangements( \@list, $k ); k_arrangements( $n, $k );
Return a list of all arrangements (sub-permutations) of the given input
list of length $k. If $n and $k are both integers, then simply the
number of $k arrangements is returned.
Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.
arrangements( $n ); arrangements( \@list ); arrangements( \@list, $k ); arrangements( $n, $k );
arrangements( @list ); # @list > 2 !!!
Return a list of all arrangements (sub-permutations) of the given input list (regardless of length). If the list is provided as a reference and an integer $k is provided then the results will be restrictetd to length $k as in the k_arrangements subroutine.
Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.
k_combinations( \@list, $k ); k_combinations( $n, $k );
Return a list of all combinations of the given input list of length $k.
Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.
combinations( $n ); combinations( \@list ); combinations( \@list, $k ); combinations( $n, $k );
combinations( @list ); # @list > 2 !!!
Return a list of all combinations of the given input list (regardless of length). If the list is provided as a reference and an integer $k is provided then the results will be restrictetd to length $k as in the k_combinations subroutine.
Note: This subroutine is slow and inefficient. If you want to use this for any real purpose then you should consider looking up an XS module on CPAN.
npdf $x npdf $x, $mu npdf $x, $mu, $sigma
Compute the probability P( X = $x ) assuming a normal distribution
with mean $mu and standard deviation $sigma. $mu and $sigma are
assumed to be 0 and 1 respectively if they are missing. $sigma
must be positive.
ncdf $x ncdf $x, $mu ncdf $x, $mu, $sigma
Compute the probability P( X <= $x ) assuming a normal
distribution with mean $mu and standard deviation $sigma. $mu and
$sigma are assumed to be 0 and 1 respectively if they are missing.
$sigma must be positive.
dotprod(\@\@)my $d = dotprod @x, @y; my $d = &dotprod(\@x, [1,2,3]);
Compute the dot product of two vectors
$inverse = modular_inverse( $x, $m );
Compute the inverse of $x in the group Z_m. The inverse will be within the set [0..$m-1].
Note: $x must be relatively prime to $m.
Compute the Greatest Common Divisor of a list of integers using the Euclidean algorithm. Negative numbers are treated as positives by this routine.
($alphs, $beta, $d) = extended_euclidean_algorithm($a, $b)
For a pair of integers, a and b, perform the extended Euclidean algorithm to compute alpha, beta, and d such that:
d = alpha * a + beta * b
In particular, if d = 1 then alpha = a^-1 mod b.
my ($N, $D) = frac( $dec )
Convert a decimal to a fraction. Returns undef if number is not rationalizable (must have repeating decimals).
ndiff(&;@)my $df = ndiff \&f; my $df = ndiff \&f, $x;
Perform numerical differentiation using the central difference formula.
f'(a) \approx ( f(a+h) - f(a-h) ) / (2h)
If M \approx f(a) \approx f''(c) for all c \in [a-h, a+h], then the total
error (truncation plus round-off) is on the order of:
error = M * (h^2/6 + eps/h)
where eps is the machine epsilon (eps = 2E-16 on 32-bit perl; (1 + 2E-16 != 1), however (1 + (2E-16)/2 == 1) ). Thus, error is minimized when h \approx \sqrt[3]{eps}. We choose h = 2**(-20) = 0.00000095367431640625.
Examples:
sub f { $_[0]**2 }
my $df = ndiff \&f;
printf "%.5f | %.5f\n", f($_), $df->($_) for 0..10;
say "f'(3) = ", ndiff(\&f, 3);
$df = ndiff { $_ ** 2 };
Nintegrate { block } $a, $b, $n
Nintegrate \&sub, $a, $b, $n
Integrate a function between two values using a composite Simpson's rule.
The last argument $n is optional and specifies the number of intervals
to divide the region into. The default is 1000.
The function is assumed to be continuous with continuous derivatives up to
order 4. $n should be even, but we adjust it if it is not. The error is
given by,
5
(b-a) (4)
err = -------- f ( x )
4
180 n
for some x in the interval (a,b).
interpolating_function \%function, $message, $nowarn
Returns a perl subroutine which interpolates %function linearly using
interpolate. $message is an optional message that will
be used if an input value is given which is out of range of the
interpolator.
interpolate $x, \%function, \@keys, $message, $nowarn
Perform an interpolation of the provided function at the point $x. The
keys of the function need not be evenly spaced, the value is approximated
linearly. The last two parameters are optional, @keys is a sorted list
of the keys of the function and $message is used in the error message
that is printed if $x is out of range of the interpolator.
continuous_compounding P => $P, r => $r, t => $t; continuous_compounding A => $A, P => $P, r => $r, t => $t, solve_for => $q;
Given any three of ``A'' (Accumulated balance), ``P'' (Principal balance), ``r'' (interest Rate), and ``t'' (Time to withdrawal), this function will return the fourth. If all four values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a ``solve_for'' key which points to one of ``A'', ``P'', ``r'', or ``t''. All values are case insensitive.
discrete_compounding P => $P, r => $r, t => $t, n => $n; discrete_compounding A => $A, P => $P, r => $r, t => $t, n => $n, solve_for => $q;
Given ``n'' (Number of compoundings per year) and any three of ``A'' (Accumulated balance), ``P'' (Principal balance), ``r'' (interest Rate), and ``t'' (Time to withdrawal), this function will return the fourth. If all five values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a ``solve_for'' key which points to one of ``A'', ``P'', ``r'', or ``t''. All values are case insensitive.
savings_plan pmt => $pmt, r => $r, t => $t, n => $n; savings_plan A => $A, pmt => $pmt, r => $r, t => $t, n => $n, solve_for => $q;
Given ``n'' (Number of deposits per year), ``r'' (interest Rate), and any two of ``A'' (Accumulated balance), ``pmt'' (Payment amount), and ``t'' (Time to withdrawal), this function will return the third. If all five values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a ``solve_for'' key which points to one of ``A'', ``pmt'', ``r'', or ``t''. All values are case insensitive.
loan_payment pmt => $pmt, r => $r, t => $t, n => $n; loan_payment L => $L, pmt => $pmt, r => $r, t => $t, n => $n, solve_for => $q;
Given ``n'' (Number of deposits per year), ``r'' (interest Rate), and any two of ``L'' (Loan amount), ``pmt'' (Payment amount), and ``t'' (Time to full payback), this function will return the third. If all five values are provided (presumedly one of them will be undefined or contain garbage) then you must provide a ``solve_for'' key which points to one of ``A'', ``pmt'', ``r'', or ``t''. All values are case insensitive.
union( $L1, $L2, ... )
Return the list of (string) elements which appear in any of the given arrays. Objects are stringified, and the string values are returned. This may be upgraded to be smarter someday.
intersection( $L1, $L2, ... )
Return the list of (string) elements which appear in all of the given arrays. Objects are stringified, and the string values are returned. This may be upgraded to be smarter someday.
difference( $L1, $L2, ... )
Return the list of (string) elements which appear in $L1 but not in any
of the subsequent arrays. Objects are stringified, and the string values
are returned. This may be upgraded to be smarter someday.
binary_search(&@)
binary_search { $_ > 4 } @sorted_nums;
binary_search \&f, @sorted_nums;
Implements a binary search. Second argument must be an array (not a list)
and must be sorted. Returns the index of the first element for which the
function &f returns true. Returns undef if there is no such element.
Function must return true for all elements larger than desired element. To search for a particular element, the following must be done:
my $i = binary_search { $_ >= 4 } @sorted_nums;
$i = undef unless $sorted_nums[$i] == 4;
A ``fast, flexible, stable sort'' that sorts strings naturally (that is, numerical substrings are compared as numbers).
Code lifted from tye on perlmonks: http://www.perlmonks.org/?node_id=442285
Limitations: http://www.perlmonks.org/?node_id=483466
It doesn't "properly" sort negative numbers, non-fixed decimal values, nor integers larger than 2^32-1.
A fast, flexible, stable comparator that sorts strings naturally (that is, numerical substrings are compared as numbers).
Code lifted from tye on perlmonks: http://www.perlmonks.org/?node_id=442285
Limitations: http://www.perlmonks.org/?node_id=483466
It doesn't "properly" sort negative numbers, non-fixed decimal values, nor integers larger than 2^32-1.
cartesian \@list1, \@list2, ... cartesian $n1, $n2, ...
Form the cartesian product of the elements in the lists. That is, all lists
of the form [ $e1, $e2, ... ] where $e1 comes from @list1, and so
on. This function returns an array reference in scalar context, and a list
in list context.
In the second form, the lists [1..$n1], [1..$n2], ... will be
constructed, and the cartesian product of those lists will be computed.
Note however, that the two forms can not be combined, you must either
provide only arrays or only numbers.
transposed \@LoL
Transpose the (possibly non-regular) list of lists @LoL. Returns a new
list reference containing the objects in @LoL.
flatten @LoLoLoL
Will recursively run through each element of the input list and will return all components as a single large list. Lists may be arbitrarily nested and any objects which are not perl ARRAY's will be considered plain elements. The expansion is done depth-first. Returns a reference in scalar context, and the list of elements in list context.
Example:
@y = flatten [1, 2, 3], [4, 5], [[6, 7], 8, 9]; say "Hooray!" if "@y" eq "1 2 3 4 5 6 7 8 9";
find_index \&f, \@array
find_index { BLOCK } \@array
find_index { BLOCK } \@array, $start, $stop, $step
May be called with either a function or a block as the first argument. The
function will then begin at $start (or zero) and then step by $step
(or 1) until we reach $stop (or the end of the array).
$_ will be set to the current array entry which will also be passed to
the function as its only argument. Thus you may use either $_ or
$_[0] within your function.
$start may be greater then $stop in which case we will proceed
backwards. In all cases the sign of $d will be adjusted if necessary so
that we finish in finite time.
find_index_with_memory \&f, \@array
find_index_with_memory { BLOCK } \@array
find_index_with_memory { BLOCK } \@array, $start, $stop, $step
May be called with either a function or a block as the first argument. The
function will then begin at $start (or zero) and then step by $step
(or 1) until we reach $stop (or the end of the array).
The function will set the caller's $a to the previous array entry and
$b to the current array entry and will also pass the two entries to the
function as its only arguments. Thus you may use either $a, $b or
$_[0], $_[1] as the previous and current entries respectively.
$start may be greater then $stop in which case we will proceed
backwards. In all cases the sign of $d will be adjusted if necessary so
that we finish in finite time.
See also: List::Util first
first \&sub, @list # if @list is not list of arrays
first { block } @list # if @list is not list of arrays
first { block } \@list
first { block } \@list, $start_pos
Return the first item of @list for which the code returns true. Code may
be either a subroutine reference or a code block. $_ will be set to each
list entry and will also be passed in as the first (and only) argument. You
may pass @list by reference (which means that you must pass it by
reference if it contains an array reference in its first entry). If you
pass @list by reference and provide a third argument, then the tird
argument will be taken to be the first position that should be checked.
See also: List::MoreUtils first_index
first_pos \&sub, @list
first_pos { block } @list
first_pos { block } \@list, $start_pos
Return the index of the first item of @list for which the code returns
true. Code may be either a subroutine reference or a code block. $_ will
be set to each list entry and will also be passed in as the first (and
only) argument. You may pass @list by reference (which means that you
must pass it by reference if it contains an array reference in its first
entry). If you pass @list by reference and provide a third argument,
then the tird argument will be taken to be the first position that should
be checked. In this case the returned index will still correspond correctly
to a position in @list.
my %buckets = bucketize { block } @list;
my %buckets = bucketize \&tagger, @list;
my $buckets = bucketize \&tagger, @list;
Partition items into buckets given a generic tagger. Returns hash ref in
scalar context. Tagger should accept a single argument (or use $_) and
should return a tag indicating the bucket to place the item in. Function is
called in list context so that the following works as expected:
%by_file_type = bucketize { /\.([^\.]+)$/ } @images;
Also note that values are given as bound aliases, so they can also be ``cleverly'' modified:
# ("foo-bar", "foo-baz", "bip-bop")
# becomes: ( foo => ["bar","baz"], bip => ["bop"] )
my %buckets = bucketize { s/^([^-]+)-//; $1 } @x;
See also: List::MoreUtils part
($true, $false) = partition { block } @list
($true, $false) = partition \&test_func, @list
Partitions a list into two lists based on the truth value of a subroutine or block. The return value is two array references, the first of which is the elements of the original list for which the function returned true, and the second are those elements for which the function returned false.
@list_2 = even_positions @list_1; @list_2 = even_positions \@list_1;
Returns the elements of the list that have even indices. Argument may be list or arrayref, always returns a list of values.
@list_2 = odd_positions @list_1; @list_2 = odd_positions \@list_1;
Returns the elements of the list that have even indices. Argument may be list or arrayref, always returns a list of values.
suggestion_sort \@list, \@preferred
Returns @list sorted by the order of the objects in @preferred. All elements are matched as strings and elements of @list that are not in @preferred are placed at the end of the resulting list in a way that preserves their original ordering within @list.
Notes: Undefined entries will be ignored. Only the first appearence of an
element in the @preferred list will be considered. Repetions in @list
will be reduced to a single occurrence.
See also: List::MoreUtils uniq
my @u = unique @list; my @u = unique \@list; my $h = unique @list; my $h = unique \@list;
Takes a list (or reference to an array) and returns a list of unique (up to stringification) objects in apparently random order. In scalar context, a histogram (hash with objects as keys, and counts as values) is returned.
Note: List::MoreUtils::uniq preserves the original order of the elements.
lex_sort @list_of_lists
lex_sort sub{ }, @list_of_lists
Sort the lists lexicographically element-wise. The sorting subroutine may
use the package variables $a and $b or may take two arguments, but
need only worry about element-wise comparison.
Example:
lex_sort( [qw/abc ac a/], [qw/abc ab c d/], [qw/x y z/], [qw/abc ab c/] ) # gives: # ( [qw/abc ab c/], # [qw/abc ab c d/], # [qw/abc ac a/], # [qw/x y z/] # )
Similarly with numerical data using: sub{ $a <=> $b }
Pattern which matches an integer expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.
Pattern which matches an floating-point expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.
Pattern which matches an exponent part (Ex: 2.3 e -10) of a floating-point expression. Beware, this pattern allows whitespace in the string which perl may not allow when interpreting strings as numbers. You may need to remove all whitespace from strings which match this pattern.
Pattern which matches safe ``word-like'' data. This pattern does not match whitespace and most punctuation but does allow hyphens ``-'' and underscores.
Returns a true value if the argument looks like an integer expression. If
no argument is provided, $_ is examined. Beware, this subroutine allows
whitespace in the string which perl may not allow when interpreting strings
as numbers. You may need to remove all whitespace from strings for which
this subroutine returns true.
Returns a true value if the argument looks like a floating-point (or
integer) expression. If no argument is provided, $_ is examined. Beware,
this subroutine allows whitespace in the string which perl may not allow
when interpreting strings as numbers. You may need to remove all whitespace
from strings for which this subroutine returns true.
Returns a true value if the argument looks like a floating-point (or
integer) expression. If no argument is provided, $_ is examined. Beware,
this subroutine allows whitespace in the string which perl may not allow
when interpreting strings as numbers. You may need to remove all whitespace
from strings for which this subroutine returns true.
Returns a true value if the argument looks like a word. If no argument is
provided, $_ is examined. Words do not have spaces and do not typically
have punctuation, though hyphens ``-'' and underscores are allowed.
Pattern which matches image-type filename extensions. The list of extensions matched (case insensitive) are:
BMP CMYK CMYKA DCM DCX DIB DPS DPX EPI EPS EPS2 EPS3 EPSF EPSI EPT FAX FITS FPX G3 GIF GIF87 GRAY ICB ICM ICO ICON IPTC JBG JBIG JP2 JPC JPEG JPG MAP MIFF MNG MONO MPC MTV MVG OTB P7 PAL PALM PBM PCD PCDS PCL PCT PCX PDB PGM PICON PICT PIX PLASMA PNG PNM PPM PSD PTIF RAS RGB RGBA RLA RLE ROSE SGI SUN SVG TGA TIF TIFF UYVY VDA VICAR VID VIFF VST WBMP X XBM XC XCF XPM XV XWD YUV
Returns a true value if the argument looks like an image file. If no
argument is provided, $_ is examined. The ist of extensions matched
(case insensitive) are:
BMP CMYK CMYKA DCM DCX DIB DPS DPX EPI EPS EPS2 EPS3 EPSF EPSI EPT FAX FITS FPX G3 GIF GIF87 GRAY ICB ICM ICO ICON IPTC JBG JBIG JP2 JPC JPEG JPG MAP MIFF MNG MONO MPC MTV MVG OTB P7 PAL PALM PBM PCD PCDS PCL PCT PCX PDB PGM PICON PICT PIX PLASMA PNG PNM PPM PSD PTIF RAS RGB RGBA RLA RLE ROSE SGI SUN SVG TGA TIF TIFF UYVY VDA VICAR VID VIFF VST WBMP X XBM XC XCF XPM XV XWD YUV
Returns true if scalar argument is readonly. (Taken from Scalar::Util.)
Returns true if the object can behave like an array. (This is just a nicer way to call UNIVERSAL::isa)
Returns true if the object can behave like a hash. (This is just a nicer way to call UNIVERSAL::isa)
Returns true if the object can behave like a scalar. (This is just a nicer way to call UNIVERSAL::isa)
my $hashref = parse_user_agent( $string ); my %hash = parse_user_agent( $string );
Given a user-agent string returns a hash containing the following fields. Fields which can not be determined are left undefined.
Returns the generic operating system type: Windows, Mac, OS2, Linux, UNIX
Returns the specific operating system type: Windoiws Vista, Windows Server 2003, Windows XP, Windows 2000, Debian, ...
One of: browser, textbrowser, bot, downloader, mobile
Note: For this field, we try to make our best guess at which class the agent string fits into.
Quasi-canonicalized program name: Internet Explorer, Netscape, Mozilla, Firefox, wget, ...
Our best guess at the program version
The Browser's rendering engine: Gecko, KHTML, MSHTML, Presto (opera), WebCore (apple), custom (other custom engines)
The version of the rendering engine
The unmodified user-agent string
If true, the agent appears to be an obsolete web browser
Parse a string into a hash using Text::Balanced::extract_delimited. This function recognises perl 5 style hashes as well as the basic perl 6 adverbial form. Items missing a value will set the corresponding hash value to true.
Example:
str2hash 'foo, bar => "Hmmm, a comma", :baz<23>, :!bip, quxx => Spaces are fine'
Parses to:
{ foo => 1,
bar => 'Hmmm, a comma',
baz => 23,
bip => 0,
quxx => 'Spaces are fine',
}
Unfortunately, the adverbial form will behave strangely with embedded commas:
str2hash ':baz<Well, how odd>'
becomes
{ ':baz<Well' => 1,
'how odd>' => 1,
}
WARNING: still quite experimental!
unformat $fmt, @strings unformat \%options, @strings
Attempts to reverse the actions of sprintf or other
formatted output (for instance date formats or apache logs). The return
value is a list of reports (see below) unless these was only a single input
string to parse in which case unformat may be safely called in scalar
context.
The format string
Specify how to return the findings. By default just a list of matched components is returned however, we can also return the following reports:
A hash mapping conversions (or their corresponding names, if given) to their corresponding strings. BEWARE KEY COLLISION
{ ~conv, str, ~conv, str, ... }
The default, the return values are each an array of strings that could have been used to generate one of the input strings.
[ str, str, ... ]
Each return value is an array of two arrays the first of which is the list of strings returned by the ``list'' option. The second is the conversion instructions giving each corresponding string.
[ [ str, str, ... ], [ conv, conv, ... ] ]
Note, in this case, each list of conversions is an array reference pointing to the same array, so altering one will alter them all.
Each return value is a flat array of pairs:
[ conv, str, conv, str, ... ]
Return a regular expression that will match the given pattern. In scalar context just the list is returned. In list context the conversions will be returned also.
( regex, conv, conv, ... )
Each return value is an array of arrays each with two elements. First the conversion instruction and second the string that it matched.
[ [conv, str], [conv, str], ... ]
In all ases except for the hash, the conversion instructions are the precise ones given in the format string, including any formatting options. For the hash however, the conversion are the simplified two-character labels (E.g. ``%s'' instead of ``% 35s'').
Additionally, the escape '%%' is treated as a string literal '%' and will not appear in any of the report types. A ``formatted percent'' (for instance ``%-05%'') will pass through the conversions and will appear in the reports if you define a special conversion for it (since we define no standard conversion for this case).
A hash of aliases between conversion types. Use this to map your custom
conversion (for instance from the date formatting commands) to standard
perl conversions. Conversions of the form ( a => "s" ) will preserve
formatting options while aliaes that start with '%' ( Y => "%04d" )
will use the formatting options ``04'' rather than any options that may have
appeared before the ``Y''. (Which would presumedly cause ``0035'' to parse to
35.) Conversion aliases are searched before conversions or special
conversions. Once can also add aliases that include the conversion options
to override other behavior ( '02Y' => '%02d', Y => 's' ).
A hash of conversions as in the conversions option but these conversions will be added to the list of standard conversions and will be consulted first should a standard conversion type appear in this listing.
A hash of conversions ( type => action ). Each ``type'' is simply the
conversion type (E.g. the ``s'' in ``%- 10s'') and each action is a pattern
that CAPTURES (preferrably non-greedily) the conversion type (for instance
(s => '(.*?)')). The action could also be a subroutine which accepts
two arguments. First the formatting options and second the conversion type.
For instance, a sub action for the ``f'' conversion type might convert its
arguments (".1", "f") into the pattern '(\d+\.\d{1})'.
Be sure that all of your conversions produce a pattern that captures exactly one substring.
Specifying this option replaces the built-in conversions which attempt to reverse the sandard perl conversions listed in the sprintf documentation.
If defined and a hash then the conversions in the above reports will be transformed by this hash. conversions will be first searched for in their full form (including formatting options) both with and without their leading '%', then searched for under only the converions type (both with and without the '%'). Anything not appearing in the conversion map will be treated normally as described above.
Default: '(%([^a-zA-Z%]*)([%a-zA-Z]))'
Should capture three strings. The entire conversion pattern, any formatting options that may be present, and the conversion type. The default pattern captures single character conversions as well as the '%' escape (``%%''). See also the ``Limitations'' below.
Limitations: format conversions are assumed to be one character long. That is, conversions like ``%ld'' will be interpreted as ``%l''. This can be fixed by altering the conversion_pattern but I don't have the need to be careful about it. If you code up a more careful parser and are willing to share, feel free to send it and I will add it in.
Also, no locale information is considered. sprinf considers the ``LC_NUMERIC'' value to affect how numbers are formatted. We do not make such considerations here.
rtf2txt( file => $filename_or_handle ) rtf2txt( string => $rtf_text )
rtf2txt( $existing_file ) rtf2txt( $rtf_text )
nicef( $num, $digits )
Nicely formats sprintf(``%.${digits}f'', $num);
Given a string like ``4in'' or ``2ft - 7in'', return the value as a number of
points (72 points per inch). undef is returned if we can't parse the
string.
Recognized units:
pt in, ft, mi km, m, cm, mm, nm
my $url = uri_rel2abs( $path, $base )
Converts a path into an absolute path based at the given base unless the path is already absolute. Any file part of the base is ignored.
This subroutine is should be a proper rfc3986 uri parser as it is simply calls URI->new_abs. However, proper parsing pays a penalty in execution time. Compare the benchmarks between uri_rel2abs and uri_rel2abs_fast:
Rate URI FAST
URI 208/s -- -93%
FAST 3012/s 1350% --
my $url = uri_rel2abs_fast( $path, $base )
Converts a path into an absolute path based at the given base unless the path is already absolute. Any file part of the base is ignored.
This subroutine is not and will likely never be a reasonable implementation of a proper rfc3986 uri parser. At the moment, however, it appears to be ``good enough'' for typical web address (http, ftp, mms, ...) handling.
The uri_rel2abs function uses the URI module to properly produce an absolute uri, however at a significant speed cost.
Rate URI FAST
URI 208/s -- -93%
FAST 3012/s 1350% --
Constructs a regular expression pattern (string) that matches the same
patterns as the given glob. The pattern matches a whole string and is
anchored using ^ and $ unless the glob ends with * in which case
the trailing .*$ will be removed. Keep this in mind if you wish to
capture the pattern matched by the glob.
Current capabilities:
* match many chars; ? match one char
\** matches '\*Hello', \\\** matches "\\*Hello"
[abc] match a character, [^abc] don't match chars, {foo,bar} match options
Current restrictions:
str($)Returns string form of argument (forces string context) if it is defined, otherwise returns the empty string.
Replaces unsightly Extended Windows characters with reasonable ASCII equivalents.
See: http://www.cs.tut.fi/~jkorpela/www/windows-chars.html
Remove all space from the provided argument. If the argument is undefined, return the empty string.
sign($)Returns ``+'' or ``-'' depending on the sign of the argument.
nsign($)Returns ``'' or ``-'' depending on the sign of the argument.
Replace CRLF, CR, LF with the Perl magic \n. Arguments are modified
in-place. If no arguments are provided then $_ is altered instead. Any
undefined arguments are ignored. (though canonicalize_newlines(undef)
will not alter $_).
Replace CRLF, CR, LF with the Perl magic \n. Arguments are copied before
canonicalization. If no arguments are provided then $_ is used instead.
Any undefined arguments result in undefined output values.
Transform a reasonable (case-insensitive) abbreviations (or plural forms) of ``second'', ``minute'', ``hour'', ``day'', ``week'', ``month'', ``year'' into one of these canonical forms. Whitespace and mumerical values are allowed at the beginning of the string and will be ignored (and not included in the return value).
NOTE: minutes are preferred over months, thus ``m'' will return ``minute'' rather than ``month''.
qbash($)Returns a string quoted for bash-like shells. The string must contin only
printable characters or whitespace, otherwise the subroutine will die.
The return value is an untainted string wrapped in single quotes ' that
is ready (and safe) to pass to a shell.
stringify( $thing, %options )
Stringifies Perl objects (SCALAR, HASH, or ARRAY based). Stringifies only a single object at a time, and accepts the options below. Note: CODE, GLOB, LVALUE, and Regexp references are not supported.
By default, overloaded stringification will be respected. Set this option to true to stringify the underlying object rather than use its overload function.
List which describes how lists are translated.
DEFAULT: [ "[", ",", "]" ]
List which describes how hashes are translated.
DEFAULT: [ "{", "=>", ",", "}" ]
simple_range2list @ranges
Expand ``#,#..#,#-#,a..z,a-z,2:23,2:5:23,a:5:zz'' strings to lists. Beginning
ending blocks may be anything matching [\w\.]+, though I'm not sure how
well underscores will behave. Commas may separate multiple range chunks.
A plain value v (numerical or non-numerical) will produce the range
1..v or 'a'..v.
If no step size is given, The standard perl .. is used to expand the range.
Ranges with step sizes are incremented by the step size (may only be decimal valued if both start and end values are numerical) until the value exceeds the right hand value.
canonicalize_filename $f; $new = canonicalize_filename $f; canonicalize_filename $f, %options;
Removes anything too exotic from the file name $f. In void context,
$f is modified, otherwise, $f is left unaltered and the modified file
name is returned. In all cases the canonicalized name will be untainted.
The following options will affect the bahavior of this subroutine. The
default values are shown:
If a string value, invalid characters will be replaced with this value. If a hash reference then characters will be replaced by their corresponding values. Any values not present in the replacement hash will be replaced with the value in the 'DEFAULT' key (if present) or the empty string.
Must be one of 'print', 'basic', or a pattern matching A SINGLE legal character. The 'print' class will allow just about anything through that is not a control character including unicode characters and punctuation if your perl supports that. The 'basic' class should only allow characters that do not require escaping or quoting in a linux shell (currently allows: \w-+.~%).
If true, subdirectory separators will be allowed (uses File::Spec to determine volume and directory separators for your system).
If false, each invalid character will be replaced separately. If the value is 'like' then, repeated illegal values are replaced by only a single replacement value. If the value is any true value other than 'dwim' then, consecutive illegal values (even if they do not match) will be replaced with the replacement value for the first illegal character in the substring. Finally, if the value is 'dwim' then a replacement hash will cause the ``like'' behavior and a replacement string will result in ``true'' behavior.
Example:
%replace = ( replacement => { ':' => "-", " " => "+" } );
# 'dwim' default using replacement hash: gives "foo-+bar" canonicalize_filename( "foo: bar", allow => 'basic', %replace );
# 'dwim' default using replacement string: gives "foo-bar" canonicalize_filename( "foo: bar", allow => 'basic', replacement => "-" );
Trim leading/trailing whitespace. Trims $_ if no arguments provided. In
void context, the arguments are altered, otherwise they are not changed and
the trimmed values are returned.
Simply calls: DateTime->now(time_zone => ``local'');
This exists because I always forget how to properly get a current DateTime object.
Behaves like localtime in scalar context, but returns the date as ``YYYY-MM-DD''. Returns the components of that string in list context.
Behaves like localtime in scalar context, but returns the date as ``YYYY-MM-DD HH:MM:SS''. Returns the components of that string in list context. Hours are presented in 24 hour format.
seconds2human( seconds, start-unit, end-unit )
Convert an arbitrary number of seconds to a ``nice'' human-readable form. the
second and third arguments are optional and specify the first and last time
units presented (note specifying a start unit rounds the precision of your
result to the given unit). The resulting data are separated by the value of
$". Units available are: seconds, minutes, hours, days, months, and
years. If the input seconds include a decimal portion, then the seconds
value will be rounded to three places using the format "%.3f".
Example:
seconds2human 99999999, 'd', 'mos.' # gives: "38 months 17 days"
local $" = ', '; seconds2human 99999999, 'm', 'hour' # gives: "27777 hours, 46 minutes"
seconds2hms $sec seconds2hms $sec, $sep
Convert an arbitrary number of seconds to a ``hh:mm:ss'' string. The ``hh'' portion of the string will always be at least two digits long (but may be more if more than 99 hours are represented by given number of seconds.
seconds2time $sec seconds2time $sec, $pad seconds2time $sec, %options
Convert a number of seconds (from 0 to 86400) to a ``h:mm AM/PM'' string. If
a second $pad parameter is given, that symbol will be used to force the
hour portion to be precisely 2 characters wide (typical values are 0 and
`` ''). You may also fully specify ``pad'', ``AM'', ``PM'', and ``sep''
(separator, default ``:'') options. The AM and PM atrings should include a
leading space if you want it.
Converts a human-written string of a timespan expressed in various abbreviations of seconds, minutes, hours, days, weeks, months, and years into an integer representing the same time span in seconds.
Subroutine dies if it is incapable of parsing the input string.
Examples:
human2seconds "3 dys. 2hr 15m" # 260820 human2seconds "3q 2wk" # dies: doesn't recognise 3q
A hash containing mappings between various months and abbreviations to their full month names (all keys are lowercase):
month => Month mon => Month mon. => Month ## => Month # => Month
Also includes 4 letter keys for September.
A hash containing mappings between various months and abbreviations to their two digit month numbers (all keys are lowercase):
month => ## mon => ## mon. => ## # => ##
Also includes 4 letter keys for September.
Given a list of sizes (possibly negative) converts each entry to its corresponding number of bytes, sums the values and then converts the result back to a human readable size. Prefixes are computed base 2 (K = 1024, M = 1048576, ...).
Example:
print size_sum qw/1.5MB -650kB -1253kB/;
DEPRECATED: size_sum now uses MB and MiB
Given a list of sizes (possibly negative) converts each entry to its corresponding number of bytes, sums the values and then converts the result back to a human readable size. Prefixes are treated as standard SI prefixes (K = 1000, M = 1000000, ...).
Example:
print size_sum_SI qw/1.5MB -650kB -1253kB/;
Given a string like ``4MB'' or ``3TiB - 400G'', return the value as a number of
bytes. undef is returned if we can't parse the string. Prefixes are
computed base 2 (Ki = 1024, Mi = 1048576, ...) or using standard SI
prefixes (K + 1000, M = 1000000).
Given a string like ``4MB'' or ``3TB - 400G'', return the value as a number of
bytes. undef is returned if we can't parse the string. Prefixes are
computed base 2 (K = 1024, M = 1048576, ...).
DEPRECATED: size2bytes now uses MB and MiB
Given a string like ``4MB'' or ``3TB - 400G'', return the value as a number of
bytes. undef is returned if we can't parse the string. Prefixes are
treated as standard SI prefixes (K = 1000, M = 1000000, ...).
Print a human-readable string of the form 20.4MiB from the corresponding number of bytes (an integer). An optional second parameter specifies the minimal digits of accuracy which is 3 by default, 1.21 but 12.1). An optional third parameter specifies the minimum number of digits after the decimal place to keep which is 1 by default. Prefixes are computed using either base 2 (Ki = 1024, Mi = 1048576, ...).
DEPRECATED: bytes2size now emits KiB, MiB, ...
Print a human-readable string of the form 20.4MB from the corresponding number of bytes (an integer). An optional second parameter specifies the minimal digits of accuracy which is 3 by default, 1.21 but 12.1). An optional third parameter specifies the minimum number of digits after the decimal place to keep which is 1 by default. Prefixes are treated as standard SI prefixes (K = 1000, M = 1000000, ...).
Read only filehandle
my $fh = rofh $filename; my $fh = rofh \$mode, $filename;
Simply performs an open or croak with an appropriate message. If a string
reference $mode is provided as a first argument it will be taken as the
file mode (the default is ``<'').
Write only filehandle
my $fh = wofh $filename; my $fh = wofh \$mode, $filename;
Simply performs an open or croak with an appropriate message. If a string
reference $mode is provided as a first argument it will be taken as the
file mode (the default is ``>'').
Read-write filehandle
my $fh = rwfh $filename; my $fh = rwfh \$mode, $filename;
Simply performs an open or croak with an appropriate message. If a string
reference $mode is provided as a first argument it will be taken as the
file mode (the default is ``+<'').
Read only compressed filehandle
my $fh = rofhz $filename; my $fh = rofhz \$mode, $filename;
Simply performs an open or croak with an appropriate message. Requires perl
compiled with PerlIO support (perl 5.8, I believe). The gzip PerlIO layer
is loaded with the autopop option so that uncompressed files can be open
using this function. If a string reference $mode is provided as a first
argument it will be taken as the file mode (the default is
``<:gzip(autopop)'').
Note: To properly decode UTF-8 files use the mode ``<:gzip(autopop):encoding(UTF-8)''
Write only compressed filehandle
my $fh = wofhz $filename; my $fh = wofhz \$mode, $filename;
Simply performs an open or croak with an appropriate message. Requires perl
compiled with PerlIO support (perl 5.8, I believe). If a string reference
$mode is provided as a first argument it will be taken as the file mode
(the default is ``>:gzip:encoding(UTF-8)'').
Note: To properly encode UTF-8 files use the mode ``>:gzip:encoding(UTF-8)''
Read-write compressed filehandle
my $fh = rwfhz $filename; my $fh = rwfhz \$mode, $filename;
Simply performs an open or croak with an appropriate message. Requires perl
compiled with PerlIO support (perl 5.8, I believe). The gzip PerlIO layer
is loaded with the autopop option so that uncompressed files can be open
using this function. If a string reference $mode is provided as a first
argument it will be taken as the file mode (the default is
``+<:gzip(autopop)'').
Note: To properly decode UTF-8 files use the mode ``+<:gzip(autopop):encoding(UTF-8)''
touch @files; touch \MODE @files;
Create files using optional numeric mode (e.g: touch \0700, ``foo''). If files exist, their atime and mtime will be updated to the current time.
Like canonpath command in File::Spec, but only works on unix filesystems (also cygwin if $^O eq 'cygwin'). However, it will clean up ``/../'' components wheras File::Spec->canonpath will not.
The code has been modified from File::Spec::Unix::canonpath in the PathTools package by Ken Williams.
my @foos = fmap { s/^FOO: (.*)/$_Util::fmap::file: '$1' line $./ } @files
my @foos = fmap { s/^FOO: (.*)/$_Util::fmap::file: '$1' line $./ } \%options, @files
Transform files. Loop through the lines of each file and apply a function.
Replace each line with the new value of $_. The current file name will
be available in the variable $_Util::fmap::file and will be one of the
entries in the file list given to the subroutine. Of course, the standard
perl variable $. ($INPUT_LINE_NUMBER when use English; is in
effect) will be available for your use as well.
In scalar or list context returns a hashref (or hash) of (filename =>
[ new contents ]) pairs. The values are arrayrefs containing the modified
lines of each file.
In void context, alters files in-place, just like using perl -pi -e from
the command line.
File mode when reading the file (the default is simply ``<'').
File mode when writing the file (the default is simply ``>'').
If a single character string (E.g., '~') or if starts with a leading dot (E.g., '.bak'), is appended to the filename as a backup suffix, Otherswise is treated as the backup file name ((E.g., 'old_foo'). The default is '~'.
my @foos = fgrep { s/^FOO: (.*)/$_Util::fgrep::file: '$1' line $./ } @files
my @foos = fgrep { s/^FOO: (.*)/$_Util::fgrep::file: '$1' line $./ } \"<:encoding(UTF-8)", @files
Grep files. Loop through the lines of each file and apply a function. If
the function returns a true value then $_ (after the function
application) will be appended to a list to be returned. The current file
name will be available in the variable $_Util::fgrep::file and will be
one of the entries in the file list given to the subroutine. Of course, the
standard perl variable $. ($INPUT_LINE_NUMBER when use English; is
in effect) will be available for your use as well.
In scalar context just the number of matches will be returned.
NOTE: If you want to chomp your lines note that the last line of a file may
not contain a newline (or whatever $/ is) so use something like
either of the following:
my @foos = fgrep { chomp; /^FOO/ } @files;
my @foos = fgrep { /^FOO/ and chomp || 1 } @files;
If a string reference $mode is provided as the first argument after the
subroutine block it will be taken as the file mode (the default is simply
``<'').
#XXX: BUGS!
Currently not entirely correct but getting better. Known bugs:
* -mindepth available but broken
* not thoroughly tested given its complexity
my @files = find [ '/' ], qw/-type f -name *.pm/;
File::Find using find(1) semantics. Currently supported find options are
given below (descriptions taken from find(1)). Unlike find, this
subroutine defaults to returning the list of matches rather than defaulting
to the -print action. Tests are performed in the order specified so a
failure early on will prevent further tests/actions from being perfomed.
Note: this function will never be a full find2perl replacement.
Process each directory's contents before the directory itself.
Dereference symbolic links. This is the option that most closely follows find(1)'s behavior but is not a perfect match. In particular, a symbolic link which (if followed) would actually result in a circular reference will be processed by find(1), but not by this function.
NOTE: This option corresponds to the follow_fast option to File::Find
Dereference symbolic links. Circular references (as weel as links that would cause a circular reference) will be automatically removed (symbolic links will only appear if the ``real'' file would not have been found otherwise). Dangling symbolic links will be ignored.
NOTE: This option corresponds to the follow option to File::Find
Descend at most levels (a non-negative integer) levels of directories below the command line arguments. '-maxdepth 0' means only apply the tests and actions to the command line arguments.
Disable ``Permission denied'' warnings for unreadable directories.
Tests
Like -name, but the match is case insensitive. For example, the patterns 'fo*' and 'F??' match the file names 'Foo', 'FOO', 'foo', 'fOo', etc.
Like -regex, but the match is case insensitive.
Base of file name (the path with the leading directories removed) matches shell pattern pattern. The metacharacters ('*', '?', and '[]') do not match a '.' at the start of the base name.
File name matches regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named './fubar3', you can use the regular expression '.*bar.' or '.*b.*3', but not 'b.*r3'.
File is of type ``char'':
b block (buffered) special c character (unbuffered) special d directory p named pipe (FIFO) f regular file l symbolic link s socket D door (Solaris)
Actions
Execute subroutine; The subroutine is executed in the directory containing the file and is passed three parameters: the file's name, the current directory (relative to the starting directory), the files's full path (relative to the starting directory). If the ``-follow'' option is provided then the ``true'' filename (all symbolic links resolved) will be provided as a fourth argument.
print the full file name on the standard output, followed by a null character. This allows file names that contain new-lines to be correctly interpreted by programs that process the find output.
print the full file name on the standard output, followed by a newline.
Discard and prune any files for which any test fails.
Discard and prune any hidden files. At the moment this means anything starting with '.' since I don't know how to detect ``hidden'' files on any systems other than linux.
Like -prune_name, but the match is case insensitive. For example, the patterns 'fo*' and 'F??' match the file names 'Foo', 'FOO', 'foo', 'fOo', etc.
Discard and prune any files where base of file name (the path with the leading directories removed) matches shell pattern pattern. The metacharacters ('*', '?', and '[]') do not match a '.' at the start of the base name.
Discard and prune any files for which an -exec clause returns false.
Discard and prune any files or directories that look like they belong to a revision control system. At the moment this means any directories named: ``.svn'', ``CVS'', ``blib'', ``{arch}'', ``.bzr'', ``_darcs'', ``RCS'', ``SCCS'', ``.git'', ``.pc''
Discard and prune any files or directories that look like backups. This includes anything ending in ``~'' or ``.bak'', matching ``#*#'', or ending in ``.tmp'' or matching ``.tmp-[_a-zA-Z0-9]+''
Discard and prune any names matching the regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named './fubar3', you can use the regular expression '.*bar.' or '.*b.*3', but not 'b.*r3'.
Main Limitations:
No grouping via (), no -or.
Returns true if first file is newer than second file. Also returns true if first file exists but second does not.
my $line = lastline $file; my $line = lastline "<:encoding(UTF-8)", $file;
Returns the last line of a file. Currently this iterates through each line of the file since I don't think that there is a better way to do it.
By default the input will not be decoded. Either provide an initial scalar reference containing the file mode (with proper encoding, for example \``<:encoding(UTF-8)'') or decode the string before using it.
fprint $filename, @stuff fprint \$mode, $filename, @stuff
Prints stuff to the indicated filename. If a mode is provided (for example,
\">:encoding(UTF-8)") then it will be used instead of the default
mode (``>'').
fprint_bu $filename, @stuff fprint_bu \$mode, $filename, @stuff
Prints stuff to the indicated filename, but backup filename (by appending a
~) first. If a mode is provided (for example, \">:encoding(UTF-8)")
then it will be used instead of the default mode (``>'').
fappend $filename, @stuff fappend \$mode, $filename, @stuff
Append stuff to the indicated filename. If a mode is provided (for example,
\">>:encoding(UTF-8)") then it will be used instead of the
default mode (``>>'').
fincrement $filename fincrement $filename, $amount fincrement $filename, pre => $pre, post => $post, layers => $perlio_layers fincrement $filename, $amount, pre => $pre, post => $post
Increments the number contained in $filename. On success, the new value
is returned (Note: may be zero if $filename contained ``-1''). On failure,
undef is returned.
The amount to add to the file's value may be provided. If it is missing,
then a value of one is assumed. The optional parameters $pre and
$post specify strings to print to the file before and after the number.
These strings default to the empty string and a single newline
respectively.
Note: $filename must contain only a number (with possible whitespace),
or must exactly contain the concatenation of $pre, number, and $post.
If $filename does not exist, then it will be initialized to ``0''
The ``layers'' option can be used to set the PerlIO layers for the opened files (for example layers => ``:encoding(UTF-8)''). By default, no layers are applied.
my $stuff = cat $file; my $stuff = cat \$mode, $file;
Read in the entirety of a file. If requested in list context, the lines are
returned. In scalar context, the file is returned as one large string. If a
string reference $mode is provided as a first argument it will be taken
as the file mode (the default is ``<'').
Read in the entirety of a binary file. If requested in list context, the lines are returned. In scalar context, the file is returned as one large string.
bu_open $file bu_open $fh, $file bu_open $fh, $file, "$file.bak" bu_open \$mode, $file bu_open \$mode, $fh, $file bu_open \$mode, $fh, $file, "$file.bak"
($writer, $reader) = bu_open \$mode, $file
Backup and open. The general idea is, if the file exists, rename it by appending a ``~'' to its name, then open the original name in write mode. This sub croaks if any operation fails. The backup file is created new so that the inode of the original file does not change.
If only a single string variable argument is given and the function is called in void context, then the requested file is backed up and opened, ``upgrading'' the given argument to a filehandle. Example:
$file = "foo"; bu_open $file; # Note: bu_open "foo"; would be a fatal error print $file "Bar\n";
In scalar context, $file is unchanged and a write-onlyfilehandle is returned.
In list context, a filehandle for both the new file (write only) and the backup (read only) are returned.
If a mode is provided as a SCALAR reference (for example, \">:encoding(UTF-8)")
then it will be used instead of the default mode (``>'').
If two arguments are given, the first will be used to store the newly opened filehandle, and the second should hold the file name.
Finally, the final argument (if provided) will be used for the backup file
(rather than the $file argument with a ``~'' appended).
Calls the File::Spec catfile and canonpath methods.
Unnecessary! use Cwd::realpath
my $results = safe_pipe [ 'command', 'arg' ], @input; my @results = safe_pipe [ 'command', 'arg' ], @input;
Pipe data to a shell command safely (without touching a command line) and retrieve the results. Notably, this is the situation that IPC::Open2 says that is dangerous (may block forever) using open2.
Code from merlyn:
http://www.perlmonks.org/index.pl?node_id=339092
Note: Input and output will not be encoded/decoded thus should be octets.
NOCOLOR(__PACKAGE__) if !$opt{color};
NOCOLOR() if !$opt{color};
Replaces subroutines and package variables whose name matches one of the names in the :color_subs or :color_strings export tags with inert versions which do not insert any color sequences. Subroutines are replaced by the identity function and strings are replaced with the empty string. The default package is the caller's current package.
WARNING: This subroutine has no good way of knowing that the subroutines
and variables that it finds are really color subroutines and variables. It
does however check that subroutines have a '$' prototype and it only has
access to package variables (those not declared by my). This combined
with the fact that there is only so many things that a function called
``BLUE'' could reasonably do means that this should not generally be a
problem.
SUBS affected:
BOLD UNDERLINE DARK BLINK REVERSE CONCEALED STRIKE BLACK RED GREEN YELLOW BLUE MAGENTA CYAN WHITE GREY GRAY BRIGHT_RED BRIGHT_GREEN BRIGHT_YELLOW BRIGHT_BLUE BRIGHT_MAGENTA BRIGHT_CYAN ON_BLACK ON_RED ON_GREEN ON_YELLOW ON_BLUE ON_MAGENTA ON_CYAN ON_WHITE ON_GREY ON_GRAY ON_BRIGHT_RED ON_BRIGHT_GREEN ON_BRIGHT_YELLOW ON_BRIGHT_BLUE ON_BRIGHT_MAGENTA ON_BRIGHT_CYAN
SCALARS affected:
$BOLD $BOLD_OFF $UNDERLINE $UNDERLINE_OFF $DARK $DARK_OFF $BLINK $BLINK_OFF $REVERSE $REVERSE_OFF $CONCEALED $CONCEALED_OFF $STRIKE $STRIKE_OFF $NORMAL $DEFAULT_FG $DEFAULT_BG $BLACK $RED $GREEN $YELLOW $BLUE $MAGENTA $CYAN $WHITE $GREY $GRAY $BRIGHT_RED $BRIGHT_GREEN $BRIGHT_YELLOW $BRIGHT_BLUE $BRIGHT_MAGENTA $BRIGHT_CYAN $ON_BLACK $ON_RED $ON_GREEN $ON_YELLOW $ON_BLUE $ON_MAGENTA $ON_CYAN $ON_WHITE $ON_GREY $ON_GRAY $ON_BRIGHT_RED $ON_BRIGHT_GREEN $ON_BRIGHT_YELLOW $ON_BRIGHT_BLUE $ON_BRIGHT_MAGENTA $ON_BRIGHT_CYAN
my $rgb = hsl2rgb( $H, $S, $L ); my @colors = hsl2rgb( @hsl_colors );
Convert HSL colors (triples from 0 to 1) to RGB colors (triples from 0 to 255).
rainbow( $n ); rainbow( $n, %colors_options);
Return a list of $n rainbow colors (ROYGBIV).
Any options supported by colors can be provided and will be passed along, including the n and colors options, so you probably don't want to include those options.
Convert a wavelength (a number between 380 nm and 780 nm) to a RGB triplet. Returns undef if given an out-of-range wavelength.
Formulas taken from Dan Bruton's color science page (http://members.cox.net/astro7/color.html).
A precompiled regular expression that matches any of the colors or font manipulations provided in this package.
Remove the color tags from a list of strings. The uncolored strings are returned. Does not modify the input strings and can be used on constant strings.
Remove the color tags from a list of strings. The uncolored strings are returned. Modifies the input strings and therefore may not be used on constant strings.
Compute the length of a possibly colored string. The standard perl length function gets confused about how long a colored or decorated string is. This function fixes that so that you can center or align data.
A hash of color names => escape sequences. Included are text style sequences,
BOLD UNDERLINE DARK BLINK REVERSE CONCEALED
Also, the following colors:
BLACK GREY GRAY WHITE RED GREEN YELLOW BLUE MAGENTA CYAN BRIGHT_RED BRIGHT_GREEN BRIGHT_YELLOW BRIGHT_BLUE BRIGHT_MAGENTA BRIGHT_CYAN
And their corresponding backgrounds:
ON_BLACK ON_GREY ON_GRAY ON_WHITE ON_RED ON_GREEN ON_YELLOW ON_BLUE ON_MAGENTA ON_CYAN
ON_BRIGHT_RED ON_BRIGHT_GREEN ON_BRIGHT_YELLOW ON_BRIGHT_BLUE ON_BRIGHT_MAGENTA ON_BRIGHT_CYAN
At the most basic level, converts colors to different formats, however this subroutine is capable of quite a bit more than that.
Examples:
colors [qw/red green blue/], format => "ps"; colors [qw/red green blue/], format => "ps", n => 2;
A list of colors, can be an X11 color name or any of the other formats recognised by Color::Calc.
Only return n colors.
If false, requesting more colors than available in the colors list will throw a fatal error. The default is to create new colors between the given colors if there are insufficient colors provided. The interpolate command will also cause colors to be interpolated if the distribute option is set.
By default, if fewer colors are requested than are contained in the colors list, this subroutine will select the first n colors. Providing a true value for distribute will cause the subroutine to evenly spread out the choice of colors over the range of colors provided (if n > 2 then the first and last colors are guaranteed to be included).
Specify the style of the returned colors. Can be anything supported by Color::Calc which is currently (Color::Calc::VERSION == 1.0): ``tuple'', ``hex'', ``html'', ``object'' (a Graphics::ColorObject object), ``pdf''. The default format is ``object''.
The following formats are also accepted and are handled by this subroutine directly: ``ps'' | ``postscript''.
Try to make the colors appear on the given background color. Colors will be altered if this option is provided.
BOLD($)Make text bold
DARK($)Make text dark
UNDERLINE($)Make text underline
BLINK($)Make text blink
REVERSE($)Make text reverse
CONCEALED($)Make text concealed
STRIKE($)Strikethrough text (rarely implemented)
BLACK($)Make text black
RED($)Make text red
GREEN($)Make text green
YELLOW($)Make text yellow
BLUE($)Make text blue
MAGENTA($)Make text magenta
CYAN($)Make text cyan
WHITE($)Make text white
GREY($)Make text grey
GRAY($)Make text gray
BRIGHT_RED($)Make text bright_red
BRIGHT_GREEN($)Make text bright_green
BRIGHT_YELLOW($)Make text bright_yellow
BRIGHT_BLUE($)Make text bright_blue
BRIGHT_MAGENTA($)Make text bright_magenta
BRIGHT_CYAN($)Make text bright_cyan
ON_BLACK($)Make text on_black
ON_RED($)Make text on_red
ON_GREEN($)Make text on_green
ON_YELLOW($)Make text on_yellow
ON_BLUE($)Make text on_blue
ON_MAGENTA($)Make text on_magenta
ON_CYAN($)Make text on_cyan
ON_WHITE($)Make text on_white
ON_GREY($)Make text on_grey
ON_GRAY($)Make text on_gray
ON_BRIGHT_RED($)Make text on_bright_red
ON_BRIGHT_GREEN($)Make text on_bright_green
ON_BRIGHT_YELLOW($)Make text on_bright_yellow
ON_BRIGHT_BLUE($)Make text on_bright_blue
ON_BRIGHT_MAGENTA($)Make text on_bright_magenta
ON_BRIGHT_CYAN($)Make text on_bright_cyan
Undo all color modifications
Remove foreground coloring
Remove background coloring
Make text bold
Undo make text bold
Make text dark
Undo make text dark
Make text underline
Undo make text underline
Make text blink
Undo make text blink
Make text reverse
Undo make text reverse
Make text concealed
Undo make text concealed
Make text strikethrough
Undo make text strikethrough
Make text black
Make text red
Make text green
Make text yellow
Make text blue
Make text magenta
Make text cyan
Make text white
Make text grey
Make text gray
Make text bright_red
Make text bright_green
Make text bright_yellow
Make text bright_blue
Make text bright_magenta
Make text bright_cyan