Expressions
Since perl expressions work almost exactly like C expressions, only the differences will be men- tioned here.
Here’s what perl has that C doesn’t:
** The exponentiation operator.
**= The exponentiation assignment operator.
( ) The null list, used to initialize an array to null.
. Concatenation of two strings.
.= The concatenation assignment operator.
eq String equality (== is numeric equality). For a mnemonic just think of ‘‘eq’’ as a string. (If you are used to the awk behavior of using == for either string or numeric equality based on the cur- rent form of the comparands, beware! You must be explicit here.)
ne String inequality (!= is numeric inequality). lt String less than.
gt String greater than.
le String less than or equal.
ge String greater than or equal.
cmp String comparison, returning -1, 0, or 1.
<=> Numeric comparison, returning -1, 0, or 1.
=˜ Certain operations search or modify the string ‘‘$_’’ by default. This operator makes that kind of operation work on some other string. The right argument is a search pattern, substitution, or translation. The left argument is what is supposed to be searched, substituted, or translated instead of the default ‘‘$_’’. The return value indicates the success of the operation. (If the right argument is an expression other than a search pattern, substitution, or translation, it is interpreted as a search pattern at run time. This is less efficient than an explicit search, since the pattern must be compiled every time the expression is evaluated.) The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else.
!˜ Just like =˜ except the return value is negated.
x The repetition operator. Returns a string consisting of the left operand repeated the number of times specified by the right operand. In an array context, if the left operand is a list in parens, it repeats the list.
print ´−´ x 80;# print row of dashes print ´−´ x80;# illegal, x80 is identifier
print "\t" x ($tab/8), ´ ´ x ($tab%8);# tab over
@ones = (1) x 80;# an array of 80 1’s @ones = (5) x @ones;# set all elements to 5
x= The repetition assignment operator. Only works on scalars.
. . The range operator, which is really two different operators depending on the context. In an array context, returns an array of values counting (by ones) from the left value to the right value. This is useful for writing ‘‘for (1..10)’’ loops and for doing slice operations on arrays.
In a scalar context, . . returns a boolean value. The operator is bistable, like a flip-flop, and emu- lates the line-range (comma) operator of sed, awk, and various editors. Each . . operator main- tains its own boolean state. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. (It doesn’t become false till the next time the range operator is evaluated. It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once. If you don’t want it to test the right operand till the next evaluation (as in sed), use three dots (. . .) instead of two.) The right operand is not evaluated while the operator is in the ‘‘false’’ state, and the left operand is not evaluated while the operator is in the ‘‘true’’ state. The precedence is a little lower than and &&. The value returned is either the null string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string ´E0´ appended to it, which doesn’t affect its numeric value, but gives you something to search for if you want to exclude the endpoint. You can exclude the beginning point by waiting for the sequence number to be greater than 1. If either operand of scalar . . is static, that operand is implicitly compared to the $. variable, the current line number. Examples:
As a scalar operator:
if (101 . . 200) { print; }# print 2nd hundred lines next line if (1 . . /ˆ$/);# skip header lines
s/ˆ/> / if (/ˆ$/ . . eof());# quote body
As an array operator:
for (101 . . 200) { print; }# print $_ 100 times
@foo = @foo[$[ . . $#foo];# an expensive no-op @foo = @foo[$#foo-4 . . $#foo];# slice last 5 items
−x A file test. This unary operator takes one argument, either a filename or a filehandle, and tests the associated file to see if something is true about it. If the argument is omitted, tests $_, except for
−t, which tests STDIN. It returns 1 for true and ´´ for false, or the undefined value if the file doesn’t exist. Precedence is higher than logical and relational operators, but lower than arith- metic operators. The operator may be any of:
−rFile is readable by effective uid/gid.
−wFile is writable by effective uid/gid.
−xFile is executable by effective uid/gid.
−oFile is owned by effective uid.
−RFile is readable by real uid/gid.
−WFile is writable by real uid/gid.
−XFile is executable by real uid/gid.
−OFile is owned by real uid.
−eFile exists.
−zFile has zero size.
−sFile has non-zero size (returns size).
−fFile is a plain file.
−dFile is a directory.
−lFile is a symbolic link.
−pFile is a named pipe (FIFO).
−SFile is a socket.
−bFile is a block special file.
−cFile is a character special file.
−uFile has setuid bit set.
−gFile has setgid bit set.
−kFile has sticky bit set.
−tFilehandle is opened to a tty.
−TFile is a text file.
−BFile is a binary file (opposite of −T).
−MAge of file in days when script started.
−ASame for access time.
−CSame for inode change time.
The interpretation of the file permission operators −r, −R, −w, −W, −x and −X is based solely on the mode of the file and the uids and gids of the user. There may be other reasons you can’t actu- ally read, write or execute the file. Also note that, for the superuser, −r, −R, −w and −W always return 1, and −x and −X return 1 if any execute bit is set in the mode. Scripts run by the supe- ruser may thus need to do a stat() in order to determine the actual mode of the file, or temporarily set the uid to something else.
Example:
while (<>) { chop;
next unless −f $_;# ignore specials
...
}
Note that −s/a/b/ does not do a negated substitution. Saying −exp($foo) still works as expected, however— only single letters following a minus are interpreted as file tests.
The −T and −B switches work as follows. The first block or so of the file is examined for odd characters such as strange control codes or metacharacters. If too many odd characters (>10%) are found, it’s a −B file, otherwise it’s a −T file. Also, any file containing null in the first block is considered a binary file. If −T or −B is used on a filehandle, the current stdio buffer is examined rather than the first block. Both −T and −B return TRUE on a null file, or a file at EOF when test- ing a filehandle.
If any of the file tests (or either stat operator) are given the special filehandle consisting of a solitary underline, then the stat structure of the previous file test (or stat operator) is used, saving a system call. (This doesn’t work with −t, and you need to remember that lstat and -l will leave values in the stat structure for the symbolic link, not the real file.) Example:
print "Can do.\n" if -r $a -w _ -x _;
stat($filename);
print "Readable\n" if -r _; print "Writable\n" if -w _; print "Executable\n" if -x _; print "Setuid\n" if -u _; print "Setgid\n" if -g _; print "Sticky\n" if -k _; print "Text\n" if -T _;
print "Binary\n" if -B _;
Here is what C has that perl doesn’t: unary & Address-of operator.
unary * Dereference-address operator. (TYPE) Type casting operator.
Like C, perl does a certain amount of expression evaluation at compile time, whenever it determines that all of the arguments to an operator are static and have no side effects. In particular, string concatena- tion happens at compile time between literals that don’t do variable substitution. Backslash interpretation also happens at compile time. You can say
´Now is the time for all´ . " \ n" .
´good men to come to.´
and this all reduces to one string internally.
The autoincrement operator has a little extra built-in magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the vari- able has only been used in string contexts since it was set, and has a value that is not null and matches the pattern /ˆ[a−zA−Z]*[0−9]*$/, the increment is done as a string, preserving each character within its range, with carry:
print ++($foo = ´99´); # prints ‘100’ print ++($foo = ´a0´); # prints ‘a1’ print ++($foo = ´Az´); # prints ‘Ba’ print ++($foo = ´zz´); # prints ‘aaa’
The autodecrement is not magical.
The range operator (in an array context) makes use of the magical autoincrement algorithm if the minimum and maximum are strings. You can say
@alphabet = (´A´ .. ´Z´);
to get all the letters of the alphabet, or
$hexdigit = (0 .. 9, ´a´ .. ´f´)[$num & 15]; to get a hexadecimal digit, or
@z2 = (´01´ .. ´31´); print @z2[$mday];
to get dates with leading zeros. (If the final value specified is not in the sequence that the magical incre- ment would produce, the sequence goes until the next value would be longer than the final value specified.)
The and && operators differ from C’s in that, rather than returning 0 or 1, they return the last value evaluated. Thus, a portable way to find out the home directory might be:
$home = $ENV{’HOME’} $ENV{’LOGDIR’}
(getpwuid($<))[7] die "You’re homeless!\n";
Along with the literals and variables mentioned earlier, the operations in the following section can serve as terms in an expression. Some of these operations take a LIST as an argument. Such a list can con- sist of any combination of scalar arguments or array values; the array values will be included in the list as if each individual element were interpolated at that point in the list, forming a longer single-dimensional array value. Elements of the LIST should be separated by commas. If an operation is listed both with and with- out parentheses around its arguments, it means you can either use it as a unary operator or as a function call. To use it as a function call, the next token on the same line must be a left parenthesis. (There may be intervening white space.) Such a function then has highest precedence, as you would expect from a func- tion. If any token other than a left parenthesis follows, then it is a unary operator, with a precedence depending only on whether it is a LIST operator or not. LIST operators have lowest precedence. All other unary operators have a precedence greater than relational operators but less than arithmetic operators. See the section on Precedence.
For operators that can be used in either a scalar or array context, failure is generally indicated in a scalar context by returning the undefined value, and in an array context by returning the null list. Remem- ber though that THERE IS NO GENERAL RULE FOR CONVERTING A LIST INTO A SCALAR. Each
operator decides which sort of scalar it would be most appropriate to return. Some operators return the length of the list that would have been returned in an array context. Some operators return the first value in the list. Some operators return the last value in the list. Some operators return a count of successful opera- tions. In general, they do what you want, unless you want consistency.
/PATTERN/
See m/PATTERN/.
?PATTERN?
This is just like the /pattern/ search, except that it matches only once between calls to the reset operator. This is a useful optimization when you only want to see the first occurrence of some- thing in each file of a set of files, for instance. Only ?? patterns local to the current package are reset.
accept(NEWSOCKET,GENERICSOCKET)
Does the same thing that the accept system call does. Returns true if it succeeded, false other- wise. See example in section on Interprocess Communication.
alarm(SECONDS) alarm SECONDS
Arranges to have a SIGALRM delivered to this process after the specified number of seconds (minus 1, actually) have elapsed. Thus, alarm(15) will cause a SIGALRM at some point more than 14 seconds in the future. Only one timer may be counting at once. Each call disables the previous timer, and an argument of 0 may be supplied to cancel the previous timer without start- ing a new one. The returned value is the amount of time remaining on the previous timer.
atan2(Y,X)Returns the arctangent of Y/X in the range −π to π. bind(SOCKET,NAME)
Does the same thing that the bind system call does. Returns true if it succeeded, false otherwise. NAME should be a packed address of the proper type for the socket. See example in section on Interprocess Communication.
binmode(FILEHANDLE) binmode FILEHANDLE
Arranges for the file to be read in ‘‘binary’’ mode in operating systems that distinguish between binary and text files. Files that are not read in binary mode have CR LF sequences translated to LF on input and LF translated to CR LF on output. Binmode has no effect under Unix. If FILE- HANDLE is an expression, the value is taken as the name of the filehandle.
caller(EXPR)
caller Returns the context of the current subroutine call:
($package,$filename,$line) = caller;
With EXPR, returns some extra information that the debugger uses to print a stack trace. The value of EXPR indicates how many call frames to go back before the current one.
chdir(EXPR) chdir EXPR
Changes the working directory to EXPR, if possible. If EXPR is omitted, changes to home direc- tory. Returns 1 upon success, 0 otherwise. See example under die.
chmod(LIST) chmod LIST
Changes the permissions of a list of files. The first element of the list must be the numerical mode. Returns the number of files successfully changed.
$cnt = chmod 0755, ´foo´, ´bar´; chmod 0755, @executables;
chop(LIST) chop(VARIABLE)
chop VARIABLE
chop Chops off the last character of a string and returns the character chopped. It’s used primarily to remove the newline from the end of an input record, but is much more efficient than s/\n// because it neither scans nor copies the string. If VARIABLE is omitted, chops $_. Example:
while (<>) {
chop;# avoid \n on last field @array = split(/:/);
...
}
You can actually chop anything that’s an lvalue, including an assignment: chop($cwd = `pwd`);
chop($answer = <STDIN>);
If you chop a list, each element is chopped. Only the value of the last chop is returned. chown(LIST)
chown LIST
Changes the owner (and group) of a list of files. The first two elements of the list must be the NUMERICAL uid and gid, in that order. Returns the number of files successfully changed.
$cnt = chown $uid, $gid, ´foo´, ´bar´; chown $uid, $gid, @filenames;
Here’s an example that looks up non-numeric uids in the passwd file:
print "User: ";
$user = <STDIN>; chop($user);
print "Files: "
$pattern = <STDIN>; chop($pattern);
open(pass, ´/etc/passwd´) die "Can’t open passwd: $!\n";
while (<pass>) { ($login,$pass,$uid,$gid) = split(/:/);
$uid{$login} = $uid;
$gid{$login} = $gid;
}
@ary = <${pattern}>;# get filenames if ($uid{$user} eq ´´) {
die "$user not in passwd file";
}
else {
chown $uid{$user}, $gid{$user}, @ary;
}
chroot(FILENAME) chroot FILENAME
Does the same as the system call of that name. If you don’t know what it does, don’t worry about it. If FILENAME is omitted, does chroot to $_.
close(FILEHANDLE)
close FILEHANDLE
Closes the file or pipe associated with the file handle. You don’t have to close FILEHANDLE if you are immediately going to do another open on it, since open will close it for you. (See open.) However, an explicit close on an input file resets the line counter ($.), while the implicit close done by open does not. Also, closing a pipe will wait for the process executing on the pipe to complete, in case you want to look at the output of the pipe afterwards. Closing a pipe explicitly also puts the status value of the command into $?. Example:
open(OUTPUT, ´sort >foo´);# pipe to sort
. . .# print stuff to output
close OUTPUT;# wait for sort to finish open(INPUT, ´foo´);# get sort’s results
FILEHANDLE may be an expression whose value gives the real filehandle name. closedir(DIRHANDLE)
closedir DIRHANDLE
Closes a directory opened by opendir().
connect(SOCKET,NAME)
Does the same thing that the connect system call does. Returns true if it succeeded, false other- wise. NAME should be a package address of the proper type for the socket. See example in sec- tion on Interprocess Communication.
cos(EXPR)
cos EXPRReturns the cosine of EXPR (expressed in radians). If EXPR is omitted takes cosine of $_. crypt(PLAINTEXT,SALT)
Encrypts a string exactly like the crypt() function in the C library. Useful for checking the pass- word file for lousy passwords. Only the guys wearing white hats should do this.
dbmclose(ASSOC_ARRAY) dbmclose ASSOC_ARRAY
Breaks the binding between a dbm file and an associative array. The values remaining in the associative array are meaningless unless you happen to want to know what was in the cache for the dbm file. This function is only useful if you have ndbm.
dbmopen(ASSOC,DBNAME,MODE)
This binds a dbm or ndbm file to an associative array. ASSOC is the name of the associative array. (Unlike normal open, the first argument is NOT a filehandle, even though it looks like one). DBNAME is the name of the database (without the .dir or .pag extension). If the database does not exist, it is created with protection specified by MODE (as modified by the umask). If your system only supports the older dbm functions, you may perform only one dbmopen in your program. If your system has neither dbm nor ndbm, calling dbmopen produces a fatal error.
Values assigned to the associative array prior to the dbmopen are lost. A certain number of val- ues from the dbm file are cached in memory. By default this number is 64, but you can increase it by preallocating that number of garbage entries in the associative array before the dbmopen. You can flush the cache if necessary with the reset command.
If you don’t have write access to the dbm file, you can only read associative array variables, not set them. If you want to test whether you can write, either use file tests or try setting a dummy array entry inside an eval, which will trap the error.
Note that functions such as keys() and values() may return huge array values when used on large dbm files. You may prefer to use the each() function to iterate over large dbm files. Example:
# print out history file offsets dbmopen(HIST,’/usr/lib/news/history’,0666); while (($key,$val) = each %HIST) {
print $key, ’ = ’, unpack(’L’,$val), "\n";
}
dbmclose(HIST);
defined(EXPR) defined EXPR
Returns a boolean value saying whether the lvalue EXPR has a real value or not. Many opera- tions return the undefined value under exceptional conditions, such as end of file, uninitialized variable, system error and such. This function allows you to distinguish between an undefined null string and a defined null string with operations that might return a real null string, in particu- lar referencing elements of an array. You may also check to see if arrays or subroutines exist. Use on predefined variables is not guaranteed to produce intuitive results. Examples:
print if defined $switch{’D’};
print "$val\n" while defined($val = pop(@ary)); die "Can’t readlink $sym: $!"
unless defined($value = readlink $sym); eval ’@foo = ()’ if defined(@foo);
die "No XYZ package defined" unless defined %_XYZ; sub foo { defined &$bar ? &$bar(@_) : die "No bar"; }
See also undef. delete $ASSOC{KEY}
Deletes the specified value from the specified associative array. Returns the deleted value, or the undefined value if nothing was deleted. Deleting from $ENV{} modifies the environment. Deleting from an array bound to a dbm file deletes the entry from the dbm file.
The following deletes all the values of an associative array:
foreach $key (keys %ARRAY) { delete $ARRAY{$key};
}
(But it would be faster to use the reset command. Saying undef %ARRAY is faster yet.)
die(LIST)
die LIST Outside of an eval, prints the value of LIST to STDERR and exits with the current value of $! (errno). If $! is 0, exits with the value of ($? >> 8) (`command` status). If ($? >> 8) is 0, exits with 255. Inside an eval, the error message is stuffed into $@ and the eval is terminated with the undefined value.
Equivalent examples:
die "Can’t cd to spool: $!\n" unless chdir ´/usr/spool/news´; chdir ´/usr/spool/news´ die "Can’t cd to spool: $!\n"
If the value of EXPR does not end in a newline, the current script line number and input line number (if any) are also printed, and a newline is supplied. Hint: sometimes appending ‘‘, stopped’’ to your message will cause it to make better sense when the string ‘‘at foo line 123’’ is appended. Suppose you are running script ‘‘canasta’’.
die "/etc/games is no good";
die "/etc/games is no good, stopped"; produce, respectively
/etc/games is no good at canasta line 123.
/etc/games is no good, stopped at canasta line 123.
See also exit. do BLOCK
Returns the value of the last command in the sequence of commands indicated by BLOCK. When modified by a loop modifier, executes the BLOCK once before testing the loop condition. (On other statements the loop modifiers test the conditional first.)
do SUBROUTINE (LIST)
Executes a SUBROUTINE declared by a sub declaration, and returns the value of the last expres- sion evaluated in SUBROUTINE. If there is no subroutine by that name, produces a fatal error. (You may use the ‘‘defined’’ operator to determine if a subroutine exists.) If you pass arrays as part of LIST you may wish to pass the length of the array in front of each array. (See the section on subroutines later on.) The parentheses are required to avoid confusion with the ‘‘do EXPR’’ form.
SUBROUTINE may also be a single scalar variable, in which case the name of the subroutine to execute is taken from the variable.
As an alternate (and preferred) form, you may call a subroutine by prefixing the name with an ampersand: &foo(@args). If you aren’t passing any arguments, you don’t have to use parenthe- ses. If you omit the parentheses, no @_ array is passed to the subroutine. The & form is also used to specify subroutines to the defined and undef operators:
if (defined &$var) { &$var($parm); undef &$var; }
do EXPR Uses the value of EXPR as a filename and executes the contents of the file as a perl script. Its primary use is to include subroutines from a perl subroutine library.
do ´stat.pl´; is just like
eval `cat stat.pl`;
except that it’s more efficient, more concise, keeps track of the current filename for error mes- sages, and searches all the −I libraries if the file isn’t in the current directory (see also the @INC array in Predefined Names). It’s the same, however, in that it does reparse the file every time you call it, so if you are going to use the file inside a loop you might prefer to use −P and #include, at the expense of a little more startup time. (The main problem with #include is that cpp doesn’t grok # comments— a workaround is to use ‘‘;#’’ for standalone comments.) Note that the follow- ing are NOT equivalent:
do $foo;# eval a file
do $foo();# call a subroutine
Note that inclusion of library routines is better done with the ‘‘require’’ operator. dump LABEL
This causes an immediate core dump. Primarily this is so that you can use the undump program to turn your core dump into an executable binary after having initialized all your variables at the beginning of the program. When the new binary is executed it will begin by executing a "goto LABEL" (with all the restrictions that goto suffers). Think of it as a goto with an intervening core dump and reincarnation. If LABEL is omitted, restarts the program from the top. WARN- ING: any files opened at the time of the dump will NOT be open any more when the program is reincarnated, with possible resulting confusion on the part of perl. See also −u.
Example:
#!/usr/bin/perl require ’getopt.pl’; require ’stat.pl’;
%days = ( ’Sun’,1,
’Mon’,2,
’Tue’,3,
’Wed’,4,
’Thu’,5,
’Fri’,6,
’Sat’,7);
dump QUICKSTART if $ARGV[0] eq ’-d’; QUICKSTART:
do Getopt(’f’);
each(ASSOC_ARRAY)
each ASSOC_ARRAY
Returns a 2 element array consisting of the key and value for the next value of an associative array, so that you can iterate over it. Entries are returned in an apparently random order. When the array is entirely read, a null array is returned (which when assigned produces a FALSE (0) value). The next call to each() after that will start iterating again. The iterator can be reset only by reading all the elements from the array. You must not modify the array while iterating over it. There is a single iterator for each associative array, shared by all each(), keys() and values() func- tion calls in the program. The following prints out your environment like the printenv program, only in a different order:
while (($key,$value) = each %ENV) { print "$key=$value\n";
}
See also keys() and values(). eof(FILEHANDLE)
eof()
eof Returns 1 if the next read on FILEHANDLE will return end of file, or if FILEHANDLE is not open. FILEHANDLE may be an expression whose value gives the real filehandle name. (Note that this function actually reads a character and then ungetc’s it, so it is not very useful in an inter- active context.) An eof without an argument returns the eof status for the last file read. Empty parentheses () may be used to indicate the pseudo file formed of the files listed on the command line, i.e. eof() is reasonable to use inside a while (<>) loop to detect the end of only the last file. Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop. Exam- ples:
# insert dashes just before last line of last file while (<>) {
if (eof()) {
print "−−−−−−−−−−−−− −\n";
}
print;
}
# reset line numbering on each input file while (<>) {
print "$.\t$_";
if (eof) {# Not eof(). close(ARGV);
}
}
eval(EXPR) eval EXPR eval BLOCK
EXPR is parsed and executed as if it were a little perl program. It is executed in the context of the current perl program, so that any variable settings, subroutine or format definitions remain afterwards. The value returned is the value of the last expression evaluated, just as with subrou- tines. If there is a syntax error or runtime error, or a die statement is executed, an undefined value is returned by eval, and $@ is set to the error message. If there was no error, $@ is guaranteed to be a null string. If EXPR is omitted, evaluates $_. The final semicolon, if any, may be omitted from the expression.
Note that, since eval traps otherwise-fatal errors, it is useful for determining whether a particular feature (such as dbmopen or symlink) is implemented. It is also Perl’s exception trapping mecha- nism, where the die operator is used to raise exceptions.
If the code to be executed doesn’t vary, you may use the eval-BLOCK form to trap run-time errors without incurring the penalty of recompiling each time. The error, if any, is still returned in $@. Evaluating a single-quoted string (as EXPR) has the same effect, except that the eval- EXPR form reports syntax errors at run time via $@, whereas the eval-BLOCK form reports syn- tax errors at compile time. The eval-EXPR form is optimized to eval-BLOCK the first time it succeeds. (Since the replacement side of a substitution is considered a single-quoted string when you use the e modifier, the same optimization occurs there.) Examples:
# make divide-by-zero non-fatal
eval { $answer = $a / $b; }; warn $@ if $@;
# optimized to same thing after first use eval ’$answer = $a / $b’; warn $@ if $@;
# a compile-time error eval { $answer = };
# a run-time error
eval ’$answer =’;# sets $@
exec(LIST)
exec LISTIf there is more than one argument in LIST, or if LIST is an array with more than one value, calls execvp() with the arguments in LIST. If there is only one scalar argument, the argument is checked for shell metacharacters. If there are any, the entire argument is passed to ‘‘/bin/sh −c’’ for parsing. If there are none, the argument is split into words and passed directly to execvp(), which is more efficient. Note: exec (and system) do not flush your output buffer, so you may need to set $ to avoid lost output. Examples:
exec ´/bin/echo´, ´Your arguments are: ´, @ARGV; exec "sort $outfile uniq";
If you don’t really want to execute the first argument, but want to lie to the program you are executing about its own name, you can specify the program you actually want to run by assigning that to a variable and putting the name of the variable in front of the LIST without a comma. (This always forces interpretation of the LIST as a multi-valued list, even if there is only a single scalar in the list.) Example:
$shell = ’/bin/csh’;
exec $shell ’-sh’;# pretend it’s a login shell
exit(EXPR)
exit EXPREvaluates EXPR and exits immediately with that value. Example:
$ans = <STDIN>;
exit 0 if $ans =˜ / ˆ[Xx] / ;
See also die. If EXPR is omitted, exits with 0 status. exp(EXPR)
exp EXPRReturns e to the power of EXPR. If EXPR is omitted, gives exp($_).
fcntl(FILEHANDLE,FUNCTION,SCALAR)
Implements the fcntl(2) function. You’ll probably have to say require "fcntl.ph";# probably /usr/local/lib/perl/fcntl.ph
first to get the correct function definitions. If fcntl.ph doesn’t exist or doesn’t have the correct definitions you’ll have to roll your own, based on your C header files such as <sys/fcntl.h>. (There is a perl script called h2ph that comes with the perl kit which may help you in this.) Argu- ment processing and value return works just like ioctl below. Note that fcntl will produce a fatal error if used on a machine that doesn’t implement fcntl(2).
fileno(FILEHANDLE) fileno FILEHANDLE
Returns the file descriptor for a filehandle. Useful for constructing bitmaps for select(). If FILE- HANDLE is an expression, the value is taken as the name of the filehandle.
flock(FILEHANDLE,OPERATION)
Calls flock(2) on FILEHANDLE. See manual page for flock(2) for definition of OPERATION. Returns true for success, false on failure. Will produce a fatal error if used on a machine that doesn’t implement flock(2). Here’s a mailbox appender for BSD systems.
$LOCK_SH = 1;
$LOCK_EX = 2;
$LOCK_NB = 4;
$LOCK_UN = 8;
sub lock { flock(MBOX,$LOCK_EX);
# and, in case someone appended # while we were waiting... seek(MBOX, 0, 2);
}
sub unlock { flock(MBOX,$LOCK_UN);
}
open(MBOX, ">>/usr/spool/mail/$ENV{’USER’}")
die "Can’t open mailbox: $!";
do lock();
print MBOX $msg,"\n\n"; do unlock();
fork Does a fork() call. Returns the child pid to the parent process and 0 to the child process. Note: unflushed buffers remain unflushed in both processes, which means you may need to set $ to avoid duplicate output.
getc(FILEHANDLE) getc FILEHANDLE
getc Returns the next character from the input file attached to FILEHANDLE, or a null string at EOF. If FILEHANDLE is omitted, reads from STDIN.
getlogin Returns the current login from /etc/utmp, if any. If null, use getpwuid.
$login = getlogin (getpwuid($<))[0] "Somebody";
getpeername(SOCKET)
Returns the packed sockaddr address of other end of the SOCKET connection.
# An internet sockaddr
$sockaddr = ’S n a4 x8’;
$hersockaddr = getpeername(S);
($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr);
getpgrp(PID) getpgrp PID
Returns the current process group for the specified PID, 0 for the current process. Will produce a fatal error if used on a machine that doesn’t implement getpgrp(2). If EXPR is omitted, returns process group of current process.
getppid Returns the process id of the parent process.
getpriority(WHICH,WHO)
Returns the current priority for a process, a process group, or a user. (See getpriority(2).) Will produce a fatal error if used on a machine that doesn’t implement getpriority(2).
getpwnam(NAME) getgrnam(NAME) gethostbyname(NAME) getnetbyname(NAME) getprotobyname(NAME) getpwuid(UID) getgrgid(GID)
getservbyname(NAME,PROTO) gethostbyaddr(ADDR,ADDRTYPE) getnetbyaddr(ADDR,ADDRTYPE) getprotobynumber(NUMBER) getservbyport(PORT,PROTO) getpwent
getgrent gethostent getnetent getprotoent getservent setpwent setgrent
sethostent(STAYOPEN) setnetent(STAYOPEN) setprotoent(STAYOPEN) setservent(STAYOPEN) endpwent
endgrent endhostent endnetent endprotoent
endserventThese routines perform the same functions as their counterparts in the system library. Within an array context, the return values from the various get routines are as follows:
($name,$passwd,$uid,$gid,
$quota,$comment,$gcos,$dir,$shell) = getpw. . . ($name,$passwd,$gid,$members) = getgr. . . ($name,$aliases,$addrtype,$length,@addrs) = gethost. . . ($name,$aliases,$addrtype,$net) = getnet. . . ($name,$aliases,$proto) = getproto. . . ($name,$aliases,$port,$proto) = getserv. . .
(If the entry doesn’t exist you get a null list.)
Within a scalar context, you get the name, unless the function was a lookup by name, in which
case you get the other thing, whatever it is. (If the entry doesn’t exist you get the undefined value.) For example:
$uid = getpwnam
$name = getpwuid
$name = getpwent
$gid = getgrnam
$name = getgrgid
$name = getgrent etc.
The $members value returned by getgr... is a space separated list of the login names of the mem- bers of the group.
For the gethost... functions, if the h_errno variable is supported in C, it will be returned to you via $? if the function call fails. The @addrs value returned by a successful call is a list of the raw addresses returned by the corresponding system library call. In the Internet domain, each address is four bytes long and you can unpack it by saying something like:
($a,$b,$c,$d) = unpack(’C4’,$addr[0]);
getsockname(SOCKET)
Returns the packed sockaddr address of this end of the SOCKET connection.
# An internet sockaddr
$sockaddr = ’S n a4 x8’;
$mysockaddr = getsockname(S);
($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr);
getsockopt(SOCKET,LEVEL,OPTNAME)
Returns the socket option requested, or undefined if there is an error. gmtime(EXPR)
gmtime EXPR
Converts a time as returned by the time function to a 9-element array with the time analyzed for the Greenwich timezone. Typically used as follows:
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time);
All array elements are numeric, and come straight out of a struct tm. In particular this means that
$mon has the range 0. .11 and $wday has the range 0. .6. If EXPR is omitted, does gmtime(time).
goto LABEL
Finds the statement labeled with LABEL and resumes execution there. Currently you may only go to statements in the main body of the program that are not nested inside a do {} construct. This statement is not implemented very efficiently, and is here only to make the sed-to- translator easier. I may change its semantics at any time, consistent with support for translated sed scripts. Use it at your own risk. Better yet, don’t use it at all.
grep(EXPR,LIST)
Evaluates EXPR for each element of LIST (locally setting $_ to each element) and returns the array value consisting of those elements for which the expression evaluated to true. In a scalar context, returns the number of times the expression was true.
@foo = grep(!/ˆ#/, @bar); # weed out comments
Note that, since $_ is a reference into the array value, it can be used to modify the elements of the array. While this is useful and supported, it can cause bizarre results if the LIST is not a named array.
hex(EXPR)
hex EXPRReturns the decimal value of EXPR interpreted as an hex string. (To interpret strings that might start with 0 or 0x see oct().) If EXPR is omitted, uses $_.
index(STR,SUBSTR,POSITION) index(STR,SUBSTR)
Returns the position of the first occurrence of SUBSTR in STR at or after POSITION. If POSI- TION is omitted, starts searching from the beginning of the string. The return value is based at 0, or whatever you’ve set the $[ variable to. If the substring is not found, returns one less than the base, ordinarily −1.
int(EXPR)
int EXPR Returns the integer portion of EXPR. If EXPR is omitted, uses $_. ioctl(FILEHANDLE,FUNCTION,SCALAR)
Implements the ioctl(2) function. You’ll probably have to say require "ioctl.ph";# probably /usr/local/lib/perl/ioctl.ph
first to get the correct function definitions. If ioctl.ph doesn’t exist or doesn’t have the correct definitions you’ll have to roll your own, based on your C header files such as <sys/ioctl.h>. (There is a perl script called h2ph that comes with the perl kit which may help you in this.) SCALAR will be read and/or written depending on the FUNCTION— a pointer to the string value of SCALAR will be passed as the third argument of the actual ioctl call. (If SCALAR has no string value but does have a numeric value, that value will be passed rather than a pointer to the string value. To guarantee this to be true, add a 0 to the scalar before using it.) The pack() and unpack() functions are useful for manipulating the values of structures used by ioctl(). The following example sets the erase character to DEL.
require ’ioctl.ph’;
$sgttyb_t = "ccccs";# 4 chars and a short if (ioctl(STDIN,$TIOCGETP,$sgttyb)) { @ary = unpack($sgttyb_t,$sgttyb);
$ary[2] = 127;
$sgttyb = pack($sgttyb_t,@ary); ioctl(STDIN,$TIOCSETP,$sgttyb)
die "Can’t ioctl: $!";
}
The return value of ioctl (and fcntl) is as follows:
if OS returns: perl returns:
-1 undefined value
0 string "0 but true"
anything else that number
Thus perl returns true on success and false on failure, yet you can still easily determine the actual value returned by the operating system:
($retval = ioctl(...)) ($retval = -1); printf "System returned %d\n", $retval;
join(EXPR,LIST) join(EXPR,ARRAY)
Joins the separate strings of LIST or ARRAY into a single string with fields separated by the value of EXPR, and returns the string. Example:
$_ = join( ´:´, $login,$passwd,$uid,$gid,$gcos,$home,$shell); See split.
keys(ASSOC_ARRAY)
keys ASSOC_ARRAY
Returns a normal array consisting of all the keys of the named associative array. The keys are returned in an apparently random order, but it is the same order as either the values() or each() function produces (given that the associative array has not been modified). Here is yet another way to print your environment:
@keys = keys %ENV; @values = values %ENV; while ($#keys >= 0) {
print pop(@keys), ´=´, pop(@values), "\n";
}
or how about sorted by key:
foreach $key (sort(keys %ENV)) { print $key, ´=´, $ENV{$key}, "\n";
}
kill(LIST)
kill LIST Sends a signal to a list of processes. The first element of the list must be the signal to send.
Returns the number of processes successfully signaled.
$cnt = kill 1, $child1, $child2; kill 9, @goners;
If the signal is negative, kills process groups instead of processes. (On System V, a negative pro- cess number will also kill process groups, but that’s not portable.) You may use a signal name in quotes.
last LABEL
last The last command is like the break statement in C (as used in loops); it immediately exits the loop in question. If the LABEL is omitted, the command refers to the innermost enclosing loop. The continue block, if any, is not executed:
line: while (<STDIN>) {
last line if / ˆ$/;# exit when done with header
...
}
length(EXPR) length EXPR
Returns the length in characters of the value of EXPR. If EXPR is omitted, returns length of $_.
link(OLDFILE,NEWFILE)
Creates a new filename linked to the old filename. Returns 1 for success, 0 otherwise.
listen(SOCKET,QUEUESIZE)
Does the same thing that the listen system call does. Returns true if it succeeded, false otherwise. See example in section on Interprocess Communication.
local(LIST)
Declares the listed variables to be local to the enclosing block, subroutine, eval or ‘‘do’’. All the listed elements must be legal lvalues. This operator works by saving the current values of those variables in LIST on a hidden stack and restoring them upon exiting the block, subroutine or eval. This means that called subroutines can also reference the local variable, but not the global one. The LIST may be assigned to if desired, which allows you to initialize your local variables. (If no initializer is given for a particular variable, it is created with an undefined value.) Commonly this is used to name the parameters to a subroutine. Examples:
sub RANGEVAL {
local($min, $max, $thunk) = @_; local($result) = ´´;
local($i);
# Presumably $thunk makes reference to $i for ($i = $min; $i < $max; $i++) {
$result .= eval $thunk;
}
$result;
}
if ($sw eq ´-v´) {
# init local array with global array local(@ARGV) = @ARGV; unshift(@ARGV,´echo´);
system @ARGV;
}
# @ARGV restored
# temporarily add to digits associative array if ($base12) {
# (NOTE: not claiming this is efficient!) local(%digits) = (%digits,’t’,10,’e’,11); do parse_num();
}
Note that local() is a run-time command, and so gets executed every time through a loop, using up more stack storage each time until it’s all released at once when the loop is exited.
localtime(EXPR) localtime EXPR
Converts a time as returned by the time function to a 9-element array with the time analyzed for the local timezone. Typically used as follows:
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
All array elements are numeric, and come straight out of a struct tm. In particular this means that
$mon has the range 0. .11 and $wday has the range 0. .6. If EXPR is omitted, does local- time(time).
log(EXPR)
log EXPRReturns logarithm (base e) of EXPR. If EXPR is omitted, returns log of $_. lstat(FILEHANDLE)
lstat FILEHANDLE lstat(EXPR)
lstat SCALARVARIABLE
Does the same thing as the stat() function, but stats a symbolic link instead of the file the sym- bolic link points to. If symbolic links are unimplemented on your system, a normal stat is done.
m/PATTERN/gio
/PATTERN/gio
Searches a string for a pattern match, and returns true (1) or false (´´). If no string is specified via the =˜ or !˜ operator, the $_ string is searched. (The string specified with =˜ need not be an lvalue— it may be the result of an expression evaluation, but remember the =˜ binds rather tightly.) See also the section on regular expressions.
If / is the delimiter then the initial ‘m’ is optional. With the ‘m’ you can use any pair of non- alphanumeric characters as delimiters. This is particularly useful for matching Unix path names that contain ‘/’. If the final delimiter is followed by the optional letter ‘i’, the matching is done in a case-insensitive manner. PATTERN may contain references to scalar variables, which will be interpolated (and the pattern recompiled) every time the pattern search is evaluated. (Note that $) and $ may not be interpolated because they look like end-of-string tests.) If you want such a pattern to be compiled only once, add an ‘‘o’’ after the trailing delimiter. This avoids expensive run-time recompilations, and is useful when the value you are interpolating won’t change over the life of the script. If the PATTERN evaluates to a null string, the most recent successful regular expression is used instead.
If used in a context that requires an array value, a pattern match returns an array consisting of the subexpressions matched by the parentheses in the pattern, i.e. ($1, $2, $3. . .). It does NOT actu- ally set $1, $2, etc. in this case, nor does it set $+, $‘, $& or $’. If the match fails, a null array is returned. If the match succeeds, but there were no parentheses, an array value of (1) is returned.
Examples:
open(tty, ´/dev/tty´);
<tty> =˜ / ˆy /i && do foo( );# do foo if desired if (/Version: * ([0−9.]* ) / ) { $version = $1; } next if m#ˆ/usr/spool/uucp#;
# poor man’s grep
$arg = shift; while (<>) {
print if /$arg/o;# compile only once
}
if (($F1, $F2, $Etc) = ($foo =˜ /ˆ(\S+)\s+(\S+)\s*(.*)/))
This last example splits $foo into the first two words and the remainder of the line, and assigns
those three fields to $F1, $F2 and $Etc. The conditional is true if any variables were assigned,
i.e. if the pattern matched.
The ‘‘g’’ modifier specifies global pattern matching— that is, matching as many times as possible within the string. How it behaves depends on the context. In an array context, it returns a list of all the substrings matched by all the parentheses in the regular expression. If there are no paren- theses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern. In a scalar context, it iterates through the string, returning TRUE each time it matches, and FALSE when it eventually runs out of matches. (In other words, it remembers where it left off last time and restarts the search at that point.) It presumes that you have not modified the string since the last match. Modifying the string between matches may result in undefined behav- ior. (You can actually get away with in-place modifications via substr() that do not change the length of the entire string. In general, however, you should be using s///g for such modifications.) Examples:
# array context
($one,$five,$fifteen) = (`uptime` =˜ /(\d+\.\d+)/g);
# scalar context
$/ = ""; $* = 1;
while ($paragraph = <>) {
while ($paragraph =˜ /[a-z][´")]*[.!?]+[´")]*\s/g) {
$sentences++;
}
}
print "$sentences\n";
mkdir(FILENAME,MODE)
Creates the directory specified by FILENAME, with permissions specified by MODE (as modi- fied by umask). If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno).
msgctl(ID,CMD,ARG)
Calls the System V IPC function msgctl. If CMD is &IPC_STAT, then ARG must be a variable which will hold the returned msqid_ds structure. Returns like ioctl: the undefined value for error, "0 but true" for zero, or the actual return value otherwise.
msgget(KEY,FLAGS)
Calls the System V IPC function msgget. Returns the message queue id, or the undefined value if there is an error.
msgsnd(ID,MSG,FLAGS)
Calls the System V IPC function msgsnd to send the message MSG to the message queue ID. MSG must begin with the long integer message type, which may be created with pack("L",
$type). Returns true if successful, or false if there is an error.
msgrcv(ID,VAR,SIZE,TYPE,FLAGS)
Calls the System V IPC function msgrcv to receive a message from message queue ID into vari- able VAR with a maximum message size of SIZE. Note that if a message is received, the mes- sage type will be the first thing in VAR, and the maximum length of VAR is SIZE plus the size of the message type. Returns true if successful, or false if there is an error.
next LABEL
next The next command is like the continue statement in C; it starts the next iteration of the loop:
line: while (<STDIN>) {
next line if / ˆ#/;# discard comments
...
}
Note that if there were a continue block on the above, it would get executed even on discarded lines. If the LABEL is omitted, the command refers to the innermost enclosing loop.
oct(EXPR)
oct EXPRReturns the decimal value of EXPR interpreted as an octal string. (If EXPR happens to start off with 0x, interprets it as a hex string instead.) The following will handle decimal, octal and hex in the standard notation:
$val = oct($val) if $val =˜ /ˆ0/; If EXPR is omitted, uses $_.
open(FILEHANDLE,EXPR) open(FILEHANDLE)
open FILEHANDLE
Opens the file whose filename is given by EXPR, and associates it with FILEHANDLE. If FILE- HANDLE is an expression, its value is used as the name of the real filehandle wanted. If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE contains the filename. If the filename begins with ‘‘<’’ or nothing, the file is opened for input. If the filename begins with ‘‘>’’, the file is opened for output. If the filename begins with ‘‘>>’’, the file is opened for appending. (You can put a ´+´ in front of the ´>´ or ´<´ to indicate that you want both read and write access to the file.) If the filename begins with ‘‘’’, the filename is interpreted as a com- mand to which output is to be piped, and if the filename ends with a ‘‘’’, the filename is inter- preted as command which pipes input to us. (You may not have a command that pipes both in and out.) Opening ´−´ opens STDIN and opening ´>−´ opens STDOUT. Open returns non-zero upon success, the undefined value otherwise. If the open involved a pipe, the return value hap- pens to be the pid of the subprocess. Examples:
$article = 100;
open article die "Can’t find article $article: $!\n"; while (<article>) {...
open(LOG, ´>>/usr/spool/news/twitlog´ );# (log is reserved) open(article, "caesar <$article " );# decrypt article open(extract, "sort >/tmp/Tmp$$" );# $$ is our process#
# process argument list of files along with any includes
foreach $file (@ARGV) {
do process($file, ´fh00´);# no pun intended
}
sub process {
local($filename, $input) = @_;
$input++;# this is a string increment
unless (open($input, $filename)) {
print STDERR "Can’t open $filename: $!\n"; return;
}
while (<$input>) {# note the use of indirection if (/ˆ#include "(.*)"/) {
do process($1, $input); next;
}
. . .# whatever
}
}
You may also, in the Bourne shell tradition, specify an EXPR beginning with ‘‘>&’’, in which case the rest of the string is interpreted as the name of a filehandle (or file descriptor, if numeric) which is to be duped and opened. You may use & after >, >>, <, +>, +>> and +<. The mode you specify should match the mode of the original filehandle. Here is a script that saves, redirects, and restores STDOUT and STDERR:
#!/usr/bin/perl
open(SAVEOUT, ">&STDOUT"); open(SAVEERR, ">&STDERR");
open(STDOUT, ">foo.out") die "Can’t redirect stdout"; open(STDERR, ">&STDOUT") die "Can’t dup stdout";
select(STDERR); $ = 1;# make unbuffered select(STDOUT); $ = 1;# make unbuffered
print STDOUT "stdout 1\n";# this works for print STDERR "stderr 1\n"; # subprocesses too
close(STDOUT); close(STDERR);
open(STDOUT, ">&SAVEOUT"); open(STDERR, ">&SAVEERR");
print STDOUT "stdout 2\n"; print STDERR "stderr 2\n";
If you open a pipe on the command ‘‘−’’, i.e. either ‘‘−’’ or ‘‘−’’, then there is an implicit fork done, and the return value of open is the pid of the child within the parent process, and 0 within the child process. (Use defined($pid) to determine if the open was successful.) The filehandle behaves normally for the parent, but i/o to that filehandle is piped from/to the STDOUT/ of the child process. In the child process the filehandle isn’t opened— i/o happens from/to the new STDOUT or STDIN. Typically this is used like the normal piped open when you want to exercise more control over just how the pipe command gets executed, such as when you are running setuid, and don’t want to have to scan shell commands for metacharacters. The following pairs are more or less equivalent:
open(FOO, "tr ´[a−z]´ ´[A−Z]´");
open(FOO, "−") exec ´tr´, ´[a−z]´, ´[A−Z]´;
open(FOO, "cat −n ’$file’");
open(FOO, "−") exec ´cat´, ´−n´, $file;
Explicitly closing any piped filehandle causes the parent process to wait for the child to finish, and returns the status value in $?. Note: on any operation which may do a fork, unflushed buffers remain unflushed in both processes, which means you may need to set $ to avoid duplicate out- put.
The filename that is passed to open will have leading and trailing whitespace deleted. In order to open a file with arbitrary weird characters in it, it’s necessary to protect any leading and trailing whitespace thusly:
$file =˜ s#ˆ(\s)#./$1#; open(FOO, "< $file\0");
opendir(DIRHANDLE,EXPR)
Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(), rewinddir() and closedir(). Returns true if successful. DIRHANDLEs have their own namespace separate from FILEHANDLEs.
ord(EXPR)
ord EXPRReturns the numeric ascii value of the first character of EXPR. If EXPR is omitted, uses $_. pack(TEMPLATE,LIST)
Takes an array or list of values and packs it into a binary structure, returning the string containing the structure. The TEMPLATE is a sequence of characters that give the order and type of values, as follows:
AAn ascii string, will be space padded. aAn ascii string, will be null padded. cA signed char value.
CAn unsigned char value. sA signed short value.
SAn unsigned short value. iA signed integer value.
IAn unsigned integer value. lA signed long value.
LAn unsigned long value. nA short in ‘‘network’’ order. NA long in ‘‘network’’ order.
fA single-precision float in the native format. dA double-precision float in the native format. pA pointer to a string.
vA short in ‘‘VAX’’ (little-endian) order. VA long in ‘‘VAX’’ (little-endian) order. xA null byte.
XBack up a byte.
@Null fill to absolute position. uA uuencoded string.
bA bit string (ascending bit order, like vec()). BA bit string (descending bit order).
hA hex string (low nybble first). HA hex string (high nybble first).
Each letter may optionally be followed by a number which gives a repeat count. With all types except "a", "A", "b", "B", "h" and "H", the pack function will gobble up that many values from the LIST. A * for the repeat count means to use however many items are left. The "a" and "A" types gobble just one value, but pack it as a string of length count, padding with nulls or spaces as necessary. (When unpacking, "A" strips trailing spaces and nulls, but "a" does not.) Likewise, the "b" and "B" fields pack a string that many bits long. The "h" and "H" fields pack a string that many nybbles long. Real numbers (floats and doubles) are in the native machine format only; due to the multiplicity of floating formats around, and the lack of a standard ‘‘network’’ representa- tion, no facility for interchange has been made. This means that packed floating point data writ- ten on one machine may not be readable on another - even if both use IEEE floating point arith- metic (as the endian-ness of the memory representation is not part of the IEEE spec). Note that perl uses doubles internally for all numeric calculation, and converting from double -> float -> double will lose precision (i.e. unpack("f", pack("f", $foo)) will not in general equal $foo).
Examples:
$foo = pack("cccc",65,66,67,68); # foo eq "ABCD"
$foo = pack("c4",65,66,67,68); # same thing
$foo = pack("ccxxcc",65,66,67,68); # foo eq "AB\0\0CD"
$foo = pack("s2",1,2);
# "\1\0\2\0" on little-endian # "\0\1\0\2" on big-endian
$foo = pack("a4","abcd","x","y","z"); # "abcd"
$foo = pack("aaaa","abcd","x","y","z"); # "axyz"
$foo = pack("a14","abcdefg"); # "abcdefg\0\0\0\0\0\0\0"
$foo = pack("i9pl", gmtime);
# a real struct tm (on my system anyway)
sub bintodec {
unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
}
The same template may generally also be used in the unpack function.
pipe(READHANDLE,WRITEHANDLE)
Opens a pair of connected pipes like the corresponding system call. Note that if you set up a loop of piped processes, deadlock can occur unless you are very careful. In addition, note that perl’s pipes use stdio buffering, so you may need to set $ to flush your WRITEHANDLE after each command, depending on the application. [Requires version 3.0 patchlevel 9.]
pop(ARRAY)
pop ARRAY
Pops and returns the last value of the array, shortening the array by
- Has the same effect as
$tmp = $ARRAY[$#ARRAY− −];
If there are no elements in the array, returns the undefined value. print(FILEHANDLE LIST)
print(LIST)
print FILEHANDLE LIST
print LIST
print Prints a string or a comma-separated list of strings. Returns non-zero if successful. FILEHAN- DLE may be a scalar variable name, in which case the variable contains the name of the filehan- dle, thus introducing one level of indirection. (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be misinterpreted as an operator unless you interpose a + or put parens around the arguments.) If FILEHANDLE is omitted, prints by default to standard output (or to the last selected output channel— see select()). If LIST is also omitted, prints $_ to STD- OUT. To set the default output channel to something other than STDOUT use the select opera- tion. Note that, because print takes a LIST, anything in the LIST is evaluated in an array context, and any subroutine that you call will have one or more of its expressions evaluated in an array context. Also be careful not to follow the print keyword with a left parenthesis unless you want the corresponding right parenthesis to terminate the arguments to the print— interpose a + or put parens around all the arguments.
printf(FILEHANDLE LIST)
printf(LIST)
printf FILEHANDLE LIST printf LIST
Equivalent to a ‘‘print FILEHANDLE sprintf(LIST)’’.
push(ARRAY,LIST)
Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST onto the end of ARRAY.
The length of ARRAY increases by the length of LIST. Has the same effect as
for $value (LIST) {
$ARRAY[++$#ARRAY] = $value;
}
but is more efficient. q/STRING/
qq/STRING/
qx/STRING/
These are not really functions, but simply syntactic sugar to let you avoid putting too many back- slashes into quoted strings. The q operator is a generalized single quote, and the qq operator a generalized double quote. The qx operator is a generalized backquote. Any non-alphanumeric delimiter can be used in place of /, including newline. If the delimiter is an opening bracket or parenthesis, the final delimiter will be the corresponding closing bracket or parenthesis. (Embed- ded occurrences of the closing bracket need to be backslashed as usual.) Examples:
$foo = q!I said, "You said, ´She said it.´"!;
$bar = q(´This is it.´);
$today = qx{ date };
$_ .= qq
*** The previous line contains the naughty word "$&".\n if /(ibmappleawk)/; # :-)
rand(EXPR) rand EXPR
rand Returns a random fractional number between 0 and the value of EXPR. (EXPR should be posi- tive.) If EXPR is omitted, returns a value between 0 and 1. See also srand().
read(FILEHANDLE,SCALAR,LENGTH,OFFSET) read(FILEHANDLE,SCALAR,LENGTH)
Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHAN- DLE. Returns the number of bytes actually read, or undef if there was an error. SCALAR will be grown or shrunk to the length actually read. An OFFSET may be specified to place the read data at some other place than the beginning of the string. This call is actually implemented in terms of stdio’s fread call. To get a true read system call, see sysread.
readdir(DIRHANDLE) readdir DIRHANDLE
Returns the next directory entry for a directory opened by opendir(). If used in an array context, returns all the rest of the entries in the directory. If there are no more entries, returns an unde- fined value in a scalar context or a null list in an array context.
readlink(EXPR) readlink EXPR
Returns the value of a symbolic link, if symbolic links are implemented. If not, gives a fatal error. If there is some system error, returns the undefined value and sets $! (errno). If EXPR is omitted, uses $_.
recv(SOCKET,SCALAR,LEN,FLAGS)
Receives a message on a socket. Attempts to receive LENGTH bytes of data into variable SCALAR from the specified SOCKET filehandle. Returns the address of the sender, or the unde- fined value if there’s an error. SCALAR will be grown or shrunk to the length actually read. Takes the same flags as the system call of the same name.
redo LABEL
redo The redo command restarts the loop block without evaluating the conditional again. The con- tinue block, if any, is not executed. If the LABEL is omitted, the command refers to the inner- most enclosing loop. This command is normally used by programs that want to lie to themselves about what was just input:
# a simpleminded Pascal comment stripper # (warning: assumes no { or } in strings) line: while (<STDIN>) {
while (s ({.*}.* ){.*}$1 ) {} s{.*} ;
if (s{.* ) {
$front = $_;
while (<STDIN>) {
if ( / }/ ) {# end of comment? sˆ$front{;
redo line;
}
}
}
print;
}
rename(OLDNAME,NEWNAME)
Changes the name of a file. Returns 1 for success, 0 otherwise. Will not work across filesystem boundaries.
require(EXPR) require EXPR
require Includes the library file specified by EXPR, or by $_ if EXPR is not supplied. Has semantics similar to the following subroutine:
sub require { local($filename) = @_;
return 1 if $INC{$filename}; local($realfilename,$result); ITER: {
foreach $prefix (@INC) {
$realfilename = "$prefix/$filename"; if (-f $realfilename) {
$result = do $realfilename; last ITER;
}
}
die "Can’t find $filename in \@INC";
}
die $@ if $@;
die "$filename did not return true value" unless $result;
$INC{$filename} = $realfilename;
$result;
}
Note that the file will not be included twice under the same specified name. The file must return true as the last statement to indicate successful execution of any initialization code, so it’s cus- tomary to end such a file with ‘‘1;’’ unless you’re sure it’ll return true otherwise.
reset(EXPR) reset EXPR
reset Generally used in a continue block at the end of a loop to clear variables and reset ?? searches so that they work again. The expression is interpreted as a list of single characters (hyphens allowed for ranges). All variables and arrays beginning with one of those letters are reset to their pristine state. If the expression is omitted, one-match searches (?pattern?) are reset to match again. Only resets variables or searches in the current package. Always returns 1. Examples:
reset ´X´; # reset all X variables
reset ´a−z´; # reset lower case variables
reset; # just reset ?? searches
Note: resetting ‘‘A−Z’’ is not recommended since you’ll wipe out your ARGV and ENV arrays.
The use of reset on dbm associative arrays does not change the dbm file. (It does, however, flush any entries cached by perl, which may be useful if you are sharing the dbm file. Then again, maybe not.)
return LIST
Returns from a subroutine with the value specified. (Note that a subroutine can automatically return the value of the last expression evaluated. That’s the preferred method— use of an explicit return is a bit slower.)
reverse(LIST) reverse LIST
In an array context, returns an array value consisting of the elements of LIST in the opposite order. In a scalar context, returns a string value consisting of the bytes of the first element of LIST in the opposite order.
rewinddir(DIRHANDLE) rewinddir DIRHANDLE
Sets the current position to the beginning of the directory for the readdir() routine on DIRHAN- DLE.
rindex(STR,SUBSTR,POSITION)
rindex(STR,SUBSTR)
Works just like index except that it returns the position of the LAST occurrence of SUBSTR in STR. If POSITION is specified, returns the last occurrence at or before that position.
rmdir(FILENAME) rmdir FILENAME
Deletes the directory specified by FILENAME if it is empty. If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno). If FILENAME is omitted, uses $_.
s/PATTERN/REPLACEMENT/gieo
Searches a string for a pattern, and if found, replaces that pattern with the replacement text and returns the number of substitutions made. Otherwise it returns false (0). The ‘‘g’’ is optional, and if present, indicates that all occurrences of the pattern are to be replaced. The ‘‘i’’ is also optional, and if present, indicates that matching is to be done in a case-insensitive manner. The ‘‘e’’ is likewise optional, and if present, indicates that the replacement string is to be evaluated as an expression rather than just as a double-quoted string. Any non-alphanumeric delimiter may replace the slashes; if single quotes are used, no interpretation is done on the replacement string (the e modifier overrides this, however); if backquotes are used, the replacement string is a com- mand to execute whose output will be used as the actual replacement text. If the PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own pair of quotes, which may or may not be bracketing quotes, e.g. s(foo)(bar) or s<foo>/bar/. If no string is specified via the =˜ or !˜ operator, the $_ string is searched and modified. (The string specified with =˜ must be a
scalar variable, an array element, or an assignment to one of those, i.e. an lvalue.) If the pattern contains a $ that looks like a variable rather than an end-of-string test, the variable will be inter- polated into the pattern at run-time. If you only want the pattern compiled once the first time the variable is interpolated, add an ‘‘o’’ at the end. If the PATTERN evaluates to a null string, the most recent successful regular expression is used instead. See also the section on regular expres- sions. Examples:
s/ \ bgreen\ b/mauve/g;# don’t change wintergreen
$path =˜ s /usr/bin /usr/local/bin;
s/Login: $foo/Login: $bar/; # run-time pattern ($foo = $bar) =˜ s/bar/foo/;
$_ = ´abc123xyz´;
s/\d+/$&*2/e;# yields ‘abc246xyz’ s/\d+/sprintf("%5d",$&)/e;# yields ‘abc 246xyz’ s/\w/$& x 2/eg;# yields ‘aabbcc 224466xxyyzz’
s/ ([ˆ ]* ) * ([ˆ ]*)/ $2 $1/;# reverse 1st two fields
(Note the use of $ instead of \ in the last example. See section on regular expressions.) scalar(EXPR)
Forces EXPR to be interpreted in a scalar context and returns the value of EXPR.
seek(FILEHANDLE,POSITION,WHENCE)
Randomly positions the file pointer for FILEHANDLE, just like the fseek() call of stdio. FILE- HANDLE may be an expression whose value gives the name of the filehandle. Returns 1 upon success, 0 otherwise.
seekdir(DIRHANDLE,POS)
Sets the current position for the readdir() routine on DIRHANDLE. POS must be a value returned by telldir(). Has the same caveats about possible directory compaction as the corre- sponding system library routine.
select(FILEHANDLE)
select Returns the currently selected filehandle. Sets the current default filehandle for output, if FILE- HANDLE is supplied. This has two effects: first, a write or a print without a filehandle will default to this FILEHANDLE. Second, references to variables related to output will refer to this output channel. For example, if you have to set the top of form format for more than one output channel, you might do the following:
select(REPORT1);
$ˆ = ´report1_top´; select(REPORT2);
$ˆ = ´report2_top´;
FILEHANDLE may be an expression whose value gives the name of the actual filehandle. Thus:
$oldfh = select(STDERR); $ = 1; select($oldfh);
select(RBITS,WBITS,EBITS,TIMEOUT)
This calls the select system call with the bitmasks specified, which can be constructed using fileno() and vec(), along these lines:
$rin = $win = $ein = ’’; vec($rin,fileno(STDIN),1) = 1;
vec($win,fileno(STDOUT),1) = 1;
$ein = $rin $win;
If you want to select on many filehandles you might wish to write a subroutine: sub fhbits {
local(@fhlist) = split(’ ’,$_[0]);
local($bits); for (@fhlist) {
vec($bits,fileno($_),1) = 1;
}
$bits;
}
$rin = &fhbits(’STDIN TTY SOCK’); The usual idiom is:
($nfound,$timeleft) =
select($rout=$rin, $wout=$win, $eout=$ein, $timeout); or to block until something becomes ready:
$nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);
Any of the bitmasks can also be undef. The timeout, if specified, is in seconds, which may be fractional. NOTE: not all implementations are capable of returning the $timeleft. If not, they always return $timeleft equal to the supplied $timeout.
semctl(ID,SEMNUM,CMD,ARG)
Calls the System V IPC function semctl. If CMD is &IPC_STAT or &GETALL, then ARG must be a variable which will hold the returned semid_ds structure or semaphore value array. Returns like ioctl: the undefined value for error, "0 but true" for zero, or the actual return value otherwise.
semget(KEY,NSEMS,SIZE,FLAGS)
Calls the System V IPC function semget. Returns the semaphore id, or the undefined value if there is an error.
semop(KEY,OPSTRING)
Calls the System V IPC function semop to perform semaphore operations such as signaling and waiting. OPSTRING must be a packed array of semop structures. Each semop structure can be generated with ’pack("sss", $semnum, $semop, $semflag)’. The number of semaphore operations is implied by the length of OPSTRING. Returns true if successful, or false if there is an error. As an example, the following code waits on semaphore $semnum of semaphore id $semid:
$semop = pack("sss", $semnum, -1, 0);
die "Semaphore trouble: $!\n" unless semop($semid, $semop);
To signal the semaphore, replace "-1" with "1". send(SOCKET,MSG,FLAGS,TO) send(SOCKET,MSG,FLAGS)
Sends a message on a socket. Takes the same flags as the system call of the same name. On unconnected sockets you must specify a destination to send TO. Returns the number of charac- ters sent, or the undefined value if there is an error.
setpgrp(PID,PGRP)
Sets the current process group for the specified PID, 0 for the current process. Will produce a fatal error if used on a machine that doesn’t implement setpgrp(2).
setpriority(WHICH,WHO,PRIORITY)
Sets the current priority for a process, a process group, or a user. (See setpriority(2).) Will pro- duce a fatal error if used on a machine that doesn’t implement setpriority(2).
setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)
Sets the socket option requested. Returns undefined if there is an error. OPTVAL may be speci- fied as undef if you don’t want to pass an argument.
shift(ARRAY) shift ARRAY
shift Shifts the first value of the array off and returns it, shortening the array by 1 and moving every- thing down. If there are no elements in the array, returns the undefined value. If ARRAY is omit- ted, shifts the @ARGV array in the main program, and the @_ array in subroutines. (This is determined lexically.) See also unshift(), push() and pop(). Shift() and unshift() do the same thing to the left end of an array that push() and pop() do to the right end.
shmctl(ID,CMD,ARG)
Calls the System V IPC function shmctl. If CMD is &IPC_STAT, then ARG must be a variable which will hold the returned shmid_ds structure. Returns like ioctl: the undefined value for error, "0 but true" for zero, or the actual return value otherwise.
shmget(KEY,SIZE,FLAGS)
Calls the System V IPC function shmget. Returns the shared memory segment id, or the unde- fined value if there is an error.
shmread(ID,VAR,POS,SIZE) shmwrite(ID,STRING,POS,SIZE)
Reads or writes the System V shared memory segment ID starting at position POS for size SIZE by attaching to it, copying in/out, and detaching from it. When reading, VAR must be a variable which will hold the data read. When writing, if STRING is too long, only SIZE bytes are used; if STRING is too short, nulls are written to fill out SIZE bytes. Return true if successful, or false if there is an error.
shutdown(SOCKET,HOW)
Shuts down a socket connection in the manner indicated by HOW, which has the same interpreta- tion as in the system call of the same name.
sin(EXPR)
sin EXPRReturns the sine of EXPR (expressed in radians). If EXPR is omitted, returns sine of $_. sleep(EXPR)
sleep EXPR
sleep Causes the script to sleep for EXPR seconds, or forever if no EXPR. May be interrupted by send- ing the process a SIGALRM. Returns the number of seconds actually slept. You probably can- not mix alarm() and sleep() calls, since sleep() is often implemented using alarm().
socket(SOCKET,DOMAIN,TYPE,PROTOCOL)
Opens a socket of the specified kind and attaches it to filehandle SOCKET. DOMAIN, TYPE and PROTOCOL are specified the same as for the system call of the same name. You may need to run h2ph on sys/socket.h to get the proper values handy in a perl library file. Return true if successful. See the example in the section on Interprocess Communication.
socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)
Creates an unnamed pair of sockets in the specified domain, of the specified type. DOMAIN, TYPE and PROTOCOL are specified the same as for the system call of the same name. If unim- plemented, yields a fatal error. Return true if successful.
sort(SUBROUTINE LIST)
sort(LIST)
sort SUBROUTINE LIST
sort BLOCK LIST
sort LIST Sorts the LIST and returns the sorted array value. Nonexistent values of arrays are stripped out. If SUBROUTINE or BLOCK is omitted, sorts in standard string comparison order. If SUBROU- TINE is specified, gives the name of a subroutine that returns an integer less than, equal to, or greater than 0, depending on how the elements of the array are to be ordered. (The <=> and cmp operators are extremely useful in such routines.) SUBROUTINE may be a scalar variable name, in which case the value provides the name of the subroutine to use. In place of a SUBROUTINE name, you can provide a BLOCK as an anonymous, in-line sort subroutine.
In the interests of efficiency the normal calling code for subroutines is bypassed, with the follow- ing effects: the subroutine may not be a recursive subroutine, and the two elements to be com- pared are passed into the subroutine not via @_ but as $a and $b (see example below). They are passed by reference so don’t modify $a and $b.
Examples:
# sort lexically @articles = sort @files;
# same thing, but with explicit sort routine @articles = sort {$a cmp $b} @files;
# same thing in reversed order @articles = sort {$b cmp $a} @files;
# sort numerically ascending @articles = sort {$a <=> $b} @files;
# sort numerically descending @articles = sort {$b <=> $a} @files;
# sort using explicit subroutine name sub byage {
$age{$a} <=> $age{$b};# presuming integers
}
@sortedclass = sort byage @class;
sub reverse { $b cmp $a; }
@harry = (´dog´,´cat´,´x´,´Cain´,´Abel´);
@george = (´gone´,´chased´,´yz´,´Punished´,´Axed´); print sort @harry;
# prints AbelCaincatdogx print sort reverse @harry; # prints xdogcatCainAbel
print sort @george, ´to´, @harry;
# prints AbelAxedCainPunishedcatchaseddoggonetoxyz
splice(ARRAY,OFFSET,LENGTH,LIST) splice(ARRAY,OFFSET,LENGTH) splice(ARRAY,OFFSET)
Removes the elements designated by OFFSET and LENGTH from an array, and replaces them with the elements of LIST, if any. Returns the elements removed from the array. The array grows or shrinks as necessary. If LENGTH is omitted, removes everything from OFFSET onward. The following equivalencies hold (assuming $[ == 0):
push(@a,$x,$y) splice(@a,$#a+1,0,$x,$y)
pop(@a) splice(@a,-1)
shift(@a) splice(@a,0,1)
unshift(@a,$x,$y) splice(@a,0,0,$x,$y)
$a[$x] = $y splice(@a,$x,1,$y);
Example, assuming array lengths are passed before arrays: sub aeq {# compare two array values
local(@a) = splice(@_,0,shift); local(@b) = splice(@_,0,shift);
return 0 unless @a == @b;# same len? while (@a) {
return 0 if pop(@a) ne pop(@b);
}
return 1;
}
if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }
split(/PATTERN/,EXPR,LIMIT)
split(/PATTERN/,EXPR) split(/PATTERN/)
split Splits a string into an array of strings, and returns it. (If not in an array context, returns the num- ber of fields found and splits into the @_ array. (In an array context, you can force the split into @_ by using ?? as the pattern delimiters, but it still returns the array value.)) If EXPR is omitted, splits the $_ string. If PATTERN is also omitted, splits on whitespace (/[ \t\n]+/). Anything matching PATTERN is taken to be a delimiter separating the fields. (Note that the delimiter may be longer than one character.) If LIMIT is specified, splits into no more than that many fields (though it may split into fewer). If LIMIT is unspecified, trailing null fields are stripped (which potential users of pop() would do well to remember). A pattern matching the null string (not to be confused with a null pattern //, which is just one member of the set of patterns matching a null string) will split the value of EXPR into separate characters at each point it matches that way. For example:
print join(´:´, split(/ */, ´hi there´)); produces the output ‘h:i:t:h:e:r:e’.
The LIMIT parameter can be used to partially split a line ($login, $passwd, $remainder) = split(/:/, $_, 3);
(When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one larger than the number of variables in the list, to avoid unnecessary work. For the list above LIMIT would have been 4 by default. In time critical applications it behooves you not to split into more fields than you
really need.)
If the PATTERN contains parentheses, additional array elements are created from each matching substring in the delimiter.
split(/([,-])/,"1-10,20");
produces the array value
(1,’-’,10,’,’,20)
The pattern /PATTERN/ may be replaced with an expression to specify patterns that vary at run- time. (To do runtime compilation only once, use /$variable/o.) As a special case, specifying a space (´ ´) will split on white space just as split with no arguments does, but leading white space does NOT produce a null first field. Thus, split(´ ´) can be used to emulate awk’s default behav- ior, whereas split(/ /) will give you as many null initial fields as there are leading spaces.
Example:
open(passwd, ´/etc/passwd´); while (<passwd>) {
($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(/:/ );
...
}
(Note that $shell above will still have a newline on it. See chop().) See also join. sprintf(FORMAT,LIST)
Returns a string formatted by the usual printf conventions. The * character is not supported. sqrt(EXPR)
sqrt EXPRReturn the square root of EXPR. If EXPR is omitted, returns square root of $_. srand(EXPR)
srand EXPR
Sets the random number seed for the rand operator. If EXPR is omitted, does srand(time). stat(FILEHANDLE)
stat FILEHANDLE
stat(EXPR)
stat SCALARVARIABLE
Returns a 13-element array giving the statistics for a file, either the file opened via FILEHAN- DLE, or named by EXPR. Returns a null list if the stat fails. Typically used as follows:
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks)
= stat($filename);
If stat is passed the special filehandle consisting of an underline, no stat is done, but the current contents of the stat structure from the last stat or filetest are returned. Example:
if (-x $file && (($d) = stat(_)) && $d < 0) { print "$file is executable NFS file\n";
}
(This only works on machines for which the device number is negative under NFS.)
study(SCALAR) study SCALAR
study Takes extra time to study SCALAR ($_ if unspecified) in anticipation of doing many pattern matches on the string before it is next modified. This may or may not save time, depending on the nature and number of patterns you are searching on, and on the distribution of character fre- quencies in the string to be searched— you probably want to compare runtimes with and without it to see which runs faster. Those loops which scan for many short constant strings (including the constant parts of more complex patterns) will benefit most. You may have only one study active at a time— if you study a different scalar the first is ‘‘unstudied’’. (The way study works is this: a linked list of every character in the string to be searched is made, so we know, for example, where all the ‘k’ characters are. From each search string, the rarest character is selected, based on some static frequency tables constructed from some C programs and English text. Only those places that contain this ‘‘rarest’’ character are examined.)
For example, here is a loop which inserts index producing entries before any line containing a certain pattern:
while (<>) { study;
print ".IX foo\n" if /\bfoo\b/; print ".IX bar\n" if /\bbar\b/; print ".IX blurfl\n" if /\bblurfl\b/;
...
print;
}
In searching for /\bfoo\b/, only those locations in $_ that contain ‘f’ will be looked at, because ‘f’ is rarer than ‘o’. In general, this is a big win except in pathological cases. The only question is whether it saves you more time than it took to build the linked list in the first place.
Note that if you have to look for strings that you don’t know till runtime, you can build an entire loop as a string and eval that to avoid recompiling all your patterns all the time. Together with undefining $/ to input entire files as one record, this can be very fast, often faster than specialized programs like fgrep. The following scans a list of files (@files) for a list of words (@words), and prints out the names of those files that contain a match:
$search = ´while (<>) { study;´; foreach $word (@words) {
$search .= "++\$seen{\$ARGV} if /\\b$word\\b/;\n";
}
$search .= "}-->"; @ARGV = @files; undef $/;
eval $search;# this screams
$/ = "\n";# put back to normal input delim foreach $file (sort keys(%seen)) {
print $file, "\n";
}
substr(EXPR,OFFSET,LEN)
substr(EXPR,OFFSET)
Extracts a substring out of EXPR and returns it. First character is at offset 0, or whatever you’ve set $[ to. If OFFSET is negative, starts that far from the end of the string. If LEN is omitted, returns everything to the end of the string. You can use the substr() function as an lvalue, in
which case EXPR must be an lvalue. If you assign something shorter than LEN, the string will shrink, and if you assign something longer than LEN, the string will grow to accommodate it. To keep the string the same length you may need to pad or chop your value using sprintf().
symlink(OLDFILE,NEWFILE)
Creates a new filename symbolically linked to the old filename. Returns 1 for success, 0 other- wise. On systems that don’t support symbolic links, produces a fatal error at run time. To check for that, use eval:
$symlink_exists = (eval ´symlink("","");´, $@ eq ´´);
syscall(LIST) syscall LIST
Calls the system call specified as the first element of the list, passing the remaining elements as arguments to the system call. If unimplemented, produces a fatal error. The arguments are inter- preted as follows: if a given argument is numeric, the argument is passed as an int. If not, the pointer to the string value is passed. You are responsible to make sure a string is pre-extended long enough to receive any result that might be written into a string. If your integer arguments are not literals and have never been interpreted in a numeric context, you may need to add 0 to them to force them to look like numbers.
require ’syscall.ph’;# may need to run h2ph syscall(&SYS_write, fileno(STDOUT), "hi there\n", 9);
sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET) sysread(FILEHANDLE,SCALAR,LENGTH)
Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHAN- DLE, using the system call read(2). It bypasses stdio, so mixing this with other kinds of reads may cause confusion. Returns the number of bytes actually read, or undef if there was an error. SCALAR will be grown or shrunk to the length actually read. An OFFSET may be specified to place the read data at some other place than the beginning of the string.
system(LIST) system LIST
Does exactly the same thing as ‘‘exec LIST’’ except that a fork is done first, and the parent pro- cess waits for the child process to complete. Note that argument processing varies depending on the number of arguments. The return value is the exit status of the program as returned by the wait() call. To get the actual exit value divide by 256. See also exec.
syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET) syswrite(FILEHANDLE,SCALAR,LENGTH)
Attempts to write LENGTH bytes of data from variable SCALAR to the specified FILEHAN- DLE, using the system call write(2). It bypasses stdio, so mixing this with prints may cause con- fusion. Returns the number of bytes actually written, or undef if there was an error. An OFFSET may be specified to place the read data at some other place than the beginning of the string.
tell(FILEHANDLE) tell FILEHANDLE
tell Returns the current file position for FILEHANDLE. FILEHANDLE may be an expression whose value gives the name of the actual filehandle. If FILEHANDLE is omitted, assumes the file last read.
telldir(DIRHANDLE) telldir DIRHANDLE
Returns the current position of the readdir() routines on DIRHANDLE. Value may be given to seekdir() to access a particular location in a directory. Has the same caveats about possible direc- tory compaction as the corresponding system library routine.
time Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970. Suitable for feed- ing to gmtime() and localtime().
times Returns a four-element array giving the user and system times, in seconds, for this process and the children of this process.
($user,$system,$cuser,$csystem) = times;
tr/SEARCHLIST/REPLACEMENTLIST/cds y/SEARCHLIST/REPLACEMENTLIST/cds
Translates all occurrences of the characters found in the search list with the corresponding char- acter in the replacement list. It returns the number of characters replaced or deleted. If no string is specified via the =˜ or !˜ operator, the $_ string is translated. (The string specified with =˜ must be a scalar variable, an array element, or an assignment to one of those, i.e. an lvalue.) For sed devotees, y is provided as a synonym for tr. If the SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST has its own pair of quotes, which may or may not be bracket- ing quotes, e.g. tr[A-Z][a-z] or tr(+-*/)/ABCD/.
If the c modifier is specified, the SEARCHLIST character set is complemented. If the d modifier is specified, any characters specified by SEARCHLIST that are not found in REPLACE- MENTLIST are deleted. (Note that this is slightly more flexible than the behavior of some tr programs, which delete anything they find in the SEARCHLIST, period.) If the s modifier is specified, sequences of characters that were translated to the same character are squashed down to 1 instance of the character.
If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly as specified. Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST, the final character is replicated till it is long enough. If the REPLACEMENTLIST is null, the SEARCHLIST is repli- cated. This latter is useful for counting characters in a class, or for squashing character sequences in a class.
Examples:
$ARGV[1] =˜ y/A−Z/a−z/; # canonicalize to lower case
$cnt = tr/*/*/; # count the stars in $_
$cnt = tr/0−9//; # count the digits in $_
tr/a−zA−Z//s; # bookkeeper −> bokeper ($HOST = $host) =˜ tr/a−z/A−Z/;
y/a−zA−Z/ /cs; # change non-alphas to single space
tr/\200−\377/\0−\177/; # delete 8th bit truncate(FILEHANDLE,LENGTH)
truncate(EXPR,LENGTH)
Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified length. Pro- duces a fatal error if truncate isn’t implemented on your system.
umask(EXPR) umask EXPR
umask Sets the umask for the process and returns the old one. If EXPR is omitted, merely returns cur- rent umask.
undef(EXPR) undef EXPR
undef Undefines the value of EXPR, which must be an lvalue. Use only on a scalar value, an entire array, or a subroutine name (using &). (Undef will probably not do what you expect on most pre- defined variables or dbm array values.) Always returns the undefined value. You can omit the EXPR, in which case nothing is undefined, but you still get an undefined value that you could, for instance, return from a subroutine. Examples:
undef $foo;
undef $bar{’blurfl’}; undef @ary;
undef %assoc; undef &mysub;
return (wantarray ? () : undef) if $they_blew_it;
unlink(LIST) unlink LIST
Deletes a list of files. Returns the number of files successfully deleted.
$cnt = unlink ´a´, ´b´, ´c´; unlink @goners;
unlink <*.bak>;
Note: unlink will not delete directories unless you are superuser and the −U flag is supplied to perl. Even if these conditions are met, be warned that unlinking a directory can inflict damage on your filesystem. Use rmdir instead.
unpack(TEMPLATE,EXPR)
Unpack does the reverse of pack: it takes a string representing a structure and expands it out into an array value, returning the array value. (In a scalar context, it merely returns the first value pro- duced.) The TEMPLATE has the same format as in the pack function. Here’s a subroutine that does substring:
sub substr { local($what,$where,$howmuch) = @_; unpack("x$where a$howmuch", $what);
}
and then there’s
sub ord { unpack("c",$_[0]); }
In addition, you may prefix a field with a %<number> to indicate that you want a <number>-bit checksum of the items instead of the items themselves. Default is a 16-bit checksum. For exam- ple, the following computes the same number as the System V sum program:
while (<>) {
$checksum += unpack("%16C*", $_);
}
$checksum %= 65536;
unshift(ARRAY,LIST)
Does the opposite of a shift. Or the opposite of a push, depending on how you look at it. Prepends list to the front of the array, and returns the number of elements in the new array.
unshift(ARGV, ´−e´) unless $ARGV[0] =˜ /ˆ−/;
utime(LIST) utime LIST
Changes the access and modification times on each file of a list of files. The first two elements of the list must be the NUMERICAL access and modification times, in that order. Returns the num- ber of files successfully changed. The inode modification time of each file is set to the current time. Example of a ‘‘touch’’ command:
#!/usr/bin/perl
$now = time;
utime $now, $now, @ARGV;
values(ASSOC_ARRAY) values ASSOC_ARRAY
Returns a normal array consisting of all the values of the named associative array. The values are returned in an apparently random order, but it is the same order as either the keys() or each() function would produce on the same array. See also keys() and each().
vec(EXPR,OFFSET,BITS)
Treats a string as a vector of unsigned integers, and returns the value of the bitfield specified. May also be assigned to. BITS must be a power of two from 1 to 32.
Vectors created with vec() can also be manipulated with the logical operators , & and ˆ, which will assume a bit vector operation is desired when both operands are strings. This interpretation is not enabled unless there is at least one vec() in your program, to protect older programs.
To transform a bit vector into a string or array of 0’s and 1’s, use these:
$bits = unpack("b*", $vector);
@bits = split(//, unpack("b*", $vector));
If you know the exact length in bits, it can be used in place of the *.
wait Waits for a child process to terminate and returns the pid of the deceased process, or -1 if there are no child processes. The status is returned in $?.
waitpid(PID,FLAGS)
Waits for a particular child process to terminate and returns the pid of the deceased process, or -1 if there is no such child process. The status is returned in $?. If you say
require "sys/wait.h";
...
waitpid(-1,&WNOHANG);
then you can do a non-blocking wait for any process. Non-blocking wait is only available on machines supporting either the waitpid (2) or wait4 (2) system calls. However, waiting for a
particular pid with FLAGS of 0 is implemented everywhere. (Perl emulates the system call by remembering the status values of processes that have exited but have not been harvested by the Perl script yet.)
wantarrayReturns true if the context of the currently executing subroutine is looking for an array value.
Returns false if the context is looking for a scalar. return wantarray ? () : undef;
warn(LIST)
warn LISTProduces a message on STDERR just like ‘‘die’’, but doesn’t exit. write(FILEHANDLE)
write(EXPR)
write Writes a formatted record (possibly multi-line) to the specified file, using the format associated with that file. By default the format for a file is the one having the same name is the filehandle, but the format for the current output channel (see select) may be set explicitly by assigning the name of the format to the $˜ variable.
Top of form processing is handled automatically: if there is insufficient room on the current page for the formatted record, the page is advanced by writing a form feed, a special top-of-page for- mat is used to format the new page header, and then the record is written. By default the top-of- page format is the name of the filehandle with ‘‘_TOP’’ appended, but it may be dynamicallly set to the format of your choice by assigning the name to the $ˆ variable while the filehandle is selected. The number of lines remaining on the current page is in variable $-, which can be set to 0 to force a new page.
If FILEHANDLE is unspecified, output goes to the current default output channel, which starts out as STDOUT but may be changed by the select operator. If the FILEHANDLE is an EXPR, then the expression is evaluated and the resulting string is used to look up the name of the FILE- HANDLE at run time. For more on formats, see the section on formats later on.
Note that write is NOT the opposite of read.