Traps
Accustomed awk users should take special note of the following:
-
Semicolons are required after all simple statements in perl
(except at the end of a block). Newline is not a statement delimiter.
-
Curly brackets are required on ifs and whiles.
-
Variables begin with $ or @ in perl.
-
Arrays index from 0 unless you set $[. Likewise string positions in
substr() and index().
-
You have to decide whether your array has numeric or string indices.
-
Associative array values do not spring into existence upon mere
reference.
-
You have to decide whether you want to use string or numeric
comparisons.
-
Reading an input line does not split it for you. You get to split it
yourself to an array. And the split
operator has different arguments.
-
The current input line is normally in $_, not $0. It generally does
not have the newline stripped. ($0 is the name of the program executed.)
-
$<digit> does not refer to fields— it refers to substrings
matched by the last match pattern.
-
The print statement does not add field and record separators
unless you set $, and $\.
-
You must open your files before you print to them.
-
The range operator is ‘‘. .’’, not comma. (The comma operator works
as in C.)
-
The match operator is ‘‘=˜’’, not ‘‘˜’’. (‘‘˜’’ is the one’s
complement operator, as in C.)
-
The exponentiation operator is ‘‘**’’, not ‘‘ˆ’’. (‘‘ˆ’’ is the
XOR operator, as in C.)
-
The concatenation operator is ‘‘.’’, not the null string. (Using the
null string would render ‘‘/pat/ /pat/’’ unparsable, since the third slash would be interpreted as a division operator— the tokener is in fact slightly context sensitive for operators like /, ?, and <. And in fact, . itself can be the beginning of a number.)
-
Next, exit and continue work differently.
-
The following variables work differently
Awk Perl
ARGC $#ARGV
ARGV[0] $0
FILENAME $ARGV
FNR $. − something
FS (whatever you like)
NF $#Fld, or some such
NR $.
OFMT $#
OFS $,
ORS $\
RLENGTH length($&)
RS $/
RSTART length($`)
SUBSEP $;
- When in doubt, run the awk construct through a2p and see what it
gives you.
Cerebral C programmers should take note of the following:
-
Curly brackets are required on ifs and whiles.
-
You should use ‘‘elsif’’ rather than ‘‘else if’’
-
Break and continue become last and next, respectively.
-
There’s no switch statement.
-
Variables begin with $ or @ in perl.
-
Printf does not implement *.
-
Comments begin with #, not /*.
-
You can’t take the address of anything.
-
ARGV must be capitalized.
-
The ‘‘system’’ calls link, unlink, rename, etc. return nonzero for
success, not 0.
-
Signal handlers deal with signal names, not numbers.
Seasoned sed programmers should take note of the following:
-
Backreferences in substitutions use $ rather than \.
-
The pattern matching metacharacters (, ), and do not have
backslashes in front.
-
The range operator is . . rather than comma.
Sharp shell programmers should take note of the following:
-
The backtick operator does variable interpretation without regard to
the presence of single quotes in the command.
-
The backtick operator does no translation of the return value,
unlike csh.
-
Shells (especially csh) do several levels of substitution on each
command line. Perl does substitution only in certain constructs such as double quotes, backticks, angle brackets and search patterns.
-
Shells interpret scripts a little bit at a time. Perl compiles the
whole program before executing it.
-
The arguments are available via @ARGV, not $1, $2, etc.
-
The environment is not automatically made available as variables.