Traps

Accustomed awk users should take special note of the following:

  • Semicolons are required after all simple statements in perl

    (except at the end of a block). Newline is not a statement delimiter.

  • Curly brackets are required on ifs and whiles.

  • Variables begin with $ or @ in perl.

  • Arrays index from 0 unless you set $[. Likewise string positions in

    substr() and index().

  • You have to decide whether your array has numeric or string indices.

  • Associative array values do not spring into existence upon mere

    reference.

  • You have to decide whether you want to use string or numeric

    comparisons.

  • Reading an input line does not split it for you. You get to split it

    yourself to an array. And the split

operator has different arguments.

  • The current input line is normally in $_, not $0. It generally does

    not have the newline stripped. ($0 is the name of the program executed.)

  • $<digit> does not refer to fields— it refers to substrings

    matched by the last match pattern.

  • The print statement does not add field and record separators

    unless you set $, and $\.

  • You must open your files before you print to them.

  • The range operator is ‘‘. .’’, not comma. (The comma operator works

    as in C.)

  • The match operator is ‘‘=˜’’, not ‘‘˜’’. (‘‘˜’’ is the one’s

    complement operator, as in C.)

  • The exponentiation operator is ‘‘**’’, not ‘‘ˆ’’. (‘‘ˆ’’ is the

    XOR operator, as in C.)

  • The concatenation operator is ‘‘.’’, not the null string. (Using the

    null string would render ‘‘/pat/ /pat/’’ unparsable, since the third slash would be interpreted as a division operator— the tokener is in fact slightly context sensitive for operators like /, ?, and <. And in fact, . itself can be the beginning of a number.)

  • Next, exit and continue work differently.

  • The following variables work differently

Awk Perl

ARGC $#ARGV

ARGV[0] $0

FILENAME $ARGV

FNR $. − something

FS (whatever you like)

NF $#Fld, or some such

NR $.

OFMT $#

OFS $,

ORS $\

RLENGTH length($&)

RS $/

RSTART length($`)

SUBSEP $;

  • When in doubt, run the awk construct through a2p and see what it

    gives you.

Cerebral C programmers should take note of the following:

  • Curly brackets are required on ifs and whiles.

  • You should use ‘‘elsif’’ rather than ‘‘else if’’

  • Break and continue become last and next, respectively.

  • There’s no switch statement.

  • Variables begin with $ or @ in perl.

  • Printf does not implement *.

  • Comments begin with #, not /*.

  • You can’t take the address of anything.

  • ARGV must be capitalized.

  • The ‘‘system’’ calls link, unlink, rename, etc. return nonzero for

    success, not 0.

  • Signal handlers deal with signal names, not numbers.

Seasoned sed programmers should take note of the following:

  • Backreferences in substitutions use $ rather than \.

  • The pattern matching metacharacters (, ), and  do not have

    backslashes in front.

  • The range operator is . . rather than comma.

Sharp shell programmers should take note of the following:

  • The backtick operator does variable interpretation without regard to

    the presence of single quotes in the command.

  • The backtick operator does no translation of the return value,

    unlike csh.

  • Shells (especially csh) do several levels of substitution on each

    command line. Perl does substitution only in certain constructs such as double quotes, backticks, angle brackets and search patterns.

  • Shells interpret scripts a little bit at a time. Perl compiles the

    whole program before executing it.

  • The arguments are available via @ARGV, not $1, $2, etc.

  • The environment is not automatically made available as variables.