Perl officially stands for
“The Practical Extraction Report Language,” but Perl is really much more than a
practical reporting language. It’s practically everything really
likeable about the Shells, awk, sed, grep,
and C combined. Programmers can enjoy the powerful pattern matching
features in Perl.
Perl, a GNU product (i.e.,
it’s free), is an interpreted language. It is used primarily as a scripting
language and runs on a number of platforms.
Although designed for the
Unix, Perl is renowned for its portability and also runs on DOS, Windows,
Macintosh, etc.
System administrator, Web
developer, database administrator, application developer in bioinformatics,
etc.
% perl –e ‘print “Hello, world\n”;’
% cat firstPerl
#!/usr/local/bin/perl #
the first line of the script
print “Hello, world\n”; # statement is separated by semicolon
;
% perl –c firstPerl # -c is used for Syntax checking at a
prompt
Another way to run a Perl
script is: # No compilation step!!
% perl firstPerl
OR
% firstPerl #
After chmod +x firstPerl
Quoting rules in Perl is
similar to C-Shell. Perl uses single quote (all characters are treated
as literals), double quote (similar to the single quote except variable
substitution), backslash \, and backquote `` (for executing
commands).
Special literals
__LIINE__ #
represents the current line number
__FILE__ #
represents the current filename
__END__ #
represents the logical end of the script
#!/usr/local/bin/perl
print “Hello, world\n”;
print “We are on line number “, _LINE_,
“.\n”;
print “The name of this file is “,
_FILE_,”.\n”; # the name of current
file
_END_
And this part after _END_, will be ignored by
Perl. # ignored by Perl
Printf(“%-15s%-20s\n”, “Jack”,
“Sprat”); # right-justified
Printf “Hello, my name is %s!\n”,
“Sam”;
Printf “The number in decimal is %d\n”,
100;
Printf “The formatted floating point number
is %8.2f\n”, 14.3456;
The Perl here document is a
line-oriented form of quoting, requiring the << operator followed by an
initial terminating string and a semicolon. There can be no spaces after the
<<.
$price = 100;
print <<EOF; # start of here document, there are no quotes
The price is $price. # variables are expaned
EOF #
end of here document, NO surrounding spaces allowed
print <<’FINAL’; # start of here document, enclosed in single quotes
The price is $price. # the variable is not expanded
FINAL #
end of here document
print << x 4; # start of here document, prints the line 4 times
Hello, there!
#
Blank line is necessary here!!
print <<`END`; # start of here document, back quote will execute
Unix
echo hi there #
commands
date
END #
end of here document
Here documents are used
extensively in CGI scripts for enclosing large chunks of HTML tags for
printing.
Like shell script, Perl
variables don’t have to be declared before being used. Perl has three types of
variables: scalar (preceded by $), list (or array,
preceded by @), and associative array (or hashes preceded by %).
For example, $name, @name, and %name are all different variables.
·
Variables
are case sensitive.
·
Since
reserved words and filehandles are not preceded by a special
character, variable names will not conflict with reserved words or filehandles.
$salary = 50000; # scalar variable
@months=(Mar, “Apr”, 5); # Perl list can store different types
of data
print “$salary\n”;
print “@months\n”;
print “$months[0], $months[1]\n”; # array subscript starts with 0
print “The number of the last subscript of
months is $#months”
$sym=net;
print “${sym}work\n”; # with curly braces, the value can be
appended
$name = “Tommy”;
print “OK\n” if defined $name; # to check the validity of a variable’s
value
undef $name; # this
function undefines an already defined variable
@months=(); #
assigned a null list (empty the list)
@digits=(0..10); # range operator, will contain 0,
1, 2, ...,10
@letters=(‘A’..’Z’);
### Array slice ###
@names=(‘Tom’, ‘Dick’, ‘Harry’, ‘Pete’,
‘Smith’);
$count = @names; # the number of elements 11 is assigned to $count
@people = @names; # names list is copied to @people
@friends = @names[1,2,3]; # or @names[1..3] is also ok
($enemy[0], $enemy[2])=@names; # the enemy array is created with values
print “@enemy\n”; # new values list is (Tom, undefined, Dick)
@matrix=([1,2],[3,4],[5,6]); # 3x2 multi-dimensional array
print “Row 0, Column 0 is $matrix[0][0].\n”
@record=(“Adams”, [2,1], # one dimensional arrary with data 2
and 1
“Edwards”, [1,0,3],
“Howard”, [3,4,5,6]);
print “In the first row $record[0]\n”; # Adams
print “In the third row $record[5][2]\n”; # 5
Associative Arrays (Hashes)
%states = (‘CA’ => ‘California’, ‘TX’
=> ‘Texas’, ‘MT’ => ‘Montana’); #
hash
# the first string is called a key, and the
second string is called the value
print “$states{‘CA’}, $states{‘MT’}\n”;
%days=(‘Mon’, ‘Monday’, ‘Tue’, ‘Tuesday’,
‘Wed’,);
$days{‘Wed’}=”Wednesday”; # The value Wednesday is assigned to the key
Wed
$days{5} = “Friday”; # The value Friday is assigned with the key 5
# Array of Hashes
@band=({name=>”Tom Jones”, age=>30,
city=>”New York”},
{name=>”Michael Jack”, age=>40, city=>”LA”},);
print “The total number of members: “, $#band
+ 1, “\n”;
print “First member name is $band[0]{name}
\n”;
There are three filehandles STDIN, STDOUT,
and STDERR.
print “What is your name? “; # The string is sent to STDOUT by default
$name = <STDIN>; #
one line of input is read and assigned to $name
@all=<STDIN>; # data entered into array until
Ctrl-d is pressed
$course{$course_num}=<STDIN>; # data stored into hash table with the key
$num=read(STDIN, $indata, 100); # read
100 bytes at a time
$answer=getc; # one character at a time
The chop function removes the
last character in a scalar variable and the last character of each word in an
array. It is used primarily for removing the newline from the line of input.
The chomp function (introduced
in Perl 5) is similar to chop except that it removes the last character only if
that character is the newline.
print “what is your name? “;
$name = <STDIN>;
chop($name); #
removes the last character and returns it
chomp($name=<STDIN>); # removes only if that is the newline
character
The join function joins the
elements of an array into a single string and separates each element of the
array with a given delimiter—opposite of split.
Format: join(delimiter, list)
$name=”John”;
$birthdate=”1/1/2000”;
$place=”LA”;
print join(“:”, $name, $birthdate, $place),
“\n”; # John:1/1/2000:LA
The split function splits up
a string by some delimiter (whitespace by default) and returns an array.
$line=”a,b,c,d”;
@letter=split(‘,’, $line);
print “The characters in the line is
@letter\n”; # a b c d
The pop function pops off
the last element of an array and returns it. The array size is decreased by
one. The push function pushes values onto the end of an array, increasing the
size of the array.
$boy=pop(@list); # returns tommy and the
tommy is removed from the list
push(@list, bobby, tomb); # bobby and tomb are added to the list
@list
The shift function shifts
off and returns the first element of an array, decreasing the size of the
array. Splice function removes and replaces elements in an
array. The general format is:
splice(array, offset, length, list). Split
function splits up a string by some delimiter (whitespace by default) and
returns an array. The general format is:
split(/delimiter/,
expr).
@names=(“bob”, “dan”, “tom”);
$man=shift @names; #
returns “bob” and @names contains now (“dan”, “tom”)
unshift(@names, Liz, bean); # @names has now (“Liz”,
“bean”, “dan”, “tom”), add to the front
@newnames=splice(@names,1,3,yellow,orange); # @newnames (“bean”,
”dan”, ”tom”)
print “the spliced array is @names \n”; # @names has (“Liz”, “yello”, “orange”)
$line = “a b c d e”;
@letter=split(‘ ‘, $line); # @letter
contains (a, b, c, d, e)
The sort function sorts and
returns a sorted array. The reverse function reverses the elements in an array.
@string=(a, d, f, c, b);
@string_sorted=sort(@string); # a b c d f
@string_reverse=reverse(@string); # f d c b a
sub numeric {$a
<=> $b;} # numeric
subroutine definition
@sorted_num=sort
numeric 10, 5, 6, 0, 1; #
%weekdays=(‘1’=>’Mon’, ‘2’=>’Tue’,
‘3’=>’Wed’, ‘4’=>’Thu’, ‘5’=>’Fri’);
foreach $key (keys(%weekdays))
{print “$key “;} # 1 2 3 4 5
foreach $value (values(%weekdays))
{print “$value “;} # Mon Tue Wed Thu Fri
while (($key, $value) = each
%weekdays)
{print “$key = $value \n“;} # it prints each pair of key and value
delete $weekday{1}; #
removes the element with the key 1, which is Mon
%ENV is a associative array
that contains the environment variables handed to Perl from the parent Shell.
foreach $key (keys(%ENV)) {print
"$key\n";}
print “your home directory is $ENV{‘HOME’}”;
The grep function evaluates
the expression for each element of the array.
Format: grep(expr, list)
@list=(tomatoes, tomorrow, potatoes, phantom,
tommy);
$count=grep(/tom/i, @list); # the number of times the expression was true
@items=grep(/tom/i, @list); # the array consisting of those element
(true)
# i means
case-sensitive
Perl performs appropriate
type conversion by testing the operands of mixed types. Operators are very
similar to C language.
+, -, *, /, %(modulus), ++, --, ==(equal to),
!=(not
equal to), &&(logical and), ||(logical or),
>, >=, +=, -=, <=> (signed return, e.g., -1, 0, 1 for the number
comparison), ..(range operator), x (string repetition), ?:(ternary conditional), String comparison operators: eq(equal to),
ne(not equal to), cmp(signed
return), qt (greater than), ge(greater than or
equal), lt(less than), le(less than or equal)
$price = ($age > 60) ? 0: 5.55; #
if $age > 60 then 0 else 5.55 is assigned to $price
print “The numbers: “, 1..10; # range operator ..
$z = “kid”;
print $z x 5, “\n”; # print 5 “kid”
print $z . “nap”, “\n”; # concatenate “kid” and “nap”
$num1 <=> $num2
# returns –1 or 0 or 1
srand time; # setting the seed value
print “Random number:
“, int(rand 6); # the random number
range 0 ~ 5
$roll = int(rand 6) +
1; # the random number
range 1 ~ 6
The
regular expression operators are used for matching patterns in searches and for
replacements in substitution operations.
m// (m/pattern/ or /pattern/)
operator is used for matching patterns and s/// (s/old/new/)
operator is used for substitution one pattern for another. However, m is
optional if the delimiter is the forward slash (default). m/good/ is
equivalent to /good/
/abc/ Any string that matched the pattern ‘abc’ will
be matched in a string or file.
?abc? Only the first occurrence of the string is
matched.
$_ = “xabcy”;
print “found it\n” if
/abc/; # will print a message ‘found it’,
# $_ is the default space for pattern matching.
Modifiers: i(turn
off case sensitivity), m(treat a string as multiple lines), g(match
globally, e.g., find all occurrences and returns a list if an array context,
and true or false if a scalar context), s(treat string as single line
when newline is embedded), e(evaluate the replacement side as an
expression).
$_ = “I lost my
gloves in the clover, Love.”;
@list=/love/g; # love love
@list=/love/gi; # love love Love.
$ cat sample.dat
Steve Blenheim
Norma Cord
Jon DeLoach
$perl –ne ‘s/Norma/Jane/;
print;’ sample.dat # will replace Norma by Jane
Steve Blenheim
Jane Cord
Jon DeLoach
$perl –ne ‘print if
s/Jon/Tom/;’ sample.dat
Tom DeLoach
$_=50;
s/$_/$&*2/e; # A special variable $& will hold the
string that was matched.
print “The new value
is $_\n”; # will print 100
If
you have a string that is not stored in the $_ variable and need to perform
matches or substitutions on that string, then the pattern binding operators are
used. They are also used with the tr function for string translations.
General
Formats:
$var
=~ /expr/ # true if $var contains pattern /expr/, returns 1 for
true, null for false.
$var
!~ /expr/ # true if $var does
not contain pattern /expr/
$var
=~ s/old/new/ # replace first
occurrence of /old/ with /new/
$var
=~ s/old/new/g # replace all
occurrences of /old/ with /new/
$var
=~ tr/a-z/A-Z/ # translate all lower
case letters to upper case.
$var
=~ /$pattern/ # a variable can be
used in the search string.
$ cat test.pl #
Perl code
while(<>) {
($name, $phone, $address) = split(/:/,
$_);
print $name if $phone =~ /400-/
}
$ perl test.pl
customer.dat # assume customer.dat
contains customer data
#
while loop is used to explicitly loop through the file named at the command
line. It will get a line from the file and store it in the $_ variable. The
line in $_ will be split by colon (:) and the value returned stored in the list
($name, $phone, $address). The pattern /400-/ is matched against the $phone
variable. If the pattern is matched in $phone, the value of $name is printed.
. matches any character
except newline.
[a-z0-9]
matches any single character in
set.
[^a-z0-9]
matches any single character not
in set.
\d matches one digit.
\D
is equivalent to [^0-9] matches a non-digit.
\w matches an alphanumeric
character.
\W matches a non-alphanumeric
character.
\s matches whitespace
character, spaces, tabs, and newlines.
\S matches non-whitespace
character.
^ matches to beginning of
line.
$ matches to end of line.
\A matches the beginning of the
string.
\Z matches the end of the
string.
x? matches 0 or 1 x
x* matches 0 or more x’s
x+ matches 1 or more x’s
x{m,n}
matches at least m x’s and no
more than n x’s.
a|b|c
matches a or b or c.
/^a..c/ # It searches at the beginning
of the line for an ‘a’, followed by any three characters,
#
followed by a ‘c’. For example, it will match ‘abbbc’, ‘a123c’, ‘aAx3c’, etc.
print if
/[A-Z][a-z]eve/; #Find A-Z, followed by
a-z, followed by ‘eve.
Print if /2\d\d/;
# find a ‘2’ followed by exactly two digits.
Print if /5+/; # find one or more 5’s
Print if /5{1,3}/; # find at least one 5 but not more than 3
Print if /10*/; # find 1 followed by 0 or more 0’s
Print if /5{3}/; # find exactly three consecutive 5’s
Print if /5{1,}/; # find at least one or more consecutive 5’s
tr /a-z/A-Z/; #
each lower case letters a-z will be replaced by upper case A-Z.
Control
structures and compound statements (block)
Simple
IF modifier: expr2 if expr1; # if expr1 is true, execute expr2
$x = 10;
print $x if $x >
5; # will print 10
A compound statement
consists of a group of statements surrounded by curly braces. Unlike C, Perl
requires if, else, while, etc. to have {} even
with one statement.
if (expr) {block}
if (expr)
{
block }
else
{
block }
if (expr1)
{ block 1}
elsif (expr2)
{ block 2}
...
else
{ block n}
unless (expr) {block}
unless (expr) {block} else {block}
unless (expr1) {block} elsif (expr2) {block}
... else {block}
# example
$hour
= 10;
if
($hour <= 10)
{print “good morning\n”;}
elsif
($hour == 12)
{print “Lunch time\n”;}
else
{print “Good night\n”;}
While
modifier: expr2 while expr1; # repeatedly executes expr2
# as long as
expr1 is true.