An Introduction to
Perl.This page is an
overview of the Perl language, and aims to provide a novice with enough information to get started using Perl.
It skims many advanced features of Perl, and does not attempt to explain the more unusual features of Perl.
Index:
Variables in Perl
(See also Non user defined variables )
In perl, variables are dynamically created.
There are no restrictions on size of variables, it is not necessary to specify or even know the size of arrays before using them. All variables
are of a variant type - they accept characters, strings, integers, floating point numbers etc..
Syntax and examples: Like UNIX, variables are prefixed with a dollar sign '$'. This includes setting the value and retrieving it, eg, $first_name = "Eoin";
$number_of_ears = 2;
print "$first_namehas $number_of_ears ears.\n"; would print "Eoin has 2 ears.". And I do. Arrays: Arrays are prefixed with an '@' sign when referring to them as a whole, but with the '$' when indexing a particular element, eg, @my_jumpers = ("Green one", "Blue one", "Red one", "Yellow one"); $day_of_the_week = 1;
print "Jumpers -
today I will wear the $my_jumpers[$day_of_the_week]\n";
Lists:
These are arrays that
do not use a numerical index. The syntax is slightly different,
but the concept is pretty much the same, eg,
%my_jumpers = ("Monday",
"Green one",
"Tuesday", "Blue one",
"Wednesday", "Red one",
"Thursday", "Yellow one");
$index = "Tuesday";
print "On $index,
I wear the $my_jumpers[$index]\n";
or
print "My
favourite jumper is the $my_jumpers['Monday']\n";
To get the number of
elements in an array, use 'scalar', eg,
$number_of_elements =
scalar(@my_array);
Arrays can be sorted -
using 'sort', eg,
sort @my_array;
Basic operators
in Perl
Basic operators
in Perl
| Operator |
Meaning |
Aliases/Others |
| = |
Assignment |
None. |
| .= |
Append to
variable |
None. |
| == |
Is equal to |
eq |
| != |
Not equal to |
ne |
| > |
Greater than |
gt |
| < |
Less than |
lt |
| >= |
Greater than
or equal to |
ge |
| <= |
Less than or
equal to |
le |
| += |
Add n to VAR
giving VAR |
None. |
| -=, *= |
Same idea as
+= |
None. |
| ++, -- |
Increment,
decrement. |
None. |
| =~ |
Contains |
Used also
for regexp. |
Conditional
statements in Perl
Perl uses the
following:
- 'if'
statements - eg,
- if ( $name ==
'Eoin' )
{
# Do something.
}
- 'else'
statements - eg,
- if ( $name ==
'Eoin' )
{
# Do something
}
else
{
# Do something else
}
- 'elsif'
statements - eg,
- if ( $name ==
'Eoin' )
{
# Do something
}
elsif (
$opponent == 'Billy' )
{
# Do something else
}
else
{
# Another scenario
}
- The 'switch'
statement does not officially exist in Perl.
Loops in Perl
Perl supports the
following loops:
- do { BLOCK OF
CODE } until ( CONDITION ) - condition is tested after
first execution of block, eg,
- do
{
print "$i";
$i++;
} until $i
== 100;
- while ( CONDITION)
{ BLOCK } - condition is tested before block is executed,
if block is executed, eg,
- while ( $i <=
100 )
{
print "$i";
$i++;
}
- for ( RANGE ) {
BLOCK }
eg,
for $i (1..100)
{
print $i;
}
Note:
if $i was omitted here, eg, for (1..100), Perl
automatically uses a non-user defined variable $_ . The
above code is equivilant to: for (1..100)
{
print "$_";
}
- foreach ELEMENT (LIST/ARRAY)
{ BLOCK }
eg, foreach $CD (@music_collection)
{
print "$CD\n";
}
Note - again
$_ can be used - eg,
foreach (@music_collection)
{
print "$_ \n";
}
Tips when
using loops:
Loops in Perl
are often implemented using labels, and the next, last
and redo commands.
An example of
this would be:
SEARCH: while
( $line != $some_search_string )
{
$line = * Read line from file (see files for more info) *;
if ( *End of File* )
{
# Like 'break' in C.
last SEARCH;
}
if ( $line =~ "^#" )
# if line begins with a '#'
{
# Skip processing below and go to next iteration of the
loop.
next SEARCH;
}
if ( $line == $some_search_string )
{
if ( $profile != "Eoin" )
{
# Next iteration of the loop but without evaluating the
loop condition.
redo SEARCH;
}
}
}
Screen Input and
Output
This is the same as
for files and any other kind of I/O. The only differences are
that Standard input (STDIN) and standard output (STDOUT)
descriptors do not have to be explicitly specified, they are the
defaults of the I/O commands.
Examples:
To print a line to
standard output:
print "This is a
line\n";
or
print STDOUT
"This is a line\n";
To retrieve a line
from standard input:
$line = <>;
or
$line =
;
Most escape sequences
are the same as in C, i.e., print "\t Tab \n Newline etc..";
Exception
handling in Perl
Perl provides two main
exception handling methods to the programmer:
- die "Error
text"
- warn "Text
to display"
which can be used with
the system error returned (see $! in non-user defined variables).
Example:
print FILEDESCRIPTOR
"Some text" or die "Cannot write to file: $!\n";
Files in Perl
- open FD,
$Filename
- close FD
- read FD, $buffer,
$len <, $offset>
- write FD,
$buffer, $len
- print FD "Some
text and $some_var"
To open a file, use:
open FD, $file or die "Cannot open file: $!\n";
Modes:
- Prefix filename with '>' for output
- Prefix filename with '<' for input
- Prefix filename with '>>' for appending
- Prefix filename with '+<' for input and output
If filename begins
with the pipe character '|', the file is taken to be a command to
be executed and the results piped - accessable by the file
descriptor.
To close a file, use:
close FD
To read from a file:
$line_in_file = ;
or
read FD, $some_buffer, $number_of_bytes;
To write to a file:
print FD "Name is: $entered_name\n";
or
write FD, $formatted_buffer, length($formatted_buffer);
It is also possible to
do the following:
@entire_file =
;
foreach $line (@entire_file)
{
# Do something
}
or
print FD "@some_array";
Executing
other programs
Perl provides various
methods for executing other programs, such as:
- qx! COMMAND !;
- system( COMMAND );
- open PIPEDESC
"| COMMAND";
- exec "COMMAND"
These commands all
execute external programs on the system.
Usages:
# Strange way of
reading a file!!
@entire_file = qx!
cat *.dbg0 !;
$Result = system("cat
test.txt");
# Another strange way
of reading a file!!
open CMDPIPE,
"| cat something.txt";
$first_line =
;
# This does not wait
for the command to finish executing - it never returns to the
script!
# Usually only
used with fork()
exec $command
Regular expressions in Perl
Perl supports the
standard regular expressions supported by UNIX. Syntax is as
follows, along with common usage:
Matching:
EXTRACT: foreach (@lines_in_file)
{
# If the line begins with a #.
if (/^#/)
{
next EXTRACT;
}
.
.
}
Could have used
'foreach $line (@lines_in_file)' and 'if ($line =~ /^#/)' here
or
$line =
;
# if the line contains
the word 'Error'
if ($line =~
"Error")
{
print "An error occurred";
}
Extracting
information:
$ps_returned = qx! ps
-ef | grep eoins | grep tpkernel !;
if ($ps_returned =~ /^\s*eoins\s*(0-9*)\s*/)
{
# Non-user defined variable $1 holds result of regexp that is in
brackets, eg, (0-9*).
$tpkernel_pid = $1
}
else
{
$tpkernel_pid = "Kernel is not running!";
}
Substitution/Replacement:
@Template =
;
$me = "EoinS";
foreach $line (@Template)
{
# if we can find rexexp in $line, substitute it.
# 'gi' at end of line instructs the regexp to do a global
replace, ignoring case.
$line = s/^Reviewer:\s*([a-zA-Z]*)$/$me/gi;
print NEWFILE "$line";
}
This also applies to y/../../g
or tr/../../g
String
manipulations
There are many utility
functions provided by perl for string manipulation (if you decide
to avoid regular expressions):
- chop $variable -
removes the last character. More efficient than regexp.
- chomp $variable -
removes newline characters from end of string.
- split
$split_char, $variable - returns an array from $variable
seperated from occurances of the $split_char - eg, @words
= split ' ', $my_string;. This would extract all words (or
space delimited characters) from the variable $my_string.
- join $delimiter,
@list_of_vars - creates a single variable from the list
of variables - seperated by the delimiter, eg, $passwd =
join(':', $login,$passwd,$uid,$gid,$gcos,$home,$shell);
- format - sets a
fixed format for 'write' call. See manpage.
- grep - extracts
data from lists/arrays, returning the list of results (if
@x specified) or the number of occurences in the list (if
$x specified), eg, @commentlist = grep(!/^#/, @inputfile);
- lc - puts the
string into lower case.
- lcfirst - puts
the first char in the string into lower case.
- uc, ucfirst -
upper case conversions.
Subroutines
(defining and calling)
Perl uses subroutines
in the same way as any programming language, although it is not
necessary to have a 'main' or something similar. Anything not in
a subroutine is considered (effectively) as being within the main
subroutine.
Definition of a
subroutine is as follows:
sub
my_subroutine_name {
# do something
# in subroutine
}
Syntax used to call
this subroutine is:
&my_subroutine_name;
To use parameters, it
is not necessary to explicity prototype the function or anything
like that. We again make use of the perl built-in variables. For
example:
In code:
&add_record "Billy",
25, "Irish";
sub add_record {
if (scalar(@_) >= 3) {
($name, $age, $nationality) = split;
print FD, "$name, $age, $nationality. Done.";
}
else {
print "Not enough information to add $_[0]\n";
}
}
Non-User defined Variables
Perl provides a lot of
information to the user automatically - in built-in variables.
Although typically these variables supply information about the
process and the process environment, they also provide shorthand
syntax etc.. to the programmer aswell. For example, we have
already seen the non-user variable '$_' being used in
conjunction with the 'for' and 'foreach' looping
commands (see the looping section for more information).
Other
built-in variables include:
| Variable |
Description |
Aliases |
| @_, $_ |
This is the
default variable used if none is specified. See looping section for an example
of how this is used. It is also the default parameter to
many commands such as chop, split, etc.. |
$ARG |
| $$ |
The process
ID. This variable holds the PID of this process. |
$PROCESS_ID, $PID |
| $! |
System error
message. It is the error returned by the OS translated
into text. |
$ERRNO, $OS_ERROR |
| $| |
Automatic I/O
buffering. If this value is set to non-zero, the current
selected channel will not use any buffering at all,
otherwise it will buffer any data until a more convenient
stage. See the sockets section of the network
programming section below for an
example of this. |
$OUTPUT_AUTOFLUSH |
| $< |
User ID -
this is the user ID of the user running the process. |
$REAL_USER_ID, $UID |
| $> |
Effective
User ID - this is the effective user ID - may be
different to the user running the program |
$EFFECTIVE_USER_ID, $EUID |
| $( |
Group ID -
the group ID of the person running the script |
$GID, $REAL_GROUP_ID |
| $) |
Effective
Group ID - this is the effective group ID - may be
different to the user running the program |
$EGID, $EFFECTIVE_GROUP_ID |
| @ARGV |
List
containing any command line arguments to the program.
Indexed into like any Perl list - eg, $ARGV[0]. Note -
'C' holds program name in argv[0], Perl does not. In
perl, the ARGV list does not contain the program name. |
None |
| $0 |
The program
name. This is the name of the script that is running.
Changing this will change the name that appears when the
'ps' command is executed etc.. |
$PROGRAM_NAME |
Using
Environment variables:
In perl, to access any
of the environment variables, we can use the $ENV special
variable.
This variable is
used as follows:
print "Printing
to $ENV{'LPDEST'}...";
or
$ENV{'HOME'} = "Kansas";
Signals
Signal handling in
Perl is very easy. To ignore a signal, we do:
$SIG{'CHLD'} =
'IGNORE';
The word IGNORE above
is recognised by Perl as being as special word in this case.
Specifying DEFAULT
here is also recognised as a reserved word.
Alternatively, we can
register a signal handler for a signal too - for example:
sub my_signal_handler
{
# Do something
print "Signal received was $_";
}
and then in the code:
$SIG{'BUS'} =
'my_signal_handler';
The wait and waitpid
system calls are available through perl too.
Network
Programming (eg, Sockets & IPC etc..)
Sockets:
To use sockets, the
following code is required at the top of your script:
use Socket;
To create a
socket, do the following:
$port_no = 23;
$host = 'my.machine.com';
$iaddr = inet_aton(
$host );
$paddr =
sockaddr_in( $port_no, $iaddr );
socket(
SOCKET_FD, AF_INET, SOCK_STREAM, $proto ) or die "$!";
To connect the socket,
now do:
connect(
SOCKET_FD, $paddr );
To read data from the
socket:
$my_data =
;
To write data to the
socket:
print SOCKET_FD "There
once was a man from Nantucket\n";
Note:
After reading or writing to/from the socket, it is a good idea to
flush the socket explicitly. We can do this by using the built-in
variable $| (see non-user defined
variables for more info), eg,
select( SOCKET_FD );
$| = 1;
select( STDOUT );
IPC:
Message queues:
A message queue is an
IPC method using the UNIX kernel to manage a pipe with a fixed
structure. Although it is considered resource heavy and most
recommend that they should not be used, perl provides an
interface none-the-less.
Message queues are
created with msgctl, messages are submitted to it using msgsnd,
messages are received from it using msgrcv, and msgget
is used to retrieve the queue ID from the KEY.
Pipes:
Pipes are another
means of IPC supported by perl. There are two ways of using pipes
in perl - using pipes across a new process created by the perl
process, eg,
pipe READHANDLE,
WRITEHANDLE;
$pid = fork;
if ($pid == 0) {
# Child process
# Can communicate to parent process now by writing to the
WRITEHANDLE
}
else {
# Parent process
# Can read from READHANDLE to communicate with child process.
}
Using pipes and
external programs:
We can open pipes
from another program's output, for example:
open (MY_FD, "|
grep ELEPHANT animals.txt") or die "$!";
@data = ;
print "The
following elephants were found: \n";
print "@data\n";
This can be expanded,
eg,
open(FD, "| cat
animals.txt | grep ELEPHANT | grep -v PINK") or die "$!";
Semaphores:
Semaphores are also
supported in Perl. A semaphore is an integer value in memory that
is accessable to any process that knows the key. It is usually
used for access control to a shared memory segment, but has many
other uses.
Semaphores are created
using semget, and are manipulated by semop and semctl.
Shared Memory:
Shared memory is a section of memory that is visible to all programs that know the
key. This is a common form of IPC, due to it's flexible nature. It is often used along with semaphores, the semaphores
maintaining access control to the shared memory segment.
Shared memory is created using shmget, can be manipulated & queried
using shmctl, and also using shmread and shmwrite.
Information sources
Web sites:
www.perl.org - the
Perl Institute
www.perl.com
- Perl developers resource
www.perl.net
- another perl site with good links etc..
|