(PERL) Introduction What is PERL? Practical Report and Extraction - - PDF document
(PERL) Introduction What is PERL? Practical Report and Extraction - - PDF document
Practical Report and Extraction Language (PERL) Introduction What is PERL? Practical Report and Extraction Language. It is an interpreted language optimized for scanning arbitrary text files, extracting information from them, and
Internet & Web Based Technology 2
Introduction
- What is PERL?
– Practical Report and Extraction Language. – It is an interpreted language optimized for scanning arbitrary text files, extracting information from them, and printing reports based on that information. – Very powerful string handling features. – Available on all platforms.
Internet & Web Based Technology 3
Main Advantages
- Speed of development
– You can enter the program in a text file, and just run it. It is an interpretive language; no compiler is needed.
- It is powerful
– The regular expressions of Perl are extremely powerful. – Uses sophisticated pattern matching techniques to scan large amounts of data very quickly.
- Portability
– Perl is a standard language and is available on all platforms. – Free versions are available on the Internet.
- Editing Perl programs
– No sophisticated editing tool is needed. – Any simple text editor like Notepad or vi will do.
Internet & Web Based Technology 4
- Flexibility
– Perl does not limit the size of your data. – If memory is available, Perl can handle the whole file as a single string. – Allows one to write simple programs to perform complex tasks.
Internet & Web Based Technology 5
How to run Perl?
- Perl can be downloaded from the Internet.
– Available on almost all platforms.
- Assumptions:
– For Windows operating system, you can run Perl programs from the command prompt.
- Run “cmd” to get command prompt window.
– For Unix/Linux, you can run directly from the shell prompt.
Internet & Web Based Technology 6
- Recommended steps:
– Create a directory/folder where you will be storing the Perl files. – Using any text editor, create a file “test.pl” with the following content: print “Good day\n”; print “This is my first Perl program\n”; – Execute the program by typing the following at the command prompt: perl test.pl
Working through an example
Internet & Web Based Technology 7
- On Unix/Linux, an additional line has to be given at
the beginning of every Perl program.
#!/usr/bin/perl print “Good day\n”; print “This is my first Perl program \n”;
Internet & Web Based Technology 8
Variables
- Scalar variables
– A scalar variable holds a single value. – Other variable types are also available (array and associative array) – to be discussed later. – A ‘$’ is used before the name of a variable to indicate that it is a scalar variable. $xyz = 20;
Internet & Web Based Technology 9
- Some examples:
$a = 10; $name=“Indranil Sen Gupta”; $average = 28.37; – Variables do not have any fixed types. – Variables can be printed as: print “My name is $name, the average temperature is $average\n”;
Internet & Web Based Technology 10
- Data types:
– Perl does not specify the types of variables.
- It is a loosely typed language.
- Languages like C or java are strongly typed.
Internet & Web Based Technology 11
- A powerful feature
– Variable names are automatically replaced by values when they appear in double-quoted strings.
- An example:
$stud = “Rupak”; $marks = 75; print “Marks obtained by $stud is $marks\n”; print ‘Marks obtained by $stud is $marks\n’;
Variable Interpolation
Internet & Web Based Technology 12
– The program will give the following output: Marks obtained by Rupak is 75 Marks obtained by $stud is $marks – What do we see:
- If we need to do variable interpolation, use double
quotes; otherwise, use single quotes.
Internet & Web Based Technology 13
- Another example:
$Expense = ‘$100’; print “The expenditure is $Expense.\n”;
Internet & Web Based Technology 14
Expressions with Scalars
- Illustrated through examples (syntax similar to C)
$abc = 10; $abc++; $total- -; $a = $b ** 10; # exponentiation $a = $b % 10; # modulus $balance = $balance + $deposit; $balance += $deposit;
Internet & Web Based Technology 15
- Operations on strings:
– Concatenation: the dot (.) is used. $a = “Good”; $b = “ day”; $c = “\n”; $total = $a.$b.$c; # concatenate the strings $a .= “ day\n”; # add to the string $a
Internet & Web Based Technology 16
– Arithmetic operations on strings $a = “bat”; $b = $a + 1; print $a, “ and ”, $b; will print bat and bau – Operations carried out based on ASCII codes.
- May not always be meaningful.
Internet & Web Based Technology 17
– String repetition operator (x). $a = $b x3; will concatenate three copies of $b and assign it to $a. print “Ba”. “na”x2; will print the string “banana”.
Internet & Web Based Technology 18
String as a Number
- A string can be used in an arithmetic expression.
– How is the value evaluated? – When converting a string to a number, Perl takes any spaces, an optional minus sign, and as many digits it can find (with dot) at the beginning of the string, and ignores everything else. “23.54” evaluates to 23.54 “123Hello25” evaluates to 123 “banana” evaluates to 0
Internet & Web Based Technology 19
- The character ‘\’ is used as the escape character.
– It escapes all of Perl’s special characters (e.g., $, @, #, etc.). $num = 20; print “Value of \$num is $num\n”; print “The windows path is c:\\perl\\”;
Escaping
Internet & Web Based Technology 20
Line Oriented Quoting
- Perl supports specification of a string spanning
multiple lines.
– Use the marker ‘<<’. – Follow it by a string, which is used to terminate the quoted material.
- Example:
print << terminator; Hello, how are you? Good day. terminator
Internet & Web Based Technology 21
- Another example:
print “<HTML>\n”; print “<HEAD><TITLE>Test page </TITLE></HEAD>\n”; print “<BODY>\n”; print “<H2>This is a test document.<H2>\n”; print “</BODY></HTML>”;
Internet & Web Based Technology 22
print << EOM; <HTML> <HEAD><TITLE>Test page </TITLE></HEAD> <BODY> <H2>This is a test document.<H2> </BODY></HTML> EOM
Lists and Arrays
Internet & Web Based Technology 24
Basic Difference
- List is an ordered list of scalars.
- Array is a variable that holds a list.
- Each element of an array is a scalar.
- The size of an array:
– Lower limit: 0 – Upper limit: no specific limit; depends on virtual memory.
Internet & Web Based Technology 25
List Literal
- Examples:
(10, 20, 50, 100) (‘red', “blue", “green") (“a", 1, 2, 3, ‘b') ($a, 12) () # empty list (10..20) # list constructor function (‘A’..’Z’) # same, for lettere\s
Internet & Web Based Technology 26
Specifying Array Variable
- We use the special character ‘@’.
@months # denotes an array The individual elements of the array are scalars, and can be referred to as:
$months[0] # first element of @months $months[1] # second element of @months ……
Internet & Web Based Technology 27
Initializing an Array
- Two ways:
– Specify values, separated by commas. @color = (‘red’, ‘green’, “blue”, “black”); – Use the quote words (qw) function, that uses space as the delimiter: @color = qw (red green blue black);
Internet & Web Based Technology 28
Array Assignment
– Assign from a list of literals @numbers = (1, 2, 3); @colors = (“red”, “green”, “blue”); – From the contents of another array. @array1 = @array2; – Using the qw function: @word = qw (Hello good morning); – Combination of above: @allcolors = (“white”, @colors, “brown”);
Internet & Web Based Technology 29
– Some other examples: @xyz = (2..5); @xyz = (1, @xyz); @xyz = (@xyz, 6);
Internet & Web Based Technology 30
Multiple Assignments
($x, $y, $y) = (10, 20, 30); ($x, $y) = ($y, $x); # swap elements ($a, @col) = (‘red’, ‘green’, ‘blue’); # $a gets the value ‘red’ # @col gets the value (‘green’, ‘blue’) ($first, @val, $last) = (1, 2, 3, 4); # $first gets the value 1 # @val gets the value (2, 3, 4) # $last is undefined
Internet & Web Based Technology 31
Number of Elements in Array
- Two ways:
$size = scalar @colors; $size = @colors;
Internet & Web Based Technology 32
Accessing Elements
@list = (1, 2, 3, 4); $first = $list[0]; $fourth = $list[3]; $list[1]++; # array becomes (1, 3, 3, 4) $x = $list[5]; # $x gets the value undef $list[2] = “Go”;# array becomes (1, 2, “Go”, 4)
Internet & Web Based Technology 33
- The $# is the index of the last element of the array.
@value = (1, 2, 3, 4, 5); print “$#value \n”; # prints 4
- An empty array has the value
$#value = -1;
Internet & Web Based Technology 34
shift and unshift
- They operate on the front of the array.
– ‘shift’ removes the first element of the array. – ‘unshift’ replaces the element at the start of the array.
Internet & Web Based Technology 35
- Example:
@color = qw (red, blue, green, black); $first = shift @color; # $first gets “red”, and @color becomes # (blue, green, black) unshift (@color, “white”); # @color becomes (white, blue, green, black)
Internet & Web Based Technology 36
pop and push
- They operate on the bottom of the array.
– ‘pop’ removes the last element of the array. – ‘push’ replaces the last element of the array.
Internet & Web Based Technology 37
- Example:
@color = qw (red, blue, green, black); $first = pop @color; # $first gets “black”, and @color becomes # (red, blue, green) push (@color, “white”); # @color becomes (red, blue, green, white)
Internet & Web Based Technology 38
Reversing an Array
- By using the ‘reverse’ keyword.
@names = (“Mina”, “Tina”, ‘Rina”) @rev = reverse @names; # Reversed list stored in ‘rev’. @names = reverse @names; # Original array is reversed.
Internet & Web Based Technology 39
Printing an Array
- Example:
@colors = qw (red, green, blue); print @colors; # prints without spaces – redgreenblue print “@colors”; # prints with spaces – red green blue
Internet & Web Based Technology 40
Sort the Elements of an Array
- Using the ‘sort’ keyword, by default we can sort the
elements of an array lexicographically.
– Elements considered as strings. @colors = qw (red blue green black); @sort_col = sort @colors # Array @sort_col is (black blue green red)
Internet & Web Based Technology 41
– Another example: @num = qw (10 2 5 22 7 15); @new = sort @num; # @new will contain (10 15 2 22 5 7) – How do sort numerically? @num = qw (10 2 5 22 7 15); @new = sort {$a <=> $b} @num; # @new will contain (2 5 7 10 15 22)
Internet & Web Based Technology 42
The ‘splice’ function
- Arguments to the ‘splice’ function:
– The first argument is an array. – The second argument is an offset (index number of the list element to begin splicing at). – Third argument is the number of elements to remove. @colors = (“red”, “green”, “blue”, “black”); @middle = splice (@colors, 1, 2); # @middle contains the elements removed
File Handling
Internet & Web Based Technology 44
Interacting with the user
- Read from the keyboard (standard input).
– Use the file handle <STDIN>. – Very simple to use. print “Enter your name: ”; $name = <STDIN>; # Read from keyboard print “Good morning, $name. \n”; – $name also contains the newline character.
- Need to chop it off.
Internet & Web Based Technology 45
The ‘chop’ Function
- The ‘chop’ function removes the last character of
whatever it is given to chop.
- In the following example, it chops the newline.
print “Enter your name: ”; chop ($name = <STDIN>); # Read from keyboard and chop newline print “Good morning, $name. \n”;
- ‘chop’ removes the last character irrespective of
whether it is a newline or not.
– Sometimes dangerous.
Internet & Web Based Technology 46
Safe chopping: ‘chomp’
- The ‘chomp’ function works similar to ‘chop’, with
the difference that it chops off the last character
- nly if it is a newline.
print “Enter your name: ”; chomp ($name = <STDIN>); # Read from keyboard and chomp newline print “Good morning, $name. \n”;
Internet & Web Based Technology 47
File Operations
- Opening a file
– The ‘open’ command opens a file and returns a file handle. – For standard input, we have a predefined handle <STDIN>. $fname = “/home/isg/report.txt”;
- pen XYZ , $fname;
while (<XYZ>) { print “Line number $. : $_”; }
Internet & Web Based Technology 48
– Checking the error code: $fname = “/home/isg/report.txt”;
- pen XYZ, $fname or die “Error in open: $!”;
while (<XYZ>) { print “Line number $. : $_”; } – $. returns the line number (starting at 1) – $_ returns the contents of last match – $i returns the error code/message
Internet & Web Based Technology 49
- Reading from a file:
– The last example also illustrates file reading. – The angle brackets (< >) are the line input operators.
- The data read goes into $_
Internet & Web Based Technology 50
- Writing into a file:
$out = “/home/isg/out.txt”;
- pen XYZ , “>$out” or die “Error in write: $!”;
for $i (1..20) { print XYZ “$i :: Hello, the time is”, scalar(localtime), “\n”; }
Internet & Web Based Technology 51
- Appending to a file:
$out = “/home/isg/out.txt”;
- pen XYZ , “>>$out” or die “Error in write: $!”;
for $i (1..20) { print XYZ “$i :: Hello, the time is”, scalar(localtime), “\n”; }
Internet & Web Based Technology 52
- Closing a file:
close XYZ;
where XYZ is the file handle of the file being closed.
Internet & Web Based Technology 53
- Printing a file:
– This is very easy to do in Perl. $input = “/home/isg/report.txt”;
- pen IN, $input or die “Error in open: $!”;
while (<IN>) { print; } close IN;
Internet & Web Based Technology 54
Command Line Arguments
- Perl uses a special array called @ARGV.
– List of arguments passed along with the script name on the command line. – Example: if you invoke Perl as: perl test.pl red blue green then @ARGV will be (red blue green). – Printing the command line arguments: foreach (@ARGV) { print “$_ \n”; }
Internet & Web Based Technology 55
Standard File Handles
- <STDIN>
– Read from standard input (keyboard).
- <STDOUT>
– Print to standard output (screen).
- <STDERR>
– For outputting error messages.
- <ARGV>
– Reads the names of the files from the command line and
- pens them all.
Internet & Web Based Technology 56
– @ARGV array contains the text after the program’s name in command line.
- <ARGV> takes each file in turn.
- If there is nothing specified on the command line, it
reads from the standard input. – Since this is very commonly used, Perl provides an abbreviation for <ARGV>, namely, < > – An example is shown.
Internet & Web Based Technology 57
$lineno = 1; while (< >) { print $lineno ++; print “$lineno: $_”; } – In this program, the name of the file has to be given on the command line. perl list_lines.pl file1.txt perl list_lines.pl a.txt b.txt c.txt
Control Structures
Internet & Web Based Technology 59
Introduction
- There are many control constructs in Perl.
– Similar to those in C. – Would be illustrated through examples. – The available constructs:
- for
- foreach
- if/elseif/else
- while
- do, etc.
Internet & Web Based Technology 60
Concept of Block
- A statement block is a sequence of statements
enclosed in matching pair of { and }.
if (year == 2000) { print “You have entered new millenium.\n”; }
- Blocks may be nested within other blocks.
Internet & Web Based Technology 61
Definition of TRUE in Perl
- In Perl, only three things are considered as FALSE:
– The value 0 – The empty string (“ ”) – undef
- Everything else in Perl is TRUE.
Internet & Web Based Technology 62
if .. else
- General syntax:
if (test expression) { # if TRUE, do this } else { # if FALSE, do this }
Internet & Web Based Technology 63
- Examples:
if ($name eq ‘isg’) { print “Welcome Indranil. \n”; } else { print “You are somebody else. \n”; } if ($flag == 1) { print “There has been an error. \n”; } # The else block is optional
Internet & Web Based Technology 64
elseif
- Example:
print “Enter your id: ”; chomp ($name = <STDIN>); if ($name eq ‘isg’) { print “Welcome Indranil. \n”; } elseif ($name eq ‘bkd’) { print “Welcome Bimal. \n”; } elseif ($name eq ‘akm’) { print “Welcome Arun. \n”; } else { print “Sorry, I do not know you. \n”; }
Internet & Web Based Technology 65
while
- Example: (Guessing the correct word)
$your_choice = ‘ ‘; $secret_word = ‘India’; while ($your_choice ne $secret_word) { print “Enter your guess: \n”; chomp ($your_choice = <STDIN>); } print “Congratulations! Mera Bharat Mahan.”
Internet & Web Based Technology 66
for
- Syntax same as in C.
- Example:
for ($i=1; $i<10; $i++) { print “Iteration number $i \n”; }
Internet & Web Based Technology 67
foreach
- Very commonly used function that iterates over a
list.
- Example:
@colors = qw (red blue green); foreach $name (@colors) { print “Color is $name. \n”; }
- We can use ‘for’ in place of ‘foreach’.
Internet & Web Based Technology 68
- Example: Counting odd numbers in a list
@xyz = qw (10 15 17 28 12 77 56); $count = 0; foreach $number (@xyz) { if (($number % 2) == 1) { print “$number is odd. \n”; $count ++; } print “Number of odd numbers is $count. \n”; }
Internet & Web Based Technology 69
Breaking out of a loop
- The statement ‘last’, if it appears in the body of a
loop, will cause Perl to immediately exit the loop.
– Used with a conditional. last if (i > 10);
Internet & Web Based Technology 70
Skipping to end of loop
- For this we use the statement ‘next’.
– When executed, the remaining statements in the loop will be skipped, and the next iteration will begin. – Also used with a conditional.
Relational Operators
Internet & Web Based Technology 72
The Operators Listed
le <= Less or equal ge >= Greater or equal lt < Less than gt > Greater than ne != Not equal eq == Equal String Numeric Comparison
Internet & Web Based Technology 73
Logical Connectives
- If $a and $b are logical expressions, then the
following conjunctions are supported by Perl:
– $a and $b $a && $b – $a or $b $a || $b – not $a ! $a
- Both the above alternatives are equivalent; first one
is more readable.
String Functions
Internet & Web Based Technology 75
The Split Function
- ‘split’ is used to split a string into multiple pieces using a
delimiter, and create a list out of it.
$_=‘Red:Blue:Green:White:255'; @details = split /:/, $_; foreach (@details) { print “$_\n”; }
– The first parameter to ‘split’ is a regular expression that specifies what to split on. – The second specifies what to split.
Internet & Web Based Technology 76
- Another example:
$_= “Indranil isg@iitkgp.ac.in 283496”; ($name, $email, $phone) = split / /, $_;
- By default, ‘split’ breaks a string using space as
delimiter.
Internet & Web Based Technology 77
The Join Function
- ‘join’ is used to concatenate several elements into a
single string, with a specified delimiter in between.
$new = join ' ', $x1, $x2, $x3, $x4, $x5, $x6; $sep = ‘::’; $new = join $sep, $x1, $x2, $w3, @abc, $x4, $x5;
Regular Expressions
Internet & Web Based Technology 79
Introduction
- One of the most useful features of Perl.
- What is a regular expression (RegEx)?
– Refers to a pattern that follows the rules of syntax. – Basically specifies a chunk of text. – Very powerful way to specify string patterns.
Internet & Web Based Technology 80
An Example: without RegEx
$found = 0; $_ = “Hello good morning everybody”; $search = “every”; foreach $word (split) { if ($word eq $search) { $found = 1; last; } } if ($found) { print “Found the word ‘every’ \n”; }
Internet & Web Based Technology 81
Using RegEx
$_ = “Hello good morning everybody”; if ($_ =~ /every/) { print “Found the word ‘every’ \n”; }
- Very easy to use.
- The text between the forward slashes defines the
regular expression.
- If we use “!~” instead of “=~”, it means that the
pattern is not present in the string.
Internet & Web Based Technology 82
- The previous example illustrates literal texts as
regular expressions.
– Simplest form of regular expression.
- Point to remember:
– When performing the matching, all the characters in the string are considered to be significant, including punctuation and white spaces.
- For example, /every / will not match in the previous
example.
Internet & Web Based Technology 83
Another Simple Example
$_ = “Welcome to IIT Kharagpur, students”; if (/IIT K/) { print “’IIT K’ is present in the string\n”; { if (/Kharagpur students/) { print “This will not match\n”; }
Internet & Web Based Technology 84
Types of RegEx
- Basically two types:
– Matching
- Checking if a string contains a substring.
- The symbol ‘m’ is used (optional if forward slash used
as delimiter). – Substitution
- Replacing a substring by another substring.
- The symbol ‘s’ is used.
Matching
Internet & Web Based Technology 86
The =~ Operator
- Tells Perl to apply the regular expression on the
right to the value on the left.
- The regular expression is contained within
delimiters (forward slash by default).
– If some other delimiter is used, then a preceding ‘m’ is essential.
Internet & Web Based Technology 87
Examples
$string = “Good day”; if ($string =~ m/day/) { print “Match successful \n"; } if ($string =~ /day/) { print “Match successful \n"; }
- Both forms are equivalent.
- The ‘m’ in the first form is optional.
Internet & Web Based Technology 88
$string = “Good day”; if ($string =~ m@day@) { print “Match successful \n"; } if ($string =~ m[day[ ) { print “Match successful \n"; }
- Both forms are equivalent.
- The character following ‘m’ is the delimiter.
Internet & Web Based Technology 89
Character Class
- Use square brackets to specify “any value in the list
- f possible values”.
my $string = “Some test string 1234"; if ($string =~ /[0123456789]/) { print "found a number \n"; } if ($string =~ /[aeiou]/) { print "Found a vowel \n"; } if ($string =~ /[0123456789ABCDEF]/) { print "Found a hex digit \n"; }
Internet & Web Based Technology 90
Character Class Negation
- Use ‘^’ at the beginning of the character class to
specify “any single element that is not one of these values”.
my $string = “Some test string 1234"; if ($string =~ /[^aeiou]/) { print "Found a consonant\n"; }
Internet & Web Based Technology 91
Pattern Abbreviations
- Useful in common cases
Not a space character \S Not a word character \W Not a digit, same as [^0-9] \D A space character (tab, space, etc) \s A word character, [0-9a-zA-Z_] \w A digit, same as [0-9] \d Anything except newline (\n)
.
Internet & Web Based Technology 92
$string = “Good and bad days"; if ($string =~ /d..s/) { print "Found something like days\n"; } if ($string =~ /\w\w\w\w\s/) { print "Found a four-letter word!\n"; }
Internet & Web Based Technology 93
Anchors
- Three ways to define an anchor:
^ :: anchors to the beginning of string $ :: anchors to the end of the string \b :: anchors to a word boundary
Internet & Web Based Technology 94
if ($string =~ /^\w/) :: does string start with a word character? if ($string =~ /\d$/) :: does string end with a digit? if ($string =~ /\bGood\b/) :: Does string contain the word “Good”?
Internet & Web Based Technology 95
Multipliers
- There are three multiplier characters.
* :: Find zero or more occurrences + :: Find one or more occurrences ? :: Find zero or one occurrence
- Some example usages:
$string =~ /^\w+/; $string =~ /\d?/; $string =~ /\b\w+\s+/; $string =~ /\w+\s?$/;
Substitution
Internet & Web Based Technology 97
Basic Usage
- Uses the ‘s’ character.
- Basic syntax is:
$new =~ s/pattern_to_match/new_pattern/; What this does?
- Looks for pattern_to_match in $new and, if found,
replaces it with new_pattern.
- It looks for the pattern once. That is, only the first
- ccurrence is replaced.
- There is a way to replace all occurrences (to be
discussed shortly).
Internet & Web Based Technology 98
Examples
$xyz = “Rama and Lakshman went to the forest”; $xyz =~ s/Lakshman/Bharat/; $xyz =~ s/R\w+a/Bharat/; $xyz =~ s/[aeiou]/i/; $abc = “A year has 11 months \n”; $abc =~ s/\d+/12/; $abc =~ s /\n$/ /;
Internet & Web Based Technology 99
Common Modifiers
- Two such modifiers are defined:
/i :: ignore case /g :: match/substitute all occurrences $string = “Ram and Shyam are very honest"; if ($string =~ /RAM/i) { print “Ram is present in the string”; } $string =~ s/m/j/g; # Ram -> Raj, Shyam -> Shyaj
Internet & Web Based Technology 100
Use of Memory in RegEx
- We can use parentheses to capture a piece of
matched text for later use.
– Perl memorizes the matched texts. – Multiple sets of parentheses can be used.
- How to recall the captured text?
– Use \1, \2, \3, etc. if still in RegEx. – Use $1, $2, $3 if after the RegEx.
Internet & Web Based Technology 101
Examples
$string = “Ram and Shyam are honest"; $string =~ /^(\w+)/; print $1, "\n"; # prints “Ra\n” $string =~ /(\w+)$/; print $1, "\n"; # prints “st\n” $string =~ /^(\w+)\s+(\w+)/; print "$1 $2\n"; # prints “Ramnd Shyam are honest”;
Internet & Web Based Technology 102
$string = “Ram and Shyam are very poor"; if ($string =~ /(\w)\1/) { print "found 2 in a row\n"; } if ($string =~ /(\w+).*\1/) { print "found repeat\n"; } $string =~ s/(\w+) and (\w+)/$2 and $1/;
Internet & Web Based Technology 103
Example 1
- validating user input
print “Enter age (or 'q' to quit): "; chomp (my $age = <STDIN>); exit if ($age =~ /^q$/i); if ($age =~ /\D/) { print "$age is a non-number!\n"; }
Internet & Web Based Technology 104
Example 2: validation contd.
- File has 2 columns, name and age, delimited by one
- r more spaces. Can also have blank lines or
commented lines (start with #).
- pen IN, $file or die "Cannot open $file: $!";
while (my $line = <IN>) { chomp $line; next if ($line =~ /^\s*$/ or $line =~ /^\s*#/); my ($name, $age) = split /\s+/, $line; print “The age of $name is $age. \n"; }
Some Special Variables
Internet & Web Based Technology 106
$&, $` and $’
- What is $&?
– It represents the string matched by the last successful pattern match.
- What is $`?
– It represents the string preceding whatever was matched by the last successful pattern match.
- What is $‘?
– It represents the string following whatever was matched by the last successful pattern match .
Internet & Web Based Technology 107
– Example: $_ = 'abcdefghi'; /def/; print "$\`:$&:$'\n"; # prints abc:def:ghi
Internet & Web Based Technology 108
- So actually ….
– S` represents pre match – $& represents present match – $’ represents post match
Associative Arrays
Internet & Web Based Technology 110
Introduction
- Associative arrays, also known as hashes.
– Similar to a list
- Every list element consists of a pair, a hash key and a
value.
- Hash keys must be unique.
– Accessing an element
- Unlike an array, an element value can be found out by
specifying the hash key value.
- Associative search.
– A hash array name must begin with a ‘%’.
Internet & Web Based Technology 111
Specifying Hash Array
- Two ways to specify:
– Specifying hash keys and values, in proper sequence. %directory = ( “Rabi”, “258345”, “Chandan”, “325129”, “Atul”, “445287”, “Sruti”, “237221” );
Internet & Web Based Technology 112
– Using the => operator. %directory = ( Rabi => “258345”, Chandan => “325129”, Atul => “445287”, Sruti => “237221” ); – Whatever appears on the left hand side of ‘=>’ is treated as a double-quoted string.
Internet & Web Based Technology 113
Conversion Array <=> Hash
- An array can be converted to hash.
@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list;
- A hash can be converted to an array:
@list = %directory;
Internet & Web Based Technology 114
Accessing a Hash Element
- Given the hash key, the value can be accessed
using ‘{ }’.
- Example:
@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; print “Atul’s number is $directory{“Atul”} \n”;
Internet & Web Based Technology 115
Modifying a Value
- By simple assignment:
@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; $directory{Sruti} = “453322”; $directory{‘Chandan’} ++;
Internet & Web Based Technology 116
Deleting an Entry
- A (hash key, value) pair can be deleted from a hash
array using the “delete” function.
– Hash key has to be specified. @list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; delete $directory{Atul};
Internet & Web Based Technology 117
Swapping Keys and Values
- Why needed?
– Suppose we want to search for a person, given the phone number. @list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; %revdir = reverse %directory; print “$revdir{237221} \n”;
Internet & Web Based Technology 118
Using Functions ‘keys’, ‘values’
- ‘keys’ returns all the hash keys as a list.
- ‘values’ returns all the values as a list.
@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; @all_names = keys %directory; @all_phones = values %directory;
Internet & Web Based Technology 119
An Example
- List all person names and telephone numbers.
@list = qw (Rabi 258345 Chandan 325129 Atul 445287 Sruti 237221); %directory = @list; foreach $name (keys %directory) { print “$name \t $directory{$name} \n”; }
Subroutines
Internet & Web Based Technology 121
Introduction
- A subroutine …..
– Is a user-defined function. – Allows code reuse. – Define ones, use multiple times.
Internet & Web Based Technology 122
How to use?
- Defining a subroutine
sub test_sub { # the body of the subroutine goes here # …….. }
- Calling a subroutine
– Use the ‘&’ prefix to call a subroutine. &test_sub; &gcd ($val1, $val2); # Two parameters – However, the ‘&’ is optional.
Internet & Web Based Technology 123
Subroutine Return Values
- Use the ‘return’ statement.
– This is also optional. – If the keyword ‘return’ is omitted, Perl functions return the last value evaluated.
- A subroutine can also return a non-scalar.
- Some examples are given next.
Internet & Web Based Technology 124
Example 1
$name = ‘Indranil'; welcome(); # call the first sub welcome_namei(); # call the second sub exit; sub welcome { print "hi there\n"; } sub welcome_name { print "hi $name\n"; # uses global $name variable }
Internet & Web Based Technology 125
Example 2
# Return a non-scalar sub return_alpha_and_beta { return ($alpha, $beta); } $alpha = 15; $beta = 25; @c = return_alpha_and_beta; # @c gets (5,6)
Internet & Web Based Technology 126
Passing Arguments
- All arguments are passed into a Perl function
through the special array $_.
– Thus, we can send as many arguments as we want.
- Individual arguments can also be accessed as $_[0],
$_[1], $_[2], etc.
Internet & Web Based Technology 127
Example 3
# Two different ways to write a subroutine to add two numbers sub add_ver1 { ($first, $second) = @_; return ($first + $second); } sub add_ver2 { return $_[0] + $_[1]; # $_[0] and $_[1] are the first two # elements of @_ }
Internet & Web Based Technology 128
Example 4
$total = find_total (5, 10, -12, 7, 40); sub find_total { # adds all numbers passed to the sub $sum = 0; for $num (@_) { $sum += $num; } return $sum; }
Internet & Web Based Technology 129
‘my’ variables
- We can define local variables using the ‘my’
keyword.
– Confines a variable to a region of code (within a block { } ). – ‘my’ variable’s storage is freed whenever the variable goes
- ut of scope.
– All variables in Perl is by default ‘global’.
Internet & Web Based Technology 130
Example 5
$sum = 7; $total = add_any (20, 10, -15); # $total gets 15 sub add_any { # local variable, won't interfere # with global $sum my $sum = 0; for my $num (@_ ) { $sum += $num; } return $sum; }
Writing CGI Scripts in Perl
Internet & Web Based Technology 132
Introduction
- Perl provides with a number of facilities to facilitate
writing of CGI scripts.
– Standard library modules.
- Included as part of the Perl distribution.
- No need to install them separately.
#!/usr/bin/perl use CGI qw (:standard);
Internet & Web Based Technology 133
- Some of the functions included in the CGI.pm (.pm
is optional) are:
– header
- This prints out the “Content-type” header.
- With no arguments, the type is assumed to be
“text/html”. – start_html
- This prints out the <html>, <head>, <title> and <body>
tags.
- Accepts optional arguments.
Internet & Web Based Technology 134
– end_html
- This prints out the closing HTML tags, </body>, >/html>.
- Typical usages and arguments would be illustrated
through examples.
Internet & Web Based Technology 135
Example 1 (without using CGI.pm)
#!/usr/bin/perl print <<TO_END; Content-type: text/html <HTML> <HEAD> <TITLE> Server Details </TITLE> </HEAD> <BODY> Server name: $ENV{SERVER_NAME} <BR> Server port number: $ENV{SERVER_PORT} <BR> Server protocol: $ENV{SERVER_PROTOCOL} </BODY> </HTML> TO_END
Internet & Web Based Technology 136
Example 2 (using CGI.pm)
#!/usr/bin/perl -wT use CGI qw(:standard); print header (“text/html”); print start_html ("Hello World"); print "<h2>Hello, world!</h2>\n"; print end_html;
Internet & Web Based Technology 137
Example 3: Decoding Form Input
sub parse_form_data { my %form_data; my $name_value; my @nv_pairs = split /&/, $ENV{QUERY_STRING}; if ( $ENV{REQUEST_METHOD} eq ‘POST’ ) { my $query = “”; read (STDIN, $query, $ENV{CONTENT_LENGTH}); push @nv_pairs, split /&/, $query; }
Internet & Web Based Technology 138
foreach $name_value (@nv_pairs) { my ($name, $value) = split /=/, $name_value; $name =~ tr/+/ /; $name =~ s/%([\da-f][\da-f])/chr (hex($1))/egi; $value =~ tr/+/ /; $value =~ s/%([\da-f][\da-f])/chr (hex($1))/egi; $form_data{$name} = $value; } return %form_data; }
Internet & Web Based Technology 139
Using CGI.pm
- The decoded form value can be directly accessed
as:
$value = param (‘fieldname’);
- An equivalent Perl code as in the last example using
CGI.pm
– Shown in next slide.
Internet & Web Based Technology 140
Example 4
#!/usr/bin/perl -wT use CGI qw(:standard); my %form_data; foreach my $name (param() ) { $form_data {$name} = param($name); }
Internet & Web Based Technology 141
Example 5: sending mail
#!/usr/bin/perl -wT use CGI qw(:standard); print header; print start_html (“Response to Guestbook”); $ENV{PATH} = “/usr/sbin”; # to locate sendmail
- pen (MAIL, “| /usr/sbin/sendmail –oi –t”);
# open the pipe to sendmail my $recipient = ‘xyz@hotmail.com’; print MAIL “To: $recipient\n”; print MAIL “From: isg\@cse.iitkgp.ac.in\n”; print MAIL “Subject: Submitted data\n\n”;
Internet & Web Based Technology 142