Introduction to Perl
Introduction to Perl Scott Hazelhurst - - PowerPoint PPT Presentation
Introduction to Perl Scott Hazelhurst - - PowerPoint PPT Presentation
Introduction to Perl Introduction to Perl Scott Hazelhurst http://www.bioinf.wits.ac.za/~scott/perl.pdf August 2013 Introduction to Perl Introduction and Motivation Introduction and Motivation Practical Extraction and Report Language
Introduction to Perl Introduction and Motivation
Introduction and Motivation
Practical Extraction and Report Language
◮ General language – Intended for systems
programming, scripting
◮ Popular with systems programmers & Web
developers
◮ Relatively new language (ca. 1990) ◮ Big language – support for concurrency and OO. ◮ Portable.
Introduction to Perl Introduction and Motivation
Perl powerful and flexible language:
◮ Many devotees ◮ Many criticisms of the language
Easy to write difficult-to-understand code.
◮ Focus on writing readable code (for yourself,
- thers)
Introduction to Perl Introduction and Motivation
Objectives
◮ Describe basic features of Perl ◮ Write simple Perl programs ◮ Use basic matching facilities
Lots of resources:
◮ Books ◮ http://cpan.mirror.ac.za/,
www.cpan.org, www.perl.org
◮ perldoc, info perl
Assume: knowledge of programming, C-like language
Introduction to Perl Perl’s Data Structures
Perl’s Data Structures I
Perl is an imperative language.
◮ State is represented by a set of variables. ◮ Program is an ordered sequence of commands. ◮ Computation is accomplished by the execution
- f these commands in the specified order
Will explore Perl’s OO features later.
Introduction to Perl Perl’s Data Structures
Flexible (perhaps too flexible) language.
◮ Scalars: numbers, strings ◮ Arrays, lists ◮ Hash tables ◮ References
Variables have a special prefatory character to indicate to which genre of variable (%, $, @, . . . )
Introduction to Perl Perl’s Data Structures
Feature Warning
◮ Variables automatically declared ◮ Variables given default values ◮ Implicit type coercion common ◮ Meaning dependent on context
Good idea to use the “ use strict; use warnings” pragmas
Introduction to Perl Perl’s Data Structures Scalar types
Scalar types
All scalar variables have a $ as prefix: e.g. $x, $year$, $etc. $c= 10; $f = 9/5* $c +32; $cname = ’Introduction to Perl ’; print "Course $cname: size $c\n";
Introduction to Perl Perl’s Data Structures String operations
Interpolation
◮ Inside a double-quoted string, Perl does variable
interpolation, and interprets escape characters print "Course $cname size \t $f\n;";
Introduction to Perl Perl’s Data Structures String operations
Interpolation
◮ Inside a double-quoted string, Perl does variable
interpolation, and interprets escape characters print "Course $cname size \t $f\n;";
◮ Not done inside single-quoted strings.
print ’Course $cname size \t $f\n’;
Introduction to Perl Perl’s Data Structures String operations
Other string operations
Concatenation: . $fruits= ’apples and pears ’; $base = ’pies ’; $full = $fruits . ’ make ’ . $base; $full .= ’: ’ . fruits;
◮ Length of string: length. Substrings: substr ◮ Repeat operator: x ◮ Powerful & flexible string matching and processing –
regular expressions.
◮ Type conversion to and from integers as necessary !!?!!
Introduction to Perl Perl’s Data Structures Artithmetic operations
Arithmetic Operations
The basic Perl arithmetic operations are: +, -, *, /, %, **.
◮ Default: floating point arithmetic ◮ There are many arithmetic procedures: int,
sqrt,. . . Short-cuts: Perl has many C-like features. eg. $x = 3; $x += $y+1; $x++;
- -$y;
Introduction to Perl Perl’s Data Structures Truth and Logical operations
Truth and Logical operations
- 1. Any string is true except for the empty string
and ”0”;
- 2. Any number is true except for 0
- 3. Any reference is true
- 4. Any undefined value is false
Introduction to Perl Perl’s Data Structures Truth and Logical operations
Logical operators
High binding Low binding && and ||
- r
! not Short-circuit evaluation! Logical expressions returns the last value evaluated. e.g. 2 or 3 and 5 2 and (0 or 5) $a ||= 2
Introduction to Perl Perl’s Data Structures Truth and Logical operations
Relational operators
Comparison Numeric String Equal, Not equal ==, != eq, ne Less than (equal) <, <= lt, le Greater than (equal) >, >= gt, ge Comparison <=> cmp Smart matching ~~ ~~
Introduction to Perl Perl’s Data Structures Truth and Logical operations
File operators
Example Name Example Name
- e $a
File Exists
- T $a
File is a text file
- r $a
Readable file
- w $a
Writable file
- d $a
File is a directory
- f $a
Regular file
Introduction to Perl Standard input and output
Standard input and output
Default: file handle STDIN associated with keyboard (input), STDOUT with console output.
◮ Input is line oriented.
Introduction to Perl Standard input and output
Output print command by default sends output to STDOUT. print "Hello"; same as print STDOUT "Hello";
Introduction to Perl Standard input and output
Input A reference to <STDIN> waits for input from console.
◮ data typed in is returned
print "Enter the temp in C: "; $c = <STDIN>; print "Temperature in F is ". (9/5*$c+32);
Introduction to Perl Control Structures Making decisions
Control structures
Conditional
◮ if/else/elsif:
if ($x < $y) {$x=1} else {$y=1};
◮ unless: unless ($ok) { die "Error in file I/O"; } ◮ no explicit switch/case ◮ implicit through use of short-circuit
(-e "f.dat") or ($numf++)
◮ ternary conditional operator —
cond ? ex1 : ex2
Introduction to Perl Control Structures Making decisions
Example unless (-e "data.dat") { die "File data.dat not found" };
Introduction to Perl Control Structures Making decisions
Example $c = $a cmp $b; if ($c ==0) { print "The strings are the same .\n } elsif ($c <0) { print "$a comes before $b\n"; } else { print "$b comes before $a\n"; }
Introduction to Perl Control Structures Loops
Loops
◮ while: while (cond} { ... } ◮ for statement: C-style:
for (init; condition; change) { ... }
◮ foreach
Introduction to Perl Control Structures Loops
Example while loop
Read in a list of numbers terminated by 0. Compute the sum. $sum =0; $num=<STDIN >; while ($num != 0) { $sum=$sum+$num; $num=<STDIN >; } print "Sum is $sum\n";
Introduction to Perl Control Structures Loops
use strict; my $i; my $j; for ($i=1; $i <11; $i++) { for ($j=0; $j <$i; $j++) { print "*"; } print "\n"; }
Introduction to Perl Control Structures Loops
use strict; my $i; for ($i=1; $i <11; $i++) { print "*" x $i; print "\n"; }
Introduction to Perl Control Structures Other flow of control. . .
Other flow of control. . .
◮ next, last continue and break ◮ scalar range operator: ..
don’t use unless you’re an expert.
◮ goto
don’t use
Introduction to Perl Simple process control
Simple process control
Can execute Unix commands or executables using the backtick operator ‘ $today = ‘date "+%C%y%m%d"‘ ; print "Today is $today\n";
Introduction to Perl Simple process control
Other system calls
system Similar to backtick but returns return code $x = system("ls"); print "Returned <$x >\n"; exec Similar to system, but does not wait for completion.
Introduction to Perl File operations
File operations
To read data from a file or write it to a file, use a file handle.
◮ Need to associate file handle with external file. ◮ File can be virtual – could be a pipe.
Introduction to Perl File operations Files in general
Files in General
File handle is the Perl construct used to manipulate files
◮ Open it – associate with external file or pipe ◮ Use it ◮ Close it
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
◮ open(DATAF, "<figs.dat")
read
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
◮ open(DATAF, "<figs.dat")
read
◮ open(DATAF, ">figs.dat")
write
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
◮ open(DATAF, "<figs.dat")
read
◮ open(DATAF, ">figs.dat")
write
◮ open(DATAF, ">>figs.dat")
append
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
◮ open(DATAF, "<figs.dat")
read
◮ open(DATAF, ">figs.dat")
write
◮ open(DATAF, ">>figs.dat")
append
◮ open(DATAF, "| output-pipe-cmd");
set up output filter
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
◮ open(DATAF, "<figs.dat")
read
◮ open(DATAF, ">figs.dat")
write
◮ open(DATAF, ">>figs.dat")
append
◮ open(DATAF, "| output-pipe-cmd");
set up output filter
◮ open(DATAF, "input-pipe-cmd| "); set up
input filter
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
◮ open(DATAF, "<figs.dat")
read
◮ open(DATAF, ">figs.dat")
write
◮ open(DATAF, ">>figs.dat")
append
◮ open(DATAF, "| output-pipe-cmd");
set up output filter
◮ open(DATAF, "input-pipe-cmd| "); set up
input filter
Introduction to Perl File operations Files in general
◮ open(DATAF, "figs.dat")
read
◮ open(DATAF, "<figs.dat")
read
◮ open(DATAF, ">figs.dat")
write
◮ open(DATAF, ">>figs.dat")
append
◮ open(DATAF, "| output-pipe-cmd");
set up output filter
◮ open(DATAF, "input-pipe-cmd| "); set up
input filter To print to a file that is open for writing, use the related file handle in the print statement.
Introduction to Perl File operations Input from files
Input from files
Suppose that INP is a file handle. A reference to <INP> does the following:
◮ Returns the next line of input; ◮ Consumes the input – in the case of a file,
advances the file pointer to the next line of the file.
◮ NB: end-of-line marker read in.
Introduction to Perl File operations Input from files
for($i=0; $i <3; $i++) { $x=<INP >; print "**$x##" } Assuming that file contains apple, banana, cherry.
Introduction to Perl File operations Input from files
for($i=0; $i <3; $i++) { $x=<INP >; print "**$x##" } Assuming that file contains apple, banana, cherry. **apple ##**banana ##**cherry ##
Introduction to Perl File operations Input from files
To add up the numbers in a file with three numbers.
- pen(INP ,"nums.txt");
$x1 = <INP >; $x2 = <INP >; $x3 = <INP >; print "Answer is " . $x1 + $x2 + $x3;
- r
- pen(INP ,"nums.txt");
print "Answer is " . (<INP > + <INP > + <INP >);
Introduction to Perl File operations Input from files
What’s the difference between
print "Answer is " . (<INP > + <INP > + <INP >); and print "Answer is . (<INP > + <INP > + <INP >) ";
Introduction to Perl File operations Input from files
Add up odd numbers in file
print "Enter the file name: "; $namef = <STDIN >;
- pen(DATAF , $namef );
while($x = <DATAF >) { if ($x % 2) { $sum = $sum + $x; } } close(DATAF ); print "The sum is $sum\n";
Introduction to Perl File operations Input from files
A common Perl idiom is
- pen(DATAF ,$namef) or die "Can’t open
Introduction to Perl File operations Implicit operands
Implicit Operands
Perl uses implicit operands extensively.
◮ We are told that this is a feature. ◮ The main culprit: $_ or $ARG
Set (among other places) by reference to a file
- handle. Many operators use $ARG if not given
explicit operator explicitly.
◮ The following prints out the contents of a file:
while (<DATAF>) {print};
Introduction to Perl File operations Implicit operands
Exercise
Write a Perl program that reads integers from a file called nums.dat and counts how many numbers are
◮ less than zero ◮ between zero and 10 ◮ greater than or equal to 11
If there is no such file, your program should print an error message and halt.
Introduction to Perl Arrays/lists
Arrays/lists
Array variables are prefixed by @ my @days; @days = ("Sun", "Mon", "Tue", "Wed", "Thu", "F
◮ Arrays indexed from 0
Introduction to Perl Arrays/lists
Arrays/lists
Array variables are prefixed by @ my @days; @days = ("Sun", "Mon", "Tue", "Wed", "Thu", "F
◮ Arrays indexed from 0 ◮ Individual element are scalars: so $days[0]
Introduction to Perl Arrays/lists
Common to use foreach
@days = ("Sun", "Mon", "Tue", "Wed", " foreach $d (@days) { print "Day $d\n"; }
Introduction to Perl Arrays/lists
Quote words
Short hand for arrays of words @days = qw(Sun Mon Tue Wed Thu Fri Sat
Introduction to Perl Arrays/lists
@ARGV – program’s arguments
@ARGV: Predefined variable array
◮ Contains the program’s arguments – values
passed by the caller.
◮ If the program is run as follows:
./example.pl apple pear 1 2 3 array @ARGV contains those values
◮ $ARGV[0] is apple; ◮ $ARGV[3] is 2;
Introduction to Perl Arrays/lists
Array semantics
Expressions are evaluated in scalar or list context.
◮ Similar or identical expressions can evaluate to
different things in different contexts.
Introduction to Perl Arrays/lists
Array semantics
Expressions are evaluated in scalar or list context.
◮ Similar or identical expressions can evaluate to
different things in different contexts. Context determined by operation:
◮ LHS of assignment determines context ◮ Operators or functions may determine context
Introduction to Perl Arrays/lists
Array context
◮ print @days yields SunMonTueWedThuFriSat ◮ @weekdays = @days[1..5] array $weekdays set to
Mon .. Fri
◮ @days[1] is a one element array
Introduction to Perl Arrays/lists
Array context
◮ print @days yields SunMonTueWedThuFriSat ◮ @weekdays = @days[1..5] array $weekdays set to
Mon .. Fri
◮ @days[1] is a one element array
Scalar context
◮ $numdays = @days; sets $numdays to 7
Introduction to Perl Arrays/lists
Array context
◮ print @days yields SunMonTueWedThuFriSat ◮ @weekdays = @days[1..5] array $weekdays set to
Mon .. Fri
◮ @days[1] is a one element array
Scalar context
◮ $numdays = @days; sets $numdays to 7 ◮
$y = ("Sun","Mon","Tue","Wed","Thu","Fri","Sat") sets $y to Sat
Introduction to Perl Arrays/lists
Array context
◮ print @days yields SunMonTueWedThuFriSat ◮ @weekdays = @days[1..5] array $weekdays set to
Mon .. Fri
◮ @days[1] is a one element array
Scalar context
◮ $numdays = @days; sets $numdays to 7 ◮
$y = ("Sun","Mon","Tue","Wed","Thu","Fri","Sat") sets $y to Sat
◮ $weekdays = $days[1..5] sets $weekdays to 5 ◮ $#days is 6 – the highest index of @days
Introduction to Perl Arrays/lists
Useful functions
◮ push : adds something to the right of the array ◮ pop : removes the rightmost element of the
array.
◮ shift/unshift ◮ sort: returns a sorted list
by default string order ascending, but can be changed.
Introduction to Perl Arrays/lists
Arrays are inherently 1D
◮ Need references to handle multi-dimensional
arrays
Introduction to Perl Arrays/lists Asides
Asides on strings
◮ splitting
@x = split ";", "1;2;3"; foreach $s (@x) { print "$s\n"; }
◮ chop, chomp. ◮ see regexes later
Introduction to Perl Hash tables
Hash tables
Unordered set of scalars which allows fast information retrieval:
◮ Elements accessed/indexed by a string value
associated with it
◮ Variables prefixed by a %, e.g. %currency ◮ Individual elements are scalar:
$currency["South Africa"]
Introduction to Perl Hash tables
$currency["South Africa"]="ZAR"; $currency["Britain"] = "GBP";
Introduction to Perl Hash tables
Example
%day2num = ("Sun",0, "Mon",1, "Tue",2, "Wed",3, "Thu",4, "Fri",5, "Sat",6) Better is %day2num= ("Sun"=>0, "Mon"=>1, "Tue"=>2, "Wed"=>3, "Thu"=>4, "Fri"=>5, "Sat"=>6) Referred to as $day2num{"Wed"}
Introduction to Perl Hash tables
Useful hash operations
◮ keys returns list of keys in hash table.
Order non-deterministic.
Introduction to Perl Hash tables
Useful hash operations
◮ keys returns list of keys in hash table.
Order non-deterministic.
◮ values returns list values stored in the hash
- table. Order of no significance.
Introduction to Perl Hash tables
Useful hash operations
◮ keys returns list of keys in hash table.
Order non-deterministic.
◮ values returns list values stored in the hash
- table. Order of no significance.
◮ sort keys %table lists the keys in
lexicographic order
Introduction to Perl Hash tables
Example
# read in from file while ($name = <INP>) { $reg = <INP>; $carreg{$name} = $reg; } # print out in order of name foreach $n (sort keys %carreg) { print "Car reg of $n is $carreg{$n} \n"; }
Introduction to Perl Hash tables
Can also tell sort how to sort:
◮ sort {$a <=> $b} list sorts the list in
numeric order.
◮
sort {$table{$a} cmp $table{$b}} keys %table sorts the keys so that corresponding hash table elements are in order
Introduction to Perl Hash tables
#print out in order of car reg foreach $n (sort {$carreg{$a} cmp $carreg{$b}} keys %carreg) { print "$carreg{$n} owned by $n\n"; }
Introduction to Perl Procedures
Procedures
Perl has procedures but its parameter passing mechanism is poor.
◮ Suppose we have a procedure plus that adds
up two numbers. Called: $x = &plus($a, $b); where $a and $b are the parameters. Call by value semantics
Introduction to Perl Procedures
Declaring a subroutine
To declare a subroutine: sub NAME BLOCK e.g. sub printhead { print "Name Age Number Balance\n"; } ... &printhead();
Introduction to Perl Procedures
Parameters
Procedures have one formal parameter: @_
◮ The values of the actual parameter are given to
the formal parameter (call-by-value).
◮ @_: a list local to the procedure.
sub plus { sub plus { $x = $_[0]; my ($x,$y) = @_; $y = $_[1]; return $x+$y; return $x + $y; } }
Introduction to Perl Procedures
A common idiom uses shift sub plus { $x = shift @_; $y = shift @_; return $x + $y; }
Introduction to Perl Procedures
Example
A procedure to add up a list: sub addlist {$sum = 0; foreach $num (@_) {$sum += $num}; return $sum;} sub proclist { my ($f1,$f2,@nums) = @_; foreach $n (@nums) {$sum = $sum+$n; } return ($f1*$f2*$sum); } $x = &addlist(1,2,3,5,6,9); $y = &proclist(3,4,1,1,0,2);
Introduction to Perl Procedures
The following does not work (s sub dotprod { (@a,@b) = @_; ... ... } @x = (1,2,3); @y = (4,5,6); &dotprod(@x,@y);
Introduction to Perl Procedures
◮ Formal parameter is an unstructured list (possible danger
in passing multiple lists as parameters)
◮ Forward declarations: sub addlist;
Telling Perl that addlist is a procedure which will be declared elsewhere.
◮ Can have anonymous procedures that get assigned to
variables.
◮ Weakness in parameter system can be overcome by using
references.
◮ Default: all variables are global wherever defined. This is
even recognised by the Perl community to be a Bad Thing Declare variables local inside procedures. Best way: prefix first use of variable in a procedure with with my. Lexical scoping.
Introduction to Perl References
References
References are mechanisms for a variable to refer to something, e.g. (1) another variable; (2) a piece of data; (3); a function Symbolic references Can turn a string into a variable $x = 123; $y = "x"; $$y = 5; print $x;
Introduction to Perl References
Hard references
Can use the \ operator to dereference a variable, and -> to dereference. @a = (10,20,30,40,50); $x = \@a; for $e ( @{$x} ) { print "$e\n"; } print $x->[0];
Introduction to Perl References
Call by reference
sub count { my ($fname , $rcount) = @_; chomp $fname; unless (-T $fname) { return; }
- pen(FINP ,$fname );
while (<FINP >) { $$rcount ++ } close(FINP ); }
Introduction to Perl References
@files = ‘ls -1 $ARGV [0]‘; $n = 0; for $f (@files) { &count($f ,\$n); } print "Total number of lines is $n\n";
Introduction to Perl References
Passing multiple arrays
sub dotprod { ($a,$b)=@_; .... } @x = (1,2,3); @y = (4,5,6); &dotprod(\@x,\@y); Note that only scalars are passed.
Introduction to Perl References
Refs to anonymous arrays and hashes
Square brackets creates an array – returns a reference $a = [ 10, 20, 30, 40];
◮ Error: @a = [ 10, 20, 30, 40];
Curly braces creates a hash – returns a reference $day = {"sun"=>0, "mon"=>1, "tues"=> .....}
Introduction to Perl Modules
Modules
A module is a collection of code, data structures that can be used as a library
◮ Often see it in OO programming
Magic word is use use IO;
◮ To access something is in a module use ::. So,
module::thing
◮ Modules can be nested.
Introduction to Perl Modules
Example
use IO:: Compress :: Bzip2; IO:: Compress :: Bzip2 :: bzip2 ("do.pl", "do.pl.bz
Introduction to Perl Modules
Example
use IO:: Compress :: Bzip2; IO:: Compress :: Bzip2 :: bzip2 ("do.pl", "do.pl.bz Can also import specific things use IO:: Compress :: Bzip2 qw(bzip2 $Bzip2Error ); unless (bzip2 ("do.plx", "do.pl.bz2")) { warn "Compression failed returning : $Bzip2 }
Introduction to Perl Objects
Objects
Data structures which
◮ contain data, know functions that can apply to
them;
◮ anonymous ◮ accessed through references ◮ typically organised in classes
Introduction to Perl Objects
use Bio::Seq; $seqio = Bio::SeqIO ->new(’-format ’ => ’embl ’ , -file $seqobj = $seqio ->next_seq (); $seqstr = $seqobj ->seq (); $seqstr = $seqobj ->subseq (10 ,50); @features = $seqobj -> get_SeqFeatures (); foreach my $feat ( @features ) { print "Feature ",$feat ->primary_tag , " starts ",$feat ->start , " ends ", $feat ->end ," strand ",$feat ->strand ,"\n"; }
Introduction to Perl Regular expressions
Regular expressions, matching, and more
Perl’s regular expression support powerful
◮ concise way of describing a set of strings.
Typical: a command uses a regular expression to process some argument.
◮ Example:
split uses a regex to split a string
$line = "Gauteng;Johannesburg GP 7"; @info = split("[ ;]", $line); ($prov, $capital,$reg, $pop) = split("[ ;] ", $line);
Introduction to Perl Regular expressions
Specifying Regular Expressions
◮ Most characters stand for themselves: F Fred 6312
Introduction to Perl Regular expressions
Specifying Regular Expressions
◮ Most characters stand for themselves: F Fred 6312 ◮ \ | ( ) [ { ^ $ * + ? .
are metacharacters (have special meaning)
◮ escape with a backslash for the chars:
Fred stands for the string (Fred)
Introduction to Perl Regular expressions
Specifying Regular Expressions
◮ Most characters stand for themselves: F Fred 6312 ◮ \ | ( ) [ { ^ $ * + ? .
are metacharacters (have special meaning)
◮ escape with a backslash for the chars:
Fred stands for the string (Fred)
◮ To group things together, use parentheses.
Introduction to Perl Regular expressions
Specifying Regular Expressions
◮ Most characters stand for themselves: F Fred 6312 ◮ \ | ( ) [ { ^ $ * + ? .
are metacharacters (have special meaning)
◮ escape with a backslash for the chars:
Fred stands for the string (Fred)
◮ To group things together, use parentheses. ◮ To specify alternatives, use |
(green|red) apples stands for green apples or red apples
Introduction to Perl Regular expressions
Specifying Regular Expressions
◮ Most characters stand for themselves: F Fred 6312 ◮ \ | ( ) [ { ^ $ * + ? .
are metacharacters (have special meaning)
◮ escape with a backslash for the chars:
Fred stands for the string (Fred)
◮ To group things together, use parentheses. ◮ To specify alternatives, use |
(green|red) apples stands for green apples or red apples
Introduction to Perl Regular expressions
Specifying Regular Expressions
◮ A list of characters in square brackets matches any of the
characters.
◮ [YyNn] matches any of an upper or lower case “y” or
“n”.
◮ [A-Za-z0-9] is all the alphanumeric characters
Introduction to Perl Regular expressions
◮ \n new line; \t tab; \s a whitespace;
Introduction to Perl Regular expressions
◮ \n new line; \t tab; \s a whitespace; ◮ \d digit; \D non-digit;
Introduction to Perl Regular expressions
◮ \n new line; \t tab; \s a whitespace; ◮ \d digit; \D non-digit; ◮ \w a word charater, \W a non-word character
Introduction to Perl Regular expressions
◮ \n new line; \t tab; \s a whitespace; ◮ \d digit; \D non-digit; ◮ \w a word charater, \W a non-word character ◮ . anything but a \n
Introduction to Perl Regular expressions
◮ \n new line; \t tab; \s a whitespace; ◮ \d digit; \D non-digit; ◮ \w a word charater, \W a non-word character ◮ . anything but a \n ◮ m{3} stands for mmm
(map){2,3} stands for mapmap or mapmapmap
◮ m* stands for 0 or more m’s
Introduction to Perl Regular expressions
◮ \n new line; \t tab; \s a whitespace; ◮ \d digit; \D non-digit; ◮ \w a word charater, \W a non-word character ◮ . anything but a \n ◮ m{3} stands for mmm
(map){2,3} stands for mapmap or mapmapmap
◮ m* stands for 0 or more m’s ◮ m+ stands for 1 or more m’s
Introduction to Perl Regular expressions
◮ \n new line; \t tab; \s a whitespace; ◮ \d digit; \D non-digit; ◮ \w a word charater, \W a non-word character ◮ . anything but a \n ◮ m{3} stands for mmm
(map){2,3} stands for mapmap or mapmapmap
◮ m* stands for 0 or more m’s ◮ m+ stands for 1 or more m’s ◮ Many others. Rules long and complex.
Introduction to Perl Regular expressions
Processing with regular expressions
Various commands that allow
◮ finding a pattern matching a regular expression in a string; ◮ extracting out the regular expression; ◮ substituting; ◮ or other modification
e.g. matching, substitution, translation, substitution.
Introduction to Perl Regular expressions
Splitting up input
split(pat,string) Takes the input string and splits the string wherever the pattern pat occurs.
Introduction to Perl Regular expressions
Splitting up input
split(pat,string) Takes the input string and splits the string wherever the pattern pat occurs.
◮ returns a list of strings as a result ◮ can choose whether the split pattern is part of the
returned string or not
Introduction to Perl Regular expressions
Splitting up input
split(pat,string) Takes the input string and splits the string wherever the pattern pat occurs.
◮ returns a list of strings as a result ◮ can choose whether the split pattern is part of the
returned string or not $x = split /\*|::/, $c processes the string $c, splits it wherever a * or :: appear, and returns the split list. So, if $c="Dat: 23 * Mon: 11; LgA -632 ::: LgB -217* a", $x= ?
Introduction to Perl Regular expressions
Matching
◮ m/regexpr/ returns true if the regular expression matches
$_.
Introduction to Perl Regular expressions
Matching
◮ m/regexpr/ returns true if the regular expression matches
$_. The m is optional in most cases (unless slash in string) m-/apple-
Introduction to Perl Regular expressions
Matching
◮ m/regexpr/ returns true if the regular expression matches
$_. The m is optional in most cases (unless slash in string) m-/apple-
◮ To match a particular string use the binding operator: =~
$inp =~ m/(Pascal)|(\WC\W)|(C\+\+)/;
Introduction to Perl Regular expressions
Matching
◮ m/regexpr/ returns true if the regular expression matches
$_. The m is optional in most cases (unless slash in string) m-/apple-
◮ To match a particular string use the binding operator: =~
$inp =~ m/(Pascal)|(\WC\W)|(C\+\+)/;
◮ A number of modifiers available: i (case insensitive) g
(global) Count number of matches in a string: while ($inp =~ m/(Pascal)|(\WC\W)|(C\+\+)/g) { $wrd++ }
Introduction to Perl Regular expressions
Extracting patterns
* Can refer to and extract out subexpressions matched $1, $2, . . .
◮ To find a repeat m/\b(\w+)\s+\1/ ◮ To extract out
while ($line = <INP>) { $line =~ m/(\w+)\s+(\w+)/; $name = $1; $reg = $2; $carreg{$name}=$reg; } Look for repeats:
Introduction to Perl Regular expressions
Substitution
s/regexprA/regexprB/ substitutes regular expression A by regular expression B. Default string processed is $ . s/Apple/Orange/; Replaces the first occurrence of Apple in $ with Orange Similar modifiers to m: i, g,... Use the binder operator to choose the string chosen: $name = "Alan Turing"; $name =~ s/(\w)\w*/\1\./; $c = ($name =~ s/(\w)\w*/\1\./); $c = ($name =~ s/(\w)\w*/\1\./g);
Introduction to Perl Regular expressions
Translation
Use the tr or y operator.
◮ Operates on strings, rather than regular expressions.
tr/range1/range2/ replaces any character in range1 and replaces it with the corresponding character in range2.
◮ $c =~ tr/a-z/A-Z/ change every character in $c to
uppercase.
◮ $cnt = $c =~ tr/a-z/a-z/ sets $cnt to the number of
lower case letters in $c.
Introduction to Perl And more
And more
◮ maths functions ◮ more sophisticated I/O ◮ lots of built in variables: e.g. \$NR ◮ modules, OO ◮ IPC, networking ◮ process control