lexical analysis lexical analysis
play

Lexical analysis Lexical analysis Lexical analysis checks the - PowerPoint PPT Presentation

Lexical analysis Lexical analysis Lexical analysis checks the correctness of program words and transforms a program to the stream of tokens: removes empty symbols and commentaries; identifies keywords, indentifiers and literal


  1. Lexical analysis

  2. Lexical analysis Lexical analysis checks the correctness of program words and transforms a program to the stream of tokens: – removes empty symbols and commentaries; – identifies keywords, indentifiers and literal constants; – constructs a symbol table; – finds line/column numbers of symbols; – informs about lexical errors when necessary. Lexical analysis is also called scanning and the corresponding analyser is called scanner.

  3. Regular expressions Regular expressions over (finite) alphabet ✝ ❊ ✿✿❂ ❀ ❥ ✧ ❥ ❛ ❥ ✭ ❊ ❊ ✮ ❥ ✭ ❊ ❥ ❊ ✮ ❥ ❊ ❄ where ❛ ✷ ✝ . Regular expression ❊ defines a language ▲ ✭ ❊ ✮ ✒ ✝ ❄ ▲ ✭ ❀ ✮ ❂ ❀ ▲ ✭ ❊ ✶ ❊ ✷ ✮ ❂ ❢ ✉✈ ❥ ✉ ✷ ▲ ✭ ❊ ✶ ✮ ❀ ✈ ✷ ▲ ✭ ❊ ✷ ✮ ❣ ▲ ✭ ✧ ✮ ❂ ❢ ✧ ❣ ▲ ✭ ❊ ✶ ❥ ❊ ✷ ✮ ❂ ▲ ✭ ❊ ✶ ✮ ❬ ▲ ✭ ❊ ✷ ✮ ❢ ✇ ✐ ❥ ✇ ✷ ▲ ✭ ❊ ✮ ❀ ✐ ✕ ✵ ❣ ▲ ✭ ❊ ❄ ✮ ▲ ✭ ❛ ✮ ❂ ❢ ❛ ❣ ❂ where ✇ ✵ ❂ ✧ and ✇ ♥ ✰✶ ❂ ✇✇ ♥ .

  4. Regular expressions Examples: Regular expression Defined language ❛ ❥ ❜ ❢ ❛❀ ❜ ❣ ❢ ❛❜❜❛ ❣ ❛❜❜❛ ❛❜ ❄ ❛ ❢ ❛❛❀ ❛❜❛❀ ❛❜❜❛❀ ❛❜❜❜❛❀ ✿ ✿ ✿ ❣ ✭ ❛❜ ✮ ❄ ❢ ✧❀ ❛❜❀ ❛❜❛❜❀ ❛❜❛❜❛❜❀ ✿ ✿ ✿ ❣ To minimize a number of needed parentheses, operators have priorities: – the closure operator ✭ ✁ ✮ ❄ has highest priority; – the choice operator ✭ ✁ ❥ ✁ ✮ has lowest priority.

  5. Regular expressions A regular description over alphabet ✝ is the set of rules ✦ ❞ ✶ ❊ ✶ ❞ ✷ ✦ ❊ ✷ ✿ ✿ ✿ ❞ ♥ ✦ ❊ ♥ where ❞ ✐ is a (unique) name and ❊ ✐ is a regular expression over alphabet ✝ ❬ ❢ ❞ ✶ ❀ ✿ ✿ ✿ ❀ ❞ ✐ � ✶ ❣ . Short-hand notation for regular expressions: – nonempty closure : ❊ ✰ ❂ ❊❊ ❄ ; – option : ❊ ❄ ❂ ✧ ❥ ❊ ; – character classes : eg. ❬ ❛❀ ❜❀ ❝ ❪ ❂ ❛ ❥ ❜ ❥ ❝ or ❬ ❛ � ③ ❪ ❂ ❛ ❥ ✿ ✿ ✿ ❥ ③ .

  6. Regular expressions Examples of regular descriptions: Identifiers: Letter ✦ ❬ ❛ � ③❀ ❆ � ❩ ❪ ✦ ❬✵ � ✾❪ Digit Letter ✭ Letter ❥ Digit ✮ ❄ Identifier ✦ Numeric constants: Sign ✦ ✭✰ ❥ � ✮❄ ✵ ❥ Sign ❬✶ � ✾❪ Digit ❄ Integer ✦ Integer ✿ Digit ✰ Decimal ✦ Real ✦ ✭ Integer ❥ Decimal ✮ ❊ Integer

  7. Finite automata A finite automaton is the quintuple ❆ ❂ ❤ ◗❀ ✝ ❀ ✍❀ q ✵ ❀ ❋ ✐ , where – ◗ is a finite set of states; – ✝ is the finite alphabet; – ✍ ✒ ◗ ✂ ✭✝ ❬ ✧ ✮ ✂ ◗ is the transition relation; – q ✵ ✷ ◗ is the initial state; – ❋ ✒ ◗ is a set of final states. A finite automaton is deterministic (DFA), if the transition relation is a function ✍ ✿ ◗ ✂ ✝ ✦ ◗ . Otherwise, the finite automaton is nondeterministic (NFA).

  8. Finite automata Finite automata can be represented by state transition diagrams: ❜ ❛ ❛ q ✵ q ✶ q ✷ The finite automaton ❆ ❂ ❤ ◗❀ ✝ ❀ ✍❀ q ✵ ❀ ❋ ✐ accepts the language ▲ ✭ ❆ ✮ ❂ ❢ ✇ ✷ ✝ ❄ ❥ ✭ q ✵ ❀ ✇❀ q ❢ ✮ ✷ ✍ ❄ ❀ q ❢ ✷ ❋ ❣ where ✍ ❄ ✒ ◗ ✂ ✝ ❄ ✂ ◗ is a reflexive and transitive closure of the transition relation ✍ . Theorem: The class of languages accepted by finite automata is that of regular languages.

  9. Converting a regular expression to an automaton Thompson’s construction for converting a regular expression to NFA: for a regular expression ❊ construct the ”automaton”: ❊ q ✵ q ❢ transform the ”automaton” using following rules until all transitions have only simple labels (ie. ✧ or a character): ❊ ✶ ❊ ✷ ❊ ✶ ❊ ✷ q ♣ q q ✶ ♣ ❊ ✶ ❊ ✶ ❥ ❊ ✷ q ♣ q ♣ ❊ ✷ ❊ ❊ ❄ ✧ ✧ q ♣ q q ✶ q ✷ ♣ ✧ ✧

  10. Converting a regular expression to an automaton Example: ❛ ✭ ❛ ❥ ❜ ✮ ❄ ✭ ❛ ❥ ❜ ✮ ❄ ❛ q ✵ q ❢ q ✵ q ✶ q ❢ ❛ ❥ ❜ ❛ ✧ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ ❛ ✧ ❜ ✧ q ✵ q ✶ q ✷ q ✸ q ❢ ✧ ✧

  11. Converting a regular expression to an automaton Example: ❛ ✭ ❛ ❥ ❜ ✮ ❄ ✭ ❛ ❥ ❜ ✮ ❄ ❛ q ✵ q ❢ q ✵ q ✶ q ❢ ❛ ❥ ❜ ❛ ✧ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ ❛ ✧ ❜ ✧ q ✵ q ✶ q ✷ q ✸ q ❢ ✧ ✧

  12. Converting a regular expression to an automaton Example: ❛ ✭ ❛ ❥ ❜ ✮ ❄ ✭ ❛ ❥ ❜ ✮ ❄ ❛ q ✵ q ❢ q ✵ q ✶ q ❢ ❛ ❥ ❜ ❛ ✧ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ ❛ ✧ ❜ ✧ q ✵ q ✶ q ✷ q ✸ q ❢ ✧ ✧

  13. Converting a regular expression to an automaton Example: ❛ ✭ ❛ ❥ ❜ ✮ ❄ ✭ ❛ ❥ ❜ ✮ ❄ ❛ q ✵ q ❢ q ✵ q ✶ q ❢ ❛ ❥ ❜ ❛ ✧ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ ❛ ✧ ❜ ✧ q ✵ q ✶ q ✷ q ✸ q ❢ ✧ ✧

  14. Constructing DFA Given NFA ❆ ❂ ❤ ◗❀ ✝ ❀ ✍❀ q ✵ ❀ ❋ ✐ construct an equivalent DFA ❆ ✵ ❂ ❤ ◗ ✵ ❀ ✝ ❀ ✍ ✵ ❀ q ✵ ✵ ❀ ❋ ✵ ✐ by subset construction. Auxiliary functions: – the ✧ -closure function ✧ - ❝❧♦s✉r❡ ✿ ✷ ◗ ✦ ✷ ◗ ✧ - ❝❧♦s✉r❡ ✭ ❙ ✮ ❂ ❢ ♣ ❥ q ✷ ❙❀ ✭ q❀ ✧❀ ♣ ✮ ✷ ✍ ❄ ❣ – the single step function ♠♦✈❡ ✿ ✷ ◗ ✂ ✝ ✦ ✷ ◗ ♠♦✈❡ ✭ ❙❀ ❛ ✮ ❂ ❢ ♣ ❥ q ✷ ❙❀ ✭ q❀ ❛❀ ♣ ✮ ✷ ✍ ❣

  15. Constructing DFA Algorithm: ◗ ✵ ✿❂ ❀ ❀ ❋ ✵ ✿❂ ❀ ❀ ✍ ✵ ✿❂ ❀ ❀ q ✵ ✵ ✿❂ ✧ - ❝❧♦s✉r❡ ✭ ❢ q ✵ ❣ ✮❀ ❯ ✿❂ ❢ q ✵ ✵ ❣ ❀ while ✾ ❙ ✷ ❯ do ❯ ✿❂ ❯ ♥ ❙ ❀ ◗ ✵ ✿❂ ◗ ✵ ❬ ❢ ❙ ❣ ❀ foreach ❛ ✷ ✝ do ❚ ✿❂ ✧ - ❝❧♦s✉r❡ ✭ ♠♦✈❡ ✭ ❙❀ ❛ ✮✮❀ if ❚ ✻✷ ❯ ❬ ◗ ✵ then ❯ ✿❂ ❯ ❬ ❢ ❚ ❣ ❀ ✍ ✵ ✿❂ ✍ ✵ ❬ ❢ ✭ ❙❀ ❛ ✮ ✼✦ ❚ ❣ ❀ end end ❋ ✵ ✿❂ ❢ ❙ ✷ ◗ ✵ ❥ ❙ ❭ ❋ ✻ ❂ ❀❣ ❀

  16. Constructing DFA Example: ❛ ❛ ✧ ❜ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧

  17. Constructing DFA Example: ❛ ❛ ✧ ❜ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ q ✵ ✵

  18. Constructing DFA Example: ❛ ❛ ✧ ❜ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ q ✵ q ✵ ✵ ✶

  19. Constructing DFA Example: ❛ ❛ ✧ ❜ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ ❛ q ✵ q ✵ q ✵ ✵ ✶ ✷ ❜

  20. Constructing DFA Example: ❛ ❛ ✧ ❜ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ ❛ ❛ q ✵ q ✵ q ✵ ✵ ✶ ✷ ❜ ❜

  21. Constructing DFA Example: ❛ ❛ ✧ ❜ ✧ q ❢ q ✵ q ✶ q ✷ q ✸ ✧ ✧ ❛ ❛ ❛ q ✵ q ✵ q ✵ ✵ ✶ ✷ ❜ ❜

  22. Minimizing DFA DFA constructed from the regular expression ❛ ✭ ❛ ❥ ❜ ✮ ❄ : ❛ ❛ ❛ q ✵ q ✶ q ✷ ❜ ❜ An equivalent smaller DFA: ❛ ❛ q ✵ q ✶ ❜

  23. Minimizing DFA DFA is minimal if there is no smaller DFA accepting the same language. For every DFA ❆ ❂ ❤ ◗❀ ✝ ❀ ✍❀ q ✵ ❀ ❋ ✐ there exists an (unique) equivalent minimal DFA ❆ ✵ ❂ ❤ ◗ ✵ ❀ ✝ ❀ ✍ ✵ ❀ q ✵ ✵ ❀ ❋ ✵ ✐ . Idea: partition the set of states into equivalence classes. – States ♣❀ q ✷ ◗ are equivalent or indistinguishable if automata having these as initial states accept the same language (ie. for any word ✇ ✷ ✝ ❄ if one succeeds (resp. fails), the other one does the same, and vice versa). – For every letter, the transition function transformes equivalent states to equivalent states.

  24. Minimizing DFA Minimization algorithm: Remove all states unreachable from the initial state q ✵ . On the remaining set of states find the biggest partition ✆ into equivalence classes. Construct the new automaton ❆ ✵ ❂ ❤ ◗ ✵ ❀ ✝ ❀ ✍ ✵ ❀ q ✵ ✵ ❀ ❋ ✵ ✐ , where – the set of states is ◗ ✵ ❂ ✆ ; – the initial state is q ✵ ✵ ❂ P ✵ , where P ✵ ✷ ✆ and q ✵ ✷ P ✵ ; – the set of final states is ❋ ✵ ❂ ❢ P ✷ ✆ ❥ P ❭ ❋ ✻ ❂ ❀❣ ; – the transition function is ✍ ✵ ❂ ❢ ✭ P ✐ ❀ ❛ ✮ ✼✦ P ❥ ❥ P ❥ ✷ ♠♦✈❡ ✭ P ✐ ❀ ❛ ✮ ❣ .

  25. Minimizing DFA Naive algorithm for finding partition: P ✿❂ ❢ ❋❀ ◗ ♥ ❋ ❣ ❀ do ✆ ✿❂ P ❀ P ✿❂ ❀ ❀ foreach ❙ ✷ ✆ do foreach ❛ ✷ ✝ do ❯ ✿❂ ❢ ❚ ✷ ✆ ❥ ❚ ❭ ♠♦✈❡ ✭ ❙❀ ❛ ✮ ✻ ❂ ❀❣ ❀ ❱ ✿❂ ❢ ❙ ❭ ♠♦✈❡ � ✶ ❛ ✭ ❚ ✮ ❥ ❚ ✷ ❯ ❣ ❀ P ✿❂ P ❬ ❱ ❀ end end until ✆ ❂ P ❀

  26. Minimizing DFA Naive algorithm tries to split all partition at every iteration. – In worst case has a quadradic complexity. – It is enough to consider only these partitions from which one can move to some split partition. Hopcroft’s algorithm for finding the partition: – uses work-list for non-examined split partitions; – if a partition not in the work-list is split, then only one (smaller) subpartition is put to the work-list.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend