CSE 115 Introduction to Computer Science I Road map Review - - PowerPoint PPT Presentation

cse 115
SMART_READER_LITE
LIVE PREVIEW

CSE 115 Introduction to Computer Science I Road map Review - - PowerPoint PPT Presentation

CSE 115 Introduction to Computer Science I Road map Review Exercises from last time Reading csv files exercise File reading A b i t o f t e x t \n o n s e v e r a l l i n e s \n A text file is a sequence of


slide-1
SLIDE 1

CSE 115

Introduction to Computer Science I

slide-2
SLIDE 2

Road map

▶︎ Review ◀ Exercises from last time Reading csv files exercise

slide-3
SLIDE 3

File reading

A text file is a sequence of characters. The contents can be read line by line:

A b i t

  • f

t e x t \n o n s e v e r a l l i n e s \n … A b i t

  • f

t e x t \n

  • n

s e v e r a l l i n e s \n …

slide-4
SLIDE 4

File reading

File objects support iteration: with open("Chapter1.txt") as f: for line in f: . . . do something with each line . . .

slide-5
SLIDE 5

Road map

Review ▶︎ Exercises from last time ◀ Reading csv files exercise

slide-6
SLIDE 6

Exercises

  • 1. Define a function that takes a file name as an argument and returns

a map with character counts for the file. def countCharacters(filename): count = {} with open(filename) as f: for line in f: for ch in line: if ch in count: count[ch] = count[ch] + 1 else: count[ch] = 1 return count Read data from file

slide-7
SLIDE 7

Exercises

  • 1. Define a function that takes a file name as an argument and returns

a map with character counts for the file. def countCharacters(filename): count = {} with open(filename) as f: for line in f: for ch in line: if ch in count: count[ch] = count[ch] + 1 else: count[ch] = 1 return count Process each line from file

slide-8
SLIDE 8

Exercises

  • 1. Define a function that takes a file name as an argument and returns

a map with character counts for the file. def countCharacters(filename): count = {} with open(filename) as f: for line in f: for ch in line: if ch in count: count[ch] = count[ch] + 1 else: count[ch] = 1 return count Process each character from line

slide-9
SLIDE 9

Exercises

  • 1. Define a function that takes a file name as an argument and returns

a map with character counts for the file. def countCharacters(filename): count = {} with open(filename) as f: for line in f: for ch in line: if ch in count: count[ch] = count[ch] + 1 else: count[ch] = 1 return count

If we've see a character before, increment its count but the first time we see a character, enter it with a count of 1

slide-10
SLIDE 10

Exercises

  • 1. Define a function that takes a file name as an argument and returns

a map with character counts for the file. def countCharacters(filename): count = {} with open(filename) as f: for line in f: for ch in line: if ch in count: count[ch] = count[ch] + 1 else: count[ch] = 1 return count

slide-11
SLIDE 11

Exercises

  • 2. Define a function that takes a file name as an argument and returns

a map with word counts for the file. Q: What counts as a word? Anything consisting of uppercase letters A-Z, lowercase letters a-z, and the single quote '. This means that anything that is not A-Z or a-z or ' must come between words. Q: How do we segment a string into words? We can use a library called re, which is a regular expression library. The relevant regular expression to split a string into words is [^A-Za-z']+

slide-12
SLIDE 12

Exercises

  • 2. Define a function that takes a file name as an argument and returns a

map with word counts for the file. import re def countWords(filename): count = {} with open(filename) as f: for line in f: wordList = re.split("[^a-zA-Z']+", line) for word in wordList: if word in count: count[word] = count[word] + 1 else: count[word] = 1 return count

Read data from file import regular expression library

slide-13
SLIDE 13

Exercises

  • 2. Define a function that takes a file name as an argument and returns a

map with word counts for the file. import re def countWords(filename): count = {} with open(filename) as f: for line in f: wordList = re.split("[^a-zA-Z']+", line) for word in wordList: if word in count: count[word] = count[word] + 1 else: count[word] = 1 return count

Process each line from file

slide-14
SLIDE 14

Exercises

  • 2. Define a function that takes a file name as an argument and returns a

map with word counts for the file. import re def countWords(filename): count = {} with open(filename) as f: for line in f: wordList = re.split("[^a-zA-Z']+", line) for word in wordList: if word in count: count[word] = count[word] + 1 else: count[word] = 1 return count

Process each word from line

slide-15
SLIDE 15

Exercises

  • 2. Define a function that takes a file name as an argument and returns a

map with word counts for the file. import re def countWords(filename): count = {} with open(filename) as f: for line in f: wordList = re.split("[^a-zA-Z']+", line) for word in wordList: if word in count: count[word] = count[word] + 1 else: count[word] = 1 return count

Process each word from line Break line into words

slide-16
SLIDE 16

Regular expressions

Regular expressions are used to match patterns. We will use a regular expression library to split each line from the file into words in a reasonable way. Q: What counts as a word? Anything consisting of uppercase letters A-Z, lowercase letters a-z, and the single quote '. This means that anything that is not A-Z or a-z or ' must come between words.

slide-17
SLIDE 17

Regular expressions

This regular expression will break a string into parts at character sequences which are not letters or the single quote (apostrophe): Sally's new puppy is named Rover. Rover's tail was wagging. Rover was happy! Sally's new puppy is named Rover. Rover's tail was wagging. Rover was happy!

slide-18
SLIDE 18

Exercises

  • 2. Define a function that takes a file name as an argument and returns a

map with word counts for the file. import re def countWords(filename): count = {} with open(filename) as f: for line in f: wordList = re.split("[^a-zA-Z']+", line) for word in wordList: if word in count: count[word] = count[word] + 1 else: count[word] = 1 return count

Process each word from wordList

Any character that's not a letter or the single quote One or more such characters

slide-19
SLIDE 19

Exercises

  • 2. Define a function that takes a file name as an argument and returns a

map with word counts for the file. import re def countWords(filename): count = {} with open(filename) as f: for line in f: wordList = re.split("[^a-zA-Z']+", line) for word in wordList: if word in count: count[word] = count[word] + 1 else: count[word] = 1 return count

If we've see a word before, increment its count but the first time we see a word, enter it with a count

  • f 1
slide-20
SLIDE 20

Exercises

  • 2. Define a function that takes a file name as an argument and returns a

map with word counts for the file. import re def countWords(filename): count = {} with open(filename) as f: for line in f: wordList = re.split("[^a-zA-Z']+", line) for word in wordList: if word in count: count[word] = count[word] + 1 else: count[word] = 1 return count

slide-21
SLIDE 21

Road map

Review Exercises from last time ▶︎ Reading csv files ◀ exercise

slide-22
SLIDE 22

csv files

Comma-separated values In computing, a comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. A CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of

  • ne or more fields, separated by commas. The use of the

comma as a field separator is the source of the name for this file format.

Excerpt from https://en.wikipedia.org/wiki/Comma-separated_values

slide-23
SLIDE 23

csv files

Month,Budget,Actual January,200,190 February,200,210 March,150,185 April,100,110 May,50,40 June,50,15 July,50,12 August,50,14 September,50,35 October,100,78 November,150,125 December,200,167

Heating.csv A csv file is a plain text file that contains rows of data, one row per line, with data elements separated by commas on each line. For example:

slide-24
SLIDE 24

csv files

Month,Budget,Actual January,200,190 February,200,210 March,150,185 April,100,110 May,50,40 June,50,15 July,50,12 August,50,14 September,50,35 October,100,78 November,150,125 December,200,167

Heating.csv A csv files can be read from and written to by different applications, such as Excel (left) and Numbers (right).

slide-25
SLIDE 25

Reading csv files

Let's write a program to read the data in our csv file into a

  • dictionary. We'll use the month as a key, and put the rest of

the data into a list. For example:

{'Month': ['Budget', 'Actual'], 'January': ['200', '190'], 'February': ['200', '210'], 'March': ['150', '185'], 'April': ['100', '110'], 'May': ['50', '40'], 'June': ['50', '15'], 'July': ['50', '12'], 'August': ['50', '14'], 'September': ['50', '35'], 'October': ['100', '78'], 'November': ['150', '125'], 'December': ['200', '167'] }

slide-26
SLIDE 26

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget Read data from file import csv library

slide-27
SLIDE 27

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget Process each line from file documentation says this is needed when reading csv files

slide-28
SLIDE 28

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget Process data from line: a list of the comma separated values

slide-29
SLIDE 29

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: key = line[0] value = [line[1], line[2]] budget[key] = value return budget Class came up with this approach:

slide-30
SLIDE 30

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget …as well as the approach I thought of:

slide-31
SLIDE 31

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget line is a list of comma separated values, as in: [ 'July', '50', '12' ]

slide-32
SLIDE 32

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget month is first item in that list

slide-33
SLIDE 33

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget leaving the rest of the data in line remove first item from line…

slide-34
SLIDE 34

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget Add the key-value pair to the dictionary

slide-35
SLIDE 35

Reading csv files

import csv def readBudget(filename): budget = {} with open(filename, newline='') as f: reader = csv.reader(f) for line in reader: month = line[0] line.pop(0) budget[month] = line return budget The complete function

slide-36
SLIDE 36

Road map

Review Exercises from last time Reading csv files ▶︎ exercise ◀

slide-37
SLIDE 37

Exercises

  • 1. Define a function 'overspent' which takes a dictionary like

the one the readBudget function returns a dictionary of the months in which expenditures were over the budget, along with the difference (as a negative value).

  • 2. Define a function 'underspent' which takes a dictionary

like the one the readBudget function returns a dictionary of the months in which expenditures were under budget, along with the difference (as a positive value).