 
              5mm. Reading data from a file INF1100 Lectures, Chapter 6: A file is a sequence of characters (text) Files, Strings, and Dictionaries We can read text in the file into strings in a program This is a common way for a program to get input data Hans Petter Langtangen Basic recipe: infile = open(’myfile.dat’, ’r’) # read next line: line = infile.readline() Simula Research Laboratory University of Oslo, Dept. of Informatics # read lines one by one: for line in infile: <process line> # load all lines into a list of strings (lines): lines = infile.readlines() for line in lines: <process line> INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.1/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.2/ ?? Example: reading a file with numbers (part 1) Example: reading a file with numbers (part 2) The file data1.txt has a column of numbers: We must convert strings to numbers before computing: 21.8 infile = open(’data1.txt’, ’r’) 18.1 lines = infile.readlines() 19 infile.close() 23 mean = 0 26 for line in lines: 17.8 number = float(line) mean = mean + number Goal: compute the average value of the numbers: mean = mean/len(lines) print mean infile = open(’data1.txt’, ’r’) lines = infile.readlines() A quicker and shorter variant: infile.close() mean = 0 infile = open(’data1.txt’, ’r’) for number in lines: numbers = [float(line) for line in infile.readlines()] mean = mean + number infile.close() mean = mean/len(lines) mean = sum(numbers)/len(numbers) print mean Running the program gives an error message: TypeError: unsupported operand type(s) for +: ’int’ and ’str’ Problem: number is a string! INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.3/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.4/ ??
While loop over lines in a file Experiment with reading techniques Especially older Python programs employ this technique: >>> infile = open(’data1.txt’, ’r’) >>> fstr = infile.read() # read file into a string >>> fstr ’21.8\n18.1\n19\n23\n26\n17.8\n’ infile = open(’data1.txt’, ’r’) mean = 0 >>> line = infile.readline() # read after end of file... n = 0 >>> line ’’ while True: # loop "forever" >>> bool(line) # test if line: line = infile.readline() if not line: # line=’’ at end of file False # empty object is False break # jump out of loop >>> infile.close(); infile = open(’data1.txt’, ’r’) >>> lines = infile.readlines() mean += float(line) n += 1 >>> lines infile.close() [’21.8\n’, ’18.1\n’, ’19\n’, ’23\n’, ’26\n’, ’17.8\n’] >>> infile.close(); infile = open(’data1.txt’, ’r’) mean = mean/float(n) >>> for line in infile: print line, print mean ... 21.8 18.1 19 23 26 17.8 INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.5/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.6/ ?? Reading a mixture of text and numbers (part 1) Reading a mixture of text and numbers (part 2) The file rainfall.dat looks like this: The complete program: Average rainfall (in mm) in Rome: 1188 months between 1782 and 1970 def extract_data(filename): Jan 81.2 infile = open(filename, ’r’) Feb 63.2 infile.readline() # skip the first line Mar 70.3 numbers = [] Apr 55.7 for line in infile: May 53.0 words = line.split() ... number = float(words[1]) numbers.append(number) Goal: read the numbers and compute the mean infile.close() return numbers Technique: for each line, split the line into words, convert the 2nd word to a number and add to sum values = extract_data(’rainfall.dat’) for line in infile: from scitools.std import plot words = line.split() # list of words on the line month_indices = range(1, 13) number = float(words[1]) plot(month_indices, values[:-1], ’o2’) Note line.split() : very useful for grabbing individual words on a line, can split wrt any string, e.g., line.split(’;’) , line.split(’:’) INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.7/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.8/ ??
What is a file? Dictionaries A file is a sequence of characters Lists and arrays are fine for collecting a bunch of objects in a single object For simple text files, each character is one byte (=8 bits, a bit is 0 or 1), which gives 2 8 = 256 different characters List and arrays use an integer index, starting at 0, for reaching the elements (Text files in, e.g., Chinese and Japanese need several bytes for For many applications the integer index is "unnatural" - a general each character) text (or integer not restricted to start at 0) will ease programming Save the text "ABCD" to file in Emacs and OpenOffice/Word and Dictionaries meet this need examine the file Dictionary = list with text (or any constant object) as index In Emacs, the file size is 4 bytes Other languages use names like hash, HashMap and associative array for what is known as dictionary in Python INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.9/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.10/ ?? Example on a dictionary Dictionary operations (part 1) Suppose we need to store the temperatures in Oslo, London and Add a new element to a dict (dict = dictionary): >>> temps[’Madrid’] = 26.0 Paris >>> print temps {’Oslo’: 13, ’London’: 15.4, ’Paris’: 17.5, List solution: ’Madrid’: 26.0} temps = [13, 15.4, 17.5] # temps[0]: Oslo Loop (iterate) over a dict: # temps[1]: London >>> for city in temps: # temps[2]: Paris ... print ’The temperature in %s is %g’ % \ ... (city, temps[city]) We need to remember the mapping between the index and the ... city name – with a dictionary we can index the list with the city The temperature in Paris is 17.5 name directly (e.g., temps["Oslo"] ): The temperature in Oslo is 13 The temperature in London is 15.4 temps = {’Oslo’: 13, ’London’: 15.4, ’Paris’: 17.5} The temperature in Madrid is 26 # or temps = dict(Oslo=13, London=15.4, Paris=17.5) The index in a dictionary is called key # application: print ’The temperature in London is’, temps[’London’] (a dictionary holds key–value pairs) for key in dictionary: value = dictionary[key] print value INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.11/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.12/ ??
Dictionary operations (part 2) Dictionary operations (part 3) Does the dict have a particular key? More operations: >>> if ’Berlin’ in temps: >>> del temps[’Oslo’] # remove Oslo key w/value ... print ’Berlin:’, temps[’Berlin’] >>> temps ... else: {’Paris’: 17.5, ’London’: 15.4, ’Madrid’: 26.0} ... print ’No temperature data for Berlin’ >>> len(temps) # no of key-value pairs in dict. ... 3 No temperature data for Berlin >>> ’Oslo’ in temps # standard boolean expression Two variables can refer to the same dictionary: True >>> t1 = temps >>> t1[’Stockholm’] = 10.0 # change t1 The keys and values can be reached as lists: >>> temps # temps is also changed >>> temps.keys() {’Stockholm’: 10.0, ’Paris’: 17.5, ’London’: 15.4, [’Paris’, ’Oslo’, ’London’, ’Madrid’] ’Madrid’: 26.0} >>> temps.values() >>> t2 = temps.copy() # take a copy [17.5, 13, 15.4, 26.0] >>> t2[’Paris’] = 16 >>> t1[’Paris’] Note: the sequence of keys is arbitrary! Never rely on it – if you 17.5 need a specific order of the keys, use a sort: for key in sorted(temps): value = temps[key] print value INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.13/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.14/ ?? Examples: polynomials represented by dictionaries Lists as dictionaries The polynomial A list can also represent a polynomial p ( x ) = − 1 + x 2 + 3 x 7 The list index must correspond to the power − 1 + x 2 + 3 x 7 becomes can be represented by a dict with power as key and coefficient as p = [-1, 0, 1, 0, 0, 0, 0, 3] value: p = {0: -1, 2: 1, 7: 3} Must store all zero coefficients, think about 1 + x 100 ... Evaluate polynomials represented as dictionaries: � i ∈ I c i x i Evaluating the polynomial at a given x value: � N i =0 c i x i def poly1(data, x): def poly2(data, x): sum = 0.0 sum = 0 for power in data: for power in range(len(data)): sum += data[power]*x**power sum += data[power]*x**power return sum return sum Shorter: def poly1(data, x): return sum([data[p]*x**p for p in data]) INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.15/ ?? INF1100 Lectures, Chapter 6:Files, Strings, and Dictionaries – p.16/ ??
Recommend
More recommend