Week 13 - Monday What did we talk about last time? Bit fields - - PowerPoint PPT Presentation

week 13 monday what did we talk about last time bit
SMART_READER_LITE
LIVE PREVIEW

Week 13 - Monday What did we talk about last time? Bit fields - - PowerPoint PPT Presentation

Week 13 - Monday What did we talk about last time? Bit fields Unions Programs must be written for people to read and only incidentally for machines to execute. Harold Abelson and Gerald Jay Sussman Authors of The Structure and


slide-1
SLIDE 1

Week 13 - Monday

slide-2
SLIDE 2

 What did we talk about last time?  Bit fields  Unions

slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5

Programs must be written for people to read and only incidentally for machines to execute. Harold Abelson and Gerald Jay Sussman

Authors of The Structure and Interpretation of Computer Programs

slide-6
SLIDE 6
slide-7
SLIDE 7

 You just learned how to read and write files

  • Why are we going to do it again?

 There is a set of Unix/Linux system commands that do the same

thing

 Most of the higher level calls (fopen(), fprintf(), fgetc(),

and even trusty printf()) are built on top of these low level I/O commands

 These give you direct access to the file system (including pipes)  They are often more efficient  You'll use the low-level file style for networking  All low level I/O is binary

slide-8
SLIDE 8

 To use low level I/O functions, include headers as follows:

#include <fcntl.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h>

 You won't need all of these for every program, but you might

as well throw them all in

slide-9
SLIDE 9

 High level file I/O uses a FILE* variable for referring to a file  Low level I/O uses an int value called a file descriptor  These are small, nonnegative integers  Each process has its own set of file descriptors  Even the standard I/O streams have descriptors

Stream Descriptor Defined Constant stdin STDIN_FILENO stdout 1 STDOUT_FILENO stderr 2 STDERR_FILENO

slide-10
SLIDE 10

 To open a file for reading or writing, use the open() function

  • There used to be a creat() function that was used to create new

files, but it's now obsolete

 The open() function takes the file name, an int for mode,

and an (optional) int for permissions

 It returns a file descriptor

int fd = open("input.dat", O_RDONLY);

slide-11
SLIDE 11

 The main modes are

  • O_RDONLY

Open the file for reading only

  • O_WRONLY

Open the file for writing only

  • O_RDWR

Open the file for both

 There are many other optional flags that can be combined with the main modes  A few are

  • O_CREAT

Create file if it doesn’t already exist

  • O_DIRECTORY

Fail if pathname is not a directory

  • O_TRUNC

Truncate existing file to zero length

  • O_APPEND

Writes are always to the end of the file

 These flags can be combined with the main modes (and each other) using bitwise OR

int fd = open("output.dat", O_WRONLY | O_CREAT | O_APPEND );

slide-12
SLIDE 12

 Because this is Linux, we can also specify the permissions for a file we create  The last value passed to open() can be any of the following permission flags bitwise

ORed together

  • S_IRUSR

User read

  • S_IWUSR

User write

  • S_IXUSR

User execute

  • S_IRGRP

Group read

  • S_IWGRP

Group write

  • S_IXGRP

Group execute

  • S_IROTH

Other read

  • S_IWOTH

Other write

  • S_IXOTH

Other execute

int fd = open("output.dat", O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IRGRP );

slide-13
SLIDE 13

 Opening the file is actually the hardest part  Reading is straightforward with the read() function  Its arguments are

  • The file descriptor
  • A pointer to the memory to read into
  • The number of bytes to read

 Its return value is the number of bytes successfully read

int fd = open("input.dat", O_RDONLY); int buffer[100]; // Fill with something read( fd, buffer, sizeof(int)*100 );

slide-14
SLIDE 14

 Writing to a file is almost the same as reading  Arguments to the write() function are

  • The file descriptor
  • A pointer to the memory to write from
  • The number of bytes to write

 Its return value is the number of bytes successfully written

int fd = open("output.dat", O_WRONLY); int buffer[100]; int i = 0; for( i = 0; i < 100; i++ ) buffer[i] = i + 1; write( fd, buffer, sizeof(int)*100 );

slide-15
SLIDE 15

 To close a file descriptor, call the close() function  Like always, it's a good idea to close files when you're done

with them

int fd = open("output.dat", O_WRONLY); // Write some stuff close( fd );

slide-16
SLIDE 16

 It's possible to seek with low level I/O using the lseek()

function

 Its arguments are

  • The file descriptor
  • The offset
  • Location to seek from: SEEK_SET, SEEK_CUR, or SEEK_END

int fd = open("input.dat", O_RDONLY); lseek( fd, 100, SEEK_SET );

slide-17
SLIDE 17

 Use low level I/O to write a hex dump program  Print out the bytes in a program, 16 at a time, in hex, along

with the current offset in the file, also in hex

 Sample output:

0x000000 : 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 0x000010 : 02 00 03 00 01 00 00 00 c0 83 04 08 34 00 00 00 0x000020 : e8 23 00 00 00 00 00 00 34 00 20 00 06 00 28 00 0x000030 : 1d 00 1a 00 06 00 00 00 34 00 00 00 34 80 04 08

slide-18
SLIDE 18

 A file descriptor is not necessarily unique

  • Not even in the same process

 It's possible to duplicate file descriptors

  • Thus, the output to one file descriptor also goes to the other
  • Input is similar
slide-19
SLIDE 19

 stderr usually prints to the screen, even if stdout is being

redirected to a file

 What if you want stderr to get printed to that file as well?  You can also redirect only stderr to a file

./program > output.txt ./program > output.txt 2>&1 ./program 2> errors.log

slide-20
SLIDE 20

 If you want a new file descriptor number that refers to an open file

descriptor, you can use the dup() function

 It's often useful to change an existing file descriptor to refer to

another stream, which you can do with dup2()

 Now all writes to stderr will go to stdout

int fd = dup(1); // Makes a copy of stdout dup2(1, 2); // Makes 2 (stderr) a copy of 1 (stdout)

slide-21
SLIDE 21

 Reading from and writing to files on a hard drive is expensive  These operations are buffered so that one big read or write

happens instead of lots of little ones

  • If another program is reading from a file you've written to, it reads

from the buffer, not the old file

 Even so, it is more efficient for your code to write larger

amounts of data in one pass

  • Each system call has overhead
slide-22
SLIDE 22

 To avoid having too many system calls, stdio uses this

second kind of buffering

  • This is an advantage of stdio functions rather than using low-level

read() and write() directly

 The default buffer size is 8192 bytes  The setvbuf(), setbuf(), and setbuffer() functions

let you specify your own buffer

slide-23
SLIDE 23

 Stdio output buffers are generally flushed (sent to the system)

when they hit a newline ('\n') or get full

  • When debugging code that can crash, make sure you put a newline in

your printf(), otherwise you might not see the output before the crash

 There is an fflush() function that can flush stdio buffers

fflush(stdout); // Flushes stdout // Could be any FILE* fflush(NULL); // Flushes all buffers

slide-24
SLIDE 24
slide-25
SLIDE 25

 You can build layers of I/O on top of other layers

  • printf() is built on top of low level write() call

 The standard networking model is called the Open Systems

Interconnection Reference Model

  • Also called the OSI model
  • Or the 7 layer model
slide-26
SLIDE 26

 There are many different

communication protocols

 The OSI reference model is an

idealized model of how different parts of communication can be abstracted into 7 layers

 Imagine that each layer is

talking to another parallel layer called a peer on another computer

 Only the physical layer is a real

connection between the two

Application

Presentation Session Transport Network Data Link Physical

slide-27
SLIDE 27

 Not every layer is always used  Sometimes user errors are referred to as Layer 8 problems

Layer Name Mnemonic Activity Example 7 Application Away User-level data HTTP 6 Presentation Pretzels Data appearance, some encryption Unicode 5 Session Salty Sessions, sequencing, recovery TLS 4 Transport Throw Flow control, end-to-end error detection TCP 3 Network Not Routing, blocking into packets IP 2 Data Link Dare Data delivery, packets into frames, transmission error recovery Ethernet 1 Physical Programmers Physical communication, bit transmission Electrons in copper

slide-28
SLIDE 28

 There is where the rubber meets the road  The actual protocols for exchanging bits as electronic signals

happen at the physical layer

 At this level are things like RJ45 jacks and rules for

interpreting voltages sent over copper

  • Or light pulses over fiber
slide-29
SLIDE 29

 Ethernet is the most widely used example of the data layer  Machines at this layer are identified by a 48-bit Media Access

Control (MAC) address

 The Address Resolution Protocol (ARP) can be used for one

machine to ask another for its MAC address

  • Try the arptables command in Linux

 Some routers allow a MAC address to be spoofed, but MAC

addresses are intended to be unique and unchanging for a particular piece of hardware

slide-30
SLIDE 30

 The most common network layer protocol is Internet Protocol

(IP)

 Each computer connected to the Internet should have a

unique IP address

  • IPv4 is 32 bits written as four numbers from 0 – 255, separated by

dots

  • IPv6 is 128 bits written as 8 groups of 4 hexadecimal digits

 We can use traceroute to see the path of hosts leading to

some IP address

slide-31
SLIDE 31

 There are two popular possibilities for the transport layer  Transmission Control Protocol (TCP) provides reliability

  • Sequence numbers for out of order packets
  • Retransmission for packets that never arrive

 User Datagram Protocol (UDP) is simpler

  • Packets can arrive out of order or never show up
  • Many online games use UDP because speed is more important
slide-32
SLIDE 32

 This layer isn't a key part of the TCP/IP model  The secure sessions provided by TLS can be considered the

session layer

slide-33
SLIDE 33

 The presentation layer is often optional  It specifies how the data should appear  This layer is responsible for character encoding (ASCII, UTF-8,

etc.)

 MIME types are sometimes considered presentation layer

issues

 Encryption and decryption can happen here

slide-34
SLIDE 34

 This is where the data is interpreted and used  HTTP is an example of an application layer protocol  A web browser takes the information delivered via HTTP and

renders it

 Code you write deals a great deal with the application layer

slide-35
SLIDE 35

 The goal of the OSI model is to make lower layers transparent to upper ones

Application Presentation Session Transport Network Data Link Physical Application Presentation Session Transport Network Data Link Physical MAC IP UDP Payload IP UDP Payload UDP Payload Payload Payload Payload

slide-36
SLIDE 36

 Seven layers is a lot to remember  Mnemonics have been developed to help

Application All All A Away Presentation Pros People Powered-Down Pretzels Session Search Seem System Salty Transport Top To Transmits Throw Network Notch Need No Not Data Link Donut Data Data Dare Physical Places Processing Packets Programmers

slide-37
SLIDE 37

 The OSI model is sort of a sham

  • It was invented after the Internet was already in use
  • You don't need all layers
  • Some people think this categorization is not useful

 Most network communication uses TCP/IP  We can view TCP/IP as four layers: Layer Action Responsibilities Protocol Application Prepare messages User interaction HTTP, FTP, etc. Transport Convert messages to packets Sequencing, reliability, error correction TCP or UDP Internet Convert packets to datagrams Flow control, routing IP Physical Transmit datagrams as bits Data communication

slide-38
SLIDE 38

 A TCP/IP connection between two hosts (computers) is

defined by four things

  • Source IP
  • Source port
  • Destination IP
  • Destination port

 One machine can be connected to many other machines, but

the port numbers keep it straight

slide-39
SLIDE 39

 Certain kinds of network communication are usually done on

specific ports

  • 20 and 21:

File Transfer Protocol (FTP)

  • 22:

Secure Shell (SSH)

  • 23:

Telnet

  • 25:

Simple Mail Transfer Protocol (SMTP)

  • 53:

Domain Name System (DNS) service

  • 80:

Hypertext Transfer Protocol (HTTP)

  • 110:

Post Office Protocol (POP3)

  • 443:

HTTP Secure (HTTPS)

slide-40
SLIDE 40

 Computers on the Internet have addresses, not names  Google.com is actually [74.125.67.100]  Google.com is called a domain  The Domain Name System or DNS turns the name into an

address

slide-41
SLIDE 41

 Old-style IP addresses are in this form:

  • 74.125.67.100

 4 numbers between 0 and 255, separated by dots  That’s a total of 2564 = 4,294,967,296 addresses  But there are 7 billion people on earth…

slide-42
SLIDE 42

 IPv6 are the new IP addresses that are beginning to be used

by modern hardware

  • 8 groups of 4 hexadecimal digits each
  • 2001:0db8:85a3:0000:0000:8a2e:0370:7334
  • 1 hexadecimal digit has 16 possibilities
  • How many different addresses is this?
  • 1632 = 2128 ≈ 3.4×1038 is enough to have 500 trillion addresses for

every cell of every person’s body on Earth

  • Will it be enough?!
slide-43
SLIDE 43
slide-44
SLIDE 44

 More on networking  Sockets

slide-45
SLIDE 45

 Work on Project 5  Read LPI chapters 13, 14, and 15