Data Representation Computers and Programs A computer is CPU - - PowerPoint PPT Presentation
Data Representation Computers and Programs A computer is CPU - - PowerPoint PPT Presentation
CSC 2400: Computer Systems Data Representation Computers and Programs A computer is CPU basically a processor (CPU) interacting with Control Data memory BUS Your program (executable) must be first loaded into memory before it
Your program
BUS
CPU Control Data
Your program
Disk Memory
- A computer is
basically a processor (CPU) interacting with memory
- Your program
(executable) must be first loaded into memory before it can start executing
Computers and Programs
Memory: Array of Bytes
- Memory is basically an
array of bytes, each with its own address
- Memory addresses are
defined using unsigned binary integers
Memory: Array of Words
- A word is a group of bytes
handled as a unit by the CPU
– tied to the CPU architecture – natural storage size for numbers
- Word address
– address of first byte in word – addresses of successive words differ by 4 (32-bit) or 8 (64-bit)
0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011
32-bit Words Bytes Addr.
0012 0013 0014 0015
64-bit Words
Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? 0000 0004 0008 0012 0000 0008
q What happens when you declare a variable?
- The compiler allocates a memory box for that variable
- How big a box?
- Depends on the type of the variable
Memory and Variables
char c = ‘A’; 01000001 Memory Value Memory Address 0016
One Annoying Thing: Byte Order
q Hosts differ in how they store data
- E.g., four-byte number (byte3, byte2, byte1, byte0)
q Little endian (“little end comes first”) ß Intel PCs!!!
- Low-order byte stored at the lowest memory location
- Byte0, byte1, byte2, byte3
q Big endian (“big end comes first”)
- High-order byte stored at lowest memory location
- Byte3, byte2, byte1, byte 0
q Makes it more difficult to write portable code
- Client may be big or little endian machine
- Server may be big or little endian machine
Memory and Variables (contd.)
int i = 258; 00000000 00000000 00000001 00000010 0020 0021 0022 0023
OR
00000010 00000001 00000000 00000000 BIG ENDIAN (least significant byte at higher address) LITTLE ENDIAN (least significant byte at lower address) 00000000 00000000 00000001 00000010 Memory Value Memory Address
Memory view:
Memory and Variables (contd.)
float f = 0.1; 00111101 11001100 11001100 11001101 Value Address 0020 0021 0022 0023
OR
11001101 11001100 11001100 00111101 LITTLE ENDIAN (least significant byte at lower address) BIG ENDIAN (least significant byte at higher address) 00111101 11001100 11001100 11001101
Memory view:
Data Representations
q Sizes of C Data Types (in bytes)
C Data Type Sparc Typical 32-bit Intel IA32 int 4 4 4 long int 8 4 4 char 1 1 1 short 2 2 2 float 4 4 4 double 8 8 8 long double 8 8 10/12 void * 8 4 4
The sizeof Operator
q Unique among operators: evaluated at compile-time q Evaluates to type size_t; on most systems, same as unsigned int q Examples
Category Operators sizeof
sizeof(type) sizeof(expr)
int i = 10; double d = 100.0; … … sizeof(int) … /* On matrix, evaluates to 4 */ … sizeof(i) … /* On matrix, evaluates to 4 */ … sizeof(double)… /* On matrix, evaluates to 8 */ … sizeof(d) … /* On matrix, evaluates to 8 */ … sizeof(d + 200.0) … /* On matrix, evaluates to 8 */
Determining Data Sizes
q Program to determine data sizes on your computer q Output on matrix
#include <stdio.h> int main() { printf("char: %d\n", (int)sizeof(char)); printf("short: %d\n", (int)sizeof(short)); printf("int: %d\n", (int)sizeof(int)); printf("long: %d\n", (int)sizeof(long)); printf("float: %d\n", _________________); printf("double: %d\n", _________________); printf("long double: %d\n", _________________); return 0; } char: 1 short: 2 int: 4 long: 4 float: 4 double: 8 long double: 16
Limits of the Machine:
Overflow
Overflow: Running Out of Room
q Adding two large integers together
- Sum might be too large to store in available bits
- What happens?
q We have overflow if:
- signs of both operands are the same, and
- sign of sum is different
01000 (8) 11000
(-8)
+ 01001 (9) + 10111
(-9)
10001 (-15) 01111
(+15)
Assuming 5-bit 2’s complement numbers
Overflow
q Unsigned integers
- All arithmetic is “modulo” arithmetic
- Sum would just wrap around
q Signed integers
- Can get nonsense values
- Example with 16-bit integers (short datatype)
- Sum: 10000+20000+30000
- Result: -5536
Try It Out
q Write a program that computes the sum
10000+20000+30000
Use only short variables in your code:
short a = 10000; short b = 20000; short c = 30000; short sum = a + b + c; printf(”%d, %d, %d, sum = %d\n", a, b, c, sum);
Exercise
q Assume a 4-bit two’s complement representation for
integer variables
q Compute the value of the expression 7 + 7
q C Allows Conversions from signed to unsigned q Memory allocation: q Resulting Value
- No change in bit representation
- Nonnegative values unchanged (ux = 5)
- Negative values change into (large) positive values (uy = 65531)
short x = 5; unsigned short ux = (unsigned short) x; short y = -5; unsigned short uy = (unsigned short) y;
Casting Signed to Unsigned
x 00000000 00000101 ux 00000000 00000101 y 11111111 11111011 uy 11111111 11111011
Exercise
q Assume a 5-bit two’s complement representation for
int variables
q What is the output of the following piece of code?
int x = 8; unsigned int ux = (unsigned int) x; int y = -8; unsigned int uy = (unsigned int) y; printf(“%d %d %d %d\n”, x, ux, y, uy);
Try It Out
q C code:
char a = 0xFF; unsigned char b = 0xFF; printf("a = %d\n", a); printf("b = %d\n", b);
Int to Char? Try It Out …
#include <stdio.h> int main() { char c = 0x81; int i; i = c; printf(" integer = %x\n character = %x\n", i, c); i = 0x87654321; c = i; printf(" integer = %d\n character code = %d\n", i, c); return 0; } c 10000001 i
C vs. Java: Cast Conversions
q Java: demotions are not automatic
C: demotions are automatic
int i; char c; … i = c; /* Implicit promotion */ /* Sign extension in Java and C */ c = i; /* Implicit demotion */ /* Java: Compile-time error */ /* C: OK; truncation */ c = (char)i; /* Explicit demotion */ /* Truncation in Java and C */
q C Guarantees Two Levels
float single precision (32 bits) double double precision (64 bits)
q Conversions
- Casting between int, float, and double changes bit values
Floating-Point to Int
i 10000000 00000000 00000010 11110001
int i = 0x800002F1; float f = (float) i;
f
- int to float
- Round according to rounding mode
- int to double
- Exact conversion, as long as int has ≤ 53 bit word size
q C Guarantees Two Levels
float single precision (32 bits) double double precision (64 bits)
q Conversions
- Casting between int, float, and double changes bit values
Floating-Point to Int
i 10000000 00000000 00000010 11110001
int i = 0x800002F1; float f = (float) i;
f 01001111 00000000 00000000 00000011
Int to Float Rounding – Try It Out
int i = 0x800002F1; float f = (int)i; printf(”%x\n", i); i = (int)f; printf(”%x\n", i);
Why is this important? Ariane 5!
- June 5, 1996
- Exploded 37 seconds after
liftoff
- Cargo worth $500 million
q Why
- Computed horizontal velocity
as 64-bit floating point number
- Converted to 16-bit integer
- Worked OK for Ariane 4
- Overflowed for Ariane 5
- Used same software
What did we learn?
q Memory is bytes, words q Datatype size (in bytes) is machine-dependent q Byte ordering (little endian, big endian) q Limits of machine, overflow q Cast conversions in C