Data (Some repeating CS1083, ECE course) bits and bit sequences - - PowerPoint PPT Presentation

data
SMART_READER_LITE
LIVE PREVIEW

Data (Some repeating CS1083, ECE course) bits and bit sequences - - PowerPoint PPT Presentation

Data (Some repeating CS1083, ECE course) bits and bit sequences integers (signed and unsigned) bit vectors strings and characters floating point numbers hexadecimal and octal notations Bits and Bit Sequences


slide-1
SLIDE 1

Data

  • (Some repeating CS1083, ECE course)
  • bits and bit sequences
  • integers (signed and unsigned)
  • bit vectors
  • strings and characters
  • floating point numbers
  • hexadecimal and octal notations
slide-2
SLIDE 2

Bits and Bit Sequences

  • Fundamentally, we have the binary digit, 0 or 1.
  • More interesting forms of data can be encoded into a bit

sequence.

  • 00100 = “drop the secret package by the park entrance”

00111 = “Keel Meester Bond”

  • A given bit sequence has no meaning unless you

know how it has been encoded.

  • Common things to encode: integers, doubles, chars. And

machine instructions.

slide-3
SLIDE 3

Encoding things in bit sequences (From textbook)

  • Floats
  • Machine Instructions
slide-4
SLIDE 4

How Many Bit Patterns?

  • With k bits, you can have 2k different patterns
  • 00..00, 00..01, 00..10, … , 11..10, 11..11
  • Remember this! It explains much...
  • E.g., if you represent numbers with 8 bits, you

can represent only 256 different numbers.

slide-5
SLIDE 5

Names for Groups of Bits

  • nibble or nybble: 4 bits
  • octet: 8 bits, always. Seems pedantic.
  • byte: 8 bits except with some legacy systems. In

this course, byte == octet.

  • after that, it gets fuzzy (platform dependent).

For 32-bit ARM,

– halfword: 16 bits – word: 32 bits

slide-6
SLIDE 6

Unsigned Binary (review)

  • We can encode non-negative integers in unsigned
  • binary. (base 2)
  • 10110 = 1*24 + 0*23 + 1*22 + 1*21 +1*20 represents the

mathematical concept of “twenty-two”. In decimal, this same concept is written as 22 = 2*101 + 2*100.

  • Converting binary to decimal is just a matter of adding

up powers of 2, and writing the result in decimal.

  • Going from decimal to binary is trickier.
slide-7
SLIDE 7

Division Method (decimal → binary)

  • Repeatedly divide by 2. Record remainders as

you do this.

  • Stop when you hit zero.
  • Write down the remainders (left to right),

starting with the most recent remainder.

slide-8
SLIDE 8

Subtract-powers method

  • Find the largest power of 2, say 2p, that is not

larger than N (your number). The binary number has a 1 in the 2p's position.

  • Then similarly encode N-2p.
  • Eg, 22 has a 1 in the 16's position

22-16=6, which has a 1 in the 4's position 6-4 = 2, which has a 1 in the 2's position 2-2=0, so we can stop....

slide-9
SLIDE 9

Adding in Unsigned Binary

  • Just like grade school, except your addition

table is really easy:

  • No carry in: 0+0=0 (no carry out)

0+1= 1+0 = 1 (no carry out) 1+1= 0 (with carry out)

  • Have carry in: 0+0=1 (no carry out)

0+1 = 1+0 = 0 (with carry out) 1+1 = 1 (with carry out)

slide-10
SLIDE 10

Fixed-Width Binary Integers

  • Inside the computer, we work with fixed-width values.
  • Eg, an instruction might add together two 16-bit unsigned binary

values and compute a 16-bit result.

  • Hope your result doesn't exceed 65535 = 216-1. Otherwise, you

have overflow. Can be detected by a carry from the leftmost stage.

  • If a result would really doesn't need all 16 bits, a process called

zero-extension just prepends the required number of zeros.

  • 10111 becomes 0000000000010111.
  • Mathematically, these bit strings both represent the same

number.

slide-11
SLIDE 11

Signed Numbers

  • A signed number can be positive, negative or

zero.

  • An unsigned number can be positive or zero.
  • Note: “signed number” does NOT necessarily

mean “negative number”. In Java, ints are signed numbers. Can they be positive?? Can they be negative??

slide-12
SLIDE 12

Some Ways to Encode Signed Numbers

  • All assume fixed width; examples below for 4 bits
  • Sign/magnitude: first bit = 1 iff number -ve

Remaining bits are the magnitude, unsigned binary Ex: 1010 means -2

  • Biased: Store X+bias in unsigned binary

Ex: 0110 means -2 if the bias is 8. (8+(-2) = 6)

  • Two's complement: Sign bit's weight is the negative
  • f what it'd be for an unsigned number

Ex: 1110 means -2: -8+4+2 = -2

  • You can generally assume 2's complement...
slide-13
SLIDE 13

Why 2's Complement?

  • There is only one representation of 0. (Other

representations have -0 and +0.)

  • To add 2's complement numbers, you use

exactly the same steps as unsigned binary.

  • There is still a “sign bit” - easy to spot negative

numbers

  • You get one more number (but it's -ve)

Range of N bits 2's complement: -2N-1 to +2N-1-1

slide-14
SLIDE 14

2's Complement Tricks

  • +ve numbers are exactly as in unsigned binary
  • Given a 2's complement number X (where X may be
  • ve, +ve or zero), compute -X using the twos

complementation algorithm (“flip and increment”)

  • Flip all bits (0s become 1s, 1s become zeros)
  • Add 1, using the unsigned binary algorithm
  • Ex: 00101 = +5 In 5 bit 2's complement

11010 + 1 → 11011 is -5 in 2's complement

  • And Flip(-5)=00100. 00100+1 back to +5
slide-15
SLIDE 15

Converting a 2's complement number X to decimal

  • Determine whether X is -ve (inspect sign bit)
  • If so, use the flip-and-increment to compute -X

Pretend you have unsigned binary. Slap a negative sign in front.

  • If number is +ve, just treat it like unsigned

binary.

slide-16
SLIDE 16

Sign extension

  • Recall zero-extension is to slap extra leading zeros
  • nto a number.
  • Eg: 5 bit 2's compl. to 7 bit: 10101 → 0010101

Oops: -11 turns into +21. Zero extension didn't preserve numeric value.

  • The sign-extension operation is to slap extra copies
  • f the leading bit onto a number
  • +ve numbers are just zero extended
  • But for -11: 10101 → 1110101 (stays -11)
slide-17
SLIDE 17

Overflow for 2's complement

  • Although addition algorithm is same for fixed-width

unsigned, the conditions under which overflow occurs are different.

  • If A and B are both same sign (eg, both +ve), then if A+B

is the opposite sign, something bad happened (overflow)

  • Overflow always causes this. And if this does not happen,

there is no overflow.

  • Eg, 1001 + 1001 →0010 but -7 + -7 isn't +2.

Note that -14 cannot be encoded in 4-bit 2's complement.

slide-18
SLIDE 18

Numbering Bits

  • On paper, we often write bit positions above the actual data bits.
  • 543210 ← normally in a smaller font than this

001010 bits 3 and 1 are ones.

  • Sometimes we like to write bits left to right, and other times, right to

left (which is more number-ish). We usually start numbering at zero.

  • Inside computer, how we choose to draw on paper is irrelevant.
  • Computer architecture defines the word size (usually 32 or 64).

Usually viewed as the largest fixed-width size that the computer can handle, at maximum speed, with most math operations.

  • So bit positions would be numbered 0 to 31 for a 32-bit architecture.
slide-19
SLIDE 19

More Arithmetic in 2's complement

  • Subtract: To calculate A-B, you can use A + (-B)

Most CPUs have a subtract operation to do this for you.

  • Multiplication: easiest in unsigned. (Most CPUs have instr.)
  • D.I.Y. unsigned multiplication is like Grade 3:

But your times table is the Boolean AND !! The product of 2 N-bit numbers may need 2N bits

  • For 2's complement, the 2 inputs' signs determine the product's
  • sign. eg, -ve * -ve → +ve
  • And you can multiply the positive versions of the two numbers.

Finally, correct for sign.

slide-20
SLIDE 20

Bit Vectors (aka Bitvectors)

  • Sometimes we like to view a sequence of bits as

an array (of Booleans)

  • Eg hasOfficeHours[x] for 1 <= x <= 31

says whether I hold office hours on the xth of this month.

  • And isTuesday[x] says whether the xth is a

Tuesday.

  • So what if you want to find a Tuesday when I hold
  • ffice hours?
slide-21
SLIDE 21

Bitwise Operations for Bit Vectors

  • Bitwise AND of B1 and B2:

Bit k of the result is 1 iff bit k of both B1 and B2 is 1.

  • Java supports bitwise operations on longs, ints

int b1 = 6, b2 = 12; // 0b110, 0b1100 int result = b1 & b2; // = 4 or 0b100

  • Bitwise NOT (~ in Java)
  • Bitwise OR ( | in Java)
  • Bitwise Exclusive Or ( ^ in Java) Also write “XOR” or “EOR”.
  • Pretty well every ISA will support these operations directly.
slide-22
SLIDE 22

Find First Set

  • Some ISAs have a Find First Set instruction.

(You've got a bitvector marking the Tuesdays when I have office hours – but now you want to find the first such day.)

  • Integer.numberOfTrailingZeros() in Java

achieves this.

  • So use

Integer.numberOfTrailingZeros(hasOfficeHours & isTuesday)

slide-23
SLIDE 23

Bit Masking

  • Think about painting and masking tape. You can put a

piece of tape on an object, paint it, then peel off the tape. Area under the tape has been protected from painting.

  • We can do the same when we want to “paint” a bit vector

with zeros, except in certain positions.

  • Eg, I decide to cancel my office hours except for the first

10 days of the month.

  • Or we can protect positions against painting with ones.
  • Details next...
slide-24
SLIDE 24

Bit Masking with AND

  • AND(x,0) = 0 for both Boolean values of x
  • AND(x,1) = x for both Boolean values of x
  • bitwise AND wants to paint bits 0, except where the mask protects (1

protects)

  • hasOfficeHours & 0b1111111111

is a bitvector that keeps my office hours for the first 10 days (only). Later in month, all days are painted false.

  • hasOfficeHours &= 0b1111111111 modifies hasOfficeHours. By analogy

to the += operator you may already love.

  • The value 0b111111111 is being used as a mask.
  • Quiz: what does hasOfficeHours & ~0b1111111111 do?
slide-25
SLIDE 25

Bit Masking with OR

  • OR(x,1) = 1 for both Boolean values x
  • OR(x,0) = x for both Boolean values x
  • bitwise OR wants to paint bits with 1s, except where the

mask prevents it. A 0 prevents painting.

  • hasOfficeHours | 0b1111111111 is a bitvector where I

have made sure to hold office hours on each of the first 10 days (and left things alone for the rest of the month)

  • hasOfficeHours |= 0b1111111111 makes it permanent.
  • Quiz: what would hasOfficeHours |= 0b101 do?
slide-26
SLIDE 26

Bit Masking with EOR (aka XOR)

  • EOR(0,x) = x for both Boolean values x
  • EOR(1,x) = NOT(x) for both Boolean values x
  • bitwise EOR wants to flip bits in positions that are

not protected with a 0 in the mask.

  • hasOfficeHours ^= 0b111100

inverts my office hour situation for Jan 3-6.

  • Bit masking with EOR is less common than OR and

AND.

slide-27
SLIDE 27

Example: Is a Number Odd?

  • Fact: A number is odd iff its least significant (i.e., rightmost) bit

is 1.

  • Java:

if ( (myNum & 0b1) == 0b1) System.out.println(“Very odd number”);

  • Note: decreasing precedence

&&, ||, |, ^, &, == Even if you don't have to, maybe parenthesizing is a good idea. It's hard to remember weird operators' precedence levels.

slide-28
SLIDE 28

Example: Multiple of 8?

  • A binary value is a multiple of 8 (=23) iff it ends with

000.

  • Related to the fact that a decimal number is a multiple
  • f 1000 (= 103) iff it ends with 000.
  • Java:

if ( (myVal & 0b111) == 0) System.out.println(“multiple of 8”);

  • Fact: a more general rule is that the rightmost k bits of

X are (X mod 2k) (not certain about -ve numbers)

slide-29
SLIDE 29

Bit Shifting

  • A bunch of operations let the element of a

bitvector play “musical chairs”.

  • logical left shift: every bit slides one position
  • left. The old leftmost bit is lost. The new

rightmost bit is 0. Java << operator repeats this to shift the value several positions.

  • Eg, 0b11 << 4 is same as 0b110000.
  • logical right shift: similar, Java >>> operator.
slide-30
SLIDE 30

Dynamically Generating Masks

  • Shifts are useful for dynamically generating masks for use with

bitwise AND, OR, EOR.

  • The Hamming weight of a bunch of bits is the number of bits that

are 1. (After Richard Hamming, 1915-1998.) Many modern CPUs have a special “population count” instruction to compute Hamming weight. Except for speed, it is not needed: int hWeight=0; // Hamming weight of int value x for (int bitPos=0; bitPos < 32; ++bitPos) { int myMask = 0b1 << bitPos; if ( x & myMask != 0) ++hWeight;

slide-31
SLIDE 31

Hacker's Delight

  • Henry Warren's book, Hacker's Delight, belongs on

the bookshelves of serious low-level programmers.

  • It is a collection of neat bit tricks collected over the
  • years. It is the source of much of the

implementation of Java's class Integer.

  • Course website has a link to a web page with a

similar collection of “bit hacks”.

  • Despite the title, this book is not about breaking

security...it's the older, honourable use of “hacker”.

slide-32
SLIDE 32

Poor Man's Multiplication

  • What happens if you take a decimal number and slap 3

zeros on the right? It's same as multiplying by 103.

  • Similarly, X << 3 is same as X * 8. Even works if X is
  • ve. (Unless X*8 overflows or underflows)
  • Poor man's X*10 is (X<<3)+(X<<1)

since it equals (8*X + 2*X)

  • Compilers routinely optimize multiplications by some

constants like this, since multiplication is often a harder

  • peration than shifting and adding. Called strength

reduction.

slide-33
SLIDE 33

Poor Man's Division

  • So then, does shifting bits to the right then correspond to division by

powers of 2?

  • For unsigned, yes. (Throwing away remainders).
  • For 2's complement +ve numbers, yes.
  • -ve numbers: no. Regular right shift inserts zeros at the leftmost

position (the sign bit).

  • 11111000 → 01111100 means -8 → +124
  • A modified form, arithmetic right shift inserts copies of the sign bit at

the leftmost. Java operator >> vs >>>

  • 11111000 → 11111100 means -8 → -4 as desired
  • Result is a bit wrong if remainder != 0.
slide-34
SLIDE 34

Poor Man's Division by 2k

  • (From Hacker's Delight): To handle all cases

– Alt1: add 2k-1 before the arithmetic shift, if dividend is

negative.

– Alt2: (branchless – faster on many modern CPUs)

temp ← arith. right shift right by k-1 positions temp2 ← logical right shift temp by 32-k positions dividend += temp2 /* adjust by 0 or 2k-1 */ arithmetic shift right dividend by k

slide-35
SLIDE 35

Division by Constant, via Multiplication and Right Shifting

  • Low-end CPUs may not have an integer divide instruction but

may have a multiply. Want to divide by a constant y that is not a power of 2.

  • Mathematically, x/y = x * 1/y
  • Multiply 1/y by p/p, for p being some power of 2. Say p = 2k. So

x/y = ( x * (p/y)) / p.

  • p/y is a constant that you can compute. Division by p is a right

shift.

  • Considerations: effect of truncations and whether the

multiplication overflows.

slide-36
SLIDE 36

Example: Divide x by 17

  • We get to choose p. Say p=28.
  • 256/17 = 15.05 is close to 15.
  • Compute (x*15) >> 8 to approximate x/17.
  • Test run for x=175. (175*15)/256 = 10 (throwing away remainder).

Good.

  • Test run for x=170. (170*10)/256 = 9 (because we throw away a

remainder of .996). Oops!

  • Can be improved, but maybe an approximate answer is okay.
  • Closer approximations by using bigger values of p.
  • Using 32-bit integers, what is the biggest number we can divide by 17

this way, without getting overflow?

slide-37
SLIDE 37

General-Purpose Mult. and Div.

  • What if you want to multiply and divide by a

variable?

  • Today, most CPUs come with instructions to do

this, except maybe the kind in your digital toaster.

  • But you can always implement * by the Grade-3

shift-and-add algorithm. Or repeated addition.

  • Division: see how many times you can subtract y

from x (in a loop). Or (harder), implement the algorithm you learned in Grade 3.

slide-38
SLIDE 38

More Bit Shuffling

  • Most CPUs support more exotic ways for bits to play

musical chairs. No operator like >> in Java or C for this, though:

  • Left rotation by 1 position:

– every bit but leftmost moves left 1. The leftmost bit circles

around and becomes the new rightmost bit.

  • Left rotation by >1 positions is same result as doing

multiple left rotations by 1.

  • A right rotation by 1, or by >1 positions, also exists.
  • Example: 1010011 right rotated by 1 is 1101001
slide-39
SLIDE 39

Confessions

  • A detailed analysis of the divide-by-a-constant approach (eg the

divide-by-17 example) can avoid the small errors. People have worked out approaches for dividing exactly by using multiplications and shifts….

  • Modern compilers usually do this: CPU divide instructions are slow.
  • Chapter 10 of Hacker’s Delight is “Integer Division by Constants”

and is 72 pages that are quite mathematical. Also the word “magic” appears many times.

  • hackersdelight.org can compute magic numbers; see also

libdivide.org.

slide-40
SLIDE 40

Character Data

  • Characters are encoded into binary. One historical

method that is still used is a 7-bit code, American Standard Code for Information Interchange.

  • ASCII contains upper and lower-case letters,

punctuation marks and digits that would commonly have been needed for US English data processing needs.

  • ASCII also encodes other things that control the

assumed teletypewriter machine: these control codes include carriage return, line feed, tab, ring the bell, end

  • f file, …
slide-41
SLIDE 41

Control Codes

  • Control codes are often invisible when printed,

and some text editors won't show them. But software (e.g. compilers) can be thrown off by

  • them. Leads to puzzled students sometimes.
  • A common convention is to discuss control

codes by using a letter preceded by ^. Eg, ^C. On many keyboards, pressing the Ctrl key at the same time as the letter can generate a control code.

slide-42
SLIDE 42

Backslash Escapes

  • Many programming languages have have

backslash escapes to represent some popular control codes.

  • Eg, '\t', '\n', '\r' in Java and C. You type \t as two

characters, but it represents a single character (a tab, ASCII code 9, ^I)

  • '\123' represents a single byte whose ASCII code

is 123 in octal (base 8 – more on this later) (In many programming languages)

slide-43
SLIDE 43

Unicode

  • Many (non US people) found ASCII to limiting so attempts were

first made to extend it to other Western European character sets.

  • Unicode seeks to represent all current and historical symbols in all

cultures and languages. Original idea was that 16 bits was enough. Java uses this early Unicode idea, so char in Java is 16 bits.

  • Unicode version 9.0 (2016) has >100k characters plus many more
  • symbols. 16 bits is not enough.
  • Each Unicode character is represented by a numeric code point.

First 128 of them correspond to ASCII, for backwards compatibility. First 216 are the Basic Multilingual Plane.

  • There are several ways of encoding code points into bytes.
slide-44
SLIDE 44

UTF-8, UTF-16 etc

  • UTF-32 uses 32 bits to store a code point. It is a fixed-width encoding: if I

know how many characters I need to store, I know precisely how many bytes it will cost me.

  • But UTF-32 wastes bytes. Characters outside the Basic Multilingual Plane

are rare. Codepoints from 0-127 (“ASCII”) are very common.

  • UTF-16 represents codes in the BMP with 2 bytes. Weird codes outside

need 2 more. Not a fixed-width encoding.

  • UTF-8 represents ASCII codes in 1 byte, other BMP codepoints with 2 or 3

bytes, and weird codes with 4.

  • UTF-8 is fully backward compatible with old ASCII files.
  • In Java, the constructor for FileOutputStream has an optional parameter

that can be “UTF-8” or “UTF-16” etc. Otherwise, it uses the operating system's default.

slide-45
SLIDE 45

Strings

  • A string is a sequence of characters. In Java it's represented by

a String object, as you know.

  • In lower-level programming (C, assembler), a string is more likely

viewed as a sequence of consecutive memory locations that store the successive characters in the string.

  • Q: how do you know when the next memory location doesn't

store the next character in a string (how to know a string is over)?

  • Common convention: null-terminated string. A string always ends

with a character whose ASCII code is 0. “C-style string”

  • ARM assembly language: if you want a C-style string, you have

to put the null at the end. Fun bugs if you forget.

slide-46
SLIDE 46

Representing Fractional Values

  • You can represent fractional values using a fixed-

point convention. In decimal and for money (unit dollars), an example would be agreeing to store each values as a whole number of cents.

  • So 2.35 is stored as 235. We have shifted the

decimal point by a fixed amount, two positions.

  • In unsigned binary 101.011 means 1012 and a

fraction of 0*2-1 + 1*2-2 + 1*2-3. I.e., 5.375 .

slide-47
SLIDE 47

Fractional Values, 2.

  • We can store all numbers by shifting the binary point right

3 (for example). So we are measuring everything in

  • eighths. 5.37510 is then stored as 101011, instead of

101.011.

  • Can add and subtract fixed-point numbers successfully, as

long as each is, for instance, measured in eighths.

  • But multiplying two numbers given in eighths results in a

product that is measured in 64ths. So have to divide by 8 (just shift right 3 positions...)

  • Advantage: fractions handled using only integer arithmetic.
slide-48
SLIDE 48

Floats, Doubles etc.

  • Scientific processes generate huge and tiny numbers. No

single fixed-point shift will suit everything.

  • Measured values have limited precision - no sense to store

the number of meters to Alpha Centauri as an integer.

  • Floating-point representation is a computer version of the

“scientific notation” you learned in school, eg: 3.456 x 10-5

  • +3.456 is the significand or mantissa. We've 4 sig. digits
  • Number is normalized to 1 significant digit before decimal

point.

  • The exponent is -5, and the sign is positive.
slide-49
SLIDE 49

IEEE-754 Standard

  • IEEE-754 is the standard way to represent a binary

floating point value in 16 (half-precision), 32 (single precision), 64 (double precision), 128 (quad precision) or 256 (octuple precision) bits.

  • 32-bit form available in C & Java as float; 64-bit

form as double.

  • Overall, it's a sign-magnitude scheme.
  • But exponent is signed quantity using the biased

approach.

slide-50
SLIDE 50

IEEE-754 Floats

  • 1 sign bit, S. (Bit position 31)
  • next, 8 exponent bits with binary value E.
  • Exponent bias b of 127.
  • 23 fraction bits fff..fff to represent the significand
  • f 1.ffff..fff. Note “hidden” or “implicit” leading 1.
  • Formula for “normal” floats:

Value = (-1)S * 1.ffff...fff * 2(E-b)

slide-51
SLIDE 51

Example

  • Find the numerical value of a float with bits

0 00111100 001100000000000000000000

  • Use formula (-1)S * 1.ffff...fff * 2(E-b)
  • S=0. E=001111002 = 6010. b=127 (always)

ffff...fff = 00110000000000000000

  • So: -10 * 1.00112 * 2-67

= +1 * (1 + 3/16) * 2-67

  • It's a small positive number. Calculator for details.
slide-52
SLIDE 52

Example 2: Determine the bits

  • Determine how to represent -2253.2017.
  • Helpful facts: 2253 = 2048 + 4 + 1.
  • 0.2017 * 224 = 3383964 + fraction
  • 338396410 = 11001110100010100111002
  • Now let's put the pieces together.
slide-53
SLIDE 53

Representable Values

  • There are/is an uncountably infinity of real numbers.
  • There are at most 232 different bit patterns for a float. No bit pattern

represents more than one real number.

  • Therefore, there are real numbers that cannot be represented.

(Overwhelming majority.)

  • For any given exponent value, there are only 223 different mantissas,

1.000... to 1.111...

  • No number whose exponent exceeds 255-127
  • No number whose exponent < (0 – 127).
  • (Though in CS3813 you'll learn about subnormal numbers, so I've

lied; IEEE-754 is a fair bit hairier than I've presented.)

slide-54
SLIDE 54

Example

  • What is the next representable value, after

5/16?

  • 5/16 = 0.0101 * 20 = 1.0100...0 * 2-2
  • Now let's reason.
slide-55
SLIDE 55

IEEE-754 Doubles

  • 64 bits, divided up into

– 1 sign bit – 11 exponent bits, bias 1023 – 52 fraction digits stored

  • More exponent bits: better range of numbers
  • More fraction bits: smaller gaps between

representable numbers (higher precision).

  • Otherwise, like Float.
slide-56
SLIDE 56

Machine Instructions

  • Another thing that becomes binary: machine instructions.
  • Typical m/c instruction has

– an operation code (opcode) that indicates which of the supported operations

is desired

– codes indicating addressing modes that provide the input data (“operands”) – code indicating where the result should be put – code indicating the conditions under which the instruction should be ignored

  • An instruction-format specification helps you determine how to

assemble these codes into a machine-code instruction.

  • Chapter 1: To store the constant 0 into a register variable:

0b0101010010100000 in LC3 machine code. (Actually a bitwise AND with mask 0...0)

  • We'll study instruction formats later.
slide-57
SLIDE 57

Hexadecimal

  • A decimal number has about 1/3 as many digits as the

corresponding binary number: small base, lots of digits.

  • Humans do poorly with many-digit numbers.
  • So for humans, it is handy to work in larger bases. But it's

hard to convert base 10 numbers into base 2 numbers.

  • Base 16, or hexadecimal (hex), is the go-to base for

machine-level human programmers.

– numbers have few digits – it's easy to convert to/from binary

slide-58
SLIDE 58

Hexadecimal digits

  • Whereas decimal uses digits 0 to 9,

hexadecimal uses 0-9,A,B,C,D,E,F.

– Digit 7 has value seven, just like decimal – Digit A has value ten, B has value eleven, …

F has value fifteen

  • Numbers have a ones place, a sixteens place, a 162s place, a 163s

place, etc.

  • 2F32 means 2*163 + 15*162 + 3*161+2
  • In many languages, you prefix hex constants with 0x

so int fred = 0x2f32; // works fine in Java. int george = 0x100; // equivalent to george = 256;

slide-59
SLIDE 59

Converting Hex to Binary

  • Because 16=24, each hex digit expands to 4

binary digits.

  • For 0x9A4

– the 9 expands to 1001 – the A expands to 1010 – the 4 expands to 0100

  • So 0x9A4 expands to 0b100110100100
slide-60
SLIDE 60

Converting Binary to Hex

  • Binary → Hex is the reverse process.
  • Only trick: you want the binary number to be the correct

length ( a multiple of 4 in length)

  • So zero extend it, if necessary
  • Then each group of 4 bits collapses to a hex digit.
  • 101010 → 0010 1010 → 2A
  • Rather than count bits and zero extend first, just circle

groups of 4 bits starting from the right. If the last group has fewer than 4 bits, it's okay.

slide-61
SLIDE 61

Small Negative Numbers

  • We usually use unsigned hex to reflect bit patterns, even

if they meant to be 2's complement numbers.

  • So what does a 32-bit negative number look like, if it is

pretty close to 0?

  • The corresponding bit pattern has a lot of leading ones.

When converted to hex, each group of 4 ones turns into an F digit.

  • So your not-very-negative number has lots of leading F's.
  • 0xFFFF FFF3 is the bit pattern for -510 = 0b111...111011
slide-62
SLIDE 62

Hex Arithmetic

  • It's sometimes handy to addition and subtraction of

hex numbers without converting to decimal.

  • (typically, subtraction when you want to figure out

the size of something in memory, and you've got the starting and ending positions)

  • Like Grade 3, except your addition/subtraction table

is bigger.

– don't memorize: just use the values of digits – you carry and borrow 16, not 10

slide-63
SLIDE 63

Example: Hex Addition

  • A debugger reports that an item begins in memory at

address 0x1234. You know its size is 0x7D. What is the first address after the item?

  • 1234

+ 7D

  • 4 is worth 4, D is worth 13. Sum to 17, or 0x11
  • Keep the 1, carry the 16 to the next stage
  • 3 and 7 sum to 10. But there is a carry, bumping you to 11,
  • r 0xB. No carry to next stage.
  • So 1234 + 7D = 12B1.
slide-64
SLIDE 64

Example: Hex Subtraction

  • 1203
  • 0F15
  • Since 3 < 5, borrow from 0x20 (making it 0x1F).
  • You borrowed 16, so 3 is now worth 16+3=19.
  • Take away 5, get 14. Hex digit for 14 is E.
  • F-1 is E (no borrow needed).
  • 1-F needs a borrow, makes 1 worth 16+1=17.
  • Take away F (value 15), get 2. (hex digit is 2).
  • You borrowed from the 1, so its 0. 0-0=0.
  • You could write down this leading zero, if you wanted....
  • 1203-0F15 = 02EE
slide-65
SLIDE 65

Octal (base 8)

  • In bygone days, octal (base 8) was an alternative to

hexadecimal.

  • Conversion to/from binary is by grouping bits into groups of

size 3, but otherwise same as hex.

  • Octal survives in some niches. In a string or a character, a

backslash can be followed by 3 octal digits (typically the ASCII code of some otherwise unprintable character).

  • In Java and C, any digit string that starts with a leading zero

is assumed to be octal. Remaining digits must be 0 to 7.

  • So: int fred = 09; // mysterious compile error