ENGN2219/COMP6719 ComputerArchitecture&Simulation - - PowerPoint PPT Presentation

engn2219 comp6719
SMART_READER_LITE
LIVE PREVIEW

ENGN2219/COMP6719 ComputerArchitecture&Simulation - - PowerPoint PPT Presentation

ENGN2219/COMP6719 ComputerArchitecture&Simulation RameshSankaranarayana Semester1,2020 (basedonoriginalmaterialbyBenSwitandUweZimmer) 1 Week3:MemoryOperations 2 Outline addresses


slide-1
SLIDE 1

ENGN2219/COMP6719

Computer
Architecture
&
Simulation Ramesh
Sankaranarayana Semester
1,
2020 (based
on
original
material
by
Ben
Swit
and
Uwe
Zimmer)

1

slide-2
SLIDE 2

Week
3:
Memory
Operations

2

slide-3
SLIDE 3

Outline

addresses load/store
instructions address
space labels
&
branching

3

slide-4
SLIDE 4

Memory
is
how
your
CPU
interacts
with
the

  • utside
world

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

but
rst,
a
few
more
instructions

6

slide-7
SLIDE 7

Bitwise
instructions

Not
all
instructions
treat
the
bit
patterns
in
the
registers
as
“numbers” Some
treat
them
like
bit
vectors
(and,
orr,
etc.) There
are
even
some
instructions
(e.g.
cmp,
tst)
which
don’t
calculate
a
“result” but
they
do
set
the
lags Look
at
the
Bit
operations
section
of
your
cheat
sheet

7

slide-8
SLIDE 8

Example:
bitwise
clear

r1

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 1 1 1 1 1 1 1 1

r2

1 1 1 1

r3

1 1 1 1

mov
r1,
0xFF
 mov
r2,
0b10101010
 bic
r3,
r1,
r2


8

slide-9
SLIDE 9

Bit-shits
and
rotations

Other
instructions
will
shit
(or
rotate)
the
bits
in
the
register,
and
there
are
lots
of diferent
ways
to
do
this! See
the
Shit/Rotate
section
of
the
cheat
sheet Be
careful
of
the
diference
between
logical
shit
and
arithmetic
shit

9

slide-10
SLIDE 10

10

slide-11
SLIDE 11

ARM
barrel
shiter

Your
discoboard’s
CPU
actually
has
special
hardware
(called
a
“barrel
shiter”)
to perform
these
shits
as
part
of
another
instruction
(e.g.
an
add);
that’s
what
{,

<shift>}
means
on
e.g.
the
cheat
sheet

There
are
dedicated
bit
shit
instructions
(e.g.
lsl)
and
other
instructions
which can
take
an
extra
shit
argument,
e.g.

@
some
examples
 adds
r0,
r2,
r1,
lsl
4
 
 mov
r3,
4
 mov
r3,
r3,
lsr
2
 mov
r3,
r3,
lsr
3
@
off
the
end!


11

slide-12
SLIDE 12

Is
that
everything?

We
haven’t
looked
at
everything
on
the
cheat
sheet (not
even
close!) The
cheat
sheet
doesn’t
have
everything
in
the
reference
manual (not
even
close!) But
you
can
do
a
lot
with
just
the
basics,
and
you
can
refer
to
the
cheat
sheet whenever
you
need
it

12

slide-13
SLIDE 13

talk

how
do
you
keep
track
of
all
the
registers
you’re
using
in
your
program? what
if
you
“run
out”?

13

slide-14
SLIDE 14

Memory
addresses

14

slide-15
SLIDE 15

…but
registers
can
store
data

Yes,
they
can!
That’s
what
we’ve
been
doing
with
the
instructions
so
far
(e.g.
mov,

add,
etc.)
manipulating
values
in
registers.

Registers
are
super-convenient
for
the
CPU,
because
they’re
inside
the
CPU
itself. And
we
can
give
them
all
special
names—r0,
r9,
lr,
pc,
etc.

15

slide-16
SLIDE 16

A
pet
duck
for
ENGN2219

16

slide-17
SLIDE 17

Let’s
step
back
a
bit

17

slide-18
SLIDE 18

Von
Neumann
Architecture

18

slide-19
SLIDE 19

Memory
Hierarchy

19

slide-20
SLIDE 20

Von
Neumann
Architecture

20

slide-21
SLIDE 21

Now,
back
to
the
present
…

21

slide-22
SLIDE 22

Random
Access
Memory

RAM
(Random
Access
Memory)
is
for
storing
lots
of
data Perhaps character
data
for
a
MMORPG rgb
pixel
data
from
a
high-resolution
photo

  • r
the
machine
code
instructions
which
make
up
a
large
program

:
~$140
for
16GB Current
price

22

slide-23
SLIDE 23

RAM
types

Two
main
types: 1.
static
RAM
(SRAM) 2.
dynamic
RAM
(DRAM) SRAM
is
faster,
more
expensive
and
physically
larger—it’s
used
in
caches
& situations
where
performance
is
crucial DRAM
is
slow(er),
more
power-ecient,
cheaper
and
physically
denser—it’s
used where
you
need
more
capacity
(bytes) Most
CPUs
have
both
types,
but
when
you
talk
about
the
RAM
in
your
gaming
rig, you’re
usually
referring
to
DRAM

23

slide-24
SLIDE 24

now
we’ve
got
an
addressing
problem

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

Memory
addresses

The
solution:
refer
to
each
diferent
section
of
memory
with
a
(numerical)
address Each
of
these
addressable
units
is
called
a
cell Think
of
it
like
a
giant
array
in
your
favourite
programming
language:

byte[]
memory
=
{
80,
65,
54,
/*
etc.
*/
};


26

slide-27
SLIDE 27

Analogy:
street
addresses

27

slide-28
SLIDE 28

Byte
addressing
(addresses
in
blue)

28

slide-29
SLIDE 29

29

slide-30
SLIDE 30

The
byte:
the
smallest
addressable
unit

One
interesting
question:
what
should
the
smallest
addressable
unit
be?
In
other words,
how
many
bits
are
in
each
bucket?
1,
8,
16,
32,
167? The
ARMv7-M
ISA
uses
8-bit
bytes
(so
do
most
of
the
systems
you’ll
come
across these
days) Usually,
we
use
a
lowercase
b
to
mean
bits,
and
an
uppercase
B
to
mean
bytes,
e.g. 1Mbps
==
1
million
bits
per
second,
3.9
GB
means
3.9
billion
bytes

30

slide-31
SLIDE 31

8
bits
==
1
byte

31

slide-32
SLIDE 32

Why
8
bits
to
a
byte?

Again,
there’s
no
fundamental
reason
it
had
to
be
that
way But
there’s
a
trade-of
between
the
number
of
bits
you
can
store
and
the
address granularity
(why?) 8
bits
provides
256
( )
diferent
values,
which
is
enough
to
store
an
 character

28

ASCII

32

slide-33
SLIDE 33

A
memory
address
is
just
a
number

33

slide-34
SLIDE 34

A
note
about
“drawing”
memory

It’s
a
one-dimensional
array
(i.e.
there’s
just
a
single
numerical
address
for
each memory
cell) When
“drawing
a
picture”
of
memory
(like
in
the
earlier
slides)
sometimes
we
draw let-to-right
(with
line
wrapping!),
sometimes
top-to-bottom,
sometimes
bottom- to-top It
doesn’t
matter!
The
address
is
all
that
matters

34

slide-35
SLIDE 35

talk

Can
you
get
data
in
and
out
of
memory
with
the
instructions
we’ve
covered
already in
the
course? nope.

35

slide-36
SLIDE 36

Load/store
instructions

36

slide-37
SLIDE 37

Load
instructions

We
need
a
new
instruction
(well,
a
bunch
of
them
actually)

ldr
is
the
the
load
register
instruction

It’s
on
the
cheat
sheet
under
Load
&
Store

37

slide-38
SLIDE 38

Loading
from
memory
into
a
register

Any
load
instruction
loads
(reads)
some
bits
from
memory
and
puts
them
in
a register
of
your
choosing The
data
in
memory
is
unafected
(it
doesn’t
take
the
bits
“out”
of
memory,
they’re still
there
ater
the
instruction)

@
load
some
data
into
r0
 ldr
r0,
[r1]


38

slide-39
SLIDE 39

What’s
with
the
[r1]?

Here’s
some
new
syntax
for
your
.S
les:
using
a
register
name
inside
square brackets
(e.g.
[r1]) This
means
interpret
the
value
in
r1
as
a
memory
address,
and
read
the
32-bit word
at
that
memory
address
into
r0

39

slide-40
SLIDE 40

remember,
memory
addresses
are
just
a number

40

slide-41
SLIDE 41

Addresses
in
immediate
values?

Can
we
specify
the
memory
address
in
an
immediate
value? Yes,
but
the
number
of
addresses
would
be
limited
to
what
could
t
in
the instruction
encoding
(remember,
that’s
what
immediates
are!) But
more
oten
you’ll
read
the
address
from
a
register
(so
you
get
the
full
 possible
addresses,
but
you
have
to
get
the
address
into
a
register
before
the
ldr instruction)

232

41

slide-42
SLIDE 42

ldr
example

What
value
will
be
in
r0?

mov
r1,
0x20000000
@
put
the
address
in
r1
 ldr
r0,
[r1]






@
load
the
data
into
r0


42

slide-43
SLIDE 43

Let’s
nd
out

43

slide-44
SLIDE 44

Now,
with
buckets!

44

slide-45
SLIDE 45

45

slide-46
SLIDE 46

The
converter
slide

Decimal Hex Binary 0x
0000
0000 0b
0000
0000
0000
0000
0000
0000
0000
0000

46

slide-47
SLIDE 47

Are
these
valid
memory
addresses?

0x55 0x5444666

  • 9

0x467ab787e

Answers:
yes,
yes,
yes,
no
(too
big!)

47

slide-48
SLIDE 48

ARM
immediate
value
encoding

ARM
instructions
have
at
most
12
bits
of
room
for
immediate
values
(depending

  • n
encoding),
but
it
can’t
represent
all
the
values
0
to
4096
(

) Instead,
it
uses
an
8-bit
immediate
with
a
4-bit
rotation—Alistair
McDiarmid
has
a 
which
explains
how
it
works

212

really
nice
blog
post

48

slide-49
SLIDE 49

Storing
to
memory

Ok,
so
we
probably
want
to
put
some
data
in
memory
rst The
ARMv7-M
has
a
paired
 store
register
instruction
for
ldr,
which
takes
a
value in
a
register
and
stores
(writes)
it
to
a
memory
location Again,
the
[r1]
syntax
means
“use
the
value
in
r1
as
the
memory
address”—this time
the
address
to
store
the
data
to

str
r0,
[r1]


49

slide-50
SLIDE 50

str
example

What
will
the
memory
at
0x20000000
look
like
ater
this?

mov
r0,
42
 mov
r1,
0x20000000
 str
r0,
[r1]


50

slide-51
SLIDE 51

More
livecoding

51

slide-52
SLIDE 52

Endianness

The
fundamental
issue
here:
memory
is
byte
addressable,
but
a
register
can
t
4 bytes So
we
can
load
up
to
4
bytes
into
a
register—which
order
do
we
“combine”
them
in?

52

slide-53
SLIDE 53

53

slide-54
SLIDE 54

Assume
that
the
number
stored
is
0x01234567

Big
Endian 0x00 0x01 0x02 0x03 01 23 45 67 Little
Endian 0x00 0x01 0x02 0x03 67 45 23 01

54

slide-55
SLIDE 55

Why
do
I
need
to
care?

Because
the
memory
at
those
addressees
might
have
been: Little-endian
is
now
more
common,
but
it’s
important
to
know
that
other
options exist the
result
of
a
str
operation
from
your
discoboard read
from
a
le
created
on
some
other
machine received
over
the
network

55

slide-56
SLIDE 56

you
looked
at
this
in
the
week
2
lab

56

slide-57
SLIDE 57

Further
reading
on
endianness

https://betterexplained.com https://www.embedded.com

57

slide-58
SLIDE 58

Load/store
halfwords
&
bytes

Sometimes,
you
just
want
to
read
a
byte
or
a
halfword
(2
bytes),
even
though
you’ve got
a
4
byte
register The
instruction
set
provides
additional
load/store
instructions
for
this: They
work
just
the
same,
but
they
read
fewer
bytes
from
memory
(and
pad
the value
in
the
register
with
zeroes)

ldrb
@
load
byte
from
register
 ldrh
@
load
halfword
from
register
 strb
@
store
byte
to
register
 strh
@
store
halfword
to
register


58

slide-59
SLIDE 59

talk

Are
these
byte/halfword
versions
of
the
instructions
necessary?
Or
could
you
live without
them?

59

slide-60
SLIDE 60

the
address
&
the
value
at
that
address
are diferent
(but
they’re
both
just
numbers)

remember
the
buckets!

60

slide-61
SLIDE 61

More
load/store
options

There
are
more
ways
to
use
these
load/store
instructions
than
what
we’ve
covered here,
but
we’ll
get
to
them
later
in
the
course

61

slide-62
SLIDE 62

Questions

62

slide-63
SLIDE 63

Memory
address
space

63

slide-64
SLIDE 64

Address
space?

64

slide-65
SLIDE 65

Address
space

The
address
space
is
the
set
of
all
valid
addresses So
on
a
machine
with
32-bit
addresses
(like
your
discoboard)
that’s
 
diferent
addresses So
you
can
address
about
4GB
of
memory
(is
that
a
lot?)

= 4294967296 232

65

slide-66
SLIDE 66

66

slide-67
SLIDE 67

A
memory
address
is
just
a
number

67

slide-68
SLIDE 68

Not
all
memory
is
the
same

You
can
see
from
the
diagram
on
the
previous
slide:
the
address
space
is
divided into
“chunks” Some
parts
look
like
“memory”
as
we’ve
been
talking
about
so
far
(e.g.
SRAM, External
RAM)
but
some
parts
don’t
(e.g.
Peripherals)

68

slide-69
SLIDE 69

The
load/store
architecture

What
if
everything
the
CPU
did
in
interacting
with
the
outside
world
was
treated like
a
load
or
a
store
to
a
memory
address? loading
&
storing
data
to
RAM conguring
the
various
peripherals
on
the
board blinking
the
LED ring
the
missiles! This
is
the
idea
behind
the
load/store
architecture,
and
it’s
the
model
your discoboard
CPU
uses

69

slide-70
SLIDE 70

Recap:
reading
memory
diagrams

You’ll
see
“Memory
diagrams”
(picture
representations
of
data
in
memory,
or
at least
in
the
Cortex
address
space) Look
for
the
addresses—which
direction
are
they
ascending/descending? Remember
that
the
spatial
layout
can
be
misleading!

70

slide-71
SLIDE 71

71

slide-72
SLIDE 72

Not
all
memory
is
“data”

Some
of
it
is: readable writable executable connected
to
external
peripherals
(which
could
still
be
r,
w,
x
or
some combination) This
is
a
consequence
of
the
load/store
model:
we
treat
everything
like
memory, because
it
makes
the
CPU
simpler

72

slide-73
SLIDE 73

STM32L476VG
memory
map

The
discoboard
conforms
to
that
Cortex
M
memory
map
(since
it’s
a
Cortex
M
CPU) But
even
within
those
memory
ranges
the
addresses
of
specic
peripherals
(e.g. timers,
GPIO,
LCD,
audio
codec)
are
unique
to
this
particular
model
of
discoboard To
nd
out
more,
you
need
the
 This
is
one
of
the
main
diferences
between
diferent
microcontrollers—the memory
maps
they
use
for
peripherals discoboard
reference
manual

73

slide-74
SLIDE 74

Code
in
memory

You
probably
noticed
the
Code
section
at
the
bottom
(i.e.
the
lower
memory addresses)
of
the
address
space/memory
map
diagram That’s
where
the
encoded
 
are:
this
is
sometimes
called
the
instruction stream Each
instruction
has
a
memory
address That’s
where
the
 
cycle
fetches
from
(based
on
the
address
in the
pc
register) instructions fetch-decode-execute

74

slide-75
SLIDE 75

Labels
and
branching

75

slide-76
SLIDE 76

Labels:
addresses
for
humans

All
these
32-bit
numbers
are
ne
for
the
discoboard,
but
not
so
good
for
humans 
provide
a
way
to
(temporarily)
give
a
name
to
a
memory
address You’ve
seen
labels
already— main
is
one!
Any
word
followed
by
a
colon
(:)
in
your assembly
code
is
a
label Labels

76

slide-77
SLIDE 77

Label
gotchas

a
label
is
not
an
instruction,
it
doesn’t
get
encoded,
it’s
not
in
memory by
default,
labels
aren’t
“visible”
outside
the
source
le the
label
points
to
the
address
of
the
next
instruction
(whether
it’s
on
the
same line
or
a
newline)

  • nly
certain
characters
are
allowed
in
label
names

@
these
two
are
the
same
 label1:
mov
r0,
5
 
 label1:


77

slide-78
SLIDE 78

78

slide-79
SLIDE 79

Branch:
select
the
next
instruction
to
execute

Remember
the
“Adele
problem”?
How
do
we
go
back
to
the
chorus
ater
the
second verse? The
answer:
change
the
value
in
the
program
counter
(pc)
to
“jump
back”
to
an earlier
instruction To
do
this,
use
a
b
(branch)
instruction,
e.g.

b
0x80001c8


79

slide-80
SLIDE 80

Branch?
Why
not
jump?

80

slide-81
SLIDE 81

but
where
to
branch
to?

81

slide-82
SLIDE 82

Labels
in
the
instruction
stream

You
don’t
want
to
have
to
gure
out
the
address
of
the
instruction
“by
hand”
and move
it
into
the
pc So
we
use
labels
in
the
assembly
code
to
keep
track
of
the
addresses
of
specic instructions And
there’s
a
b
(branch)
instruction
to
tell
your
discoboard
to
make
the
jump
to that
instruction

82

slide-83
SLIDE 83

Branch
&
labels
example

main:
 

mov
r0,
0
 
 @
infinite
loop
-
r0
will
overflow
eventually
 loop:
 

add
r0,
1
 

b
loop


83

slide-84
SLIDE 84

Branches
&
labels
are
best
friends
:)

84

slide-85
SLIDE 85

Conditional
branch

The
<c>
sux
tells
us
that
the
branch
instruction
knows
about
the
 , i.e.
NZCV This
is
huge.

b<c>
<label>


condition
lags

85

slide-86
SLIDE 86

Conditional
branch
examples

See
the
back
of
the
 
for
the
full
list

beq
<label>
@
branch
if
Z
=
1
 bne
<label>
@
branch
if
Z
=
0
 bcs
<label>
@
branch
if
C
=
1
 bcc
<label>
@
branch
if
C
=
0
 bmi
<label>
@
branch
if
N
=
1
 bpl
<label>
@
branch
if
N
=
0
 bvs
<label>
@
branch
if
V
=
1
 bvc
<label>
@
branch
if
V
=
0


cheat
sheet

86

slide-87
SLIDE 87

talk

Does
your
discoboard
need
to
know
about
the
labels?
Where
is
that
information stored?

87

slide-88
SLIDE 88

Quiz

Given
a
number
n
in
r1,
calculate
the
sum
1+2+3+...+n
and
store
the
result
in

r0

Translate
the
following
‘C’
code
into
assembly.
Assume
that
r1
contains
the
value

  • f
x

if
(x
>
5)
 
x
=
5
 else
 
x
=
0


88

slide-89
SLIDE 89

Labels:
just
for
humans

When
you
build
your
program,
the
 
program: 1.
gures
out
what
exact
memory
addresses
the
labels
refer
to,
and 2.
swaps
all
the
label
names
for
the
32-bit
address
values
the
discoboard understands linker

Compiling
.pioenvs/disco_l476vg/src/main.S.o
 Linking
.pioenvs/disco_l476vg/firmware.elf
 Calculating
size
.pioenvs/disco_l476vg/firmware.elf
 text






data




bss




dec




hex
filename
 792







1080



1600



3472




d90
.pioenvs/disco_l476vg/firmware.elf
 Building
.pioenvs/disco_l476vg/firmware.bin


89

slide-90
SLIDE 90

Your
CPU
never
knows
about
the
labels

The
linker
replaces
them
all
with
addresses
before
you
create
the
binary
le
(e.g.

firmware.bin)
which
is
uploaded
to
your
discoboard

90

slide-91
SLIDE 91

Memory
segmentation

The
other
thing
that
the
linker
does
is
to
make
sure
that
the
various
parts
of
your program
get
put
in
the
right
part
of
the
address
space make
sure
secret
data
isn’t
readable make
sure
code/instructions
isn’t
writable make
sure
“storage”
memory
isn’t
executable This
is
a
good
thing™ It’s
all
controlled
by
the
linker
le
 (lib/bare_stm32l476/ldscripts/STM32L476VG_FLASH.ld)

91

slide-92
SLIDE 92

Questions?

92

slide-93
SLIDE 93

next
lecture

93