CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly - PowerPoint PPT Presentation

4.1 CS356 Unit 4 x86 Instruction Set

4.2 Why Learn Assembly • Understand hardware limitations • Understand performance • Use HW options that high-level languages don't allow (e.g., operating systems, utilizing special HW features, etc.) • Understand security vulnerabilities • Can help debugging

4.3 Compiling and Disassembling void abs_value (int x, int *res) { • From C to assembly code if (x < 0) { *res = -x; } else { $ gcc -Og -c -S file1.c *res = x; } } • Looking at binary files Original Code $ gcc -Og -c file1.c Disassembly of section .text: $ hexdump -C file1.o 0000000000000000 <abs_value>: 0: 85 ff test %edi,%edi • From binary to assembly 2: 78 03 js 7 “if(x<0) goto 7” 4: 89 3e mov %edi,(%rsi) 6: c3 retq $ gcc -Og -c file1.c 7: f7 df neg %edi $ objdump -d file1.o 9: 89 3e mov %edi,(%rsi) b: c3 retq Compiler Output (Machine code & Assembly) Notice how each instruction is turned CS:APP 3.2.2 into binary (shown in hex)

4.4 Basic Computer Organization Check the recorded lecture

4.5 x86-64 Memory Organization Recall variables live in memory & need to int x,y=5;z=8; • Because each byte of memory has its be loaded into the x = y+z; processor to be used own address we can picture memory A 40 as one column of bytes (Fig. 2) Proc. Mem. D • With 64-bit logical data bus we can 64 access up to 8-bytes of data at a time Fig. 2 … • We will usually show memory F8 0x000002 arranged in rows of 4 bytes (Fig. 3) or 13 0x000001 8 bytes 5A 0x000000 – Still with separate addresses for each byte Logical Byte-Oriented View of Mem. Fig. 3 … a 8 b 9 8E AD 33 29 0x000008 6 4 7 5 8E AD 33 29 0x000004 3 2 1 0 7C F8 13 5A 0x000000 Logical DWord-Oriented View

4.6 Memory & Word Size CS:APP 3.9.3 Double Word 4 • To refer to a chunk of memory we Quad Word 0 Word 6 Word 4 must provide: • Byte 7 Byte 6 Byte 5 Byte 4 The starting address • The size: B, W, D, L Byte 3 Byte 2 Byte 1 Byte 0 • There are rules for valid starting Word 2 Word 0 addresses Double Word 0 • A valid starting address should be a multiple of the data size Byte • Address Words (2-byte chunks) must start on an … 0x4007 even (divisible by 2) address Word 4006 QWord 4000 DWord 0x4006 • Double words (4-byte chunks) must start 0x4005 0x4004 on an address that is a multiple of Word 4004 0x4004 (divisible by) 4 … 0x4003 DWord • Quad words (8-byte chunks) must start on Word 4002 0x4002 0x4000 an address that is a multiple of (divisible 0x4001 by) 8 Word 0x4000 4000

4.7 Endian-ness CS:APP 2.1.3 • Endian-ness refers to the two alternate methods of ordering the The DWORD value: bytes in a larger unit (2, 4, 8 bytes) 0 x 12 34 56 78 – Big-Endian can be stored differently • PPC, Sparc, TCP/IP • MS byte is put at the starting address – Little-Endian 0x00 12 0x00 78 0x01 34 0x01 56 • used by Intel processors / original PCI bus 0x02 56 0x02 34 • LS byte is put at the starting address 0x03 78 0x03 12 • Some processors (like ARM) and Big-Endian Little-Endian busses can be configured for either big- or little-endian

4.8 Big-endian vs. Little-endian • Big-endian • Little-endian – makes sense if you view your – makes sense if you view your memory as starting at the memory as starting at the top-left and addresses bottom-right and addresses increasing as you go down increasing as you go up 0 1 2 3 … Addresses increasing downward 000000 12345678 Addresses increasing upward 000014 000004 000010 000008 00000C 00000C 000008 000010 000004 000014 … 000000 12345678 3 2 1 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Byte 0 Byte 1 Byte 2 Byte 3 Byte 3 Byte 2 Byte 1 Byte 0

4.9 Big-endian vs. Little-endian Issues • Issues arise when transferring data between different systems – Byte-wise copy of data from big-endian system to little-endian system – Major issue in networks (little-endian computer => big-endian computer) and even within a single computer (system memory => I/O device) Intel is Big-Endian Little-Endian LITTLE-ENDIAN 0 1 2 3 Copy byte 0 to byte 0, … Addresses increasing downward 000000 12345678 byte 1 to byte 1, etc. Addresses increasing upward 000014 000004 000010 000008 00000C 00000C DWORD @ 0 in big-endian 000008 000010 system is now different than 000004 000014 DWORD @ 0 in little-endian … 78563412 000000 system 3 2 1 0 wrong! DWORD @ addr. 0 1 2 3 4 5 6 7 7 8 5 6 3 4 1 2 8 Byte 0 Byte 1 Byte 2 Byte 3 Byte 3 Byte 2 Byte 1 Byte 0

4.10 x86-64 ASSEMBLY

4.11 x86-64 Data Sizes CS:APP 3.3 Integer Floating Point • 4 sizes • 2 sizes – Byte (B) – Single (S) • 8-bits = 1 byte • 32-bits = 4 bytes – Word (W) – Double (D) • 16-bits = 2 bytes • 64-bits = 8 bytes • (For a 32-bit data bus, a – Double Word (L) double would be accessed • 32-bits = 4 bytes from memory in 2 reads) – Quad Word (Q) • 64-bits = 8 bytes In x86-64, instructions generally specify what size data to access from memory and then operate upon.

4.12 x86-64 Register Names CS:APP 3.4 b (1 byte) w (2 bytes) l (4 bytes) q (8 bytes) %ax %rax %eax accumulate %bx %ebx base %rbx %cx %rcx %ecx counter %dx %rdx %edx data %si %rsi %esi source index destination index %rdi %edi %di %sp %rsp %esp stack pointer %bp %rbp %ebp base pointer • In addition: %al , %bl , %cl , %dl , %sil , %dil , %spl , %bpl for least significant byte • In addition: %r8 to %r15 ( %r8d / %r8w / %r8b for lower 4 / 2 / 1 bytes)

4.13 Intel x86 Register Set • 8-bit processors in late 1970s – 4 registers for integer data: A, B, C, D – 4 registers for address/pointers: SP (stack pointer), BP (base pointer), SI (source index), DI (dest. index) • 16-bit processors extended registers to 16-bits but continued to support 8-bit access! – Use prefix/suffix to indicate size: AL referenced the lower 8-bits of register A AH the higher 8-bits of register A AX referenced the full 16-bit value • 32-/64-bit processors (see next slide)

4.14 x86-64 Instruction Classes • Data Transfer ( mov instruction) – Moves data between registers , or between registers and memory (One operand must be a processor register.) – Specifies size via a suffix on the instruction ( movb , movw , movl , movq ) • ALU Operations – One operand must be a processor register or a constant – Size and operation specified by instruction ( addl , orq , andb , subw ) • Control / Program Flow – Unconditional/Conditional Branch ( cmpq , jmp , je , jne , jl , jge ) – Subroutine Calls ( call , ret ) • Privileged / System Instructions – Instructions that can only be used by OS or other “supervisor” software (e.g. int to access certain OS capabilities, etc.)

4.15 Operand Locations • Source operands must be in one of the following 3 locations: – A register value (e.g. %rax ) Proc. Mem. – A value in a memory location 400 Inst. A Reg. (e.g. value at address 0x0200e8 ) 401 Inst. Reg. – A constant stored in the instruction D ... Data itself (known as “immediate value”) ALU Data add $1,d0 ... – The $ indicates the constant/immediate • Destination operands must be – A register – A memory location (specified by its address 0x0200e8)

4.16 DATA TRANSFER INSTRUCTIONS

4.17 mov Instruction & Data Size CS:APP 3.4.2 • Moves data between memory and processor register • Always provide the LS-Byte address (little-endian) of the desired data • Size is explicitly defined by the instruction suffix (' mov[bwlq] ') used • Recall: Start address should be divisible by size of access (Assume start address = A) Processor Register Memory / RAM 63 7 0 Byte operations only 7654 3210 A+4 access the 1-byte at the movb Byte fedc ba 98 A specified address movb leaves upper bits unaffected 63 15 0 Word operations access A+4 7654 3210 the 2-bytes starting at the Word movw A specified address fedc ba98 movw leaves upper bits unaffected 63 31 0 Word operations access A+4 7654 3210 the 4-bytes starting at the 0000 0000 Double Word movl A specified address fedc ba98 movl zeros the upper bits Word operations access 63 0 7654 3210 A+4 the 8-bytes starting at the movq Quad Word fedc ba98 A specified address

4.18 Mem/Register Transfer Examples Memory / RAM • mov[b,w,l,q] src, dst 7654 3210 0x00204 fedc ba98 0x00200 • Initial Conditions: Processor Register ffff ffff 1234 5678 rax – mov q 0x200, %rax 7654 3210 fedc ba98 rax – mov l 0x204, %eax movl zeros the upper 0000 0000 7654 3210 rax bits of dest. reg – mov w 0x202, %ax 0000 0000 7654 fedc rax – mov b 0x207, %al 0000 0000 7654 fe 76 rax 0000 76 00 0x004e4 – mov b %al, 0x4e5 0000 0000 0x004e0 0000 7600 0x004e4 – mov l %eax, 0x4e0 7654 fe76 0x004e0 movl changes only 4 bytes here Treat these instructions as a sequence where one affects the next.

CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly - PowerPoint PPT Presentation

4.1 CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly Understand hardware limitations Understand performance Use HW options that high-level languages don't allow (e.g., operating systems, utilizing special HW features, etc.)

Introduction to CS356 CS356 Object-Oriented Design and Programming http://cs356.yusun.io

SOLID: Principles of OOD CS356 Object-Oriented Design and Programming http://cs356.yusun.io

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

CS356 Unit 4 Intro to x86 Instruction Set 4.2 Why Learn Assembly To understand something of

CS356 Unit 5 x86 Control Flow 5.2 JUMP/BRANCHING OVERVIEW 5.3 Concept of Jumps/Branches

CS356 Unit 10 Memory Allocation & Heap Management 10.2 BASIC OS CONCEPTS & TERMINOLOGY

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (IP register)

CS356 Unit 9 Virtual Memory & Address Translation 9.2 Indirection Indirection means

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (Instruc. Pointer)

CS356 Unit 12 Processor Hardware Organization Pipelining 12.2 From combinational to sequential

CS356 Unit 7 Data Layout & Intermediate Stack Frames 7.2 Structs CS:APP 3.9.1 Structs

CS356 Unit 11 Linking 11.2 In complex C projects... We would like to: Split source into

CS356 Unit 15 Review 15.2 Final Jeopardy Binary Instruction Random Riddles Memory Processor

Goals Understand the terms and ideas used in a modern, high-performance processor CS356 Unit

CS356 Unit 12a Processor Hardware Organization BASIC HW Pipelining 12a.3 12a.4 Logic Circuits

CS356 Unit 10 Memory Allocation & Heap Management BASIC OS CONCEPTS & TERMINOLOGY 10.3

EDI progress update to Council The purpose of this item will be to cover Update since the

Required Notice Automation DOUG ROOPE LMAC FEBRUARY 2020 Required Notice Reporting

EDI/IT BH Redesign Workgroup May 10, 2017 Agenda Welcome Practitioner enrollment and

Data Movement Instructions Systems Design & Programming CMPE 310 Intel Assembly Data

Memory: C and x86 assembly 1 Loop Refresher mem ops Optimized or sum: .LFB2: .loc 1 2 0

Web services Patryk Czarnik XML and Applications 2015/2016 Lecture 6 11.04.2016 Motivation

CS/ECE 6710 Tool Suite Verilog sim Synopsys Behavioral Design Compiler Verilog Structural

TOS Arno Puder 1 Objectives Explain non-preemptive scheduling Explain step-by-step how

CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly - PowerPoint PPT Presentation

4.1 CS356 Unit 4 x86 Instruction Set 4.2 Why Learn Assembly Understand hardware limitations Understand performance Use HW options that high-level languages don't allow (e.g., operating systems, utilizing special HW features, etc.)

Introduction to CS356 CS356 Object-Oriented Design and Programming http://cs356.yusun.io

SOLID: Principles of OOD CS356 Object-Oriented Design and Programming http://cs356.yusun.io

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

CS356 Unit 4 Intro to x86 Instruction Set 4.2 Why Learn Assembly To understand something of

CS356 Unit 5 x86 Control Flow 5.2 JUMP/BRANCHING OVERVIEW 5.3 Concept of Jumps/Branches

CS356 Unit 10 Memory Allocation &amp; Heap Management 10.2 BASIC OS CONCEPTS &amp; TERMINOLOGY

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (IP register)

CS356 Unit 9 Virtual Memory &amp; Address Translation 9.2 Indirection Indirection means

CS356 Unit 6 x86 Procedures Basic Stack Frames 6.2 Review of Program Counter (Instruc. Pointer)

CS356 Unit 12 Processor Hardware Organization Pipelining 12.2 From combinational to sequential

CS356 Unit 7 Data Layout &amp; Intermediate Stack Frames 7.2 Structs CS:APP 3.9.1 Structs

CS356 Unit 11 Linking 11.2 In complex C projects... We would like to: Split source into

CS356 Unit 15 Review 15.2 Final Jeopardy Binary Instruction Random Riddles Memory Processor

Goals Understand the terms and ideas used in a modern, high-performance processor CS356 Unit

CS356 Unit 12a Processor Hardware Organization BASIC HW Pipelining 12a.3 12a.4 Logic Circuits

CS356 Unit 10 Memory Allocation &amp; Heap Management BASIC OS CONCEPTS &amp; TERMINOLOGY 10.3

EDI progress update to Council The purpose of this item will be to cover Update since the

Required Notice Automation DOUG ROOPE LMAC FEBRUARY 2020 Required Notice Reporting

EDI/IT BH Redesign Workgroup May 10, 2017 Agenda Welcome Practitioner enrollment and

Data Movement Instructions Systems Design &amp; Programming CMPE 310 Intel Assembly Data

Memory: C and x86 assembly 1 Loop Refresher mem ops Optimized or sum: .LFB2: .loc 1 2 0

Web services Patryk Czarnik XML and Applications 2015/2016 Lecture 6 11.04.2016 Motivation

CS/ECE 6710 Tool Suite Verilog sim Synopsys Behavioral Design Compiler Verilog Structural

TOS Arno Puder 1 Objectives Explain non-preemptive scheduling Explain step-by-step how

CS356 Unit 10 Memory Allocation & Heap Management 10.2 BASIC OS CONCEPTS & TERMINOLOGY

CS356 Unit 9 Virtual Memory & Address Translation 9.2 Indirection Indirection means

CS356 Unit 7 Data Layout & Intermediate Stack Frames 7.2 Structs CS:APP 3.9.1 Structs

CS356 Unit 10 Memory Allocation & Heap Management BASIC OS CONCEPTS & TERMINOLOGY 10.3

Data Movement Instructions Systems Design & Programming CMPE 310 Intel Assembly Data