resources : patt patel ppts
audience: wase 2006 batch
Chapter 5The LC-3
Instruction Set Architecture
ISA = All of the programmer-visible components and operations of the computer
• memory organization
Ø address space -- how may locations can be addressed?
Ø addressibility -- how many bits per location?
• register set
Ø how many? what size? how are they used?
• instruction set
Ø opcodes
Ø data types
Ø addressing modes
ISA provides all information needed for someone that wants towrite a program in machine language (or translate from a high-level language to machine language).
Instruction Set
Instruction execution means CPU operation.
Instruction set are the functional requirement for the CPU.
With instruction set a programmer becomes aware of register, memory organization, type of data and functionality of ALU.
An instruction consists of OPCODES and Operands. Opcodes specify the operation to be performed.
Collection of instructions is instruction set
Instruction Set
Opcodes use mnemonics like Add, Subtract.
A typical instruction format:
4 bits 6 bits 6 bits
opcode oper1 oper2
1 bit 2 bits 1 bit 3 bits
sign opcode register momory
Instruction format
8 bits 8 bits
Opcode-6 bits d-1bit w-1 bit mod-2 bits reg- 3 bits r/m-3
d === 0 register is source else destination
w=== 0 8bit else 16 bit
mod === 00,01,10,11 + r/m specify the addressing mode
Instruction format
MOV AX, BX
OPCODE D W MOD REG R/M
100010 1 1 11 000 011=== 8BC3H
REGISTER IS DST, LENGTH IS 16 BITS, I OPERAND IS REGISTER, REGISTER IS AX, ADDRESSING IS REGISTER
Instruction types
The number of addresses required may vary but still each instruction has a source, a destn, result, next instr so four -4- addresses may be required.
Next instruction address if implied three -3- addresses are required.
If result stored in the destn two -2- addresses are required.
If accumulator is used one -1- address is required.
If stack are used zero addresses are required.
types
Type of operands : Addresses, numbers, characters, logical data
Type of of operations:
A. Data Transfer: Move, load, store
B. Arithmetic: Add, Subtract
C. Logical: AND, OR, NOT
D. Transfer of Control: JUMP, RET, HALT
E. I/O: Input
F. Conversion: Convert
Addressing Modes
•Immediate
•Direct
•Indirect
•Register
•Register Indirect
•Displacement
•Stack
An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere.
Addressing Modes
•Immediate add reg1 reg2 constant reg1 := reg2 + constant;
MOV AX, 0005H
Operand is present in the register. Immediate addressing is used to declare constants or variables
Addressing modes
b. Direct
load reg, address
MOVE AX, [5000H]à effective address
Disadvantage is limited address space.
Addressing Modes
c. Indirect
MOV AX, [BX]
POINTERS
d. register
MOV BX, AX
e. register Indirect : MOV AX, [BX]
f. displacement : MOV AX, 50H[BX]
g. stack addressing:
Instruction length
8 bit 16 bit 32 bit 64 bit
Programmers want more addresses, more operands
But if we have 64 bits it is a waste of memory.
Opcodes can all be of same length or variable lengths
Instruction Formats
MOV AX, 25H
1011 W REG 16-BIT-DATA
1011 1 000 0025H=== B80025H
LC-3 Overview: Memory and Registers
Memory
• address space: 216 locations (16-bit addresses) 65536 addresses
• addressability: 16 bits
Registers
• temporary storage, accessed in a single machine cycle
Ø accessing memory generally takes longer than a single cycle
• eight general-purpose registers: R0 - R7
Ø each 16 bits wide
Ø how many bits to uniquely identify a register?
• other registers
Ø not directly addressable, but used by (and affected by) instructions
Ø PC (program counter), condition codes
LC-3 Overview: Instruction Set
Opcodes
• 15 opcodes
• Operate instructions: ADD, AND, NOT
• Data movement instructions: LD, LDI, LDR, LEA, ST, STR, STI
• Control instructions: BR, JSR/JSRR, JMP, RTI, TRAP
• some opcodes set/clear condition codes, based on result:
Ø N = negative, Z = zero, P = positive (> 0)
Data Types
• 16-bit 2’s complement integer
Addressing Modes
• How is the location of an operand specified?
• non-memory addresses: immediate, register
• memory addresses: PC-relative, indirect, base+offset
Operate Instructions
Only three operations: ADD, AND, NOT
Source and destination operands are registers
• These instructions do not reference memory.
• ADD and AND can use “immediate” mode,where one operand is hard-wired into the instruction.
Will show dataflow diagram with each instruction.
• illustrates when and where data moves to accomplish the desired operation
NOT (Register)
ADD/AND (Register)
ADD/AND (Immediate)
Using Operate Instructions
With only ADD, AND, NOT…
• How do we subtract?
• How do we OR?
• How do we copy from one register to another?
• How do we initialize a register to zero?
Data Movement Instructions
Load -- read data from memory to register
• LD: PC-relative mode
• LDR: base+offset mode
• LDI: indirect mode
Store -- write data from register to memory
• ST: PC-relative mode
• STR: base+offset mode
• STI: indirect mode
Load effective address -- compute address, save in register
• LEA: immediate mode
• does not access memory
PC-Relative Addressing Mode
Want to specify address directly in the instruction
• But an address is 16 bits, and so is an instruction!
• After subtracting 4 bits for opcodeand 3 bits for register, we have 9 bits available for address.
Solution:
• Use the 9 bits as a signed offset from the current PC.
9 bits:
Can form any address X, such that:
Remember that PC is incremented as part of the FETCH phase;
This is done before the EVALUATE ADDRESS stage.
LD (PC-Relative)
ST (PC-Relative)
Indirect Addressing Mode
With PC-relative mode, can only address data within 256 words of the instruction.
• What about the rest of memory?
Solution #1:
• Read address from memory location,then load/store to that address.
First address is generated from PC and IR(just like PC-relative addressing), thencontent of that address is used as target for load/store.
LDI (Indirect)
STI (Indirect)
Base + Offset Addressing Mode
With PC-relative mode, can only address data within 256 words of the instruction.
• What about the rest of memory?
Solution #2:
• Use a register to generate a full 16-bit address.
4 bits for opcode, 3 for src/dest register,3 bits for base register -- remaining 6 bits are usedas a signed offset.
• Offset is sign-extended before adding to base register.
LDR (Base+Offset)
STR (Base+Offset)
Load Effective Address
Computes address like PC-relative (PC plus signed offset) and stores the result into a register.
Note: The address is stored in the register, not the contents of the memory location.
LEA (Immediate)
Example
Control Instructions
Used to alter the sequence of instructions(by changing the Program Counter)
Conditional Branch
• branch is taken if a specified condition is true
Ø signed offset is added to PC to yield new PC
• else, the branch is not taken
Ø PC is not changed, points to the next sequential instruction
Unconditional Branch (or Jump)
• always changes the PC
TRAP
• changes PC to the address of an OS “service routine”
• routine will return control to the next instruction (after TRAP)
Condition Codes
LC-3 has three condition code registers: N -- negative Z -- zero P -- positive (greater than zero)
Set by any instruction that writes a value to a register(ADD, AND, NOT, LD, LDR, LDI, LEA)
Exactly one will be set at all times
• Based on the last instruction that altered a register
Branch Instruction
Branch specifies one or more condition codes.
If the set bit is specified, the branch is taken.
• PC-relative addressing:target address is made by adding signed offset (IR[8:0])to current PC.
• Note: PC has already been incremented by FETCH stage.
• Note: Target must be within 256 words of BR instruction.
If the branch is not taken,the next sequential instruction is executed.
BR (PC-Relative)
Using Branch Instructions
Compute sum of 12 integers.Numbers start at location x3100. Program starts at location x3000.
Sample Program
JMP (Register)
Jump is an unconditional branch -- always taken.
• Target address is the contents of a register.
• Allows any target address.
TRAP
Calls a service routine, identified by 8-bit “trap vector.”
When routine is done, PC is set to the instruction following TRAP.
(We’ll talk about how this works later.)
Another Example
Count the occurrences of a character in a file
• Program begins at location x3000
• Read character from keyboard
• Load each character from a “file”
Ø File is a sequence of memory locations
Ø Starting address of file is stored in the memory locationimmediately after the program
• If file character equals input character, increment counter
• End of file is indicated by a special ASCII value: EOT (x04)
• At the end, print the number of characters and halt(assume there will be less than 10 occurrences of the character)
A special character used to indicate the end of a sequenceis often called a sentinel.
• Useful when you don’t know ahead of time how many timesto execute a loop.
Flow Chart
Program (1 of 2)
Program (2 of 2)
LC-3 Data PathRevisited
Data Path Components
Global bus
• special set of wires that carry a 16-bit signal to many components
• inputs to the bus are “tri-state devices,”that only place a signal on the bus when they are enabled
• only one (16-bit) signal should be enabled at any time
Ø control unit decides which signal “drives” the bus
• any number of components can read the bus
Ø register only captures bus data if it is write-enabled by the control unit
Memory
• Control and data registers for memory and I/O devices
• memory: MAR, MDR (also control signal for read/write)
Data Path Components
ALU
• Accepts inputs from register fileand from sign-extended bits from IR (immediate field).
• Output goes to bus.
Ø used by condition code logic, register file, memory
Register File
• Two read addresses (SR1, SR2), one write address (DR)
• Input from bus
Ø result of ALU operation or memory read
• Two 16-bit outputs
Ø used by ALU, PC, memory address
Ø data for store instructions passes through ALU
Data Path Components
PC and PCMUX
• Three inputs to PC, controlled by PCMUX
Ø PC+1 – FETCH stage
Ø Address adder – BR, JMP
Ø bus – TRAP (discussed later)
MAR and MARMUX
• Two inputs to MAR, controlled by MARMUX
• Address adder – LD/ST, LDR/STR
• Zero-extended IR[7:0] -- TRAP (discussed later)
Data Path Components
Condition Code Logic
• Looks at value on bus and generates N, Z, P signals
• Registers set only when control unit enables them (LD.CC)
Ø only certain instructions set the codes(ADD, AND, NOT, LD, LDI, LDR, LEA)
Control Unit – Finite State Machine
• On each machine cycle, changes control signals for next phaseof instruction processing
Ø who drives the bus? (GatePC, GateALU, …)
Ø which registers are write enabled? (LD.IR, LD.REG, …)
Ø which operation should ALU perform? (ALUK)
Ø …
• Logic includes decoder for opcode, etc.
Chapter 7Assembly Language
Human-Readable Machine Language
Computers like ones and zeros…
Humans like symbols…
Assembler is a program that turns symbols intomachine instructions.
• ISA-specific:close correspondence between symbols and instruction set
Ø mnemonics for opcodes
Ø labels for memory locations
• additional operations for allocating storage and initializing data
An Assembly Language Program
;
; Program to multiply a number by the constant 6
;
.ORIG x3050
LD R1, SIX
LD R2, NUMBER
AND R3, R3, #0 ; Clear R3. It will
; contain the product.
; The inner loop
;
AGAIN ADD R3, R3, R2
ADD R1, R1, #-1 ; R1 keeps track of
BRp AGAIN ; the iteration.
;
HALT
;
NUMBER .BLKW 1
SIX .FILL x0006
;
.END
LC-3 Assembly Language Syntax
Each line of a program is one of the following:
• an instruction
• an assember directive (or pseudo-op)
• a comment
Whitespace (between symbols) and case are ignored.
Comments (beginning with “;”) are also ignored.
An instruction has the following format:
Opcodes and Operands
Opcodes
• reserved symbols that correspond to LC-3 instructions
• listed in Appendix A
Ø ex: ADD, AND, LD, LDR, …
Operands
• registers -- specified by Rn, where n is the register number
• numbers -- indicated by # (decimal) or x (hex)
• label -- symbolic name of memory location
• separated by comma
• number, order, and type correspond to instruction format
Ø ex: ADD R1,R1,R3 ADD R1,R1,#3 LD R6,NUMBER BRz LOOP
Labels and Comments
Label
• placed at the beginning of the line
• assigns a symbolic name to the address corresponding to line
Ø ex: LOOP ADD R1,R1,#-1 BRp LOOP
Comment
• anything after a semicolon is a comment
• ignored by assembler
• used by humans to document/understand programs
• tips for useful comments:
Ø avoid restating the obvious, as “decrement R1”
Ø provide additional insight, as in “accumulate product in R6”
Ø use comments to separate pieces of program
Assembler Directives
Pseudo-operations
• do not refer to operations executed by program
• used by assembler
• look like instruction, but “opcode” starts with dot
Trap Codes
LC-3 assembler provides “pseudo-instructions” foreach trap code, so you don’t have to remember them.
Style Guidelines
Use the following style guidelines to improvethe readability and understandability of your programs:
• Provide a program header, with author’s name, date, etc.,and purpose of program.
• Start labels, opcode, operands, and comments in same columnfor each line. (Unless entire line is a comment.)
• Use comments to explain what each register does.
• Give explanatory comment for most instructions.
• Use meaningful symbolic names.
• Mixed upper and lower case for readability.
• ASCIItoBinary, InputRoutine, SaveR1
• Provide comments between program sections.
• Each line must fit on the page -- no wraparound or truncations.
• Long statements split in aesthetically pleasing manner.
Sample Program
Count the occurrences of a character in a file.Remember this?
Char Count in Assembly Language (1 of 3)
;
; Program to count occurrences of a character in a file.
; Character to be input from the keyboard.
; Result to be displayed on the monitor.
; Program only works if no more than 9 occurrences are found.
;
;
; Initialization
;
.ORIG x3000
AND R2, R2, #0 ; R2 is counter, initially 0
LD R3, PTR ; R3 is pointer to characters
GETC ; R0 gets character input
LDR R1, R3, #0 ; R1 gets first character
;
; Test character for end of file
;
TEST ADD R4, R1, #-4 ; Test for EOT (ASCII x04)
BRz OUTPUT ; If done, prepare the output
Char Count in Assembly Language (2 of 3)
;
; Test character for match. If a match, increment count.
;
NOT R1, R1
ADD R1, R1, R0 ; If match, R1 = xFFFF
NOT R1, R1 ; If match, R1 = x0000
BRnp GETCHAR ; If no match, do not increment
ADD R2, R2, #1
;
; Get next character from file.
;
GETCHAR ADD R3, R3, #1 ; Point to next character.
LDR R1, R3, #0 ; R1 gets next char to test
BRnzp TEST
;
; Output the count.
;
OUTPUT LD R0, ASCII ; Load the ASCII template
ADD R0, R0, R2 ; Covert binary count to ASCII
OUT ; ASCII code in R0 is displayed.
HALT ; Halt machine
Char Count in Assembly Language (3 of 3)
;
; Storage for pointer and ASCII template
;
ASCII .FILL x0030
PTR .FILL x4000
.END
Assembly Process
Convert assembly language file (.asm)into an executable file (.obj) for the LC-3 simulator.
First Pass:
• scan program file
• find all labels and calculate the corresponding addresses;this is called the symbol table
Second Pass:
• convert instructions to machine language,using information from symbol table
First Pass: Constructing the Symbol Table
•Find the .ORIG statement,which tells us the address of the first instruction.
• Initialize location counter (LC), which keeps track of thecurrent instruction.
•For each non-empty line in the program:
• If line contains a label, add label and LC to symbol table.
• Increment LC.
– NOTE: If statement is .BLKW or .STRINGZ,increment LC by the number of words allocated.
•Stop when .END statement is reached.
NOTE: A line that contains only a comment is considered an empty line.
Practice
Construct the symbol table for the program in Figure 7.1(Slides 7-11 through 7-13).
Second Pass: Generating Machine Language
For each executable assembly language statement,generate the corresponding machine language instruction.
• If operand is a label,look up the address from the symbol table.
Potential problems:
• Improper number or type of arguments
Ø ex: NOT R1,#7 ADD R1,R2 ADD R3,R3,NUMBER
• Immediate argument too large
Ø ex: ADD R1,R2,#1023
• Address (associated with label) more than 256 from instruction
Ø can’t use PC-relative addressing mode
Practice
Using the symbol table constructed earlier,translate these statements into LC-3 machine language.
LC-3 Assembler
Using “assemble” (Unix) or LC3Edit (Windows),generates several different output files.
Object File Format
LC-3 object file contains
• Starting address (location where program must be loaded),followed by…
• Machine instructions
Example
• Beginning of “count character” object file looks like this:
Multiple Object Files
An object file is not necessarily a complete program.
• system-provided library routines
• code blocks written by multiple developers
For LC-3 simulator, can load multiple object files into memory,then start executing at a desired address.
• system routines, such as keyboard input, are loaded automatically
Ø loaded into “system memory,” below x3000
Ø user code should be loaded between x3000 and xFDFF
• each object file includes a starting address
• be careful not to load overlapping object files
Linking and Loading
Loading is the process of copying an executable imageinto memory.
• more sophisticated loaders are able to relocate imagesto fit into available memory
• must readjust branch targets, load/store addresses
Linking is the process of resolving symbols betweenindependent object files.
• suppose we define a symbol in one module,and want to use it in another
• some notation, such as .EXTERNAL, is used to tell assembler that a symbol is defined in another module
• linker will search symbol tables of other modules to resolve symbols and complete code generation before loading
Monday, February 5, 2007
Subscribe to:
Posts (Atom)