Intel 8086/80186/80286/80386 CPU Reference

Overview

The Intel 8086 family represents the x86 architecture that became the foundation for modern PC computing. This implementation provides a unified CPU core supporting multiple processor models (8086, 80186, 80286, 80386) that can be used by PC/DOS emulation through the Memory8086 trait.

Implementation: crates/core/src/cpu_8086.rs, crates/core/src/cpu_8086_protected.rs

Supported Processor Models

The implementation supports the following CPU models via the CpuModel enum:

  • Intel8086: Original 8086 processor (16-bit, real mode only)
  • Intel80186: Enhanced 8086 with additional instructions
  • Intel80286: Adds protected mode with 16-bit segments
  • Intel80386: Adds 32-bit operations and extended registers (partial support)

Set the model when creating the CPU:

let mut cpu = Cpu8086::new(memory, CpuModel::Intel80286);

Architecture

Registers (Real Mode)

General Purpose Registers (16-bit)

  • AX (Accumulator): Primary arithmetic register
    • AH: High byte of AX
    • AL: Low byte of AX
  • BX (Base): Base pointer for memory access
    • BH: High byte of BX
    • BL: Low byte of BX
  • CX (Count): Loop counter and shift operations
    • CH: High byte of CX
    • CL: Low byte of CX
  • DX (Data): I/O operations and extended arithmetic
    • DH: High byte of DX
    • DL: Low byte of DX

Pointer and Index Registers (16-bit)

  • SP (Stack Pointer): Points to top of stack
  • BP (Base Pointer): Base pointer for stack frame
  • SI (Source Index): String/memory operations source
  • DI (Destination Index): String/memory operations destination

Segment Registers (16-bit)

  • CS (Code Segment): Segment for instruction fetch
  • DS (Data Segment): Default segment for data
  • SS (Stack Segment): Segment for stack operations
  • ES (Extra Segment): Additional data segment

Control Registers

  • IP (Instruction Pointer): Offset of current instruction
  • FLAGS: 16-bit processor status and control flags

32-bit Extensions (80386+)

When running in 80386 mode, registers can be accessed in 32-bit form:

  • EAX, EBX, ECX, EDX: 32-bit versions of AX, BX, CX, DX
  • ESP, EBP, ESI, EDI: 32-bit versions of SP, BP, SI, DI
  • EIP: 32-bit instruction pointer (protected mode only)

FLAGS Register

ODIT SZ-A-P-C  (16-bit FLAGS)
││││ ││ │ │ │
││││ ││ │ │ └─ Carry Flag (CF)
││││ ││ │ └─── Parity Flag (PF)
││││ ││ └───── Auxiliary Carry Flag (AF)
││││ │└─────── Zero Flag (ZF)
││││ └──────── Sign Flag (SF)
│││└────────── Trap Flag (TF) - Single step
││└─────────── Interrupt Enable Flag (IF)
│└──────────── Direction Flag (DF) - String operations
└───────────── Overflow Flag (OF)

Additional flags in 80286+:

  • IOPL (bits 12-13): I/O Privilege Level
  • NT (bit 14): Nested Task
  • RF (bit 16, 80386+): Resume Flag
  • VM (bit 17, 80386+): Virtual 8086 Mode
  • AC (bit 18, 80386+): Alignment Check

Memory Segmentation

The 8086 uses segmented memory addressing:

Physical Address = (Segment << 4) + Offset

Example:

  • CS:IP = 0x1000:0x0234 → Physical = 0x10234
  • Maximum addressable memory: 1MB (20-bit address space)

Usage

Systems using the 8086 must implement the Memory8086 trait:

pub trait Memory8086 {
    fn read(&self, addr: u32) -> u8;
    fn write(&mut self, addr: u32, val: u8);
    fn port_in(&mut self, port: u16) -> u8;
    fn port_out(&mut self, port: u16, val: u8);
}

Example

use emu_core::cpu_8086::{Cpu8086, Memory8086, CpuModel};

struct PcSystem {
    ram: Vec<u8>,
}

impl Memory8086 for PcSystem {
    fn read(&self, addr: u32) -> u8 {
        self.ram[addr as usize]
    }
    
    fn write(&mut self, addr: u32, val: u8) {
        self.ram[addr as usize] = val;
    }
    
    fn port_in(&mut self, port: u16) -> u8 {
        // Handle hardware I/O
        0
    }
    
    fn port_out(&mut self, port: u16, val: u8) {
        // Handle hardware I/O
    }
}

let system = PcSystem { ram: vec![0; 0x100000] };
let mut cpu = Cpu8086::new(system, CpuModel::Intel8086);
cpu.reset();

Instruction Set Overview

The x86 instruction set is extensive. Key categories include:

Data Transfer

  • MOV: Move data between registers/memory
  • PUSH/POP: Stack operations
  • XCHG: Exchange values
  • IN/OUT: Port I/O
  • LEA: Load effective address
  • LDS/LES: Load pointer with segment
  • LAHF/SAHF: Load/store AH from/to flags
  • PUSHF/POPF: Push/pop flags

Arithmetic

  • ADD/SUB: Addition/subtraction
  • ADC/SBB: Add/subtract with carry
  • INC/DEC: Increment/decrement
  • NEG: Two's complement negation
  • CMP: Compare (subtract without storing)
  • MUL/IMUL: Unsigned/signed multiply
  • DIV/IDIV: Unsigned/signed divide
  • AAA/AAS/AAM/AAD: ASCII adjust for arithmetic
  • DAA/DAS: Decimal adjust for addition/subtraction

Logical

  • AND/OR/XOR: Bitwise operations
  • NOT: Bitwise NOT
  • TEST: Logical compare (AND without storing)

Shift/Rotate

  • SHL/SHR: Logical shift left/right
  • SAL/SAR: Arithmetic shift left/right
  • ROL/ROR: Rotate left/right
  • RCL/RCR: Rotate through carry left/right

String Operations

  • MOVS: Move string
  • CMPS: Compare string
  • SCAS: Scan string
  • LODS: Load string
  • STOS: Store string
  • Each has byte/word variants (MOVSB/MOVSW)
  • REP/REPE/REPNE: Repeat prefixes

Control Flow

  • JMP: Unconditional jump
  • Jcc: Conditional jumps (JE, JNE, JA, JB, JG, JL, etc.)
  • CALL/RET: Subroutine call/return
  • LOOP/LOOPE/LOOPNE: Loop instructions
  • INT/INTO: Software interrupt
  • IRET: Return from interrupt

Flag Operations

  • CLC/STC/CMC: Clear/set/complement carry
  • CLD/STD: Clear/set direction flag
  • CLI/STI: Clear/set interrupt flag

80186+ Instructions

  • PUSHA/POPA: Push/pop all general registers
  • BOUND: Check array bounds
  • ENTER/LEAVE: High-level procedure entry/exit
  • INS/OUTS: String I/O

80286 Protected Mode

  • LGDT/SGDT: Load/store GDT register
  • LIDT/SIDT: Load/store IDT register
  • LLDT/SLDT: Load/store LDT register
  • LTR/STR: Load/store task register
  • LAR/LSL: Load access rights/segment limit
  • VERR/VERW: Verify segment for read/write
  • ARPL: Adjust RPL field
  • LMSW/SMSW: Load/store machine status word

80386 32-bit Extensions

  • 32-bit versions of most instructions
  • MOVZX/MOVSX: Move with zero/sign extension
  • SHLD/SHRD: Double-precision shifts
  • BT/BTS/BTR/BTC: Bit test operations
  • BSF/BSR: Bit scan forward/reverse
  • SETCC: Set byte on condition

Addressing Modes

The 8086 supports complex addressing modes:

  1. Register: MOV AX, BX
  2. Immediate: MOV AX, 1234h
  3. Direct: MOV AX, [1234h]
  4. Register Indirect: MOV AX, [BX], MOV AX, [SI]
  5. Based: MOV AX, [BX+1234h], MOV AX, [BP+1234h]
  6. Indexed: MOV AX, [SI+1234h], MOV AX, [DI+1234h]
  7. Based Indexed: MOV AX, [BX+SI], MOV AX, [BP+DI]
  8. Based Indexed with Displacement: MOV AX, [BX+SI+1234h]

80386 32-bit Addressing

The 80386 adds more flexible addressing:

  • Any general register can be a base: [EAX], [ECX], etc.
  • SIB (Scale-Index-Base) byte: [EAX+EBX*4+1234h]
  • Scale factors: 1, 2, 4, 8

Operating Modes

Real Mode

  • Default mode after reset
  • 20-bit addressing (1MB limit)
  • Direct memory access
  • No memory protection

Protected Mode (80286+)

  • Activated by setting PE bit in CR0
  • Segment descriptors in GDT/LDT
  • Memory protection and privilege levels
  • Up to 16MB (80286) or 4GB (80386)

Virtual 8086 Mode (80386+)

  • Run real-mode code in protected mode
  • Provides isolation and protection

Interrupts

The 8086 supports 256 interrupt vectors (0-255):

Predefined Interrupts

  • INT 0: Divide by zero
  • INT 1: Single step (debug)
  • INT 3: Breakpoint
  • INT 4: Overflow (INTO)

DOS Interrupts (System-Specific)

  • INT 10h: Video services
  • INT 13h: Disk services
  • INT 16h: Keyboard services
  • INT 21h: DOS services

See docs/PC_INTERRUPT_ANALYSIS.md for DOS interrupt implementation details.

Protected Mode (80286/80386)

Segment Descriptors

In protected mode, segment registers point to descriptors:

Descriptor (8 bytes):
  - Base Address (24/32 bits)
  - Limit (20 bits)
  - Access Rights (type, DPL, present)
  - Granularity, Size flags (80386)

Privilege Levels

Four privilege levels (rings):

  • Ring 0: Kernel (highest privilege)
  • Ring 1: Device drivers
  • Ring 2: System services
  • Ring 3: User applications (lowest privilege)

Task Switching

The 80286/80386 supports hardware task switching:

  • Task State Segment (TSS) stores task context
  • LTR loads task register
  • Far JMP/CALL to TSS descriptor switches tasks

Implementation Notes

CPU Model Selection

Different CPU models affect:

  • Available instructions
  • Register sizes
  • Addressing modes
  • Operating modes
  • Timing characteristics

Instruction Prefixes

Instructions can have multiple prefixes:

  • Segment override: CS:, DS:, ES:, SS:, FS:, GS:
  • Repeat: REP, REPE/REPZ, REPNE/REPNZ
  • Lock: LOCK (for atomic operations)
  • Operand size: Switch between 16/32-bit (80386+)
  • Address size: Switch between 16/32-bit addressing (80386+)

String Operations

String instructions can use repeat prefixes:

  • REP: Repeat CX times
  • REPE/REPZ: Repeat while equal/zero
  • REPNE/REPNZ: Repeat while not equal/not zero

Direction controlled by DF flag:

  • DF=0: Increment (forward)
  • DF=1: Decrement (backward)

BCD and ASCII Operations

The x86 supports decimal arithmetic:

  • DAA/DAS: Decimal adjust after addition/subtraction
  • AAA/AAS/AAM/AAD: ASCII adjust for arithmetic

Systems Using 8086

This CPU core is used by:

  • PC (IBM PC/XT) - crates/systems/pc/

Quick Instruction Reference

Highly unofficial reference for cross-checking implementation gaps in source code.

Legend

  • r/m: Register or Memory operand.
  • imm: Immediate value (constant).
  • src/dst: Source / Destination.
  • st(i): FPU Stack Register (0-7). st(0) is the Top of Stack.
  • IP: Modifies Instruction Pointer directly (Jump/Call).
  • Fl: Modifies CPU Status Flags (ZF, CF, OF, SF, PF).
  • m16/32/64/80: Memory operand size in bits.

1. Integer Core: Data Transfer

Fundamental addressing and data movement. Verify Segment Registers (CS, DS, ES, SS, FS, GS) logic.

Mnemonic Operands Size CPU Emulator Notes
MOV r/m, r/m/imm 8/16/32 8086 No flags affected. Handle Sreg moves carefully (protection faults).
PUSH r/m/imm/sreg 16/32 8086 Decs SP/ESP. 186+ allows PUSH imm.
POP r/m/sreg 16/32 8086 Incs SP/ESP. POP CS is illegal (except early 8088 bugs).
XCHG r/m, r 8/16/32 8086 Atomic (implicit LOCK) if operand is memory.
XLAT - 8 8086 Table lookup: AL = [DS:BX + unsigned AL].
IN / OUT port 8/16/32 8086 I/O address space. Use DX for ports > 255.
LEA r, m 16/32 8086 Calc effective address only. Does not access memory.
LDS/LES r, m 32/48 8086 Load Far Pointer.
LFS/LGS/LSS r, m 32/48 386 Load Far Pointer (FS/GS/SS).
BSWAP r32 32 486 Byte Swap (Endianness conversion).
MOVZX/SX r, r/m 16/32 386 Zero-Extend / Sign-Extend. Essential for casting.
CMOVcc r, r/m 16/32 P6* Conditional Move. Technically P6, but supported by late Socket 7 CPUs.

2. Integer Core: ALU (Arithmetic & Logic)

Source of most bugs. Pay close attention to Flag updates (Overflow vs Carry).

Mnemonic Operands Size CPU Emulator Notes
ADD/ADC dst, src 8/16/32 8086 ADC includes CF. Fl: All status flags.
SUB/SBB dst, src 8/16/32 8086 SBB subtracts CF. Fl: All status flags.
INC/DEC r/m 8/16/32 8086 Fl: Does NOT affect Carry Flag (CF). Crucial!
CMP r/m, r/m/imm 8/16/32 8086 Non-destructive SUB. Only updates Fl.
NEG r/m 8/16/32 8086 Two's complement (0 - x).
MUL/IMUL r/m 8/16/32 8086 Unsigned/Signed. Affects AX/DX/EDX. 186+ adds 3-op IMUL.
DIV/IDIV r/m 8/16/32 8086 Divides (E)AX:(E)DX by operand. Trap #DE if div by 0.
AND/OR/XOR r/m, r/m/imm 8/16/32 8086 Fl: Clears CF and OF. Updates ZF, SF, PF.
TEST r/m, r/m/imm 8/16/32 8086 Non-destructive AND. Updates Fl.
NOT r/m 8/16/32 8086 1's complement. Affects NO flags.
SHL/SHR r/m, cl/imm 8/16/32 8086 Logical Shift. 186+ allows immediate != 1.
SAL/SAR r/m, cl/imm 8/16/32 8086 Arithmetic Shift (SAR preserves sign bit).
ROL/ROR r/m, cl/imm 8/16/32 8086 Rotate.
RCL/RCR r/m, cl/imm 8/16/32 8086 Rotate through Carry Flag.
SHLD/SHRD r/m, r, imm 16/32 386 Double precision shift (across two registers).
XADD r/m, r 8/16/32 486 Atomic exchange + add.
CMPXCHG r/m, r 8/16/32 486 Compare and Exchange. (Pentium adds CMPXCHG8B).
BCD Ops DAA, DAS, etc. 8 8086 Legacy. Decimal Adjust. Often implemented incorrectly.

3. Integer Core: Control Flow

Directly modifies IP/EIP.

Mnemonic Operands CPU Emulator Notes
JMP rel/r/m 8086 IP update. Short/Near/Far types. Verify absolute vs relative logic.
CALL rel/r/m 8086 Pushes return address (IP or CS:IP), then Jumps.
RET / RETF imm? 8086 Pops IP (and CS if Far). Optional imm added to SP (stdcall).
Jcc rel 8086 Conditional Jump (JE, JNE, JG, etc). Checks Fl. 386+ adds near conditional.
LOOP/x rel8 8086 Decs (E)CX. Jumps if (E)CX!=0 (and Z-flag check for LOOPE/NE).
INT n imm8 8086 Software Interrupt. Pushes Flags, CS, IP. Vectors via IDT/IVT.
IRET - 8086 Return from Interrupt. Pops IP, CS, Flags. Handles Task Switch in Protected Mode.

4. Integer Core: String Operations

Usually prefixed with REP, REPE, REPNE. Checks DF (Direction Flag).

Mnemonic Operation CPU Emulator Notes
MOVS [ES:DI] = [DS:SI] 8086 Inc/Dec SI/DI based on operand size.
CMPS CMP [DS:SI], [ES:DI] 8086 Compare memory.
SCAS CMP Acc, [ES:DI] 8086 Compare Accumulator (AL/AX/EAX) with memory.
LODS Acc = [DS:SI] 8086 Load memory to Accumulator.
STOS [ES:DI] = Acc 8086 Store Accumulator to memory.

5. System & Protected Mode

Essential for OS booting.

Mnemonic Description CPU Emulator Notes
LGDT/LIDT Load GDT/IDT Register 286 Reads 6 bytes (limit + base) from memory.
LLDT/LTR Load LDT/Task Register 286 Selector loads.
MOV CRn, r Move to/from Control Reg 386 CR0 (PE, PG), CR3 (Page Dir), CR4. Triggers mode switches.
LMSW Load Machine Status Word 286 Precursor to CR0 modification.
CLTS Clear Task Switched Flag 286 Used in FPU context switching logic.
CPUID Processor ID Pent Returns features in EAX/EBX/ECX/EDX.
RDTSC Read Time Stamp Pent 64-bit cycle count to EDX:EAX.

6. FPU (x87 Floating Point)

Operates on 80-bit stack ST(0)...ST(7). Distinct from Integer registers.

Critical: FPU Status Word (SW) contains condition codes (C0-C3).

Data Transfer

Mnemonic Operands Notes
FLD m32/64/80 / st(i) Push Value to Stack. (Converts int/float to 80-bit ext).
FST / FSTP m32/64/80 / st(i) Store (Copy) / Store & Pop.
FILD m16/32/64 Load Integer (convert to float) & Push.
FIST / P m16/32/64 Store Integer (convert float to int).
FXCH st(i) Swap ST(0) with ST(i).
FBLD / FBSTP m80 Load/Store BCD (Decimal) 80-bit.

Arithmetic

Most instructions have a P variant (e.g., FADDP) which pops the stack after op.

Mnemonic Description Notes
FADD / FSUB Add / Subtract st(0) += src. Watch for NaNs and Infinities.
FMUL / FDIV Multiply / Divide st(0) *= src. Handle #Z (Divide by Zero).
FPREM / 1 Partial Remainder IEEE 754 remainder. Important for trig reduction.
FABS st(0) = abs(st(0)) Clears sign bit.
FCHS st(0) = -st(0) Inverts sign bit.
FRNDINT Round to Integer Uses RC (Rounding Control) field in Control Word.
FSCALE st(0) * 2^st(1) Fast multiplication by power of 2.
FSQRT Square Root

Comparison & Control

Mnemonic Description Emulator Notes
FCOM / P / PP Compare Sets C0, C2, C3 in Status Word. Does not set CPU Flags.
FCOMI / P Compare Sets CPU Flags (ZF, PF, CF) directly (Pentium Pro+).
FSTSW Store Status Word Usually FSTSW AX. Used to move FPU conditions to CPU flags (via SAHF).
FINIT Initialize FPU Reset Control/Status words, tags to Empty.
FLDCW / FSTCW Load/Store Control Word Sets Rounding Mode, Precision, Exception Masks.
FWAIT Wait Checks for pending FPU exceptions.

Transcendental (Trig)

Hard to emulate perfectly bit-exact due to internal microcode variations.

Mnemonic Description
FSIN / FCOS Sine / Cosine of ST(0).
FSINCOS Computes both. Pushes Cos, then Sin.
FPTAN Partial Tangent.
FPATAN Partial Arctangent.
FYL2X st(1) * log2(st(0))
FYL2XP1 st(1) * log2(st(0) + 1)

Constants

Mnemonic Value Pushed
FLDZ / FLD1 0.0 / 1.0
FLDPI π
FLDL2T / FLDL2E log₂10 / log₂e
FLDLG2 / FLDLN2 log₁₀2 / ln2

7. MMX (Pentium MMX)

SIMD operations. Registers MM0-MM7 are aliased to FPU ST0-ST7 (lower 64 bits).

Warning: Mixing MMX and FPU instructions without EMMS corrupts data.

Mnemonic Operands Operation
EMMS - Empty MMX State. Sets FPU tags to empty. Must be called before returning to Float code.
MOVD mm, r/m32 Move 32-bit (Doubleword).
MOVQ mm, mm/m64 Move 64-bit (Quadword).
PACKSS/US mm, mm/m64 Pack with Signed/Unsigned Saturation (e.g., 16-bit -> 8-bit).
PUNPCKH/L mm, mm/m64 Unpack (Interleave) High/Low data.
PADD/SUB mm, mm/m64 Parallel Add/Sub. Wraps around on overflow.
PADDS/US mm, mm/m64 Parallel Add/Sub with Saturation (clamps to min/max).
PMULL/H mm, mm/m64 Parallel Multiply (stores Low or High bits of result).
PMADDWD mm, mm/m64 Multiply-Add (Dot product backbone).
PAND/OR/XOR mm, mm/m64 Bitwise logical operations.
PCMPGT/EQ mm, mm/m64 Parallel Compare. Result is mask of 1s (True) or 0s (False).
PSLL/SRL/SRA mm, imm/x Shift Packed Data (Logical Left/Right, Arithmetic Right).

Emulator Implementation Tip: The ModR/M Byte

For most integer and MMX instructions, immediately following the opcode is the ModR/M byte. You must parse this correctly to identify the operands.

  • Format: [Mod (2 bits)] [Reg/Opcode (3 bits)] [R/M (3 bits)]
  • Mod 00: [reg] (Memory)
  • Mod 01: [reg + disp8] (Memory)
  • Mod 10: [reg + disp32] (Memory)
  • Mod 11: reg (Direct Register)

Note: The SIB (Scale Index Base) byte follows ModR/M if R/M == 100 in 32-bit mode.

References