Bringing SMP to Your UP Operating SystemSidney Cammeresi
SigOps OS Tutorial to teach the fundamentals of symmetric multprocessing using Intel MP compliant hardware. Knowledge of the concepts and implementations of basic operating system parts such as managing virtual memory and multitasking are assumed and will not be discussed except as they relate to multiprocessing. Knowledge equivalent to an intermediate or advanced computer architecture college course will be helpful in understanding scheduling issues, but is not required.
This tutorial is not intended to be a complete explanation of how to implement an SMP-capable operating system, nor as a replacement for Intel's documentation. Rather it is designed to give an overview of the things I learned in writing SMP support for OpenBLT, a freely redistributable microkernel-based operating system under the BSD licence. Particularly, some tedious hardware aspects will not be discussed in detail when the reader could just as easily read official Intel documentation. The interested reader should refer to the references for more detailed information. For code examples, the reader should refer to the source code of OpenBLT or FreeBSD. The Linux kernel source code might be helpful, although it is under the GPL.
This tutorial is a work in progress. If you see an error or something that needs clarification, please e-mail me.
First, you need to find the floating pointer structure. According to the spec, it can be in one of four places: (1) in the first kilobyte of the extended BIOS data area, (2) the last kilobyte of base memory, (3) the top of physical memory, or (4) the BIOS read-only memory space between 0xe0000 and 0xfffff. You need to search these areas for the four-byte signature "_MP_" which denotes the start of the floating pointer structure. Absence of this structure indicates that the system is not MP compliant. At this point your operating system can either halt, or it can fall back into a UP setup.
You should checksum the structure to make sure it has not been corrupted. There is not much of interest in the floating pointer structure, unless your system does not have a configuration table. In this case, you will need to get the number of the default configuration your system adheres to and set up the system for SMP using those parameters. Otherwise, you will need to get the address of the configuration table and begin parsing that.
The configuration table is divided into three parts: a header, a base section, and an extended section. The header begins with the four-byte signature "PCMP", although you do not have to search for it. Once you find it, checksum it. At this point, you can print the OEM and product ID strings in the configuration table if you want. You should get the address of the local APIC from this and store it. Then, proceed to parse the base section.
The base section consists of a set of entries that describe either processors, system busses, I/O APICs, I/O interrupt assignments, or local interrupt assignments. All entries are eight bytes in length, save processor entries which are twenty bytes. The first byte of each entry denotes the type of the entry. Look through each entry. You will probably want to generate quite a few OS-specific data structure here. In particular, you will want to note the APIC ID of each processor in the system, its version, and its type as well as the address of the system's I/O APIC.
Since the APs will wake up in real mode, everything they need to get started should be in low memory (below 0x100000 or one megabyte). First, set the shutdown code by setting address 0:f to 0xa. Then, grab a page of memory for the AP's stack. You will also need space to store the `trampoline' code, i.e. the code the processor executes after waking up to switch to protected mode and jump to the kernel. You can either use the same page of code for each processor or store the code at the bottom of the processor's stack. Note that the start of the code must be at a page-aligned address. Copy the code there, then set the warm reset vector at address 40:67 to the start of this code. Next, you should reset a bit in the kernel which the processor will use to signal that it has booted and finished initialisation and clear any APIC error by writing a zero to the error status register. If you need to pass any parameters or data to the AP, now would be a good time to set that up. For example, since OpenBLT's kernel runs in high memory, I have to pass the address of the page directory in memory so that the AP can load it and enable paging before calling the kernel.
Now you can actually boot the processor. The procedure consists of sending a sequence of interrupts to the processor. The incremental effect of each is undefined, but at the end of the sequence, the processor will be booted. First send an INIT IPI. Assert the INIT signal by writing the target processor's APIC ID to the high word of the interrupt command register. Then write to the low word with the bits set to enable the INIT delivery mode, level triggered, and assert the interrupt. Deassert INIT by repeating the procedure with the assert bit reset. Now, wait 10 ms. Use of the APIC timer is suggested.
If the local APIC is not an 82489dx, you need to send two STARTUP IPIs. Clear APIC errors, set the target APIC ID in the ICR, then send the interrupt by writing to the low word of the ICR with bits set for STARTUP delivery mode and with the code vector in the low byte. The code vector is the physical page number at which the processor should start executing, i.e. the start of your trampoline code. Wait 200 ms, then check the low word of the ICR to make sure bit 16 is reset to indicate the interrupt was dispatched before sending the second STARTUP. After sending it, spin and wait for the AP to set its ready bit in memory. You may want to set a timeout of 5 seconds, after which you assume the processor did not wake up.
Don't reference any symbols in this code since it will be running at an address for which it was not linked; all memory references must be absolute. Since your kernel is above one megabyte in memory, you can't access any global variables in real mode. Also be careful in specifying your offset address for the ljmp instruction, and do specify the address of the start of your kernel, not a symbol in the instruction that goes into the kernel. Jumping to a symbol doesn't seem to work. For details, see OpenBLT's kernel/trampoline.S.
Debugging this part is really not too bad. What you have to do is establish some communication space in low memory, then have the AP write bytes to that memory to explain what it is doing and print these out on the BSP.
Using the local APIC, you can send interrupts to all processors, all processors but the one sending the interrupt, or a specific processor or logical address as well as self-interrupts. To send an IPI, write the destination APIC ID, if needed, into the high word of the ICR, then write the low word of ICR with the destination shorthand and interrupt vector set to send the IPI. Be sure to wrap these functions in spinlocks. You might want to turn off interrupts as well while sending IPIs.
developer.intel.com. Supposedly, by request, Intel will also send you printed documentation by post.
07 Nov 1998 - initial version, most sections filled out.
Bringing SMP to Your UP Operating System is Copyright © 1998 by Sidney Cammeresi in its entirety. All rights reserved.
Permission is granted to make verbatim copies of this tutorial for non-commercial use provided this notice remains intact on all copies.
$Id: smp.html 1.3 Thu, 01 Jul 1999 10:51:51 -0500 sac $