The Protected Mode Tutorial

by : Peter Quiring
date : Aug/97

Table of Contents

Overview
Why Protected mode?
The 80386 Processor
Selectors and descriptors
Descriptors
Exception Handlers
Special 386 Registers
V86 Mode
The DPMI Standard
Calling RMODE from PMODE
Transfering control from RMODE to PMODE
Default Segment Size
16bit PMODE vs. 32bit PMODE
Last note and "QLIB"

Overview

The purpose of this tutorial is to teach programmers the difference between protected mode from real mode, and how to start programming in it using a DOS extender (examples of DOS extenders are DOS/4GW or PMODE/W).
Real mode is the original 8086 mode where you have 16bit registers and are limited to 640k of memory.
Protected offers much more.
This tutorial is intended for programmers that are already quite good programming in real mode and want to move on to something better but do not want to waste their time learning to program in the complicated and ugly Windows enviroment, just to make a dumb calculator.
If you need to learn about real mode and how to program at all for the PC I suggest you try another homepage or wait to see if I create such a tutorial.
Note that this tutorial is to teach you how to program in protected mode using a DOS extender, not to create a DOS extender. I will create such a tutorial later.
Here are two important short cuts I will use thoughout this tutorial:
RMODE = real mode
PMODE = protected mode
V86 = virtual 8086 mode

Why Protected Mode?

Why start programming in PMODE? Well it's a great move from RMODE. Once in protected mode the programmer has a much more vast amounts of resources available to them directly. If you've ever used XMS or EMS it's very slow. Why not access all that extra memory directly, in PMODE. No more asking some memory manager to move pages of memory into the 640k barrier so that you can access it.
Protected mode also offers a lot of program security, but most prorgammers do not really concern themselves with this aspect unless you are planning to create your own operating system. I may create such a tutorial in my homepage someday.
Usually getting to PMODE is very hard, but DOS extenders have made this problem very easy. These DOS extenders move your program from real mode and into PMODE before your program starts. Now programming could not be anymore easier in PMODE.

The 386 Processor

With the creation of the 80386 processor, 32bit PMODE was born. The 80286 was the first CPU to have PMODE but it was only 16bit PMODE and is hardly ever used in the world. The 80286 could only support 16MBs of RAM which was soon increased to 4GBs by the 386. When in 32bit PMODE you can access all memory directly. No need to go thru a XMS or EMS driver anymore.
Since the 80386 CPU is a 32bit processor all registers are now 32bits wide.
As shown here:

bit numbers:
31      16 15 8 7  0
+---------+----+----+
|         | AH | AL |
+---------+----+----+
          [-- AX  --]
[------- EAX -------]

+---------+----+----+
|         | BH | BL |
+---------+----+----+
          [-- BX  --]
[------- EBX -------]

+---------+----+----+
|         | CH | CL |
+---------+----+----+
          [-- CX  --]
[------- ECX -------]

+---------+----+----+
|         | DH | DL |
+---------+----+----+
          [-- DX  --]
[------- EDX -------]

+---------+----+----+
|         |   SI    |
+---------+----+----+
[------- ESI -------]

+---------+----+----+
|         |   DI    |
+---------+----+----+
[------- ESI -------]

+---------+----+----+
|         |   BP    |
+---------+----+----+
[------- EBP -------]

+---------+----+----+
|         |   SP    |
+---------+----+----+
[------- ESP -------]

+---------+----+----+
|         |  flags  |
+---------+----+----+
[----- Eflags ------]

All of the segment registers are still 16bit but they will work a little different as shown in the next section.
There are also two new segment registers, FS and GS, and they work exactly like ES, which can be used however the programmer feels fit.
They also many other new system registers that configure how the system operates, allows CPU debug trapping, paging (which allows virtual memory) and much more. But if you do not intend to build your own DOS extender, you need not concern yourself with them.
These 32bit registers allow us to access upto 4GBs of memory. If you do a little math 2 ^ 32 will give you 4GBs (over 4 billion bytes). Of course you'll probably never see 4GBs but there is that potential. Intel has already created a new processor, the P2, that has a 36bit wide bus enabling to access 64GBs of RAM, can you imagine? Don't get this 32bit/36bit address bus size confused with the CPU's data bus size which is 64bit for the Pentium processors.
One last note about a new addressing modes. The 80386 has a great new addressing mode called scaled index base mode (SIB).
Basically you may put TWO registers inside the square brackets now (any two) and multipy one of them by 1,2,4 or 8. You may omit one of the registers if you like. You can still add an offset too.
Examples:

mov al,[eax*2]
mov bl,[ebx+ecx*4+my_offset]
mov [esp+eax*2],edi

There are also many new instructions in the 80386 which you can learn about in your favorite 80386+ reference.
Here are a few important instructions:

iretd - 32bit iret
pusad/pushfd/popad/popfd - 32bit versions

Remember to add the ".386" directive at the begining of your ASM files to use these new instructions.

Selectors and Descriptors

When the 386 is in real mode it operates just as a 8086 processor. Ever time there is an access to memory the value in the segment is multiplied by 10h and the offset is added to that to get the "physical" address.
Well in PMODE everything changes, and I mean everything! The segment registers now hold an offset into system tables. Inside these system tables are blocks of information that define the characterists of the segment, like where it is in memory and how big it is. This offset held in the segment registers is called a selector and the block of info in the system tables are called descriptors.
The descriptors also contain a lot of other info, like who is allowed to use the descriptor and what is its purpose. Descriptors can point to segments that could contain code, data, an interrupt handler and much more.
Therefore in PMODE you can no longer load the segment registers with what ever value you want to access memory. You must also use selectors that are provided to you by the system, or ones that you allocate from the system. Segment registers can no longer how a segment value such as 0a000h. To access memory below the 640k memory block you must use a selector that starts in the region you need. Since the registers are now 32bit you can use one selector that starts at the very beginning of memory and be able to access all of the entire 4GB memory range (including the 1st 640k) without ever having to reload the segment register with another selector.
For example, if you DS register was loaded with such a selector then the following is possible.

mov edi,0b8000h    ;Colour text video segment
mov [edi],al       ;put a byte on the text screen
mov edi,0400h      ;BIOS segment
mov eax,[edi+67h]  ;get the system timer tick (IRQ#1)

Where did I got those values to load into EDI?
You use the segment you want to access and multiple it by 10h and use that as the offset. Since the base of the segment is zero any offset will be the final address.
With many DOS extender available today, all the selectors provided all begin at zero (including CS). This makes access conventional memory (below 640k) very easy.

The system tables (GDT, LDT, IDT)

Within every system there are three types of tables:
GDT - Global Descriptor Table
LDT - Local Descriptor Table
IDT - Interrupt Descriptor Table

These tables each serve an important purpose while in PMODE. The GDT is a system wide table used by ever program in the system. This table can hold upto 8,192 descriptors, but the 1st descriptor is reserved and can not be used. The reserved descriptor is called the NULL descriptor. Loading the NULL desciptor into a segment register is valid but if you then attempt to use this segment you will get into trouble, I'll get into that later.
The LDT is the same as the GDT except that each task (program) in the system _may_ have it's own LDT.
The IDT is where all the interrupt handlers are stored. This table can only hold upto 256 descriptors. Note that the RMODE interrupt table is totally seperate from the PMODE interrupt table. The RMODE table is always at 0:0 but the PMODE table can exist anywhere in memory.
To reference any of these descriptors you need a selector which looks like this:

Selector layout
     +---------------------------+
bit  | 15 ...     3 | 2  | 1   0 |
desc | Descriptor   | TI |  PL   |
     +---------------------------+

Bits 15-3 is the desciptor index. This number is 13 bits and with a little math we get : 2^13 = 8192. That's where 8192 descriptors comes from.
Bit 2 is the TI (table indicator). If this bit is 0 then the selector points to a descriptor in the GDT, otherwise (1) it points to one in the LDT. A selector can not point in to the IDT.
Bit 1 and 0 are the Privledge level. This has to deal with the protection mechinism of the CPU. I will go over this a little, later on.
The CPU has specical registers to define where these tables exist in the memory and how big each table is, they are: gdtr - GDT Register
idtr - IDT Register
The LDT does not have such a register, I'll show where it is in the next section.
Each register is 48bits. 32bits for the location in memory and another 16bits to indicate how large the segment is.
The limit = number of bytes - 1. (Not the # of descriptors).
-or-
The limit = # descriptors * 8 - 1. (since each descriptor is 8 bytes)

Descriptors

There are 18 different types of descriptors you could possible create and use. Many of them are reserved or old (like 286 stuff).
I will get into much detail about them in this tutorial, since the DOS extender will handle much of these details for you. If you are interested in these things, wait for my Advanced PMODE Tutorial.
Basically each descriptor has the following info:
BASE : This defines where the segment starts in memory. Examples would be 0a0000h for the video grafix segment. This f
LIMIT : This defines the maximum limit allowed to be used while using this segment. Most of the time the limit is set to the max (4GBs) while using a DOS extender.
TYPE : This defines exactly what type of segment this is. There are generally three types here. Code,data(or stack) and gates. Gates are usually interrupt handlers and such. It also defines if the segment can be read from if it's a code segment and if it can be read/written to if it's a data segment. Stacks are simply writtable data segments. The LDT is also really just a descriptor itself that must reside in the GDT. Note that some descriptors can not exist in certain system tables.
But you really don't need to worry about that unless you are going to create a DOS extender or OS.

Exceptions

Exceptions are a very important topic. An exception (or fault) is something the CPU gererates when something goes wrong. Exception are really just interrupt handlers from INT #0h to INT #01fh.
The most popular exception is #13h. This exception handler catches most programming bugs such as these:

loading CS (with a JMP or CALL) with a descriptor that is not a code segment
writting into the code segment
loading SS with a descriptor that is not a writtable data segment
accessing a segment beyond the allowed limit
attempting to switch to PMODE directly while in V86 mode

And there are many, many more.
I will soon create a PMODE Reference where I will list all exceptions and there meanings. Here are some of interest though:

INT # 0h - Divide by zero
INT # 1h - Debug trap (using advanced 386 trapping hardware)
INT # 3h - User Breakpoint (debugging)
INT # 13h - General Exc Handler (catches all exceptions that do not belong to any other exception handler - the catch-all handler).

Special Registers (CRx)

The 386 has many new registers for system configuration, debugging, testing and more.
Here are a few you may want to know about:
CR0 - This is a very important one. Bit#0 indicates if the CPU is in PMODE(1) or in RMODE(0).
CR1 -> CR3 - other important registers.
DR0 -> DR7 - debugging registers.
TR0 -> TR? - registers for testing the CPU (warning these can change or be removed at anytime from CPU to CPU).

Virtual 8086 Mode (V86)

The 386 also has a special mode called V86 mode. This is basically a special operation of the PMODE. When in this mode the CPU is still in PMODE but acts like it is executing in RMODE. This is done so that the protection that PMODE offers is intact while still being able to execute older RMODE appliations.
To do this many instructions such as INTx will be emulated by the exception handle #13 (GPF) handler.
This mode is used when bit #17 of the EFLAGS is set.
You can not set this bit by simply poping a value into the flags.
This is the mode your RMODE programs run under if the system is currently under PMODE. You system is running in PMODE if you are using any of the following: Windoze, QEMM, EMM386, etc.
So when I say RMODE I also mean V86 mode unless specified.

The DPMI Standard

First a little history about the 386 CPU and it's software. When the PMODE CPU operating mode was first introduced it was not handled correctly. RMODE applications would simply attempt to set bit#0 of CR0 to get to PMODE themselves. But if another application where to be started if in V86 mode (if the 1st program tried to launch another app in V86 mode) then they system would freeze.
So VCPI was designed would kept the system in V86 mode and allowed many applications access to the system in a common fashion.
But VCPI soon become unusable in multi-tasking enviroment (such as windows) because it gave apps to much power and then the DPMI server was born.
DPMI (Dos Protected Mode Interface) allows many applications to get to PMODE while keeping system security much higher than VCPI could.
DPMI also provides many services to alloc/free system resources which we will look into with great detail.
Note that most DOS extenders today provide these DPMI services, even if one does not exist. The DPMI has become a very popular standard. To do this the DOS extender may convert these DPMI calls to VCPI equivelents. This makes programming in PMODE much easier.
The DPMI server works very much the same way as the DOS INT 21h handlers works. To call the DPMI server you must use INT 31h in PMODE only. You can not call the DPMI server from RMODE except for detection and requesting to switch to PMODE.

Calling RMODE from PMODE

While running in PMODE you may find it necessary to get back to RMODE to call a RMODE INT or another RMODE function. To do this you have many options.
If you going to call a RMODE INT which does not require a parameter is one of the segment registers you may simply call the INT as you would in RMODE from PMODE. Just load the registers as needed and issue an INT ##. The DPMI host has setup most of the PMODE interrupt handlers to simple pass the interrupt call down to RMODE while passing just the general registers (including EBP, ESI, EDI). The reason you can not pass anything in the segment register is that they have different meaning while in PMODE as they do in RMODE. If you attempt to load a RMODE segment number into a segment register while in PMODE that may be a selector value that you do not have the rights to use and will cause your program to crash.
The DOS extender does extend some INT calls so that you can give a selector in the segment regsiters and the DOS extender will make sure the RMODE handler can access that segment. Most INT 21h functions should be extended by your DOS extender in this manner.
Most extender also extend other functions such as INT 33h and others. See your DOS extenders documenation for info.
If you still need to pass a buffer to the RMODE INT handler or you must CALL a RMODE routine, then you must go thru the DPMI server to do this.
The DPMI server provides a set of functions to call RMODE INTs and CALLs thru the use of this special structure:

callstruct struct   ;50 bytes  (32h)
  _edi dd ?     ;0
  _esi dd ?     ;4
  _ebp dd ?     ;8
  _res dd ?     ;0ch reserved
  _ebx dd ?     ;10h
  _edx dd ?     ;14h
  _ecx dd ?     ;18h
  _eax dd ?     ;1ch
  _flg dw ?     ;20h flags
  _es dw ?      ;22h segments (NOT selectors)
  _ds dw ?      ;24h  "
  _fs dw ?      ;26h  "
  _gs dw ?      ;28h  "
  _ip dw ?      ;2ah ignored in some calls
  _cs dw ?      ;2ch  "
  _sp dw ?      ;2eh must be 0 to use system stacks
  _ss dw ?      ;30h  "
callstruct ends

This call structure must be filled in by you with the values you want to pass on to the RMODE INT handler or function. Remember to give segment values, not selectors!
The _sp and _ss fields must be zero so that the DPMI server will provide a system stack for the call. If you need to use your own special stack then you can use it, but I see no reason for that. The DPMI server will have many stacks which will allow you to call your function many times.
If you are CALLing a RMODE function then _cs and _ip must have hold the address of the FAR function. You can not call NEAR functions in RMODE.
If you are requesting an INT in RMODE the _cs and _ip are ignored.
Here are the DPMI functions to implement RMODE calling:

ax = 300h - Simulate RMODE INT
  bl = INT #
  bh = 0  (must always be zero)
  cx = ?  (number of dwords to pass on stack)
  es:edi = offset of 'callstruct'
ax = 301h - Call RMODE Procedure with FAR stack frame
  bh = 0  (must always be zero)
  cx = ?  (number of dwords to pass on stack)
  es:edi = offset of 'callstruct'
ax = 302h - Call RMODE Procedure with IRET stack frame
  bh = 0  (must always be zero)
  cx = ?  (number of dwords to pass on stack)
  es:edi = offset of 'callstruct'

The value of cx indicates how many parameters will be copied from your PMODE stack to the RMODE stack. The DPMI server will not pop these off of your stack. The size of each parameter will be dwords in a 32bit PMODE program or words in a 16bit PMODE program.
Simply setup the callstruct as needed, load the registers as shown above and then use INT 31h.
One note about using this from an ISR (interrupt service rountine). If your ISR is re-entryable and you use these functions from within an ISR you must make sure that you do not use just one callstruct each time. I suggest simply creating the callstruct on the stack each time by simply using the following code:

myISR proc
  pushad
  sub esp,sizeof callstruct     ;create a callstruct buffer
  mov edi,esp
  ...   ;place your code here
  add esp,sizeof callstruct     ;remove callstruct buffer
  popad
  iretd
myISR endp

Within the routine EDI will hold a temp buffer that you can use for a callstruct as needed in the ISR, and a new one will be created each time the ISR is called.

Transfering control from RMODE to PMODE

There is only one more thing you need to learn now to understand the basics, and that is transfering control from RMODE to PMODE. To do this you must allocate a RMODE callback from the DPMI host. When you alloc one of these callbacks you must provide a callstruct that will hold the RMODE registers while the transfer is made to PMODE. In return you get a CS:IP value from the DPMI host that mean called it will jump to a PMODE address that you specify. During this PMODE routine the registers will be as follows:

ES:EDI = callstruct you provided
DS:ESI = top of RMODE stack (SS:SP)
interrupts disabled

When you return (using an IRETD) the registers must be as follows:

ES:EDI = callstruct to be used for returning to RMODE

Note that the callstruct you provide when returning need not be the same provided to you. The reason you would do this is that if you have to enable the interrupts during the call you must first copy the callstruct to a buffer and then use that buffer (which can not be on the stack of course). If you do not enable the interrupts then there is no need to use another buffer.
The contents of this callstruct is an exact copy of the RMODE registers at the exact moment the RMODE code steped into the callback. If you return without modifying the values in the callstruct the RMODE program will simply step into the callback again (causing the system to go into an infinite loop). So using the values in the callstruct you must simulate a RETF or IRET (what ever the case may be). The DS:ESI will help you in poping values off the RMODE stack and placing them into the callstruct to simulate the RETF, IRET. Here is a few functions that I use to help me do this:

simulate_retf proc
  ;Assume ES:EDI -> callstruct
  ;       DS:ESI -> RMODE stack
  xor eax,eax
  mov ax,es:[edi].callstruct._ss
  shl eax,4
  xor ebx,ebx
  add bx,es:[edi].callstruct._sp
  add eax,ebx
  mov bx,[eax]
  mov es:[edi].callstruct._ip,bx
  mov bx,[eax+2]
  mov es:[edi].callstruct._cs,bx
  add es:[edi].callstruct._sp,4
  iretd
simulate_retf endp

simulate_iret proc
  ;Assume ES:EDI -> callstruct
  ;       DS:ESI -> RMODE stack
  xor eax,eax
  mov ax,es:[edi].callstruct._ss
  shl eax,4
  xor ebx,ebx
  add bx,es:[edi].callstruct._sp
  add eax,ebx
  mov bx,[eax]
  mov es:[edi].callstruct._ip,bx
  mov bx,[eax+2]
  mov es:[edi].callstruct._cs,bx
  mov bx,[eax+4]
  mov es:[edi].callstruct._flg,bx
  add es:[edi].callstruct._sp,6
  iretd
simulate_iret endp

And that's all it takes. The only time I ever use these callbacks is when I need to use the mouse callback service. The INT 33h func 0ch requires an RMODE far rountine that will be called whenever then mouse is moved, and using a callstruct in this case helps a lot.

Default Segment Size

As you should recall there is a flag in each descriptor that defines whether a segment is 16bit or 32bits. This bit is very important to understand. When running in 16bit mode (as in RMODE or PMODE) the following instruction looks like this:

  0100:0000         89h d8h          mov ax,bx

And if your were running in a 32bit segment and you ran this:

  0100:0000         89h d8h          mov eax,ebx

Note that each would be assembled exactly the same! This bizard behaviour has developed to create smaller code since in 32bit segments you would tend to use 32bit registers more often (I guess).
But if you wanted to use 32bit registers in the 16bit segment (or vise-versa), you must insert the data operand size prefix override. This prefix is simply 066h.
There is also a prefix to override the size of the offset used in indirect addressing (ie: mov ax,[bx]). When the address size override is used both the registers and offset are assumed to be 32bits in 16bit segments and vise-versa.
So if you are in a 16bit segment and need to do a 32bit RETF you would assemble the following:

  db 66h
  retf

That's it!
Note that not all instruction opcodes operate in this manner. Such as the PUSHFD, POPFD, PUSHAD, POPAD, etc.
PUSHF always pushes a 16bit flags register regardless of default segment size.
The default segment size in the current running code descriptor determines if IP or EIP is used in code pre-fetching, and data references using DS, ES, FS, GS.
The default segment size in the stack descriptor determines if SP or ESP is used with the stack.
The default segment size in DS, ES, FS, GS descriptors are ignored.

16bit PMODE vs. 32bit PMODE

When programming with PMODE/W and Watcom's DOS/4GW you can create 16bit and 32bit PMODE applications, I suggest you simple use 32bit apps to avoid confusion. You can also mix 16bit, 32bit, and even RMODE code. But I've done so and I don't ever believe their is a need for it. Doing so does give a slight speed increase for IRQ handlers but I've never tried it. If you do try remember this, the DPMI host works either in 32bit mode or in 16bit mode (not both). So if you application is considered a 32bit application it excepts all points to be 32bit (ie: ES:EDI) but if your app is a 16bit one then all points must be 16bit (ie: ES:DI).
Another thing to consider is this, if you create a 32bit rountine that needs to call another 16bit rountine, you can't! Because only the IP would be poped off during the RETF (and not the EIP as you would except). Now if you wrote the 16bit rountine you could create a 32bit RETF with the following code:

  db 66h
  retf

But if you are chaining down an interrupt handler you could not control how it returns.
So I suggest you simply stay in 32bit mode and avoid all headaches.

Last note and "QLIB"

Well this concludes this basic tutorial, if you need more help I suggest you download QLIB from this site. It contains many examples and a near complete set of 32bit ASM coded C library functions.
Please read the other Tutorials on this site including "Mixing ASM and C" and "DOS extender descriptions".
Then try creating your own little projects with QLIB.
Thank you for reading this tutorial ;)