分类: LINUX
2008-06-10 14:52:03
An outline of the boot sequence
Things start rolling when you press the power button on the computer (no!
do tell!). Once the motherboard is powered up it initializes its own firmware -
the chipset and other tidbits - and tries to get the CPU running. If things
fail at this point (e.g., the CPU is busted or missing) then you will likely
have a system that looks completely dead except for rotating fans. A few
motherboards manage to emit beeps for an absent or faulty CPU, but the
zombie-with-fans state is the most common scenario based on my experience.
Sometimes USB or other devices can cause this to happen: unplugging all
non-essential devices is a possible cure for a system that was working and
suddenly appears dead like this. You can then single out the culprit device by
elimination.
If all is well the CPU starts running. In a multi-processor or multi-core
system one CPU is dynamically chosen to be the bootstrap processor (BSP) that
runs all of the BIOS and kernel initialization code. The remaining processors,
called application processors (AP) at this point, remain halted until later on
when they are explicitly activated by the kernel. Intel CPUs have been evolving
over the years but they’re fully backwards compatible, so modern CPUs can
behave like the original 1978 , which is exactly what they do after power up. In this primitive power
up state the processor is in with memory
disabled. This is like ancient MS-DOS where only 1 MB of memory can be
addressed and any code can write to any place in memory - there’s no notion of
protection or privilege.
Most
in the CPU have well-defined values after power up, including the instruction
pointer (EIP) which holds the memory address for the instruction being executed
by the CPU. Intel CPUs use a hack whereby even though only 1MB of memory can be
addressed at power up, a hidden base address (an offset, essentially) is
applied to EIP so that the first instruction executed is at address 0xFFFFFFF0
(16 bytes short of the end of 4 gigs of memory and well above one megabyte).
This magical address is called the and is
standard for modern Intel CPUs.
The motherboard ensures that the instruction at the reset vector is a jump
to the memory location mapped to the BIOS entry point. This jump implicitly
clears the hidden base address present at power up. All of these memory
locations have the right contents needed by the CPU thanks to the memory
map kept by the chipset. They are all mapped to flash memory containing the
BIOS since at this point the RAM modules have random crap in them. An example
of the relevant memory regions is shown below:
Important memory regions during boot
The CPU then starts executing BIOS code, which initializes some of the
hardware in the machine. Afterwards the BIOS kicks off the
(POST) which tests various components in the computer. Lack of a working video
card fails the POST and causes the BIOS to halt and emit beeps to let you know
what’s wrong, since messages on the screen aren’t an option. A working video
card takes us to a stage where the computer looks alive: manufacturer logos are
printed, memory starts to be tested, angels blare their horns. Other POST
failures, like a missing keyboard, lead to halts with an error message on the
screen. The POST involves a mixture of testing and initialization, including
sorting out all the resources - interrupts, memory ranges, I/O ports - for PCI
devices. Modern BIOSes that follow the build a number of data tables that describe the devices in the
computer; these tables are later used by the kernel.
After the POST the BIOS wants to boot up an operating system, which must
be found somewhere: hard drives, CD-ROM drives, floppy disks, etc. The actual
order in which the BIOS seeks a boot device is user configurable. If there is
no suitable boot device the BIOS halts with a complaint like “Non-System Disk
or Disk Error.” A dead hard drive might present with this symptom. Hopefully
this doesn’t happen and the BIOS finds a working disk allowing the boot to
proceed.
The BIOS now reads the first 512-byte (sector zero) of the
hard disk. This is called the
and it normally contains two vital components: a tiny OS-specific bootstrapping
program at the start of the MBR followed by a partition table for the disk. The
BIOS however does not care about any of this: it simply loads the contents of
the MBR into memory location 0×
Master Boot Record
The specific code in the MBR could be a Windows MBR loader, code from
Linux loaders such as LILO or GRUB, or even a virus. In contrast the partition
table is standardized: it is a 64-byte area with four 16-byte entries
describing how the disk has been divided up (so you can run multiple operating
systems or have separate volumes in the same disk). Traditionally Microsoft MBR
code takes a look at the partition table, finds the (only) partition marked as
active, loads the boot sector for that partition, and runs that code.
The boot sector is the first sector of a partition, as opposed to the
first sector for the whole disk. If something is wrong with the partition table
you would get messages like “Invalid Partition Table” or “Missing Operating
System.” This message does not come from the BIOS but rather from the
MBR code loaded from disk. Thus the specific message depends on the MBR flavor.
Boot loading has gotten more sophisticated and flexible over time. The
Linux boot loaders Lilo and GRUB can handle a wide variety of operating
systems, file systems, and boot configurations. Their MBR code does not
necessarily follow the “boot the active partition” approach described above.
But functionally the process goes like this:
There’s a complication worth mentioning (aka, I told you this thing is
hacky). The image for a current Linux kernel, even compressed, does not fit
into the 640K of RAM available in real mode. My vanilla Ubuntu kernel is 1.7 MB
compressed. Yet the boot loader must run in real mode in order to call the BIOS
routines for reading from the disk, since the kernel is clearly not available
at that point. The solution is the venerable . This is not a
true processor mode (I wish the engineers at Intel were allowed to have fun
like that), but rather a technique where a program switches back and forth between
real mode and protected mode in order to access memory above 1MB while still
using the BIOS. If you read GRUB source code, you’ll see these transitions all
over the place (look under stage2/ for calls to real_to_prot and prot_to_real).
At the end of this sticky process the loader has stuffed the kernel in memory,
by hook or by crook, but it leaves the processor in real mode when it’s done.
We’re now at the jump from “Boot Loader” to “Early Kernel Initialization”
as shown in the first diagram. That’s when things heat up as the kernel starts
to unfold and set things in motion. The next post will be a guided tour through
the Linux Kernel initialization with links to sources at the . I can’t do the same for
Windows but I’ll point out
the highlights.
[Update: cleared up discussion of NTLDR.]
June 5, 2008 at 11:40 am | Filed Under Internals, Software Illustrated
Subscribe to
blog
Comments
4 Responses to “How Computers Boot Up”
Cómo inician las computadoras
[Eng]…
Excelente artículo para conocer
la secuencia de arranque de una computadora. Y sí, el primer paso es encenderla. Visto en caballe.cat/wp/
Nice article thanks. For a grub centric bootup
description see:
I’m a programmer in high school, have done work with
successively lower-level programming (mostly Java, C, and then assembly) but
this is a great article to give a really good sense of the technical specifics
that happen on startup. Thank you.
@Pádraig: Thanks for the link, that’s a great reference
for GRUB.
@Brian: Thank you for reading. That’s cool that you’re
getting into the lower level stuff. Experience with lower-level programming
makes a big difference in how effectively you design, troubleshoot, and think
about software. Regardless of what you end up doing, I think it’s pretty useful
stuff, not to mention it’s a lot of fun.
Let me know if I can help out in any way with regards to
questions or book recommendations or whatever. I’ve been working 60+ hours
these past few weeks, but I try to reply to the comments as often as possible.
Leave a Reply