1
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze
Unit OS11: Performance Evaluation
11.2. Boot/Startup Troubleshooting
3
Roadmap for Section 11.2 Windows Boot Process Shutdown Causes for - - PDF document
Unit OS11: Performance Evaluation 11.2. Boot/Startup Troubleshooting Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Roadmap for Section 11.2 Windows Boot Process Shutdown Causes for Crashes
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze
3
4
Master Boot Record (MBR) Boot sector NTLDR – NT Boot Loader NTDETECT.COM BOOT.INI SCSI driver – Ntbootdd.sys (not present on all systems)
System files – %SystemRoot%: Ntoskrnl.exe, Hal.dll, etc.
5
BIOS Reads MBR from boot device
MBR
Contains small amount of code that scans partition table
4 entries First partition marked active is selected as the system volume
Loads boot sector of system volume
Boot sector (NT-specific code)
Reads root directory of volume and loads NTLDR
C:
7
4. NTLDR
Moves system from 16-bit to 32-bit mode and enables paging Reads and uses Ntbootdd.sys to perform disk I/O if the boot volume is on a SCSI disk different than the system volume
This is a copy of the SCSI miniport driver used when the OS is booted
Reads Boot.ini
Boot.ini selections point to boot drive Specifies OS boot selections and optional switches (most for debugging/troubleshooting) that passed to kernel during boot
If more than one selection, NTLDR displays boot menu (with timeout) If you select a 64-bit installation, NTLDR moves the CPU into 64-bit mode
8
4. NTLDR (continued)
Once boot selection made, user can type F8 to get to special boot menu Last Known Good, Safe modes, hardware profile, Debugging mode NTLDR loads and executes Ntdetect.com to perform BIOS hardware detection (x86 and x64 only) Later saved into HKLM\Hardware\Description NTLDR loads: Ntoskrnl.exe, Hal.dll, and Bootvid.dll (and Kdcom.dll for XP and later) The registry SYSTEM hive (\Windows\System32\Config\System) Later this becomes HKLM\System Based on the SYSTEM hive, the boot drivers are loaded Boot driver: critical to boot process (e.g. boot file system driver) Transfers control to main entry point of Ntoskrnl.exe
9
10
Every driver has a key in HKLM\System\CurrentControlSet\Services Type: 1 for driver, 2 for file system driver, others are Win32 services Start: 0 = boot, 1 = system, 2 = auto, 3 = manual, 4 = disabled Some drivers need fine-grained control over load order to satisfy dependencies with
A driver’s optional Group value controls load order within a start phase (boot, system, auto) HKLM\System\CurrentControlSet\Control\ServiceGroupOrder A driver’s optional Tag value control’s startup within its group Note: Plug-and-play (discussed in the I/O section) controls load order of PnP drivers Special case: the file system driver for the boot volume is always loaded and started, regardless of what its start type is Lab: run LoadOrd from Sysinternals to see driver ordering
11
the first user-mode process
Runs programs specified in BootExecute e.g. autochk, the native API version of chkdsk Processes “Delayed move/rename” commands
Used to replace in-use system files by hotfixes, service packs, etc.
Initializes the paging files and rest of Registry (hives or files) Loads and initializes kernel-mode part of Win32 subsystem (Win32k.sys) Starts Csrss.exe (user-mode part of Win32 subsystem) Starts Winlogon.exe
12
Starts Lsass.exe (Local Security Authority) Loads GINA DLL (Graphical Identification and Authentication)
Default is Msgina.dll Displays logon dialog
Starts Services.exe (the service controller)
Also includes any drivers marked Automatic start Service startup continues asynchronous to logons
13
14
Autoruns (Sysinternals) Msconfig (in \Windows\PCHEALTH \HELPCTR\Binaries
15
16
ExitWindowsEx function sent to Csrss Start menu->shutdown: Explorer calls it CTRL+ALT+DEL->shutdown: Winlogon calls it If not a forced shutdown, Csrss sends query message to all threads owning top- level windows Processes can cancel shutdown if not a “forced” shutdown Interactive shutdowns are not forced If all answer ok, Csrss sends shutdown message Csrss waits for time defined by HKCU\Control Panel\Desktop\HungAppTimeout If timeout expires, shows popup:
17
Csrss tells Service Control Manager (Services.exe) to exit, which tells all Win32 services to exit Csrss.exe waits for HKLM\System\CurrentControlSet\Control\WaitToKillServiceTimeout After the timeout, Services.exe is terminated (even though service processes may still be shutting down) Example: IIS, Exchange Some sites lengthen the value to accommodate long shutdowns Finally, calls NtShutdownSystem, which calls the Plug and Play manager’s NtSetSystemPowerState orchestrates final system shutdown Drivers are called to shut down (e.g. flush data to disk) Finally, the HAL is called, which then tells the hardware either to reboot or power off Systems without power management end with the dialog “it is safe to power off your system now”
18
System memory saved to hiberfil.sys on system volume On power-on NTLDR reads hiberfil.sys and continues where the system left off
No boot.ini or boot option menu if hiberfil.sys has valid data
Not supported on x86 Server systems (works on x64 Server 2003 systems)
Hibernation file is better compressed I/O overlapped on IDE drives Resume is faster because reads are larger Device parallelization during power up improved
Power up done asynchronously in the background by drivers (specifically power-pageable devices without children)
19
Unhandled exception (e.g. executing invalid instruction) OS or driver detects severe inconsistency Referencing paged out memory at interrupt level (famous “IRQL_NOT_LESS_EQUAL” crash) A reschedule is attempted at dispatch level IRQL or higher Hardware error
20
~70% caused by 3rd party driver code ~15% caused by unknown (memory is too corrupted to tell) ~10% caused by hardware issues ~5% caused by Microsoft code
From online crash analysis database:
55,000 unique drivers - 24 new / day (28,000 in 2004) 220,000 total drivers - 98 revised / day (130,000 in 2004)
Many Devices
Over 1,263,300 distinct Plug and Play (PnP) IDs (680,000 in2004) 1,600 PnP IDs added every day
21
Stop code (also called bugcheck code) 4 stop-code defined parameters
22
23
24
Simple repair-oriented command-line environment Built on a minimal NT kernel Bootable from Win2000/XP/Server 2003 Setup CD Type “r” to repair and then select the installation Installable onto hard disk (winnt32.exe /cmdcons) Winnt32.exe must match service pack you are running Can also network boot using PXE boot from a RIS server
25
Only access \Windows, \System Volume Information, and root of non-removable media Can only copy files onto system, not off You can override these in the Local Security Policy editor (secpol.msc) on the installation when its running
26
27
28
29
30
31
Backup of all system state and user data on system volume
Includes registry, system files, boot sector, MBR
Made by Windows Backup (Ntbackup.exe)
Windows XP Professional and higher
Boot into ASR from Windows setup (press F2 when prompted) and insert the ASR floppy Will restore entire system state, including boot sector, MBR, system files, and registry
You have to keep the backup up-to-date No control over granularity of restore (all-or-nothing) Not included with Windows XP Home Edition
32
33
Disk is corrupt File is missing or corrupt
Boot into RC Run Chkdsk If no Chkdsk errors, obtain clean copy of file and replace file Check in \Windows\System32\DLLCache for backup Replacement must be identical match i.e. from same hotfix
If there’s more than one corrupt file, use Setup Repair Install If can’t find replacement use Automated System Recovery (ASR)
34
System blue screens on boot Hang before logon prompt appears NOTE: If system auto-reboots on crash you won’t see the blue screen!
Buggy driver Registry corruption of non-System hive
Last Known Good
Safe Mode
RC
35
36
37
38
An existing driver was updated A latent driver bug for some reason becomes active Files or registry hives are missing or corrupt
39
40
41
42
HKLM\System\CurrentControlSet\Control\SafeBoot\AlternateSh ell specifies shell for Command-Prompt boot
But might be needed to boot the system
43
44
45
46
47
48
49
Check all settings except “low resource simulation”
50
51
52
File System Driver (NTFS/FAT) System Restore Filter Applications
File system request
Change.log1 A0009653.exe A0009654.ini
\System Volume Information\ _restore{XX-XXX-XXX }\ RP5 User mode Kernel mode
53
54
55
NTFSDOS Professional (Winternals)
Access NTFS from DOS Can run DOS virus scanners and other DOS applications
ERD Commander 2003 (Winternals)
Windows-like recovery environment booted from CD Full GUI interface (previous version was command line) Based on WinPE
Special subset of XP that replaces having to use DOS boot disks Only available to hardware & software vendors Since it’s XP, plug and play configures the system
Offers more functionality than Recovery Console: Reset any password Full registry editor Text editor System compare wizard System Restore No security restrictions
56
57
58