Computer Science from the Bottom Up

Ian Wienand

Computer Science from the Bottom Up is a short course teaching students about the computer from the bottom end up. Topics covered include binary and binary logic, operating systems fundamentals, compiler fundamentals, system library fundamentals and more.

This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.


Table of Contents
Introduction
Welcome
Philosophy
Why from the bottom up?
Enabling technologies
Who is this aimed at?
Who should be teaching this course?
The Open Source Philosophy
About the course
1. General Unix and Advanced C
Everything is a file!
Abstraction and function pointers
Application Programming Interfaces
Libraries
Summary
Exercises
Standard File Descriptors
The Shell
Redirection
2. Binary and Number Representation
Binary -- the basis of computing
Binary Theory
Hexadecimal
Practical Implications
Types and Number Representation
C Standards
Types
Number Representation
Exercises
3. Computer Architecture for Beginners
The CPU
Branching
Cycles
Fetch, Decode, Execute, Store
CISC v RISC
Memory
Memory Hierarchy
Peripherals and busses
PCI Bus
DMA
Other Busses
Small to big systems
Symmetric Multi-Processing
Clusters
Non-Uniform Memory Access
Memory ordering, locking and atomic operations
Exercises
4. The Operating System
The role of the operating system
Abstraction of hardware
Multitasking
Standardised Interfaces
Security
Performance
Operating System Organisation
The Kernel
Userspace
System Calls
Overview
Analysing a system call
Privileges
Hardware
Other ways of communicating with the kernel
File Systems
5. The Process
What is a process?
Elements of a process
Process ID
Memory
File Descriptors
Registers
Kernel State
Process Hierarchy
Fork and Exec
Fork
Exec
How Linux actually handles fork and exec
The init process
Context Switching
Scheduling
Preemptive v co-operative scheduling
Realtime
Nice value
A brief look at the Linux Scheduler
The Shell
Signals
Example
6. Virtual Memory
What Virtual Memory isn't
What virtual memory is
64 bit computing
Using the address space
Pages
Physical Memory
Pages + Frames = Page Tables
Virtual Addresses
Page
Offset
Virtual Address Translation
Consequences of virtual addresses, pages and page tables
Individual address spaces
Protection
Swap
Sharing memory
Disk Cache
Hardware Support
Physical v Virtual Mode
The TLB
TLB Management
Linux Specifics
Address Space Layout
Three Level Page Table
7. The Toolchain
Compiled v Interpreted Programs
Compiled Programs
Interpreted programs
Building an executable
Compiling
The process of compiling
Syntax
Assembly Generation
Optimisation
Assembler
Linker
Symbols
The linking process
A practical example
Compiling
Assembly
Linking
The Executable
8. Behind the process
Review of executable files
Representing executable files
Three Standard Sections
Binary Format
Binary Format History
ELF
ELF in depth
Libraries
Static Libraries
Executable Files
Shared Libraries
ABI's
Byte Order
Calling Conventions
Starting a process
Kernel communication to programs
Starting the program
9. Dynamic Linking
Code Sharing
Dynamic Library Details
Including libraries in an executable
The Dynamic Linker
Relocations
Position Independence
Global Offset Tables
The Global Offset Table
Libraries
The Procedure Lookup Table
Working with libraries and the linker
Library versions
Finding symbols
10. I/O Fundamentals
File System Fundamentals
Networking Fundamentals
Computer Science from the Bottom Up Glossary
List of Tables
1-1. Standard Files Provided by Unix
1-2. Standard Shell Redirection Facilities
2-1. Binary
2-2. 203 in base 10
2-3. 203 in base 2
2-4. Convert 203 to binary
2-5. Bytes
2-6. Truth table for not
2-7. Truth table for and
2-8. Truth table for or
2-9. Truth table for xor
2-10. Boolean operations in C
2-11. Hexadecimal, Binary and Decimal
2-12. Convert 203 to hexadecimal
2-13. Standard Integer Types and Sizes
2-14. Standard Scalar Types and Sizes
2-15. One's Complement Addition
2-16. Two's Complement Addition
2-17. IEEE Floating Point
2-18. Scientific Notation for 1.98765x10^6
2-19. Significands in binary
2-20. Example of normalising 0.375
3-1. Memory Hierarchy
9-1. Relocation Example
9-2. ELF symbol fields
List of Figures
1-1. Abstraction
1-2. Default Unix Files
2-1. Masking
2-2. Types
3-1. The CPU
3-2. Inside the CPU
3-3. Reorder buffer example
3-4. A Hypercube
3-5. Acquire and Release semantics
4-1. The Operating System
4-2. The Operating System
4-3. Rings
5-1. The Elements of a Process
5-2. The Stack
5-3. Process memory layout
5-4. Threads
5-5. The O(1) scheduler
6-1. Virtual memory pages
6-2. Virtual Address Translation
6-3. Segmentation
6-4. Linux address space layout
6-5. Linux Three Level Page Table
7-1. Alignment
7-2. Alignment
8-1. ELF Overview
9-1. Memory access via the GOT
9-2. sonames
List of Examples
1-1. Abstraction with function pointers
2-1. Using flags
2-2. Example of warnings when types are not matched
2-3. Floats versus Doubles
2-4. Program to find first set bit
2-5. Examining Floats
2-6. Analysis of 8.45
3-1. Memory Ordering
4-1. getpid() example
4-2. PowerPC system call example
4-3. x86 system call example
5-1. Stack pointer example
5-2. pstree example
5-3. Zombie example process
5-4. Signals Example
7-1. Struct padding example
7-2. Stack alignment example
7-3. Page alignment manipulations
7-4. Hello World
7-5. Function Example
7-6. Compilation Example
7-7. Assembly Example
7-8. Readelf Example
7-9. Linking Example
7-10. Executable Example
8-1. The ELF Header
8-2. The ELF Header, as shown by readelf
8-3. Inspecting the ELF magic number
8-4. Investigating the entry point
8-5. The Program Header
8-6. Sections
8-7. Sections
8-8. Sections readelf output
8-9. Sections and Segments
8-10. Creating and using a static library
8-11. Disassembley of program startup
8-12. Constructors and Destructors
9-1. Specifying Dynamic Libraries
9-2. Looking at dynamic libraries
9-3. Checking the program interpreter
9-4. Relocation as defined by ELF
9-5. Specifying Dynamic Libraries
9-6. Using the GOT
9-7. Relocations against the GOT
9-8. Hello World PLT example
9-9. Hello world main()
9-10. Hello world sections
9-11. Hello world PLT
9-12. Hello world GOT
9-13. Dynamic Segment
9-14. Code in the dynamic linker for setting up special values (from libc sysdeps/ia64/dl-machine.h)
9-15. Symbol definition from ELF
9-16. Examples of symbol bindings
9-17. Example of LD_PRELOAD
9-18. Example of symbol versioning