Overview
The bring-up of UNIX 2.11BSD on the w11 system required a lot of very detailed studies of the source code. That called quickly for a hyper-linked and cross-referenced rendition of the sources, much like doxygen or lxr does it. The desired features were:
- cover boot, kernel, networking, library, and userland code
- cover C, assembler, and other relevant languages with syntax highlighting
- provide 'what is used where' information, over the whole system
- provide hyperlinks for a fast code path analysis, especially across the user-library-kernel interfaces
- support to follow the execution logic of a system call starting from user code, through the library layer, through the kernel interface, to the kernel code by just following hyperlinks
Since none of the existing tools seemed to match that profile a new tool was written. The task is strongly simplified by the fact that the code base is quasi-static and that the size of the code base is small by today's standards, which allows holding the metadata of the full code base in memory for analysis. The tool takes the whole file system and creates a set of static Html files which show all sources with proper hyperlinks. Even though the original motivation was to understand the 2.11BSD system, the tool was designed such that all UNIX systems with a similar structure can be processed. So far the 2.11BSD configuration is certainly the most advanced, that's why in the following all examples refer to the 2.11BSD code base.
Brief Example
It's very easy to follow the execution flow, even across a user-kernel boundary. A simple example starts in the code of the telnet(1) command
- /usr/src/ucb/telnet.c.html
- It opens the required socket for communication with a call of socket(2) in line 2079
- /usr/src/ucb/telnet.c.html#n:2079
- A click on
socket()
leads to - /usr/src/lib/libc/pdp/sys/__socket.s.html#s:_socket
- which is an assembler stub in
libc
which implements the userland side of the socket(2) system call. The comments summarize the control flow viatrap
instruction,syscall()
handler,sysent[]
dispatcher, to thesocket
kernel function. A click onsocket
leads to the kernel handler - /usr/src/sys/sys/uipc_syscalls.c.html#s:_socket
-
this routine another interface layer needed to reach the networking code
which resides in supervisor space and is executed in supervisor mode.
This switch is done by
SOCREATE
, a click on it leads to - /usr/src/sys/pdp/net_mac.h.html#m:SOCREATE
-
and shows that it is a macro, which uses
KScall()
to switch from kernel to supervisor mode and callsocreate
. A click onsocreate
finally leads to - /usr/src/sys/sys/uipc_socket.c.html#s:_socreate
- which is finally the handler doing the work.
Each source is fully cross-referenced, with the whole system codebase. Clicking on a symbol definition (or the X in the sidebar) gets you to the cross-referencing info. This gives comprehensive answers to questions like
-
where is
setjmp()
called ? -
where is
struct iob
used ? -
where is macro
NBPG
used ? -
where is
nlist.h
included ?
Brief User Guide
Supported languages and file formats
Essentially all the languages used in the UNIX system code base are supported, some other file formats too:
File Type | Language | Example |
---|---|---|
.c .h | C | /usr/src/bin/ed.c |
.c | C (Bourne style) | /usr/src/bin/sh/service.c |
.s | assembler (UNIX style) | /usr/src/sys/pdp/mch_trap.s |
.m11 | assembler (DEC style) | /usr/src/new/m11/mac.m11 |
.y | yacc | /usr/src/bin/expr.y |
.f .F | Fortran | /usr/src/new/PORT/dungeon/objcts.F |
.p | Pascal | /usr/src/ucb/PORT/pascal/pdx/test/parall.p |
.0 | man pages | /usr/man/cat1/awk.0 |
The language sources are syntax highlighted with the common color code:
Token Type | Example |
---|---|
string literal | "some string" |
character literal | 'a' |
number literal | 0x123 |
keyword | if then else |
type | int char float |
directive | .data .globl |
comment | /* this is a comment */ |
Navigation
A navigation bar lists all defined functions, global variables, macros, and structs. Clicking the name leads to the line of definition, clicking the 'X' to the cross-reference listing. Each relevant object in a htlmized source file can be addressed via an URL:
Object | #.... | Example | Comment |
---|---|---|---|
source code line | #n:<number> |
/usr/src/sys/sys/sys_generic.c.html#n:126 | source line 126 |
symbol definition | #s:<name> |
/usr/src/sys/sys/sys_generic.c.html#s:_ioctl | definition ioctl() |
struct definition | #sd:<name> |
/usr/src/sys/h/file.h.html#sd:fileops | definition struct fileops |
macro definition | #m:<name> |
/usr/src/sys/h/dir.h.html#m:MAXNAMLEN | #define MAXNAMLEN |
The background color tells in which territory you are in:
Territory | 2.11BSD Example |
---|---|
boot/standalone |
/usr/src/sys/pdpstand/boot.c |
kernel |
/usr/src/sys/sys/sys_generic.c |
network |
/usr/src/sys/netinet/ip_icmp.c |
libraries |
/usr/src/lib/libc/gen/readdir.c |
user level |
/usr/src/bin/cp.c |
Cross-Reference
The cross-reference listing shows where global symbols, macro's and struct's are defined and where they are used, in the current source as well as in the whole system:
Object | #.... | Example | Comment |
---|---|---|---|
symbol | #xref:s:<name> |
/usr/src/lib/libc/gen/readdir.c.html#xref:s:_readdir | where is readdir() used |
struct | #xref:sd:<name> |
/usr/src/sys/h/mbuf.h.html#xref:sd:mbuf | where is struct mbuf used |
macro | #xref:m:<name> |
/usr/src/sys/h/dir.h.html#xref:m:MAXNAMLEN | where is MAXNAMLEN used |
include | #xref:i:<name> |
/usr/src/sys/h/dir.h.html#xref:i:dir.h | where is dir.h used |