UNIX& on the Game Boy Advance
UNIX& on the Game Boy Advance
& Amit Singh. All Rights Reserved.
Written in August 2004
UNIX& is a registered trademark of The Open Group.
Introduction
In this document, I discuss &gbaunix&, a rather contrived experiment in which I run an ancient version of the UNIX operating system on a popular hand-held video game system using a simulator. I can imagine this to be loosely of interest to a few types of people:
Game Boy hobbyist programmers
Students of operating systems, emulators, and compilers
UNIX aficionados
Specifically, I run 5th edition UNIX on the Nintendo Game Boy Advance. I briefly cover the following topics in this discussion:
.) Apple teamed with Acorn to fund a new company called Advanced RISC Machines, Limited, which became the new ARM. VLSI Technology was a technology partner in this endeavor. ARM Limited's first processor was the ARM6, based on Version 3 of the ARM architecture (ARMv3). It had full 32-bit code and data addressing. An ARM6 processor, a 20 MHz 610, was used in Apple's MessagePad hand-held (Newton).
The GBA's 32-bit processor is an ARM7TDMI, an implementation of Version 4T of the ARM architecture. We look at some of its details in the next section.
The GBA's ARM
The GBA uses an ARM7TDMI processor, with certain ARM features missing. The ARM7TDMI is the most widely-used 32-bit embedded RISC microprocessor. Its architecture (ARMv4T) is almost a decade old (as of 2004). The ARM7TDMI has neither a memory management unit nor a cache. Its nomenclature may be understood as follows:
It is an ARM7 processor
It includes the Thumb 16-bit compressed instruction set
It has on-chip Debug support
It has an enhanced 32-bit x 8 Multiplier, with 64-bit result
It has EmbeddedICE hardware that supports on-chip breakpoints and watchpoints
The ARM7TDMI core has a single 32-bit data bus carrying both instructions and data. Data can be 8, 16, or 32 bits, and only load, store, and swap instructions can access data from memory. There are 31 general-purpose 32-bit registers, 6 status registers, a barrel-shifter, an Arithmetic Logic Unit (ALU), and an enhanced multiplier. All registers are not accessible at the same time. For example, in ARM state, 16 general registers and one or two status registers are accessible at any one time. The processor can operate in seven modes: a user mode that is used for executing most programs, and six privileged modes (Fast Interrupt, Interrupt, Supervisor, Abort, Undefined, and System). The processor has two states: ARM and Thumb (see below).
The GBA has DMA hardware (with 4 channels) external to the processor. While ARM supports two types of interrupt requests: normal (IRQ) and &fast& (FIQ), the GBA only makes use of the normal IRQ.
The ARM7 has a simple three-stage pipeline with the following stages:
Fetch: An instruction is fetched (from memory), and put into the instruction pipeline.
Decode: An instruction is decoded (for example, the registers used in the instruction are decoded).
Execute: One or more registers are read from the register bank, shift and ALU operations occur, and results are written to one or more registers.
At any point during normal operation, while one instruction executes, the next instruction is decoded and a third instruction is fetched.
A traditional drawback of RISCs is their relatively poor code density (as compared to CISCs), due to their fixed-length instruction sets. This increases the size of a program's working-set, and leads to poorer cache utilization, more memory traffic, and higher power consumption. Such problems become particularly important in embedded applications. While an efficient solution to the power consumption problem would be multi-pronged (efficient use of parallelism, innovative electronics, etc.), ARM incorporated the &Thumb& architecture into certain processors to improve code density.
Thumb is a 16-bit compressed version of the normal 32-bit ARM instruction set. It includes a subset of the most commonly used 32-bit ARM instructions. While Thumb instructions have 16-bit wide opcodes, they operate on the same 32-bit register set as ARM code, and have most other benefits of the 32-bit core (32-bit address space, 32-bit barrel shifter, 32-bit ALU, etc.) Thumb-enabled processors (such as the GBA's ARM7TDMI) have decompression hardware in the instruction pipeline. The decompressor translates Thumb instructions into equivalent ARM instructions. Thumb code density approaches, and even exceeds, that of many CISC processors. It is much better than ARM in certain contexts, for example (the numbers are approximate):
Thumb code takes 35% lesser space than ARM code
Thumb code uses 40% more instructions than ARM code
Thumb code is 40% slower than ARM code, if using 32-bit memory
Thumb code is up to 60% faster than ARM code, if using 16-bit memory
Thumb code results in 30% less external memory power consumption than ARM code
Typical applications, including those on the GBA (and our experimental code), use a mixture of ARM and Thumb. The ARM7TDMI interprets an instruction stream as ARM or Thumb based on a bit (the &T& bit) in the Current Program Status Register (CPSR). The GBA has a small amount of fast 32-bit memory (the 32 KB IWRAM). Typically, some amount of speed-critical code can be ARM code executing from this memory. The majority of an application's code might not be speed-critical, however, and thus may be Thumb executing from slower memory (such as an GBA GamePak). Nevertheless, as we saw, Thumb code can be faster than ARM code under certain circumstances, so it is not always a straightforward comparison.
, a highly portable PDP-11 simulator written in C. SIMH also implements simulators for numerous other systems, but I only use the PDP-11 portion. While the GBA has a C runtime via freely available toolchains, the port of SIMH to the GBA requires several additions and a few modifications. The high-level architecture of gbaunix is shown below:
The gbaunix &game cartridge& (or ROM) is the concatenation of a simulator runtime and a UNIX disk image. The latter could be taken as is, or may have additional programs loaded onto it.
The simulator contains SIMH at its core. I have attempted to make minimal modifications to SIMH itself, abstracting GBA specific functionality into a logically separate layer that SIMH talks to. This layer could be thought of as having the following components:
TTY Output: gbaunix provides an illusion of a text terminal to SIMH. All TTY output is re-routed to GBA routines that format it, and send it to the GBA's framebuffer. Text-scrolling is done, if necessary. printf() is implicitly converted to a two step operation: sprintf()
into a formatting buffer, followed by displaying the buffer on the screen.
TTY Input: gbaunix does not have an input mechanism currently. You can only execute a canned sequence of UNIX shell commands. The sequence must be specified at compile-time as an array of strings in gba/gba_kbd.h in the source. While UNIX is running, pressing the START button feeds the next command line into the TTY's input buffer. gbaunix can either simply poll for keypresses, or can use the GBA's keypad interrupt. The latter ensures that no keypresses are missed. An example sequence is shown below:
/* gba/gba_kbd.h */
const char *gba_kbdinput[] = {
&chdir /work\r&,
&ls -l\r&,
&./fact 100\r&,
&cat hanoi.c\r&,
&./hanoi\r&,
&./hanoi 3\r&,
&chdir /tmp\r&,
&echo 'main() { printf(\&Hello, World!\\n\&); }' \
& hello.c\r&,
&cc hello.c\r&,
&./a.out\r&,
... /* more commands */
File System: Due to the size of the UNIX disk image (2.5 MB), the only area of GBA memory that it would fit in is the ROM. Even though I cannot write to the ROM, I must accommodate writes to this &disk&. A simple-minded solution is to allocate a &shadow& buffer in RAM every time a new write happens. All reads and writes look up the shadow buffer chain to see if the source or target (respectively) regions of the disk already exist. Note that a subsequent operation may span multiple shadow buffers, partially or fully. Thus, we may need to coalesce buffers occasionally. Furthermore, a fake stdio layer is exported to SIMH.
Memory: The simulated PDP-11 is given 128 KB of memory from the GBA's 256 KB EWRAM. Simulator memory operations (such as copying or moving) are implicitly converted to equivalent GBA operations (with optimizations, such as the use of DMA, if applicable).
Miscellaneous: This category includes code to initialize the runtime: for setting up the TTY, setting up any interrupt handlers, calling file system initialization hooks, etc.
Support a wide range of applications.
Have flexible and convenient remote access, with acceptable response time.
Have a hierarchical information structure (allowing for hierarchical control, and decentralization, of resource allocations and accounting authorizations).
Have a reliable internal file system.
Have selective information sharing (for example, a user may selectively allow others to access his files).
Have on-line system documentation.
BTL withdrew from the Multics project in early 1969.
Multics continued as a commercial product after Honeywell acquired GE's computer assets, and after Bull acquired the Honeywell properties. The system eventually met many of its design objectives. The last Multics site was shut down in 2000. In any case, Multics' design would influence several systems to come.
After BTL's withdrawal, there were a few at BTL who had been working on Multics and were restless (Ken Thompson, Dennis Ritchie, Stu Feldman, Doug McIlroy, Bob Morris, and Joe Ossanna). Thompson was working on a &game&, Space Travel, on the GE-635. Space Travel had first been written for Multics, and had a FORTRAN port for the General Electric Comprehensive Operating System (GECOS) running on the 635. The game simulated the motion of certain celestial bodies in the Solars System. The player's goal was to land a space ship on a planet or a moon. The GE computer's hardware and software were both ill-equipped to run the game, and playing was expensive in terms of CPU time (although the &money& so spent was only theoretical). Thompson proactively looked for an alternative. He decided to take over a &little-used& DEC PDP-7. This particular unit had 8K 18-bit words of memory and a capable vector display processor. Thompson, with help from Ritchie, rewrote Space Travel in PDP-7 assembly language from scratch (including a floating-point simulator). It ran standalone on the PDP-7.
0th Edition (late 1969)
Thompson and Ritchie did not use any native software on the PDP-7 to write Space Travel, but used a cross-assembler on the GE machine, and used paper tapes to run the program on the PDP. Unhappy with this setup, Thompson began writing an operating system for the PDP-7, starting with a file system. The &cross-assembly on GECOS followed by paper tape& arrangement continued until Thompson had a system capable of hosting development. At this point (late 1969), the system had a kernel, an editor, an assembler, a simple command shell, and some file utilities (cat, cp, rm, etc.) This was UNICS: a pun on Multics whether you think of it as &Uniplexed ULTICS& or a castrated Multics. Later, UNICS became UNIX. This first incarnation of UNIX may be regarded as the 0th edition.
The ancient cp command operated on multiple file name arguments by taking them in pairs. Thus:
# cp file11 file12 file21 file22 ...
The dsw command (contraction for &delete using switches&) was used for deleting files interactively.
Multics (and the earlier CTSS) influenced many aspects of UNIX, such as:
The shell (the command interpreter was even called &shell& on Multics). UNIX `shell command` is analogous to Multics' [shell command].
Utilities such as ls, pwd, chdir (cwd on Multics), mail, man (help on Multics).
&rc& files (CTSS had a program called RUNCOM).
roff, the command for rendering text (the CTSS RUNOFF command was used for Multics documentation).
A file as a stream of bytes with no underlying structure
A text file as a stream of characters with newlines
File system with a tree structure
Structure of the underlying disk hidden from applications performing file I/O
The arguments (file handle, buffer, count) used in file I/O calls
I/O redirection
Ritchie said in The Unix Time-sharing System & A Retrospective that &In most ways UNIX is a very conservative system. Only a handful of its ideas are genuinely new. In fact, a good case can be made that it is in essence a modern implementation of MIT's CTSS system. The claim is intended as a compliment to both UNIX and CTSS.&
PDP-7 UNIX had a file system with i-nodes, although an i-node contained very basic information: the list of physical blocks, and minimal metadata (size, protection mode, and type). While special files and directories were supported, there were no pathnames. There was even buffering in the file system. Some stark limitations included:
You could only create directories and special files at file system creation time.
The system had only one disk, and you could not mount any more disks.
The system was not multiprogrammed. Thus, only one program could exist in memory at a time.
Disk I/O completely blocked the CPU.
There were no fork, exec, or wait. The shell executed commands using a tedious mechanism, which terminated the shell upon every command. After command completion, exit ran a brand new copy of the shell.
The PDP-7 UNIX system also caused the inception of a high-level language, B, which was was influenced by the BCPL programming language. Dennis Ritchie called B &... C without types ... BCPL squeezed into 8k bytes of memory and filtered through Thompson's brain ...& B would eventually lead to C. PDP-7 UNIX, as well as its utility programs, were all written in assembly language.
For an example of a BCPL program, see .
Note that UNIX also ran on the PDP-9.
1969 was also the year that the first ARPANET node became operational, and the first Internet RFC was published.
The UNIX &group& then made efforts to convince BTL to acquire a better machine, the PDP-11. They promised to deliver a document editing and formatting system (meant to run standalone, without UNIX), and were supposed to use UNIX only as a development platform. The first PDP-11 they received was an 11/20, with 24 KB of memory. UNIX ran on the PDP-11 in early 1971. Of its memory, 12 KB was used by the kernel, some by user programs, and rest as a ram disk.
1st Edition (November, 1971)
The 1st edition ran on the PDP-11/20, which had no MMU, nor any hardware protection features. Therefore, it was trivial to crash the operating system. There was no multiprogramming. Pathnames were present. The only (documented) system calls were: break, cemt, chdir, chmod, chown, close, creat, exec, exit, fork, fstat, getuid, gtty, ilgins, intr, link, mkdir, mount, open, quit, read, rele, seek, setuid, smdate, stat, stime, stty, tell, time, umount, unlink, wait, and write.
A few programming languages were supported: Assembly, B, BASIC, and FORTRAN (but no C).
The B (and Assembly) compilation environment consisted of the following:
ComponentDescription
/bin/asassembler (the output of the assembler went to a file called a.out by default)
/bin/ldthe link editor (note that only one person could use ld at a time in a given directory due to its use of temporary files)
/bin/nmprogram for printing the symbol table from the output file of an assembler or loader run
/bin/stripprogram for removing symbols and relocation bits
program for printing a list of undefined symbols in an appropriate file
/etc/as2second pass of the PDP-11 assembler
/etc/baB assembler (prog.i && prog.s)
/etc/bcB compiler (prog.b && prog.i)
/etc/bilibB interpreter library
/etc/brt1B runtime routines
/etc/brt2B runtime routines
/etc/liba.aassembly language subroutines
/etc/libb.ageneral utility routines for B programs
/usr/b/rca shell script that compiled, by invoking other programs, a B source file (program.b) into executable (a.out). The compilation sequence was program.b && program.i && program.s && a.out
It is noteworthy that the 1st edition did not carry any copyright notice.
The system had impressive documentation. The
was divided into seven sections:
Commands (programs intended to be invoked directly by the user)
System calls (entries into the UNIX supervisor, accessed via the trap instruction)
Subroutines (intended to be called by user programs)
Special files (files referring to I/O devices)
File formats
User maintained programs
Miscellaneous
Subsequent editions had an eighth section titled System Maintenance.
Each logical page of the manual, a &man page&, contained subsections: name, synopsis, description, files, see also, diagnostics, bugs, and owner. The manual was prepared using the UNIX text editor ed, and the roff text formatter.
The first man page was for the cat command.
2nd Edition (June, 1972)
The 2nd edition contained cc, the C compiler, but it was not yet implemented in C. A few new commands, system calls, and subroutines were introduced, such as: :(1), cc, echo(1), exit(1), goto(1), if(1), login(1), m6(1), man(1), mt(1), opr(1), stty(1), tmg(1), tss(1), kill(2), sleep(2), sync(2), atan(3), hypot(3), nlist(3), qsort(3), salloc(3), and sqrt(3).
ComponentDescription
:(1)A command that did nothing (its function was to place a label for the goto command: thus, the shell didn't have to be fixed to ignore lines with :'s.
cc(1)The C compiler.
m6(1)A general purpose macro processor.
opr(1)submit a job for off line printing.
tmg(1)A compiler compiler (TMG was a compiler-writing language).
tss(1)Interface to the Honeywell TSS.
salloc(3)A set of routines for dealing with (almost) arbitrary length strings of words and bytes.
2nd edition UNIX was still unprotected, with no multiprogramming. It did carry a copyright.
3rd Edition (February, 1973)
The 3rd edition was the first UNIX to run on the PDP-11/45, which had hardware segmentation and support for more core memory (256 KB).
3rd edition UNIX had some noteworthy features, such as pipes and multiprogramming. Some other interesting additions included:
ComponentDescription
cdb(1)The C debugger
crypt(3)The password encoding routine
proof(1)A program for comparing two text files (latter day diff)
sno(1)A SNOBOL III compiler and interpreter
speak(1)A word to voice translator that turned a stream of ASCII words into utterances, and output them to a voice synthesizer.
typo(1)Quoted verbatim from the man page: &... hunts through a document for unusual words, typographic errors, and hapax legomena and prints them on the standard output.&)
yacc(6)A compiler compiler
4th Edition (November, 1973)
The 4th edition was essentially the 3rd edition written in C. It also supported newer PDP-11 systems (such as the /60 and /70). Implementation in a high-level language resulted in a system about a third bigger than the previous.
A few commands were added, while the B programming language was not included.
is also the magazine of the
Association) has to do with a popular terminal (Teletype model 37) in the early 1970s. The ';' escape sequence put that terminal in full-duplex mode. Hence, the UNIX greeting message contained this sequence. A terminal that does not understand this sequence, such as our GBA TTY, would print the semi-colon.
We noted that even the first UNIX had special files: I/O devices represented as files. When such a file was read from or written to, the underlying device was activated. In the accompanying screenshot, /dev/rk0 and /dev/rrk0 are the block and character devices, respectively, corresponding to first moving-head &RK& disk drive. /dev/mem mapped the core memory of the computer into a file. It was possible to patch the running system using a debugger on this device.
glob, contraction for &global&, was an external (to the shell) program use by the shell to expand special characters ('*' and '?') in the argument list. glob would expand any metacharacters and invoke the command itself. Upon failing to generate any matches, glob would print the classic UNIX error message: &No match&.
Note that certain commands, such as mkfs, were kept in /etc to lessen the probability of them being invoked accidentally, or out of curiosity!
The dc command, an arbitrary precision reverse Polish calculator, made its appearance very early in UNIX. In fact, it was the first program to be tested on the PDP-11, even before UNIX ran on it.
A typical 5th edition kernel was less than 26 KB. Note also the sizes of the shell (5738 bytes) and init (1972 bytes). The system shown here has a minimal rc. /etc/update periodically updated the super block every 30 seconds, so as to ensure that the file system was fairly up to date in case of a crash.
The long format output of ls listed the mode, number of links, owner, size in bytes, time of last modification, and name.
By the time the 5th edition was released, UNIX had a rich programming environment with support for numerous programming languages, such as Algol-68, APL, Assembly, BASIC, C, FORTRAN, M6, PASCAL, Snobol, and TMG. In fact, using gbaunix you can even program on the GBA itself in several languages (in theory, might be painful in real-life). Compilers/interpreters are included on the 5th edition disk image for C, PDP-11 assembly, BASIC, Shell (scripting), and FORTRAN. I have tried Algol-68 too, although you may have to load it onto the disk image.
Following are some more examples of programming in the gbaunix environment (including the obligatory &Towers of Hanoi&):
Some more screenshots, including those of the output of miscellaneous commands, can be seen on the
Subsequent Editions
The 6th edition (May, 1975) was important in that BSD and XENIX were derived from it. John Lions wrote his famous Commentary based on the 6th edition (A Commentary on the Sixth Edition UNIX Operating System, 1977). Moreover, 6th edition is the earliest UNIX system that is available in machine readable form in its entirety.
The on-line documentation for 5th edition is missing. Only bits and pieces are available of the 4th edition and earlier systems.
Beginning with the 6th edition, the UNIX system proliferated rather rapidly.
Further editions in the BTL lineage were:
7th edition (January, 1979)
8th edition (February, 1985)
9th edition (September, 1986)
10th edition (October, 1989)
It is also possible to run some other systems on the GBA that normally run under SIMH. However, doing so is increasingly harder (and after a point, impossible) as system requirements increase (more memory required, say).
Following are some screenshots depicting BSD 2.9 running on the GBA:
Note that in order to run 5th edition UNIX with gbaunix, you must have an RK05 disk image of 5th edition UNIX, which is not included in the gbaunix distribution. SCO owns the copyright for the 5th edition (and several others). You can download a disk image, after reading and understanding the license and ensuring that you are eligible, from the Unix Archive at the PDP-11 Unix Preservation Society:
If you only want to run gbaunix, without recompiling it, you can simply concatenate the pre-compiled executable (unixv5.tmp) in the distribution with the RK05 disk image:
% cat unixv5.tmp disks/unixv5.dsk & unixv5.gba
unixv5.gba is the ready-to-run &game cartridge&, which can be used with a Game Boy Advance emulator, or on real hardware.
If you want to compile gbaunix (highly recommended, and the only realistic way to experiment with it), you would need a cross-compilation environment: an ARM toolchain for your platform. You may need to edit the Makefile to set the path to your ARM compiler. The RK05 image must be present as disks/unixv5.dsk in the source tree. Thereafter, simply run make.
References
Ancient UNIX sources, binaries, and documentation.
Dennis Ritchie's writings.
SIMH documentation.