Top positive review
December 18, 2018
I've been around compilers, code generation, and object formats enough that this was mostly a refresher - but I did like some of the historical references and notes on atypical processors. I think this works at a good level of detail for experienced programmers who can fill in the blanks for things like relocation records that identify which bit positions need to be set. An experienced programmer will also have used symbolic debuggers, and will have a fair idea of what those tools expect to find in an executable image.
I found discussion of linker scripts a bit thin, though - on one embedded application, I think I wrote more lines of linker scripts than of assembler. The author does mention things like linking a piece of code to run at address X but storing it in memory at address Y. You'll need this, for example, when your processor has a small but fast on-chip RAM (address X), and you have a few different code fragments (stored at addresses Y) you'll be loading into that buffer at different times.
It also doesn't mention useful things like defining a symbol at the linker instead of the application code - helpful when coding to a memory mapped device that might live at different addresses in different application configurations. I also used this this feature with the GNU linker and this command line option: "-defsym,buildDateTime=$(shell date +%s)". That embeds the build time in the application - not as a value stored in memory, but as the address of the buildDateTime symbol which then gets cast to a time value.
This does mention overlays, which can still be life-savers when coding a large application for a small (e.g. 16-bit) address space. I saw only 'tree-structured' overlays described, though, the kind found in DEC's RSX-11 operating system. I did not see mention of 'band-structured' overlays, as found in DEC's RT-11 system. That disappointed me because I've found the concept helpful in small address spaces with larger ROMs where, for example, something like overlays could be used for internationalization. With a helpful bus controller, you could swap between UI messages in different languages by choosing which languages' message overlay appears in the address space at any given time.
No text can cover everything. For example, this omits the "cmpexe" compound executable format used in the Apollo Domain system. The one runnable file actually held code in two different instruction sets, so one program file could be used on both the 680x0 or the "Prism" processor architectures - the loader just chose which side of the program to run. Heterogeneous environments like that have been rarities, but the idea remains interesting. I was also involved in design of a microcode linker, where addresses were not numbers but bit-strings. Since sequential addresses had little meaning, individual instructions from different input segments could be interleaved, subject to bit pattern constraints on micro-addresses.
Historical exotica aside, this gives a strong foundation in the basic concepts of linking and loading. It offers just enough of a look at CPU hardware to show how instruction formats and memory characteristics affect the process. It also presents a nice progression from simpler to more complex object formats, and the reasons for them. I imagine this as a useful adjunct to a college course in compilers or operating systems, and helpful to professionals self-teaching about what's "under the hood" in familiar programming tools. Highly recommended, but you might outgrow it quickly once you start working on the tools yourself.