FAT12 floppy image (Python)

From LiteratePrograms
Jump to: navigation, search


[edit] theory

A filesystem is basically a container data structure; in order to read a filesystem (or mutate one efficiently) we would need to be able to handle all valid states, but in order to initialize one we are free to choose the simplest valid injection.

In this case, we attempt to build rudimentary DOS FAT12 floppy images; as this filesystem format is over three decades old, a direct route will be less than a screenful of lines.

[edit] practice

We use the following straightforward code to write the filesystem image:

<<writing image>>=
def mkimg(filename,blobs):
    f = open(filename, "w")
    for off, data in blobs:

Therefore, we describe our desired filesystem in terms of block offsets and the data which will be concatenated at that location.

The filesystem consists of the following areas:

  • the Master Boot Record
  • a File Allocation Table
  • directory entries for each file
  • and, of course, the data comprising each file's contents.

There are only a few details to watch out for:

  • the FAT occurs twice, to provide a redundant spare
  • we also arrange to write a zero block in the final sector, forcing the image to the desired size.
<<image description>>=
[( 0,[open(param('-b',"mbr.raw")).read()]),
 ( 1,fat),
 (19,[dirent(*e) for e in zip(files,offs,lens)])] +
[(31+o,[c]) for o,c in zip(offs,contents)] +

[edit] directory entries

Creating the directory entries is messy, but straightforward, especially because we ignore all metadata except for the file name, offset of its first block, and the file length.

dosfn	=lambda f: "%-8.8s%-3.3s" % tuple((f.upper()+'.').split('.'))[:2]
dirent  =lambda f,o,l: pack('<11s15xHI',dosfn(basename(f)),o,l)

[edit] file data blocks

Deciding where to place the file contents is also easy. We don't have to deal with an arbitrary file system state, but can instead choose an easy state: the files are laid out one after the other contiguously on disk.

Because files are block-aligned, we round up the file sizes to the number of blocks taken by the file, then accumulate the number of blocks taken up by the previous files to determine each file's starting offset.

Due to special interpretation of certain offset values in the FAT, we start the data area at block offset 2 instead of 0.

<<calculate offsets>>=
offs    = accum((blocks(l,512) for l in lens),2)
blocks	=lambda n,b: n/b + (n%b>0)
accum   =lambda vs,v0: reduce(lambda l,r: l + [r+l[-1]],vs,[v0])

[edit] the File Allocation Table

Finally, encoding the FAT — for the contiguous case — is trivial. The FAT is basically a disk-based linked list; the successor of each block within a file is the next block, so range(1,ω) provides almost what we want. The only problem would be that the final block of each file would then point to the start offset of the next one. By changing each occurrence of a start offset to the EOF flag (0xfff) we correctly mark the end of each file and preserve the property that the start of each file shouldn't have a predecessor.

The sole difficulty here is that FAT12 is a 12-bit encoding, meaning that each entry is 3 nybbles long and hence not byte-aligned. We do the obvious thing, first expanding each FAT entry into its 3 component nybbles then using n2bytes to reduce each pair of nybbles to a byte value.

<<encode FAT>>=
fat     = n2bytes(nybbles((b in offs) and 0xfff or b
				for b in range(1,offs[-1]+1)))
nybbles =lambda vs: sum(((v&0xf,(v>>4)&0xf,(v>>8)&0xf) for v in vs),())
n2bytes =lambda ns: [pack('B',o*0x10+e) for e,o in zip(ns[0::2],ns[1::2])]


[edit] wrapping up

Finally, we get the file contents and lengths in the obvious way...

<<get file data>>=
contents= [file(f).read() for f in files]
lens    = [len(c) for c in contents]

... and provide a simple command-line wrapper.

from struct import pack
from os.path import basename

writing image
def mkfat12(opts,files):
    if '-h' in opts:
        print "Usage: [-b MBR] [-o output] [-s sector count] files..."

    param   = opts.get
    get file data
    calculate offsets
    encode FAT
          image description)

if __name__ == '__main__':
    import sys, getopt

    os,files = getopt.getopt(sys.argv[1:],"b:o:s:h")

[edit] testing

You should provide mbr.raw or use the -b option to specify a 512-byte Master Boot Record. This can be copied from an existing floppy with something along the lines of:

dd if=/dev/fd0 of=mbr.raw bs=512 count=1

The resulting output file (defaults to a.raw as a 1.44 MB floppy image) can be set as a virtual device for a PC emulator such as Q or Bochs, written to a physical floppy disk with dd(1), or even placed on a CD-ROM (depending upon burner software) as a bootstrap image.

Download code