bun - bun file format


A bun file is a cdb file. cdb provides indexed and sequential access methods to records; refer to the cdb documentation for details of how the database is organized on disk.

The bun file format assumes ASCII throughout.

Index record

A valid bun file contains an index record with a null key (the key of length 0). The data associated with the index is the concatenation of each pathname that appears in the bun file, in netstring format.

Here is an example of a complete index, in cdbmake format.


This index contains three pathnames: foo, bar, and quux.

Head records

A valid bun file contains a head record for each pathname. The head key is H (the capital letter aitch), followed by the pathname. The head data is the concatenation of:

The file type character is one of the following.

_ (underscore)
regular file
(hard) link
symbolic link
block device
character device

Here are some example head records.


The first says that foo is a regular file; its reference number is 0, and there are no metadata records associated with it. bar is also a regular file, its reference number is 1, and it has two metadata records associated with it. quux is a directory; its reference number is 2, and there are four metadata records associated with it.

Content records

All file types, except directory and pipe, require a content record. The content key is D (the capital letter dee), followed by the relevant reference number.

For a file, the content data is the (possibly compressed) contents of the file.

For a link or symbolic link, the content data is the name of the file linked to.

For a device, whether character or block, the content data is the value of st_rdev stored as 8 bytes, big-endian.

Here is a trivial content record; the item with reference number 0 (the regular file foo in this example) contains the words bar and baz on separate lines.


Metadata records

Each head record contains a (possibly empty) string of metadata characters. For each character, there is a corresponding metadata record: the key is the metadata character followed by the reference number.

The following metadata characters are defined.

Time of the item's last access, in external TAI64NA format.
Global file permissions: file permission codes to use in default of specific permissions.
Time of the item's last modification, in external TAI64NA format.
O (capital letter oh)
File ownership: concatenation, in netstring format, of the names or IDs of all the owners of the file, and any names that have permissions associated with them. Each begins with a letter showing what sort of name it is: U is a user name; u a numeric user ID; G is a group name; g a numeric group ID; and O is other users.
File permissions: concatenation of the file permission codes for each owner in netstring format.
Uncompression algorithm.

Here are some example metadata records.


These say that file bar can be uncompressed with gunzip; directory quux is owned by tjg, has a numeric group ID of 312, and the permissions are rwxr-xr-x.

File permission codes

The following file permission codes are defined for use in G and P metadata records.

G only; directory only. BSD group semantics apply. Corresponds to the meaning in Unix of the setgid bit on a directory.
P only, and meaningless unless X is also set. When executed, the process acquires the owner's credentials. Corresponds to the usual meaning of Unix's setuid (for a U or u owner) and setgid (for a G or g owner) bits.
G only. Mandatory file locking applies.
G only; directory only. Restricted write permission: users may only rename and delete files they own within this directory. Corresponds to the meaning of Unix's directory sticky bit.
Read access.
Directory only. Search access.
G only, and meaningless unless X is also set. The program's text segment should remain in VM. Corresponds to the usual meaning of Unix's sticky bit.
Write access.
Execute access.


Tim Goodwin; cdb by Dan Bernstein.