NAME

bun - bundle many files into a single file

SYNOPSIS

bun command [ -flag ... ] bundle [ pathname ... ]

bun -h

bun -v

DESCRIPTION

The bun program is used to bundle many files, directories, symbolic links, etc. into a single bundle file. bun can also be used to extract items from a previously created bundle.

bun -h writes a usage summary to standard output and exits; bun -v reports the version number of bun to standard output, and exits. Otherwise, bun must be invoked with at least two command line arguments: the command specifies which operation to perform (creation, extraction, or table of contents), and the bundle names the file which is to operated on. bundle must be a seekable file (a normal file in the file system; not a pipe, tape device, etc.).

Commands

The following commands are available.

c
create bundle. If one or more pathnames are specified, they are each added to bundle. If no pathnames are specified, bun reads them from standard input, terminated by newlines.
x
extract items from bundle. If one or more pathnames are specified, they are each extracted from bundle. If no pathnames are specified, every item contained in bundle is extracted.

In addition to the items stored in bundle, bun creates directories as needed.

t
table of contents. As for the x command, if one or more pathnames are specified, they are each printed to standard output, followed by a newline. If no pathnames are specified, every pathname contained in bundle is printed.
z
create compressed. The z command is equivalent to the c command, except that each regular file is a candidate for compression.

Flags

Command line flags may be concatenated with the command, or they may appear as separate arguments introduced by - (a hyphen). A hyphen is also permitted before the command. Thus, the following are equivalent.

    bun clv mybun.bun
    bun -c -l -v mybun.bun
    bun c -lv mybun.bun

The argument -- (two hyphens) indicates the end of flag processing. It is only necessary if bundle begins, or may begin, with a hyphen.

    bun x -- "$BUN"

Two flags specify which meta-information is to be stored or retrieved.

d
date: file modification and access times. A bundle is capable of storing timestamps with attosecond precision. bun uses as much precision as the underlying filesystem offers: normally nanosecond, microsecond, or second precision.
u
user: file ownership and permission information (including group ownership).

These flags both work in the same way. If the flag is specified for bundle creation (that is, with the c or z commands), that meta-information is stored in the bundle.

During extraction, bun by default retrieves all meta-information present in bundle. However, if the d or u flag is specified on the command line, the corresponding meta-information will be ignored. (It is not an error to ignore meta-information that is not present!)

Likewise, when listing the verbose table of contents (the tv command), specifying d or u suppresses the corresponding meta-information.

Although a little unusual, this scheme reflects common usage. Most bundles do not need to store date meta-information: the default is not to store it. But if a bundle includes date meta-information, it must be there for a reason: the default is to use it.

Other flags which can be specified when creating a bundle are as follows.

f
flat mode. By default, when a pathname (either mentioned on the command line, or read from standard input) names a directory, the directory and its contents are stored in bundle, recursively. If the f flag is in effect, just the directory is stored.
i (lower case letter aye)
use numeric identifiers for user information. This flag only makes sense in combination with u; it instructs bun to store owner information using your system's numeric identifiers, instead of names.
l (lower case letter ell)
look for hard links. By default, bun doesn't make the extra effort to detect hard links when creating bundle. If the l flag is in effect, hard links will be detected: the contents of a multiply-linked file will only be stored once in bundle; each subsequent occurrence will be stored as a link. When extracting from a bundle containing hard links, bun always re-creates the links.
s
follow symbolic links. By default, bun stores symbolic links as such in bundle. If the s flag is in effect, the target of the link is stored instead. If this flag is present, and the part of the file system being stored contains a loop (a symbolic link pointing to one of its own parent directories), bun will loop till it exhausts some system resource (usually the limit on open file descriptors).

Other flags which can be specified when extracting are as follows.

a
use absolute pathnames. During bundle creation, bun always stores pathnames exactly as they were specified. By default, bun removes a leading / from absolute pathnames during extraction: this provides flexibility, and some degree of protection from mistakes. If the a flag is present, all files will be extracted to the stored pathnames, even if they are absolute.
o (lower case letter oh)
copy regular files to standard output; anything which is not a regular file is silently skipped.

Flags which can be specified for any operation are as follows.

n
write pathnames to standard output after they are added to, or extracted from, the bundle.
q
quicker, but safe only for quiescent trees. By default, bun x extracts files (everything except directories) to a temporary name, synchronizes the file (that is, ensures that the file is physically on the disk), then atomically renames the new file. If the q flag is present, files are extracted in place (overwritten) and not synchronized. This is considerably faster, but voids bun's reliability guarantee.

Similarly, bun c creates bundle with a temporary name, and, if there are no errors, synchronizes it and renames it into place. bun cq writes bundle in place, which is slightly quicker (the difference is not as significant as when extracting). Additionally, if a pathname cannot be inserted (perhaps because it is unreadable), bun cq will create bundle regardless; bun c will not.

bun tq is permitted, but the q flag has no effect.

v
write verbose information about each file to standard output; see below for details.
0 (digit zero)
use NUL terminated pathnames. When reading pathnames from standard input, or writing them to standard output, by default they are terminated by the LF character. If the 0 flag is in effect, they are terminated by NUL instead, permitting pathnames which contain LF to appear in bundle. Note that in the command bun cn0 foo.bun (with no pathnames specified on the command line), the 0 flag causes both the input and the output to be NUL terminated.

Verbose table of contents

When the v flag is present, verbose information about each file is written to standard output. For example:

    $ bun zudv bar.bun /bin/su
    /bin/su file 14124 G:RWX M:1999-08-18T02:31:25.000Z
A:2000-07-11T03:05:37.000Z P:Uroot(IRWX),Gsystem(RX),O(RX) Z:gunzip

Each line of output has at least the following fields, separated by spaces.

Name
The pathname of the item, as stored in the bundle.
Type
One of file, directory, pipe, block-special, or character-special.
Size
The size of the data associated with this item, in bytes. This will be 0 for directories, and pipes. For compressed files, it is the size of the compressed data.

These may be followed by several optionals fields, each of which is introduced by a capital letter followed by a colon.

A
The access time of the item. Times are expressed in UTC, to milli-second precision.
G
Global permissions; see below.
M
The modification time of the item.
P
Specific owners and permissions associated with the item; see below.
Z
The uncompression algorithm, if the item is compressed.

Compression

With the z command, each regular file encountered is a candidate for compression. Compression is controlled by three environment variables.

BUN_ZIP (default gzip)
The command to use for compression: it must be found in PATH (or be an absolute pathname); it must read from standard input and write compressed data to standard output.

If BUN_ZIP is set, BUN_UNZIP must be set, too.

BUN_UNZIP (default gunzip)
The name of the uncompression algorithm. This is stored in bundle; during extraction, it is interpreted as the filter command to run. It should be an unadorned name: recommended values are gunzip or bunzip2.
BUN_ZIP_MIN (default 188)
The minimum file size to compress. Files smaller than this will not be compressed.

Note that all these variables, including BUN_UNZIP, are only relevant during bundle creation. The uncompression command to use is stored in bundle itself; changing BUN_UNZIP has no effect on extraction.

File permissions

bun -cu stores user information as names, not numeric UIDs, GIDs, etc. If the v flag is in effect, user and permission information is introduced by P:, which is followed by a comma separated list. Each element of the list consists of a letter indicating what sort of user this is, an optional name, and a parenthesized list (possibly empty) of the permissions associated with that user.

U
The user that owns the item.
u
The numeric ID of the user that owns the item.
G
The group that owns the item.
g
The numeric ID of the group that owns the item.
O
Other, which must have an empty name: the permissions granted to all users not otherwise named.

In addition to the full ownership and permissions information which is stored with bun -cu, bun also stores global permissions information, whether the u flag was specified or not. If the v flag is in effect, global permissions are introduced by G:. The global permissions are the union of all the specific read, write, search, and execute permissions: if some user can execute a file, it will have eXecute global permissions. To save some space, the most common global permissions are not stored: RWS is the default for a directory; RW for any other type.

During bundle extraction, the global permissions are used if there is no other permissions data, or if the u flag is in effect. Global permissions are modified by the process's umask before being applied.

The following permission codes may be reported if the v flag is in effect.

R
Read access.
S
Directory only. Search access.
W
Write access.
X
Execute access.
B
G only; directory only. BSD group semantics apply. Corresponds to the meaning in Unix of the setgid bit on a directory.
I
P only, and meaningless unless X is also set. When executed, the process acquires the owner's credentials. Corresponds to the usual meaning of Unix's setuid bit (if this is a U owner) or setgid bit (if this is a G owner).
M
G only. Mandatory file locking applies.
P
G only; directory only. Restricted write permission: users may only rename and delete files they own within this directory. Corresponds to the meaning of Unix's directory sticky bit.
T
G only, and meaningless unless X is also set. The program's text segment should remain in VM. Corresponds to the usual meaning of Unix's sticky bit.

EXAMPLES

Specifying pathnames for creation

These two examples are equivalent: they demonstrate the two ways of supplying pathnames to the create command.

    bun c foo.bun foo1.c foo2.c

    { echo foo1.c; echo foo2.c; } | bun c foo.bun

There is little point in using the second form with a fixed list of pathnames; it is intended to be combined with tools such as find(1).

    find foo -name '*.c' | bun c foo.bun

Specifying pathnames for table of contents

Specifying a pathname in combination with the t command is a quick way to check if a bundle contains that pathname.

    $ bun t foo.bun foo1.c foo3.c
    foo1.c
    bun: cannot look up `foo3.c': head record missing

The exit status of bun will be zero only if every pathname specified was found in bundle.

Date and user meta-information

This command creates a bundle that includes date meta-information for each file.

    bun cd mybun.bun *

The usual command will extract all files from mybun.bun, and (attempt to) set each file's date meta-information to the stored values.

    bun x mybun.bun

But if the d flag is specified during extraction,

    bun xd mybun.bun

the date meta-information will be ignored, just as if the d flag had not been specified for the creation of mybun.bun.

Users and permissions

Here is a motley collection of (empty) files, showing a variety of owners and permissions.

    $ ls -l
    total 0
    -r-xr-xr-t   1 ftp      adm             0 Jul 11 11:33 01555
    -rwxrwsr-x   1 bin      mem             0 Jul 11 11:33 02775
    -r--------   1 tjg      xra             0 Jul 11 11:32 0400
    -r--r--rw-   1 nobody   nobody          0 Jul 11 11:32 0446
    -rwsr-xr-x   1 root     system          0 Jul 11 11:32 04755
    -rwxr-xr-x   1 qmailq   qmail           0 Jul 11 11:32 0755

    $ bun cuv bar.bun *
    01555 file 0 G:RTX P:Uftp(RX),Gadm(RX),O(RX)
    02775 file 0 G:RWX P:Ubin(RWX),Gmem(IRWX),O(RX)
    0400 file 0 G:R P:Utjg(R),Gxra(),O()
    0446 file 0 P:Unobody(R),Gnobody(R),O(RW)
    04755 file 0 G:RWX P:Uroot(IRWX),Gsystem(RX),O(RX)
    0755 file 0 G:RWX P:Uqmailq(RWX),Gqmail(RX),O(RX)

Global permissions

Here is the bun tv listing for a particular bundle. Each file's name is descriptive of its permissions.

    read-execute file 4 G:RX
    read-only file 4 G:R
    read-search directory 0 G:RS
    read-write file 4
    read-write-search directory 0
    write-execute file 4 G:RWX

Each file's name is descriptive of its permissions. bun has not stored any global permissions for the file read-write, nor the directory read-write-search, as these are the default.

If a umask of 027 were in effect, unpacking the above bundle would produce the following file permissions.

    -r-xr-x--- [...] read-execute
    -r--r----- [...] read-only
    dr-xr-x--- [...] read-search
    -rw-r----- [...] read-write
    drwxr-x--- [...] read-write-search
    -rwxr-x--- [...] write-execute

The 0 flag

The 0 flag works well in combination with a find(1) that supports -print0, and xargs(1) that supports -0.

    find foo -type f -print0 | bun c0 foo.bun

    bun t0 foo.bun | xargs -0 rm -f

It is only necessary if you expect pathnames which may contain the LF character: since bun does not split input at any other character, it can handle pathnames that contain spaces even without the 0 flag.

The f flag

The f flag is particularly useful when using find(1) to select files for inclusion in bundle.

    find foo -name RCS -prune -o -print | bun cf foo.bun

The o flag

The o flag is a quick way to examine a file in a bundle, without having to extract anything.

    bun xo pkg-1.1.bun pkg-1.1/README | more

It can be combined with bun -t to loop over every file in a bundle.

    for i in `bun t foo.bun`; do
        echo $i
        bun xo foo.bun $i | wc
    done

The i flag

The i flag is useful if you need to bundle up a tree which contains user or group IDs that cannot be looked up; for example, if they belong to an account which has been deleted.

     $ bun cu gone.bun /export/home/gone
     bun: cannot insert `/export/home/gone': cannot find
name for uid `123': file does not exist
     bun: warning: `bar.bun' not created due to earlier errors

     $ bun cuiv gone.bun /export/home/gone
     /export/home/gone directory 0 P:u123(RSW),g123(RS),O(RS)

A side note: if you're wondering what ``file does not exist'' means here, that is the error reported by the getpwuid(3) call. bun reports this error back to you, even though most flavours of Unix don't return sensible error information from getpwuid(3) and friends.

RETURN VALUE

bun exits with status 0 if it was able to do everything it was asked to. If not, suitable diagnostics are printed, and the exit status is 1.

RELIABILITY GUARANTEE

The bun reliability guarantee is a design goal, and depends on both bun and your operating system being free of bugs. No legal warranty is expressed or implied.

bun promises that, at all times, the files it creates will either i) not exist with their final names; or ii) be complete. The promise applies even if the operating system halts in the middle of a bun operation. The promise applies both to the single bundle created with bun c, and the (potentially) many files created with bun x.

The reliability guarantee does not apply if the q flag is present: the q flag should only be used for quiescent trees.

For example, if you are unpacking the source of a package which is distributed as a bundle, you are presumably creating a completely new directory tree, and using the q flag is perfectly sensible. If you are unpacking a bundle which contains /etc/passwd, you definitely want the default, reliable mode of operation.

SEE ALSO

bun(5).

BUGS

There should be a way of specifying flags (particularly d, u, and z) on a per-pathname basis.

RESTRICTIONS

A bundle cannot exceed 4GB.

It is theoretically possible to confuse bun's detection of hard links (the l flag), by deleting and creating files within the file tree being bundled.

It is not meaningful to attempt to store a (Unix domain) socket.

Files of type block-special and character-special are inherently unportable.

AUTHOR

Tim Goodwin; library code from Dan Bernstein.