Wasabi Systems Logo
Wasabi Certified BSD Wasabi Systems Products

Wasabi Journaling Filesystem

WasabiJFS is an optional extension to Wasabi Certified BSD that utilizes Write-Ahead Physical Block Logging (WAPBL) to increase the recovery speed of the Unix filesystem. A form of journaling, WAPBL allows a filesystem to be repaired within seconds after a system or disk crash, making it crucial to obtaining high availability. It also has the potential to increase performance of large disk operations that make heavy use of the directory structure.

The traditional tool used to repair a system after a failure, known as fsck, is impractical on large filesystems. It can take hours, or even days, to find and replace errors. The process takes time proportional to the size of the filesystem; the larger the system, the longer it takes. With the size of filesystems ever-increasing, the process soon will not be able to be completed in any useful length of time.

Journaling a filesystem solves this problem by logging changes that were underway when the filesystem crashed. By tracking filesystem modifications, journaling allows the system to be repaired in a shorter time period based on the number of operations that were in progress when the system stopped.

Wasabi's implementation of journaling:

  • builds on the established and trusted Berkeley Unix Filesystem
  • is not restricted by the Gnu Public License, unlike IBM's JFS and SGI's XFS
  • works on FFS2, allowing filesystems larger than 2 Terabytes
  • can place the journal in non-volatile memory, eliminating duplicate disk writes

Developed and proven by the database industry, which needed a fast and reliable recovery of databases, journaling keeps a log of system operations to be played back if the system fails. Disk devices are designed to provide the guarantee that either a complete block is written or not written to permanent storage if it is disconnected. However, many filesystem operations must modify several blocks on the disk to complete. For example, when a file inode is allocated, the table (bitmap) of available inodes must be updated, summaries used to speed lookup in these tables must be updated, and the inode must be referenced from a directory so that the file may be named. If the give system crashes while some of these locations are updated but not others, then it must be repaired before it can be used again.

Journaling has been applied to filesystems in two ways: logical operation journaling and physical block logging.

In logical operation journaling, the journal is constructed as a list of abstract operations to be performed. On recovery, the filesystem is compared to the list of operations and any partially completed operations are either completed or undone. Unfortunately, this analysis is complicated and requires many changes to the existing filesystem software. This is the type of journaling implemented on Vertas' VxFS and Solaris UFS.

We used physical block logging to develop WAPBL because it is simple to implement and does not require a dramatic and potentially dangerous rework of the existing filesystem code. This is the type of journaling implemented on Apple's HFS+, Linux's ext3fs, and Beos's BeFs.

With this form of journaling, filesystem operations are completed in memory. Their corresponding physical block changes are then recorded to the journal. Once the journal is guaranteed to be safely on disk, those same blocks can be written to the filesystem. To recover from a crash, we copy the blocks from the journal to the filesystem, since it is not harmful to rewrite blocks to the filesystem that have already been written.

This is the ideal form of journaling to add to an existing filesystem such as UFS/UFS2. The existing filesystem code converts logical operations to physical block changes and the buffer cache can be used to assemble these in memory and hold them as they get written first to the journal and then to the filesystem.