This is a short note to detail the issues that are raised by the filesystem operation of renaming a directory from its parent directory to some other directory. Background A filesystem typically allows a nested collection directories: the root directory contains subdirectories, which in turn contain subdirectories, and so on. The on-disk state of the filesystem will typically lag the in-memory state. Indeed, more advanced filesystems may even permit the on-disk state to diverge from the in-memory state, so that the on-disk state may not represent any state that occurred in-memory. For example, we create a file "foo.txt" then a file "bar.txt". It is quite possible that a filesystem that then crashed might restart in a state where the file "bar.txt" exists, but "foo.txt" does not, even though this state was never seen in-memory. Implementation optimizations Implementations of filesystems would like to treat every file and every directory as indepe...
[NOTE: this post was inspired by looking at the mirage/index code and trying to understand the properties that code provides in a crash scenario] In systems with reasonable claims to robustness in the event of system crash, the issue arises that we want to write to a file in an "atomic" way. Let's simplify by considering appending data to a file. Let's also assume that the filesystem performs blk-sized blk-aligned writes atomically (but multiple such writes may be reordered). Simplify further by assuming we only want to append a single block of data to a file at a position that is a multiple of the block size. What can go wrong? The problem is that there are many moving parts, each of which can result in incorrectness. Here are some possibilities: We append a new block at the end of the file (at a position that is a multiple of the blk size). This should be atomic. Has the FS (filesystem) freelist correctly recorded that the block is now in use? Has the FS updated th...
TL;DR: Writing documentation can be beneficial even for a lone programmer writing code that will never be read by anyone else; however, the benefits, when other programmers are involved, are cumulative and potentially huge in terms of time saved (of those programmers) when trying to grok the existing codebase. It is well known that programmers in general hate to write documentation. There are lots of reasons for this. For example, in the heat of writing code, the programmer simply cannot envisage what it would be like for someone else to come along in a couple of years and try to understand the code from scratch... And anyway, what does the programmer owe to that future person? (The programmer will likely have moved on by that point...) etc. etc. So, poor documentation (or lack of documentation altogether) is a problem that stems from the programmer, and probably left unfixed by the managers of the programmer, right up the chain of responsibility. This post isn't an attempt to add...
Comments
Post a Comment