~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ What is it? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ DFS is a fully distributed file system. Unlike most network file systems which operate in a client/server model where one or more servers keep the master copy and the clients just access it, DFS in actually a peer-to-peer file system with the ability to handle disconnected operation. It provides the accessibility of a network file system, the speed of a local hard drive, the redundancy of mirroring and the scalability of a RAID. At least... That's what it _will_ be. Right now, it's just a prototype and proof of concept with really only the speed and redundancy being supported to some degree. But looking in to the future... DFS accomplishes this by registering itself with the operating system to process file system requests. When a file is created or modified, it instantly informs all DFS machines of the change and opens itself to send the new version to others. By keeping a local copy we get speed and the ability to operate when disconnected from the network; by sending copies to other machines we get redundancy. The real key to DFS, though, is that not all machines will keep copies of all files. Each machine will keep the files it uses the most and perhaps a few others, leaving the remaining files to be stored on other machines but always ensuring that any given file is held by a minimum number of hosts. It is this feature that sets DFS apart from similar schemes such Coda and InterMezzo. If a request is made by an application for a file not held locally, DFS will automatically and transparently fetch that file from the nearest host that does have a copy. By not requiring each host to have a complete copy of the filesystem, any number of hosts past X (where X is the minimum number of copies the system will keep of any given file) will approximate a linear addition to the total file system space. It also means that ordinary "client" machines, with disks typically much smaller than those in "server" machines, can act as peers and contribute to the whole instead of just using their disk as a cache of what is stored on the network. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Why was it written? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The idea of DFS was born somewhere in the mid 90's for reasons I can no longer recall but never had any work done on it beyond writing down some of the requirements. Then, some 10 years later, while working on how to expand my company's network to multiple subnets with shared files, the idea arose again. None of NFS, Coda, and InterMezzo had the scalability I wanted while OceanStore was far, far beyond what I wanted. So being a sucker for punishment (me being Brian White, by the way), I took many of my free hours see if I could turn the idea in to a reality.