Speculate about the nature of file system use and extrapolate possible improvements
Present performance and efficiency enhancements to the existing UNIX file system.
Compare and contrast performance and efficiency benchmarks after implementation of
enhancements.
Modifications
Increasing the block size (from 512 bytes to >= 4096 bytes / block)
Using principles of locality to position data block on the disk thereby reducing
latency, etc.
Long filenames
File Locking (replacing the need for conventional "lock files")
Symbolic Links (both relative and absolute path names)
Increase the reliability of rename (without copying)
Quotas
Implementation
4096 or larger blocks 1024 block segments.
Scatter the superblock across multiple clusters.
Superblock contained a vector os lists called rotational layout tables.
Bitmap array to hold "free" segments.
Three step allocation hierarchy for writing files that had increased in size.
Cylinder groups comprised of one or more consecutive cylinders on a disk, used to
organize files within a directory.
Global allocator call the local allocator which uses a four level allocation strategy
for data placement.
Achievements
Intelligent organization of directories and files to take advantage of locality
resulting in increased throughput.
Developed a system to derive the benefits of large blocks while still coping with the
resulting waste of internal fragmentation.
Made the existing file system more robust through the addition of useful features.
Drawbacks
High CPU usage is required to implements solutions to problems resulting from larger
block size (impractical for commercial use).
The number of small files that tend to exist on a file system prevent larger blocks from
being feasible without some way of dealing with internal fragmentation (intolerable amount
of waste).
Quotas do not deal with files created by a user, hard linked to by another user, and
deleted by the creator.
No built in facility for eliminating "dangling" symbolic links.
File system relies on a machine dependent parameter that can cause a performance drop if
the file system is moved between physical machines.
Questions
Are the assumptions on which these improvements are based still valid today.
(I.e., Do most file systems still contain many short files? Do larger blocks still
increase performance?)
Given the power of today's CPU's, would the percentage of cycles needed to implement the
proposed improvements be the about the same? Is the cost still prohibitive?
Which optimizations still make sense given today's disks?
Do you know the parameters necessary to optimize the disk drive in your workstation?
What do you think of the "enhancements"? Symlinks, locks, long file names,
rename, quotas.
* UNIX was a trademark of Bell Laboratories at the time...