Changing
the Size of a File in an EXT2 File System
One
common problem with a file system is its tendency to fragment. The blocks that
hold the file's data get spread all over the file system and this makes
sequentially accessing the data blocks of a file more and more inefficient the
further apart the data blocks are. The EXT2 file system tries to overcome this
by allocating the new blocks for a file physically close to its current data
blocks or at least in the same Block Group as its current data blocks. Only
when this fails does it allocate data blocks in another Block Group.
Whenever a process attempts to
write data into a file the Linux file system checks to see if the data has gone
off the end of the file's last allocated block. If it has, then it must allocate
a new data block for this file. Until the allocation is complete, the process
cannot run; it must wait for the file system to allocate a new data block and
write the rest of the data to it before it can continue. The first thing that
the EXT2 block allocation routines do is to lock the EXT2 Superblock for this
file system. Allocating and deallocating changes fields within the superblock,
and the Linux file system cannot allow more than one process to do this at the
same time. If another process needs to allocate more data blocks, it will have
to wait until this process has finished. Processes waiting for the superblock
are suspended, unable to run, until control of the superblock is relinquished
by its current user. Access to the superblock is granted on a first come, first
served basis and once a process has control of the superblock, it keeps control
until it has finished. Having locked the superblock, the process checks that
there are enough free blocks left in this file system. If there are not enough
free blocks, then this attempt to allocate more will fail and the process will
relinquish control of this file system's superblock.
If there are enough free blocks
in the file system, the process tries to allocate one.
If the EXT2 file system has been
built to preallocate data blocks then we may be able to take one of those. The
preallocated blocks do not actually exist, they are just reserved within the
allocated block bitmap. The VFS inode representing the file that we are trying
to allocate a new data block for has two EXT2 specific fields, prealloc_block and prealloc_count, which are the block number of
the first preallocated data block and how many of them there are, respectively.
If there were no preallocated blocks or block preallocation is not enabled, the
EXT2 file system must allocate a new block. The EXT2 file system first looks to
see if the data block after the last data block in the file is free. Logically,
this is the most efficient block to allocate as it makes sequential accesses
much quicker. If this block is not free, then the search widens and it looks
for a data block within 64 blocks of the of the ideal block. This block,
although not ideal is at least fairly close and within the same Block Group as
the other data blocks belonging to this file.
If even that block is not free,
the process starts looking in all of the other Block Groups in turn until it
finds some free blocks. The block allocation code looks for a cluster of eight
free data blocks somewhere in one of the Block Groups. If it cannot find eight
together, it will settle for less. If block preallocation is wanted and enabled
it will update prealloc_block
and prealloc_count
accordingly.
Wherever it finds the free block,
the block allocation code updates the Block Group's block bitmap and allocates
a data buffer in the buffer cache. That data buffer is uniquely identified by
the file system's supporting device identifier and the block number of the
allocated block. The data in the buffer is zero'd and the buffer is marked as
``dirty'' to show that it's contents have not been written to the physical
disk. Finally, the superblock itself is marked as ``dirty'' to show that it has
been changed and it is unlocked. If there were any processes waiting for the
superblock, the first one in the queue is allowed to run again and will gain
exclusive control of the superblock for its file operations. The process's data
is written to the new data block and, if that data block is filled, the entire
process is repeated and another data block allocated.
No comments:
Post a Comment