158
1 0
T H E V N O D E L AY E R
can point to the same
ofile
structure. The POSIX call
dup()
depends on
this functionality to be able to duplicate a file descriptor. Similarly, different
ofile
structures can point to the same vnode, which corresponds to the abil-
ity to open a file multiple times in the same program or in different programs.
The separation of the information maintained in the
ofile
structure and the
vnode
that it refers to is important.
Another important thing to notice about the above diagram is that every
vnode
structure has a vnode-id. In the BeOS, every vnode has a vnode-id
that uniquely identifies a file on a single file system. For convenience, we
abbreviate the term "vnode-id" to just "vnid." Given a vnid, a file system
should be able to access the i-node of a file. Conversely, given a name in a
directory, a file system should be able to return the vnid of the file.
To better understand how this structure is used, let's consider the concrete
example of how a
write()
on a file descriptor actually takes place. It all starts
when a user thread executes the following line of code:
write(4, "hello world\n", 12);
In user space, the function
write()
is a system call that traps into the
kernel. Once in kernel mode, the kernel system call handler passes control
to the kernel routine that implements the
write()
system call. The kernel
write()
call,
sys write()
, is part of the vnode layer. Starting from the calling
thread's
ioctx
structure,
sys write()
uses the integer file descriptor (in this
case, the value 4) to index the file descriptor array,
fdarray
(which is pointed
to by the
ioctx
). Indexing into
fdarray
yields a pointer to an
ofile
structure.
The
ofile
structure contains state information (such as the position we are
currently at in the file) and a pointer to the underlying vnode associated with
this file descriptor. The
vnode
structure refers to a particular vnode and also
has a pointer to a structure containing information about the file system that
this vnode resides on. The structure containing the file system information
has a pointer to the table of functions supported by this file system as well as
a file system state structure provided by the file system. The vnode layer uses
the table of function pointers to call the file system
write()
with the proper
arguments to write the data to the file associated with the file descriptor.
Although it may seem like a circuitous and slow route, this path from
user level through the vnode layer and down to a particular file system hap-
pens very frequently and must be rather efficient. This example is simplified
in many respects (for example, we did not discuss locking at all) but serves
to demonstrate the flow from user space, into the kernel, and through to a
particular file system.
The BeOS vnode layer also manages the file system name space and han-
dles all aspects of mounting and unmounting file systems. The BeOS vnode
layer maintains the list of mounted file systems and where they are mounted
in the name space. This information is necessary to manage programs travers-
ing the hierarchy as they transparently move from one file system to another.
Practical File System Design:The Be File System
, Dominic Giampaolo
page 158