LWN.net Logo

 
Sponsored Link
TrustCommerce

E-Commerce & credit card processing - the Open Source way!


 
Summary page
Return to the Kernel page
 
Recent Features

LWN.net Weekly Edition for March 18, 2004

LWN.net Weekly Edition for March 11, 2004

The annotated SCO stock price chart

A grumpy editor's calendar search

LWN.net Weekly Edition for March 4, 2004

Printable page
 

 

Driver Porting: block layer overview

This article is part of the LWN Porting Drivers to 2.6 series.
The first big, disruptive changes to the 2.6 kernel came from the reworking of the block I/O layer. As one might guess, the result of all this work is a great many changes as seen by driver authors - or anybody else who works with block I/O. The transition may be painful for some, but it's worth it: the new block layer is easier to work with and offers much better performance than its predecessor.

Fully covering the changes that have been made will require a whole series of articles. So we'll start with an overview which highlights the major changes that have been made without getting into any sort of detail. Subsequent articles will fill in the rest.

Note that parts of the block layer remain volatile - this development is not yet complete. We'll keep up with further changes as they happen.

So, what has changed with the block layer?

  • A great deal of old cruft is gone. For example, it is no longer necessary to work with a whole set of global arrays within block drivers. These arrays (blk_size, blksize_size, hardsect_size, read_ahead, etc.) have simply vanished. The kernel still maintains much of the same information, of course, but the management of that information is much improved.

  • As part of the cruft removal, most of the <linux/blk.h> macros (DEVICE_NAME, DEVICE_NR, CURRENT, INIT_REQUEST, etc.) have been removed; <linux/blk.h> is now empty. Any block driver which used these macros to implement its request loop will have to be rewritten. It is still possible to implement a simple request loop for straightforward devices where performance is not a big issue, but the mechanisms have changed.

  • The io_request_lock is gone; locking is now done on a per-queue basis.

  • Request queues have, in general, gotten more sophisticated. Quite a bit of work has been done in the area of fancy request scheduling (though drivers don't generally need to know about that). There is simple support for tagged command queueing, along with features like request barriers and queue-time device command generation. Request queues must be allocated dynamicly in 2.6.

  • Buffer heads are no longer used in the block layer; they have been replaced with the new "bio" structure. The new representation of block I/O operations is designed for flexibility and performance; it encourages keeping large operations intact. Simple drivers can pretend that the bio structure does not exist, but most performance-oriented drivers - i.e. those that want to implement clustering and DMA - will need to be changed to work with bios.

    One of the most significant features of the bio structure is that it represents I/O buffers directly with page structures and offsets, not in terms of kernel virtual addresses. By default, I/O buffers can be located in high memory, on the assumption that computers equipped with that much memory will also have reasonably modern I/O controllers. Support operations have been provided for tasks like bio splitting and the creation of DMA scatter/gather maps.

  • Sector numbers can now be 64 bits wide, making it possible to support very large block devices.

  • The rudimentary gendisk ("generic disk") structure from 2.4 has been greatly improved in 2.6; generic disks are now used extensively throughout the block layer. Among other things, each generic disk has its own block_device_operations structure; the operations are no longer directly associated with the driver. The most significant change for block driver authors, though, may be the fact that partition handling has been moved up into the block layer, and drivers no longer need know anything about partitions. That is, of course, the way things should always have been.

Subsequent articles will explore the above changes in depth; stay tuned.


No comments have been posted. Post one now

Copyright (©) 2003, Eklektix, Inc.
Linux (®) is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.