Loading Module

Each module is made up of object code that can be dynamically linked to the running kernel by the insmod program and can be unlinked by the rmmod program. modprobe, like insmod, loads a module into the kernel. It differs in that it will look at the module to be loaded to see whether it references any symbols that are not currently defined in the kernel. If any such references are found, modprobe looks for other modules in the current module search path that define the relevant symbols. When modprobe finds those modules (which are needed by the module being loaded), it loads them into the kernel as well. If you use insmod in this situation instead, the command fails with an “unresolved symbols” message left in the system logfile.

As mentioned before, modules may be removed from the kernel with the rmmod utility. Note that module removal fails if the kernel believes that the module is still in use (e.g., a program still has an open file for a device exported by the modules), or if the kernel has been configured to disallow module removal.

The lsmod program produces a list of the modules currently loaded in the kernel. Some other information, such as any other modules making use of a specific module, is also provided. lsmod works by reading the /proc/modules virtual file. Information on currently loaded modules can also be found in the sysfs virtual filesystem under /sys/module.

Module Types

The Linux way of looking at devices distinguishes between three fundamental device types. Each module usually implements one of these types, and thus is classifiable as a char module, a block module, or a network module. The three classes are:

  • Character devices
    A character (char) device is one that can be accessed as a stream of bytes (like a file). Such a driver usually implements at least the open, close, read, and write system calls. The text console (/dev/console) and the serial ports (/dev/ttyS0 and friends) are examples of char devices. The only relevant difference between a char device and a regular file is that you can always move back and forth in the regular file, whereas most char devices are just data channels, which you can only access sequentially.
  • Block devices
    A block device is a device (e.g., a disk) that can host a filesystem. Linux allows the application to read and write a block device like a char device—it permits the transfer of any number of bytes at a time. As a result, block and char devices differ only in the way data is managed internally by the kernel, and thus in the kernel/driver software interface. Like a char device, each block device is accessed through a filesystem node in the /dev directory. Block drivers have a completely different interface to the kernel than char drivers.
  • Network interfaces
    Any network transaction is made through an interface, that is, a device that is able to exchange data with other hosts. A network interface is in charge of sending and receiving data packets, driven by the network subsystem of the kernel, without knowing how individual transactions map to the actual packets being transmitted. A network driver knows nothing about individual connections; it only handles packets.

Character devices are conventionally located in the /dev directory. Special files for char drivers are identified by a “c” in the first column of the output of ls -l. Block devices appear in /dev as well, but they are identified by a “b”. If you issue the ls -l command, you’ll see two numbers (separated by a comma) in the device file entries before the date of the last modification. These numbers are the major and minor device number for the particular device. Major numbers are 1, 4, 7, and 10, while the minors are 1, 3, 5, 64, 65, and 129.

 crw-rw-rw-    1 root     root       1,   3 Apr 11  2022 null
 crw-------    1 root     root      10,   1 Apr 11  2022 psaux
 crw-------    1 root     root       4,   1 Oct 28 03:04 tty1
 crw-rw-rw-    1 root     tty        4,  64 Apr 11  2022 ttys0
 crw-rw----    1 root     uucp       4,  65 Apr 11  2022 ttyS1
 crw--w----    1 vcsa     tty        7,   1 Apr 11  2022 vcs1
 crw--w----    1 vcsa     tty        7, 129 Apr 11  2022 vcsa1
 crw-rw-rw-    1 root     root       1,   5 Apr 11  2022 zero

Major number identifies the driver associated with the device. For example, /dev/null and /dev/zero are both managed by driver 1, whereas virtual consoles and serial terminals are managed by driver 4.

Example

#include <linux/init.h>
#include <linux/module.h>
MODULE_LICENSE("BSD/GPL");

static int hello_init(void)
{
    printk(KERN_ALERT "Hello, world\n");
    return 0;
}

static void hello_exit(void)
{
    printk(KERN_ALERT "Goodbye\n");
}

module_init(hello_init);
module_exit(hello_exit);

This module defines two functions, one to be invoked when the module is loaded into the kernel (hello_init) and one for when the module is removed (hello_exit). The module_init and module_exit lines use special kernel macros to indicate the role of these two functions. Another special macro (MODULE_LICENSE) is used to tell the kernel that this module bears a free license; without such a declaration, the kernel complains when the module is loaded.

The printk function is defined in the Linux kernel and made available to modules. The kernel needs its own printing function because it runs by itself, without the help of the C library. The module can call printk because, after insmod has loaded it, the module is linked to the kernel and can access the kernel’s public symbols (functions and variables).

Initialization and Shutdown

As already mentioned, the module initialization function registers any facility offered by the module. The actual definition of the initialization function always looks like:

static int _ _init initialization_function(void)
{
    /* Initialization code here */
}
module_init(initialization_function);

Initialization functions should be declared static, since they are not meant to be visible outside the specific file. The _ _init token is a hint to the kernel that the given function is used only at initialization time. The module loader drops the initialization function after the module is loaded, making its memory available for other uses. Use of _ _init is optional, but it is worth the trouble. Just be sure not to use them for any function (or data structure) you will be using after initialization completes

Every nontrivial module also requires a cleanup function, which unregisters interfaces and returns all resources to the system before the module is removed. This function is defined as:

static void _ _exit cleanup_function(void)
{
    /* Cleanup code here */
}
module_exit(cleanup_function);

The cleanup function has no value to return, so it is declared void. The _ _exit modifier marks the code as being for module unload only (by causing the compiler to place it in a special ELF section). If your module is built directly into the kernel, or if your kernel is configured to disallow the unloading of modules, functions marked __exit are simply discarded. For this reason, a function marked _ _exit can be called only at module unload or system shutdown time; any other use is an error. If your module does not define a cleanup function, the kernel does not allow it to be unloaded.

Kernel Symbol Table

We’ve seen how insmod resolves undefined symbols against the table of public kernel symbols. The table contains the addresses of global kernel items—functions and variables—that are needed to implement modularized drivers. When a module is loaded, any symbol exported by the module becomes part of the kernel symbol table. In the usual case, a module implements its own functionality without the need to export any symbols at all. You need to export symbols, however, whenever other modules may benefit from using them. New modules can use symbols exported by your module, and you can stack new modules on top of other modules.

If your module needs to export symbols for other modules to use, the following macros should be used.

EXPORT_SYMBOL(name);
EXPORT_SYMBOL_GPL(name);

Either of the above macros makes the given symbol available outside the module. The _GPL version makes the symbol available to GPL-licensed modules only. Symbols must be exported in the global part of the module’s file, outside of any function, because the macros expand to the declaration of a special-purpose variable that is expected to be accessible globally. This variable is stored in a special part of the module executible (an “ELF section”) that is used by the kernel at load time to find the variables exported by the module