[ Curiosity,Experimentation ]

Random stuff from the parallel universe of Ones and Zeroes

Posts Tagged ‘Kernel’

Writing a Linux character Device Driver

Posted by appusajeev on June 18, 2011

In this post, we would be writing a Linux device driver for a hypothetical character device which reverses any string that is given to it. i.e.  If we write any string to the device file represented by the device and then read that file, we get the string written earlier but reversed (for eg.,  myDev being our device, echo “hello” >/dev/myDev ; cat /dev/ myDev would print “olleh”).
We will be simulating the functionality of the device in the driver itself (and this is precisely what is done in emulation tools like Daemon Tools, Alcohol etc).

Download Code

We will be dealing with

  • Introduction and some basics
  • Creating a device file
  • The driver code
  • Compiling the driver
  • Loading and unloading the driver
  • Testing the driver

The post helps understand how to write a device driver, the significance of a device file and its role in the interaction between a user program and device driver.


The devices in UNIX fall in two categories- Character devices and Block devices. Character devices can be compared to normal files in that we can read/write arbitrary bytes at a time (although for most part, seeking is not supported).They work with a stream of bytes. Block devices, on the other hand, operate on blocks of data, not arbitrary bytes. Usual block size is 512 bytes or larger powers of two. However, block devices can be accessed the same was as character devices, the driver does the block management. (Networking devices do not belong to these categories, the interface provided by these drivers in entirely different from that of char/block devices)

The beauty of UNIX is that devices are represented as files. Both character devices and block devices are represented by respective files in the /dev directory. This means that you can read and write into the device by manipulating those file using standard system calls like open, read, write, close etc.
For eg, you could directly write or read the hard disk by accessing /dev/sd* file – a dangerous act unless you know what you are doing (for those interested, try hexdump  –C  /dev/sda   –n  512 – what you see then is the boot sector of your hard disk !). As another example, you could directly see the contents of the RAM by reading /dev/mem.

Every device file represented in this manner is associated with the device driver of that device which is actually responsible for interacting with the device on behalf of the user request. So when you access a device file, the request is forwarded to the respective device driver which does the processing and returns the result.

For instance, you might be knowing about the files /dev/zero (an infinite source of zeroes), /dev/null (a data black hole), /dev/random ( a source of random numbers) etc. When you actually read these files, what happens is that a particular function in the device driver registered for the file is invoked which returns the respective data.

In our example, we will be developing a character device represented by the device file /dev/myDev.  The mechanisms for creating this file will be explained later.

Under the hood

Now how does Linux know which driver is associated with which file? For that, each device and its device file has associated with it, a unique Major number and a Minor number. No two devices have the same major number.  When a device file is opened, Linux examines its major number and forwards the call to the driver registered for that device.  Subsequent calls for read/write/close too are processed by the same driver. As far as kernel is concerned, only major number is important. Minor number is used to identify the specific device instance if the driver controls more than one device of a type.

To know the major, minor number of devices, use the ls – l command as shown below.

ls -l

ls -l

The starting ‘c’  means its a character device, 1 is the major number and 8 is the minor number.

A Linux driver is a Linux module which can be loaded and linked to the kernel at runtime. The driver operates in kernel space and becomes part of the kernel once loaded, the kernel being monolithic. It can then access the symbols exported by the kernel.

When the device driver module is loaded, the driver first registers itself as a driver for a particular device specifying a particular Major number.

It uses the call register_chrdev function for registration. The call takes the Major number, Minor number, device name and an address of a structure of the type file_operations(discussed later) as argument. In our example, we will be using a major number of 89 . The choice of major number is arbitrary but it has to be unique on the system.

The syntax of register_chrdev is

int register_chrdev(unsigned int major,const char *name,struct file_operations *fops)

Driver is unregistered by calling the unregister_chrdev function.

Since device driver is a kernel module, it should implement init_module and cleanup_module functions. The register_chrdev call is done in the init_module function  and unregister_chrdev call is done in the cleanup_module function.

The register_chrdev call returns a non-negative number on success. If we specify the Major number as 0, the kernel returns a Major number unique at that instant which can be used to create a device file.
A device file can be created either before the driver is loaded if we know the major and minor number beforehand or it can be created later after letting the driver specify a major number for us.

Apart from those, the driver must also define certain callback functions that would be invoked when file operations are done on the device file. Ie. It must define functions that would be invoked by the kernel when a process uses open, read, write, close system calls on the file. Every driver must implement functions for processing these requests.
When register_chrdev call is done, the fourth argument is a structure that contains the addresses of these callback functions, callbacks for open, read, write, close system calls. The structure is of the type file_operations and has 4 main fields that should be set  – read,write,open and release. Each field must be assigned an address of a function that would be called when open, read,write , close system calls are called respectively.  For eg

file_operations structure initialisation

file_operations structure initialisation

It is important to note that all these callback functions have a predefined prototype although the name can be any.

Creating a device file

A device file is a special file. It can’t just be created using cat or gedit or shell redirection for that matter. The shell command mknod is usually used to create device file. The syntax of mknod is

mknod path type major minor

path:-path where the file to be created. It’s not necessary that the device file needs to be created in the /dev directory. It’s a mere convention. A device file can be created just about anywhere.

type: -‘c’ or ‘b’ . Whether the device being represented is a character device or a block device. In our example, we will be simulating a character device and hence we choose ‘c’.

major, minor:- the major and minor number of the device.

Heres how

mknod command

mknod command

chmod, though not necessary is done because, if not done, only processes will root permission can read or write to our device file.

The driver code

Given below is the code of the device driver

 Download Code

Device Driver Code

Device Driver Code

For debugging, I have included some printk messages in the code. To see those messages while the driver is in action, do dmesg|tail

Compiling the driver

A Linux module cannot just be compiled the way we compile normal C files.  cc filename.c won’t work. Compiling a Linux module is a separate process by its own. We use the help of kernel Makefile for compilation. The makefile we will be using is.

makefile for module compilation

makefile for module compilation

Here, we are making use of the kbuild mechanism used for compiling the kernel.
The result of compilation is a ko file (kernel object) which can then be loaded dynamically when required.

Loading and Unloading the Driver

Once the compilation is complete, we can use either insmod or modprobe command ( insmod myDev.ko  or modprobe myDev.ko, of course assuming the current directory contains the compiled module). The difference between insmod and modprobe is that modprobe automatically reads our module to find any dependencies on other modules and loads them before loading our module (these modules must be present in the standard path though!). insmod lacks this feature.

To test if the driver has been loaded successfully, do cat /proc/modules and cat /proc/devices.  We should see our module name in the first case and device name in the second.

cat /proc/modules

cat /proc/modules

cat /proc/devices

cat /proc/modules

To unload the driver, use rmmod command. (rmmod myDev.ko)

Testing the driver

To test the driver, we try writing something to the device file and then reading it. For example,

Testing the driver

Testing the driver

See the output. (The reason for the ‘ugly’ output is because echo automatically writes a newline character to the end of string. When the driver reverses the string, the newline is shifted to the front of the string and there is no newline at the end. Hence the result being ‘ugly’)

To see how this can be done from our program, I wrote a demo program given below

Interacting with the driver

Interacting with the driver

Compile it normally(or run make test) and run ./test  some_string  and see the output.

Testing the driver

Testing the driver

Note: You need to be root to compile the module, load the module and unload the module.

This driver interface presented here is an old one, there is a newer one but the old one is still supported.

Posted in Kernel | Tagged: , , , , | 69 Comments »

Writing a 16-bit Real mode OS [NASM]

Posted by appusajeev on January 27, 2011

This article is about writing a minimal 16 bit, real modes DOS like operating system that boots off a pen drive and provides a shell to run pure binary executables(aka COM files in the DOS era)  with a custom file system implemented. This means that the OS could run COM files directly if you have one. COM files are pure binary files in the sense that they don’t have a header, contains machine instructions only. For demonstration, I have developed some sample executables which could be run using our OS (apps like clone of unix echo, register dump etc). See the end of the post to see some pictures of the OS in action.

Download Source

The post explains how to

  • Write a bootloader
  • Write a shell, a kernel placeholder
  • Implement a basic filesystem
  • Write the OS to disk
  • Boot from the OS

The OS is written in the open source NASM assembler in Linux. To understand the working, you need to have some understanding about x86 booting process, bootloaders and real mode of processor operation.

Booting Process Basics

When the system is powered on, BIOS pops into action and performs what is known as Power On Self Test(POST) to verify the working of devices, initializing them etc. Immediately after that, POST loads BIOS executable code, present in the BIOS ROM into memory at address which is usually 0xF00000.  POST then places a jump instruction in the first byte of memory (CS:IP = 0). The jump instruction is nothing but a jump to the address 0xF00000 where the BIOS code is loaded.  Now the BIOS code takes control and performs certain operations like setting up the Interrupt Vector Table(IVT), finding a boot device, setting up certain information in RAM(BIOS Data Area), loading the bootloader  etc.

BIOS provides certain basic interrupts for the programmer for basic functions like loading and storing disk sectors, reading keyboard, printing to screen etc. These interrupts are similar to DOS interrupts but are not DOS interrupts.

Back to topic,  once these processes are over, the BIOS iterates through the list of boot devices and according the boot order preferences, the bios searches for a bootloader in each boot device. If a bootloader is found, its loaded into memory and is given control.

A point to note is that whenever system is powered on, whatever be the processor, Core 2 Duo or Core i7 or whatever operates in 16 bit real mode by default until it is explicitly asked to switch to 32  bit protected mode (by setting PE bit in CR0 register and doing a far jump to fix CS to point to a segment descriptor after setting up GDT).

Bootloader Basics

Bootloader is basically a 512 byte piece of machine code that is present in the first sector a boot device.  Bootloader is the first user defined program that’s loaded into memory and given control of. It is the duty of the bootloader to load the OS into memory and pass control to it.

A bootloader must be exactly 512 bytes in size. BIOS identifies  a valid bootloader by means of a signature. The 511th byte of the bootloader should contain the value 0x55 and 512th byte should have the value 0xAA.

A Bootloader will always be loaded at address 0x7C00 in RAM. Usually, this corresponds to CS:IP pair of 00:0x7C00 but some BIOSes set CS:IP as 0x7C0:0 which is essentially points to the same address  but leads to issues in writing bootloader when we have to specify the offset where our code will be loaded.
This can be easily dealt by defining an offset 0 and then jumping to 0x7C0:start , where start is the label of the next instruction following  the far jump. This jump sets CS = 0x7C0

(Note: Physical address = CS x 16 + IP)

Our Bootloader

Our bootloader, present in the first sector of the pendrive  (the mechanism for writing the bootloader and the OS into the pendrive will be discussed later) will be loaded into memory and execution will immediately start.

Our bootloader serves 3 purposes, it displays a welcome message and then loads the OS and file table from the disk and jumps to OS entry point. The os is loaded at address 0x1000 and file table at address 0x2000

Heres our bootloader source

Download full Source

Our Bootloader

Our Bootloader

Bios provides interrupts for displaying characters as well as strings on screen. Here I have used character display interrupt to write a function to print a zero-terminated string like in C.

The interrupt for displaying character on screen is

INT 0x10, BX = character color, AL = character to display, AH = 0x0E

Our bootloader, OS and executables are stored on  disk . For execution, they need to be loaded to RAM. The bootloader will be loaded by BIOS and the rest we have to load when required through the sector loading interrupt.A block of 512 bytes (in this case, need not always be) is called a sector.
For loading sectors off the disk to RAM, bios provides an interrupt, INT 0x13 and the parameters to set are

AH = 2

DL = drive,  DL = 0x80 for hard disk and this applies to our pen drive

DH = head number

CH = track number

CL= sector number of the sector to be loaded

AL = number of sectors to load

ES = segment to load the sector

BX = offset from ES where the sector is to be loaded

Our  bootloader occupies the first sector, file table in the third sector and OS in the third sector.

The bootloader loads the os into address 0x1000:0 and filetable into address 0x2000

Implementing Filesystem

The shell provided by our os enables the user  to type an executable name and run it. For that , the os must know where exactly each executable is located on disk and this is where the concept of file system surfaces. A file system, in simple words is basically a specification that tells how to locate of a file on disk given its name.

For our purpose, I made a simple filesystem. Each file is mapped to a sector in disk to form a string with following structure


This filetable is loaded to 0x2000:0 by the bootloader this address space is scanned to find the sector where the requested file is stored on disk and that sector is brought to RAM.

Our OS/Shell

So, the bootloader has loaded our OS at 0x1000:0 and filetable at 0x2000:0 and it makes a far jump to 0x1000:0, the OS entry point.  The working of the shell is simple, using BIOS interrupt to reach character, we read the executable name from the user. The filetable loaded at 0x2000:0 is scanned for a match, the associated sector number is read and that sector is loaded into memory at 0x9000:0 and the shell makes a far jump to this address. If the name entered by the user cannot be matched in the file table, an error message is shown.  After the executable completes execution, for it to return back to the OS, it must make a far jump to 0x1000:0, the OS entry point. This step is functionally similar to executing (AH= 4ch, INT 0x21 in DOS)

The interrupt for reading a character from a keyboard is INT 0x16 with AH=0, the read character will be available in AL.

Heres our OS source

Download full Source

Our Shell

Our Shell

Writing the OS to disk

Okay so everything is done, the final thing to do to boot from the pen drive, is to write our os into it. It can’t just be done by copying the files to the pen drive. It doesn’t work in our case for two reasons:

1.   We cannot write the bootloader to the first sector using this method. When we ask the OS to copy a file, it copies the file to some free sectors and add this information to the file table.

2. If were to copy the files to the pendrive, our filesystem has to be known to the host OS, like FAT  which means we have to write code to parse that file system during boot time which, well is an overkill for this os

So, to write the bootloader and custom file system, we need to have low level disk access, for which the obvious choice is Linux.  Linux treats devices like files which can be read and written to. This is a really powerful and useful concept. Commands like ‘dd’ use this concept. The file representing our harddisk would be something like sda, something  like sdc for pen drive, it varies. To know the file allocated, after the pen drive is attached, run dmesg|tail in the terminal.

Now whatever we write to the device file will be written as such into the device which is exactly what we want. To write the bootloader, write the compiled boodloader into the first 512 bytes of the file and this would be written to first 512 bytes of the disk and this can be done using C file operations. Pretty easy, isn’t it. Now to write the file table and OS  to the 2nd and 3rd sectors of the disk, just write these files to the next two 512 bytes of the device file . The same procedure applies to writing the executables to disk

The following C code does the job of writing everything to disk. It takes the path of device file, bootloader, os and list of executable as command line arguments.

Copy Program

Copy Program

Booting the OS

Plug in the  pen drive, find its corresponding device file (use dmesg|tail), open the makefile, substitute its path in dev variable. Then run ‘make’ , reboot the system, choose to boot from mass storage and you could see the sweet sight of our OS booting . At the prompt,  type ‘help’ and hit enter, you could see list of available executables, like reg to dump registers, echo that echoes back a string read from the user

Here are some pics of the OS in action(Click to enlarge).

OS in action

OS in action

Our OS in action

Our OS in action

Posted in Kernel | Tagged: , , | 42 Comments »

Implementing a System call in Linux Kernel 2.6.35

Posted by appusajeev on November 13, 2010

As we know, System calls are a set of services/functions provided to the programmer by the OS. These functions can be invoked in any language that provide some interface to the System call mechanism of the OS. Some common linux system calls are open,read,write etc.
While executing a system call, the calling process moves from user space to kernel space and back to user space when its completes executing the call.

There are around 338 system calls in linux kernel by default. Presented here a howto on adding a new one into the kernel, a 339th one so that it will be available globally for any program. As example, we will implement the strcpy function as a system call so that it can be used without including string.h.

Obviously, you need the kernel source tree since some kernel modification is involved. Get it from kernel.org (any kernel version higher than would work fine )and untar it to get linux- The paths used below all will be relative to this path.
We need to edit 4 files and 2 files need to be created.

The Code

First, lets start off writing the code for strcpy. We need to include the file linux/linkage.h because it contains the macro asmlinkage which means that the system call expects the arguments on the stack and not in registers. Printk is the kernel alternative of printf, but with certain peculiar properties.

Code for the system call

Code for the system call

<1>” tells printk that we are giving that message the highest priority.

Create a folder named ‘test’ in the root of linux source directory and save this file as strcpy.c in that directory. Create a Makefile in that directory containing only the line.


Thus, now the strcpy.c file and Makefile are present in


The Edits

The following files need to be edited.

1. linux-

Append to the file the following line

.long sys_strcpy



2. linux-

This file contains the unique number associated with each system call. We can see the names of all the system calls and the number associated with each. After the last system call-number pair (around line 345), add a line

#define __NR_strcpy  338

(if 337 was the number associated with the last system call).
Then replace NR_syscalls directive denoting the total number of system calls by the previous number incremented by 1 ie the new value is

#define NR_syscalls 339

Note down the number 338, we need it later.



3. linux-

This file contains the prototype of all the system calls. Here we append to the file the prototype of our function.
ie. We add  the line

asmlinkage  long sys_strcpy(char *dest,char *src);



4. Makefile

Open Makefile present in the root of source directory and find the area where core-y is defined and add the folder test to the end of that line as shown below



Next compile the kernel. Assuming you are familiar with kernel compilation, execute
make bzImage  –j4
The last argument is optional and is intended to speed up compilation on dual core CPUS
Once compilation is complete, install the kernel by executing the following command with root permission
make install
Once the kernel image is installed to /boot, reboot the system.


Now we need to check the newly done system call. Run the code below and feel the satisfaction 🙂

System call test

System call test

The kernel has performed the strcpy for us. Cool ! isn’t it

Execute the command dmesg now, you can see done printed in the last. Printk by default doesnt  print to the terminal. It writes to the kernel ring buffer which is printed by the dmesg

Try putting an infinite loop inside a system call, the system just drops dead. As it goes, the linux kernel does not preempt itself.

<1> means that we are giving that message the highest priority Create a folder name ‘test’ in the root of linux source directory and save this file as strcpy.c in that directory . Create a makefile in that directory containing the line Obj-y:=strcpy.o The Edits The files to be edited are 1 . /usr/src/linux- Append the line  .long sys_strcpy to the file(Replace sys_strcpy with whatever name you want) 2. /usr/src/linux- This file contains the unique number associated with each system call. We can see the names of all the system calls and the number associated with each. After the last system call-number pair (around line 345), add a line
#define __NR_strcpy  338  (if 337 was the number associated with the last system call).
Then replace NR_syscalls directive denoting the total number of system calls by the previous number incremented by 1 ie the new value id
#define NR_syscalls 339 Note down the number 338, we need it later. 3. /usr/src/linux- This file contains the prototypes of system calls. Here we append to the file the prototype of our file.
ie. We add  the line
asmlinkage  long sys_strcpy(char *dest,char *src);
4. Makefile
Open makefile and find the area where core-y is defined and add the folder test to the end of that line as shown below

Posted in Kernel | Tagged: , , | 26 Comments »