Tuesday, January 20, 2026

Ring Buffers & ioctl() – Time-Series Telemetry in Kernel Space

 


Ring Buffers & ioctl() – Time-Series Telemetry in Kernel Space

The Problem: One-Shot Temperature Reads Aren't Enough

At Cepheid, we shipped diagnostic instruments that had to detect thermal anomalies in real-time. The GPU would throttle mid-test, and field engineers couldn't debug it because they only had the last temperature reading. They needed history—thermal trajectory, not just a snapshot.

This is where character devices hit their limit. In the previous post, /dev/tempmon gave you one temperature each time you cat it. Useful for a quick check, useless for diagnostics. You need a buffer that accumulates samples over time, so you can read the last 100 samples and see what happened.

Enter the ring buffer (circular buffer). Fixed-size memory that wraps around. When full, it overwrites the oldest data. Kernel-side it's fast; userspace sees a continuous stream of recent history.


What Is a Ring Buffer?

 

A ring buffer is a fixed-size array with two pointers:

  • head: Next write position
  • tail: Oldest unread data
Array: [sample_0, sample_1, ..., sample_1023]
       ^                            ^
       |                            |
      tail                         head

When head reaches end, it wraps: head = (head + 1) % BUFFER_SIZE
If head catches tail, tail wraps too: We drop the oldest sample.

Why ring buffers for GPU telemetry?

  • Fixed memory footprint (no malloc/free in kernel)
  • O(1) write (just increment pointer)
  • No garbage collection or allocation fragmentation
  • Wrap-around is automatic modulo arithmetic

For a GPU running at 1kHz sample rate, a 1024-sample buffer = 1 second of history. Lose old data gracefully. No surprises.


The Code: Adding Ring Buffer to Character Device

We already have this working. Let's walk through the key parts:

Data Structure

struct temp_sample{
    u64 timestamp;
    int temp_celsius;
};
static struct temp_sample sample_buffer[BUFFER_SIZE];
static int head = 0, tail = 0;

Each sample captures when (nanosecond timestamp via ktime_get_ns()) and what (temperature in Celsius).

Adding a Sample

static void add_sample(int temp){
    sample_buffer[head].timestamp = ktime_get_ns();
    sample_buffer[head].temp_celsius = temp;
    head = (head + 1) % BUFFER_SIZE;
    if (head == tail) tail = (tail + 1) % BUFFER_SIZE;
}

Line by line:

  • Capture timestamp (kernel monotonic time, unaffected by NTP)
  • Store temperature
  • Advance head with wraparound
  • If head catches tail (buffer full), advance tail to drop oldest sample

This runs in microseconds. No locks (single kernel thread for now, but you'd add spinlock for real devices).

Reading Via Character Device

static ssize_t temp_read(struct file *filp, char __user *buf,
                        size_t len, loff_t *off){
    char temp_data[64];
    int temp = 45 + (get_random_u32() % 30);  // Simulate temp
    int bytes;

    if (*off > 0) return 0;  // Already read, return EOF
    
    add_sample(temp);  // Add to ring buffer
    
    bytes = snprintf(temp_data, sizeof(temp_data),
                "Temperature: %d C [Buffered: %d samples]\n",
                temp, (head - tail + BUFFER_SIZE) % BUFFER_SIZE);
    
    if (copy_to_user(buf, temp_data, bytes)){
        return -EFAULT;  // Copy failed
    }
    
    *off += bytes;
    return bytes;
}

Each read() call:

  1. Adds one temperature sample to the ring buffer
  2. Reports current temp + buffer occupancy
  3. Returns formatted string to userspace

The (head - tail + BUFFER_SIZE) % BUFFER_SIZE formula handles wraparound. If head=10, tail=5, that's 5 samples. If head=5, tail=10 (wrapped), that's 1019 samples (1024 - 5 = 1019). The modulo handles it.


ioctl() – Kernel Commands from Userspace

Character devices are read-only so far. But you need to configure the kernel module: get sample count, clear the buffer, set thresholds. Enter ioctl() (input/output control)—a syscall for device-specific commands.

static long temp_ioctl(struct file *filp, unsigned int cmd, unsigned long arg){
    int count;

    switch(cmd){
        case 0:  // Get sample count
            count = (head - tail + BUFFER_SIZE) % BUFFER_SIZE;
            return count;
        
        case 1:  // Clear buffer
            head = tail = 0;
            return 0;
        
        default:
            return -EINVAL;  // Invalid command
    }
}

How to use from userspace:

# Get sample count
ioctl_request 0  # Returns # of samples in buffer

# Clear buffer
ioctl_request 1  # Resets head/tail

In the file_operations struct, register this:

static struct file_operations fops = {
    .owner = THIS_MODULE,
    .read = temp_read,
    .unlocked_ioctl = temp_ioctl,  // <-- Add this
};

Module Initialization & Cleanup

Registration is the same as before:

static int __init temp_init(void){
    alloc_chrdev_region(&dev_num, 0, 1, DEVICE_NAME);
    cdev_init(&temp_cdev, &fops);
    cdev_add(&temp_cdev, dev_num, 1);

    temp_class = class_create(DEVICE_NAME);
    device_create(temp_class, NULL, dev_num, NULL, DEVICE_NAME);

    printk(KERN_INFO "Temp Monitor: Device with ring buffer created at /dev/%s\n", DEVICE_NAME);
    return 0;
}

static void __exit temp_exit(void){
    device_destroy(temp_class, dev_num);
    class_destroy(temp_class);
    cdev_del(&temp_cdev);
    unregister_chrdev_region(dev_num, 1);
    printk(KERN_INFO "TempMonitor: Device removed\n");
}

Building & Testing

make
sudo insmod temp_monitor_character_device_ring_buffer.ko

# Read multiple times, see buffer fill up
for i in {1..10}; do cat /dev/tempmon; sleep 0.1; done

# Output: Temperature: X C [Buffered: N samples]

Each read adds a sample. After 10 reads, you see [Buffered: 10 samples].


Why This Matters

Ring buffers are invisible infrastructure in production systems:

  • Prometheus uses them for metrics (fixed-size circular storage)
  • NVIDIA driver telemetry uses them for thermal events
  • Any telemetry system needs them to handle burst capture without malloc/free

This module demonstrates:

  • Time-series data collection in kernel space
  • Efficient wraparound logic (no allocations)
  • ioctl() for device-specific control
  • Understanding of kernel constraints (fixed memory, no sleep)

Next: Userspace Library Wrapper

Character devices work, but manually managing /dev/tempmon from userspace is tedious. Next post: build a C library (libtempmon) that:

  • Opens/closes the device
  • Reads current + buffered samples
  • Handles ioctl() commands
  • Exposes a clean API: tempmon_get_samples(), tempmon_clear_buffer()

Then: Python bindings, anomaly detection, CUDA acceleration.


Code Repository

NVTherm on GitHub

Branch: mainkernel_module/temp_monitor_character_device_ring_buffer.c



No comments:

Post a Comment

Fixing Race Conditions: SPSC Ring Buffer with Spinlock

  The Problem We Ignored Yesterday's ring buffer had a subtle but critical bug: race condition on head/tail pointers. Here's what...