To buffer or not to buffer

May 19, 2024

Musings

The DMA-based UART driver in JeeH is an interesting example of the interaction between hardware, memory use, and blocking behaviour.

Reading data #

In JeeH, the way to read bytes from the UART is to send a message to its device driver, and wait for its reply. The mTag field is 'R', with the mLen and mPtr field starting off as zero. The driver uses a small ring buffer internally, which is filled in via DMA as bytes arrive. The CPU is not involved.

There are three cases which trigger an IRQ and produce a reply:

There is some data and the middle of the ring buffer has been reached.
There is some data and the end of the ring buffer has been reached.
There is some data and the input has gone idle for at least one character time.

In each of these case, the reply is returned with mPtr set to the first available byte and mLen set to the number of contiguous bytes available. Since it’s a ring buffer, data will end up arriving in small pieces.

But there is a twist: the available data is not removed when the reply is sent, i.e. the ring buffer pointer is not advanced. Instead, the driver expects to see a new read request with the mPtr and mLen indicating which part has been accepted by the app, and can therefore be released from the ring buffer.

In normal use, we simply repeat the read request, leaving mPtr and mLen as is. This will release everything read, and start a fresh read cycle.

This design allows reading only as much as needed, with as most common example reading only up to and including a separator, such as a newline. Otherwise, we would need to either read one character at a time, or read too much and save the excess data in another buffer, to be consumed later.

In this design, there is no need for an extra buffer. We also don’t need to specify how much data to read up front: the reply will tell us how much is available, based on the three cases mentioned earlier, and we “take as much as we like”.

But there is one exceptional, though: what if the app no longer wants to read more data. It needs to tell the driver to release what it last read, without starting a new request. This special case is handled by sending a read request with mLen > 0 and mPtr == nullptr.

A drawback of this ringbuffer design, is that when the input buffer overruns, it can mess up what is received, with some data lost, but also some data potentially returned twice.

Writing data #

Writing data is somewhat different. The app “owns” the data it wants to send, and must keep that memory area intact while the DMA request is running and sending out all the bytes. It does this by sending a message to the UART driver, with mTag set to 'W' and mPtr + mLen pointing to its data area.

In many situations, we don’t care how much time it takes to send this data. For this reason, the UART driver has a small transmit buffer, into which it will copy the application data and immediately reply to signal that the data is no longer needed.

This works as long as the TX buffer in the driver has room. Otherwise, the incoming message gets queued, and will be processed by the driver later. The app will not get a reply as long as its data needs to stay around.

For large sends exceeding the size of the transmit, the request is immediately queued, and no copying will take place at all: the DMA transfer will use the app memory area itself.

Sometimes, you want to make sure that the data has actually been sent, even when it’s a small amount which might easily fit into the transmit buffer. There is special logic for this, triggered by sending zero bytes of data. Such an “empty” request will always be queued and replied to after all pending requests have been completed.

Some examples #

Read data forever in blocking mode, as it comes in:

Message m { uart.id(), 'R' };
while (true) {
    sys::call(m);
    ... process [m.mPtr, m.mPtr+m.mLen-1]
}

Read data forever, one byte at a time:

Message m { uart.id(), 'R' };
while (true) {
    sys::call(m);
    ... process the byte at *m.mPtr
    m.mlen = 1;
}

Write data, blocking as needed:

uint8_t buffer [10];
... fill the buffer
Message m { uart.id(), 'W', buf, 10 };
sys::call(m);

Write data and then block until actually sent:

uint8_t buffer [10];
... fill the buffer
Message m { uart.id(), 'W', buf, 10 };
sys::call(m);
m.mLen = 0;
sys::call(m);

It should be fairly straightforward to implement wrappers around all this, such as reading a specific number of bytes or reading up to a newline terminator.