by Paul Williamson KB5MU
In the August 2025 Inner Circle newsletter, I wrote some design notes under the title IIO Timeline Management in Dialogus — Transmit. I came to the tentative conclusion that the simplest approach might be just fine. This time, I will discuss the implementation of that approach. You might want to re-read those articles before proceeding with this one.
In that simplest approach, we set a recurring decision time Td based on the arrival time of the first frame. Ideally, the frames would all arrive at 40ms intervals. We set Td to occur halfway between a nominal arrival time and the next nominal arrival time. No matter what, we push a frame to IIO at Td, and never at any other time. If no frames have arrived for transmission before Td, we push a dummy frame. If exactly one frame has arrived, we push that frame. If somehow more than one frame has arrived, we count an error and push the latest arriving frame. The “window” for arrival of a valid frame is the maximum 40ms wide. In fact, the window is always open, so we don’t need to worry about what to do with frames arriving outside the window.
As part of that approach, Dialogus groups the arriving frames into transmission sessions. The idea is that the frames in a session are to be transmitted back-to-back with no gaps. To make acquisition easier for the receiver, we transmit a frame of a special preamble sequence before transmitting the first normal frame of the session. Also for the benefit of the receiver, we transmit a frame of a special postamble sequence at the very end of the session. The incoming encapsulated frames don’t carry any signaling to identify the end of a session. Dialogus has to infer the session end by noticing that encapsulated frames are no longer arriving. We don’t want to declare the end of a session due to the loss of a single encapsulated frame packet. So, when a Td passes with no incoming frame, Dialogus generates a dummy frame to take its place. This begins a hang time, currently hard-coded to 25 frames (one second). If a frame arrives before the end of the hang time, it is transmitted (not a dummy frame) and the hang time counter is reset. If the hang time expires with no further encapsulated frame arrival. that triggers the transmission of the postamble and the end of the transmission session.
As it turns out, the scheme we arrived at is not so different from what we intended to implement in the first place. Older versions of Dialogus tried to set a consistent deadline at 40ms intervals, just as we now intend to do. There were several implementation problems.
One problem was that the 40ms intervals were only approximated by software timekeeping, which under Linux is subject to drift and uncontrolled extra delays. A bigger problem was that the deadline was set to expire just after the nominal arrival time of the frame, meaning that frames that were only slightly late would trigger the push of a dummy frame. The late frame would then be pushed as well. The double push would soon exceed the four-buffer default capacity of the IIO kernel driver. After that, each push function call would block, waiting for a kernel buffer to be freed.
The original Dialogus architecture had one thread in addition to the main thread. That thread was responsible for receiving the encapsulated frames arriving over the USB network interface. It used the blocking call recvfrom(), so it had to be in a separate thread. That thread also took care of all the processing that was triggered by the arrival of an encapsulated frame, including pushing the de-encapsulated frames to the kernel.
Timekeeping was done in the main thread by checking the kernel’s monotonic timer, then sleeping a millisecond, and looping like that until the deadline arrived. Iterations of this loop were also counted to obtain a rough interval of 10 seconds to trigger a periodic report of statistics. If the timekeeping code detected the deadline passing before the listener thread detected the frame’s arrival, the timekeeping code would push a dummy frame and then the listener thread would push the de-encapsulated frame, and either or both could end up blocked waiting for a kernel buffer to be freed.
rame timekeeping and the other for timing the periodic statistics report. The duties of the existing listener thread were limited to listening to the network interface and de-encapsulating the arriving frames, placing the data in a global buffer named modulator_frame_buffer. All responsibility for pushing buffers to the kernel was shifted to the new frame timekeeping thread, called the timeline manager. The main thread was left with no real-time responsibilities at all. This architecture change was not strictly necessary, but I felt that it was cleaner and easier to implement correctly.
Timestamping for debug print purposes has so far been done to a resolution of 1ms using the get_timestamp_ms() function. For keeping track of timing for non-debug purposes, we added get_timestamp_us() and keep time in 64-bit microseconds. This is probably unnecessary, but it seemed possible that we might need more than millisecond precision for something.
The three threads share some data structures. In particular, the listener thread fills up a buffer, which the timeline manager thread may push to IIO. To make this data sharing thread-safe, we use a common mutex that covers almost all processing in all three of the threads. The threads release the mutex only when waiting for a timer or pushing a buffer to IIO. This is a little simple-minded, but it should be ok for this application. None of the threads do much in the way of I/O, so there would be little point in trying to interleave their execution at a finer time scale. The exception would be output to the console, which consists of the periodic statistics report (not very big and only occurs every 250th frame) and any debug prints from anywhere in the threaded code. If this becomes a bottleneck, it will be easy enough to do that I/O outside of the mutex.
Each of the three threads is implemented in three functions. One function is the code that runs in the thread, the second function starts the thread, and the third stops the thread. The main function of each thread runs a loop that checks the global variable named stop but otherwise repeats indefinitely.
Listener Thread
The listener thread is rather similar to the one thread in previous versions of Dialogus. It still handles the beginning of a transmission session, but then relinquishes frame processing to the timeline manager thread. Pushing of data frames to IIO (after the first frame of a session), pushing dummy frames, and pushing postamble frames are no longer the responsibility of the listener thread.
The listener thread waits blocked in a recvfrom() call, waiting for an encapsulated frame to arrive over the network. When the first encapsulated frame arrives and a transmission session is not currently in progress, the listener thread detects that as the start of a new transmission session. It calls start_transmission_session, which takes note of the frame’s arrival time, and computes the first Td of the new session. It then creates and pushes a preamble frame, and also pushes the de-encapsulated frame. This creates the starting conditions for the timeline manager to take over for the rest of the transmission session.
Thereafter, for the duration of the transmission session, the listener thread merely de-encapsulates incoming frames into the single shared buffer named modulator_frame_buffer and increments a counter named ovp_txbufs_this_frame. These are left for the timeline manager thread to handle.
Timeline Manager Thread
The timeline manager has a pretty simple job. It wakes up at 40ms intervals during a transmission session, and pushes exactly one frame to IIO each time. That frame will be a frame received by the listener thread in encapsulated form, if one is available. Otherwise, it will be a dummy frame or a postamble frame.
The timeline manager thread has two main paths, depending on the state of the variable ovp_transmission_active. If no transmission is active, the timeline manager thread waits on a condition, which will be set when the listener thread receives the first encapsulated frame of the next transmission. If a transmission is already active, the timeline manager thread locks the timeline_lock mutex, then checks to see if a decision time has been scheduled. If so, it gets the current time, and compares that with the decision time to see if it has passed. If it has not, the timeline manager releases the mutex, computes the duration until the decision time, and sleeps for that long, so that next time this thread runs, it will probably be time for decision processing. When decision time has arrived, it starts to decide. First it checks to see if one or more encapsulated frames has been copied into the buffer. Each frame beyond one represents an untimely frame error, which it simply counts. The buffer contains the one arrived frame, or the last of multiple arrived frames, and this buffer is now pushed to IIO. On the other hand, if no frames have arrived, we will push either a dummy frame (starting or continuing a hang time) or, if the hang time has been completed, a postamble frame. If we push a postamble frame, we go on to end the transmission session normally, and then wait for notification of a new session. Otherwise, we again compute the duration from now until the next decision time, and sleep for that long.
Period Statistics Reporter Thread
The periodic statistics reporter thread has an even simpler job. It wakes up at approximate 10s intervals during a transmission session, and prints a report to the console of all the monitored statistics counts.
Some Optimizations
Ending a Transmission Session
The normal end of a transmission session includes pushing a postamble frame, and then waiting for it to actually be transmitted before shutting down the transmitter. There is no good way to wait for a frame to actually be transmitted, so we make do with a fixed timer. That timer was 50ms, which I believe is just 40ms plus some margin. The new value is 65ms. Computing from the Td that resulted in the postamble being pushed, that’s 20ms for the last dummy frame to go out, then 40ms for the postamble frame to go out, plus 5ms of margin. This will need to be checked against reality when we’re able.
The delay is not desired when the session is interrupted by the user hitting control-C. In that case, we want to shut down the transmitter immediately. This is handled in a new abort_transmission_session function.
Special Frame Processing Paths
The preamble is a 40ms pattern of symbols. It does not start with a frame sync word and it does not contain a frame header. Thus, the preamble processing (found in start_transmission_session) skips over some of the usual processing steps and goes directly to push_txbuf_to_msk.
The dummy frames and postamble frames are less special. They do start with the normal frame sync word, and contain a frame header copied (with FEC encoding already done) from the last frame of data transmitted in that transmission session. This provides legal ID and also enables the receiver to correctly associate these frames with the transmitting station. As a result, these can be created by overlaying the encoded dummy payload or encoded postamble on top of the last payload in modulator_frame_buffer and pushing that buffer to IIO as normal.
All three of these special data patterns (preamble, dummy, and postamble) are now computed just once and saved for re-use.
About Hardware/Software Partitioning
Up until this work on timeline management, we have assumed that the modulator in the FPGA would eventually take care of adding the frame sync word, scrambling (whitening), encoding, and interleaving the data in the frames. Under those assumptions, Dialogus just sends a logical frame consisting of the contents of the frame header (12 bytes) concatenated with the contents of the payload portion of the packet (122 bytes), for a total of 134 bytes per frame.
The work described here was done under a different set of assumptions, so that we could make progress independent of the FPGA work. We assumed that the modulator hardware is completely dumb. That is, it just takes a sequence of 1’s and 0’s and directly MSK-modulates them, without any other computations.
Probably, neither of these assumptions is completely correct. It now seems that the FPGA in the Pluto is not big enough to handle all of those functions. However, a completely dumb modulator in the FPGA is not a very satisfying or clean solution. We may end up with multiple solutions, depending on the hardware available and the requirements of the use case targeted.
About Integrating Receiver Functions
This work has been entirely focused on the transmit chain. This makes sense for a Haifuraiya-type satellite system, where the downlink is a completely different animal from the uplink. However, we also want to support terrestrial applications, including direct user-to-user communications. In that case, it would be nice to have transmitter and receiver implemented together.
We already have taken some steps in that direction, in that the FPGA design we’re working with has both modulator and demodulator implementations for MSK. We’ve been able to demonstrate “over the air” tests of MSK for quite some time already, using the built-in PRBS transmitter and receiver in the FPGA design. Likewise, we’ve been able to demonstrate two-way and multi-way Opulent Voice communications using a network interface in place of the MSK radios, using Interlocutor and Locus.
Integrating the two demos will bring us significantly closer to a working system. The receive timeline has to be independent of the transmit timeline, and has different requirements, but the same basic approach seems likely to work for receive as for transmit. I expect that adding another thread or two for receive will be a good start.