As can be seen here mplex, the mplex multiplexer was designed in the early days of libp2p. It is a simple protocol, and because of this, it has some drawbacks.
However, even though it is a simple protocol, there are still some issues to solve. In this brief blog I try to explain.
Inner workings
The main goal of a multiplexer is to combine several streams into one. On the sending side, packets from multiple streams are encapsulated into “frames”. A frame contains data from the packet, and usually also the id from the stream it
came from. And, depending on the protocol, some additional info like length, type, hash, etc. On a receiver side the frames are interpreted and the id denotes to which stream the data should be pushed to. This is typically a data-frame.
There are usually some additional frame types, for example to open a new stream, to close a stream and to reset a stream.
In my mplex implementation, there is one multiplexer (the connection) and multiple streams. The multiplexer contains a coroutine that reads frames, interprets them and creates, closes, and resets streams (update its internal administration)
. The multiplexer contains (among other things) one output channel and multiple streams. Each stream has a new unique input channel, and shares the same output channel. When a new data-frame arrives, the multiplexer determines
the right stream, and pushes the data into the corresponding data channel. When a stream wants to send data, it sends it on the output channel. All streams share the same output channel and can push simultaneously (this is thread safe).
There are three topics that are noteworthy to mention:
- Backpressure
- Half-close stream
- Full close multiplexer
Backpressure
As mentioned, when the muliplexer receives a data-frame, it determines the id which determines the stream this data should be pushed to, and sends the data into the corresponding channel. However, when this channel is full, it blocks.
It waits until the receiver (the stream) reads new data from the channel such that there is new space in the channel which unblocks the sender (the multiplexer). It’s important to note that this blockage blocks the entire multiplexer, which
also stalls the other streams! Therefore, there is a small timeout whenever the multiplexer sends data into the stream channel. When this timeout elapses, the stream is too slow (or perhaps dead) and the multiplexer resets this stream.
Half close stream
Streams can be closed in two ways: full-close, and half-close. A full-close means closing for both sending and receiving. And half-close means only closed for sending. Usually a sender sends a request, and when this request is send, it closes the stream for sending. The stream is still open for receiving, and usually it waits for the response.
On the sender side, when a stream half-closes the stream it sends a close-frame to the receiver. The receiver receives this frame and processes this correspondingly. It should update its internal administration (both for the multiplexer
and the corresponding stream) that the communication is stopped in one direction and is still ongoing in the other direction. It’s an error to send data on a half-closed stream.
Full close multiplexer
Suppose we have a stream, and we are done sending some data on it. So we close the stream. And perhaps we are also done with the multiplexer, so we close it as well. We can not close the multiplexer immediately since we want to wait until
all the data is sent to the receiver. Remember that when we close a stream, we send a close-frame to the receiver. This is done on the shared output channel of the multiplexer.
So, we can have the following scenario: close stream and close multiplexer. If those two closes happen in this order, everything is fine because the stream close puts a close-frame on the output channel, and when the multiplexer closes,
it closes the output channel. All elements that are already in the output channel, are still processed, including the close-frame. However, we can not guarantee this order (race condition), so when the multiplexer is closed first, it
closes the channel. All elements already present in this channel are processed. However, the close-frame was not yet in the output channel because the stream close was not yet completed. So when the stream close continues, it tries to
send a close-frame on a closed channel.
Therefore, the output channel can only be closed when the multiplexer is closed and after all the streams are closed.