Garbage in, garbage out

Or with verbs: Connect. Send. Receive.

For a while now, I’ve been of the opinion that we need a generic streaming/networking library for the GNOME core platform, for two reasons:

1. The Unix I/O API is hard to use correctly, and many applications get it wrong.

For instance, a “readable” condition followed by a zero-byte read() means that a socket was disconnected in an orderly fashion. This is not good API.

Problems caused by this API include:

  • Blocking the UI while waiting for a network or file operation. People don’t want to implement a buffering async pump every single time. By the way, did you know that local file I/O always blocks on Linux (and probably most other Unices), even if you set O_NONBLOCK?
  • Spinning on a slow, non-blocking FD, pegging the CPU.
  • Failure to save errno early. This leads to the infamous “Connection failed: Success” and other equally confusing error messages.
  • Not checking for the EINTR non-error, terminating a perfectly good FD and potentially losing data.
  • Writing to a socket that was closed in the remote end, resulting in an unhandled SIGPIPE and terminating the process.
  • Not checking the return value from close(). Very few apps do this. See the close(2) man page for why you should.
  • Closing an already closed FD. Since nobody checks the return value from close(), this goes largely undetected until the code is used in a threaded app. Then it will cause heisenbugs when the FD is re-used by another thread between the first and second close() statements. Unless you know what you’re looking for, this is incredibly hard to pin down.

It would be nice to have a single place in which to fix this. Also, we can provide Windows portability for applications that insist on this (transparently handling the fact that on Windows, files and sockets are very different things).

2. There is too little code reuse in stream implementations.

We need a way to compartmentalize stream elements in C, so that they can be re-combined to achieve specific tasks. For instance, you may want to add a rate control/throttling element to a file transfer protocol, or construct a pipeline of exec() subprocesses – or you may want to do something more adventurous, like SSL + uuencode + unencrypted IM protocol. We can also provide policy to help ease the pain of shunting data back and forth. If you’ve written code that asynchronously forwards data from a pipe to another, handling slow consumer/fast producer and vice versa, you know how ridiculously complex such code can get. What we want (or at least, what I want) is the ability to just connect the black boxes and call it a day.

GStreamer has had success with such an element-based architecture, and I think it makes sense in the more generic case too.

A solution?

I’m working on a library to resolve this situation, with the working title “Flow”. Currently it’s about 1/3 finished, with 10k LOC written. It depends on GLib, GObject and GThread. I’m reusing old network code of mine that is known to compile and run on Linux, FreeBSD and Windows, which should ease portability once the initial API is done.

You can get more details about the implementation I’m aiming for, as well as a preliminary tarball with totally unfinished code in it (but it passes distcheck!). If the response is positive, I’ll transfer it to the GNOME wiki.

I’m aware of other efforts, like gnet and gnetwork/gio, but for various reasons I don’t like them. gnet is too naive, and although James is a great guy, I disagree with him about some of his gio/gnetwork design goals – specifically that elements are always two-way and the decision to implement loadable modules with an XML registry. So rather than spend my time arguing, I’m doing this. I hope I’m not offending anyone too much by doing so.

Leave a Reply

Your email address will not be published. Required fields are marked *