Undefined behavior in C

I don’t usually like posts that are just a bunch of links, but this is particularly interesting: the LLVM blog has a series of posts describing the actual consequences of certain common kinds of undefined behavior in C (for example: dereferencing a null pointer is illegal, so if you dereference a pointer, the compiler is allowed to assume it’s not null, and remove checks against null from the code; INT_MAX + 1 is not guaranteed to wrap around to INT_MIN; etc)

Read the posts here: part 1, part 2, part 3.

A war story

Unix processes close all file descriptors on exit. This happens automatically and without user intervention, of course; the kernel takes care of it regardless of the reason of the process’s death.

This doesn’t happen immediately, though. There are a lot of things that happen before closing the file descriptors: dumping core, munmap() ing any mmap() ed files, etc. In the case of Facebook’s internal search index server, this would take up to 2 minutes, during which time existing TCP connections (and new connection attempts) would hang.

So, some time ago, I wrote a piece of code (which I named ShutdownSocketSet ) that ensures that, whenever the process dies because of a signal ( abort() / SIGABRT , SIGSEGV , SIGQUIT , etc), we disconnect all incoming connections first (including the accepting socket) before doing any of the other exciting things I mentioned above. (In the index server case above, this means that callers can identify the server as dead and try another replica)

ShutdownSocketSet needs to track all sockets that are currently assigned to the server so it knows which ones to disconnect; note that you can’t just close() them, as the file descriptors might still be in use in other threads that are still running; after closing, the same file descriptor would be reassigned to another file, so those threads would end up doing I/O in the wrong place. (We dup2() /dev/null onto the file descriptor that used to belong to the socket instead.)

A few months later, another server at Facebook started failing an assertion in ShutdownSocketSet: when adding a socket to the set (when the server received a new connection), the file descriptor was already there. Which is of course unexpected, so we abort.

After scrutinizing ShutdownSocketSet and the TCP server code for a few hours, I convinced myself that they’re mostly correct, which leaves one possible explanation: some code is closing file descriptors that don’t belong to it.

That is:

  1. file descriptor A is assigned to an incoming TCP connection and gets added to the ShutdownSocketSet
  2. some other piece of code, through means unknown, closes A
  3. A gets assigned to another incoming connection
  4. that other incoming connection tries to add A to the ShutdownSocketSet again, boom.

The “means unknown” part is a bit puzzling. How do you close() a random file descriptor that belongs to an incoming connection? It’s not like we have code extracting file descriptors from random connection objects and running close() on them, so how can that happen?

One possible theory is that you close your own file descriptor twice. Perhaps once when you hit an error and once again from a destructor. That is:

  1. BadCode gets file descriptor A assigned to it
  2. BadCode closes A
  3. A gets assigned to an incoming TCP connection and gets added to the ShutdownSocketSet
  4. BadCode closes A again
  5. A gets assigned to another incoming connection
  6. that other incoming connection tries to add A to the ShutdownSocketSet again, boom.

Now: how do you track something like this?

I wrote a small library that intercepts the close() glibc function, records the stack trace and thread id where the close happened, and delays the actual close() for some time (that is, the file descriptor isn’t recycled). If, during that time, the library observes another close() for the same file descriptor, it’s clearly a bug, so we dump the stack traces and crash. (This is the same as jemalloc‘s “quarantine” option, where free() d memory is quarantined for a while, and an attempt to free() the same address indicates a bug.)

I loaded the library (using LD_PRELOAD ) into the affected server and waited. We didn’t have to wait for very long: a few minutes later, we found that a new version (that we’d just upgraded to) of an internal MySQL client library had a bug that made it close() connections twice on connect() errors. We reverted that upgrade and everything was fine.

Anyway, I thought this was a fun story (and would make a good answer for the “tell me about an interesting bug that you’ve debugged” interview question) so I thought my readers would enjoy it as well. Carry on :)

The zombie apocalypse

In case you haven’t read James Mickens‘s article in the November 2013 issue of ;login::

The systems programmer has read the kernel source, to better understand the deep ways of the universe, and the systems programmer has seen the comment in the scheduler that says “DOES THIS WORK LOL,” and the systems programmer has wept instead of LOLed, and the systems programmer has submitted a kernel patch to restore balance to The Force and fix the priority inversion that was causing MySQL to hang. A systems programmer will know what to do when society breaks down, because the systems programmer already lives in a world without law.

It’s about zombies.

Welcome back

I moved all my old posts off of Quora. You might still experience some slight hiccups.

You’ll have to register to comment, and your first comment will be held for moderation. (sorry; I don’t like spam)

Have fun!

Pointer arithmetic in C

You know that piece of the C standard that says that pointer arithmetic is only valid if the pointers point into the same array? (I don’t have a copy of the C standard, but C++11 says, in 5.7.6: “Unless both pointers point to elements of the same array object, or one past the last element of the array object, the behavior is undefined.”)

It’s true. Here’s an example:

On my machine, it prints:

What’s happening here? Clearly bar.b[0] is stored at a higher address than bar.a[0] , why is the difference a humongous negative number?

Let’s look at the disassembly of foo_diff() (on Linux x86_64):

So we subtract a ( %rdi) from b ( %rsi), and then, rather than dividing the result by the size of struct Foo (9 bytes), we multiply by 10248191152060862009. What?

That’s 9’s multiplicative inverse. Dividing by 9 produces the same result as multiplying by the multiplicative inverse (but presumably multiplication is faster), but this property only holds if the division is exact (the remainder is zero). The compiler is allowed to make this assumption (because a and b are supposed to point in the same array, remember?), and so it produces code that won’t work if the assumption doesn’t hold.

Singletons are easy

In C++11, initialization of static local variables is guaranteed to be thread-safe. (This has been true by default in gcc since times immemorial — version 3.4 or so?)

So stop with your overengineered singleton classes (or class templates). This is what a singleton looks like in 3 lines of code:

If you want to ensure that there is no way there could be more than one [code]Foo[/code] created, use a private constructor and a static method:

Update: Static variables, just like globals, have the issue that they may be destroyed in any order (there are some limited ordering constraints, but they’re mostly useless). So if foo depends on other statics or globals, then the program might not be able to exit cleanly: the dependencies might be destroyed before foo is destroyed, so for a while foo might try to use other objects after destruction.

So, purists be damned, unless you actually need the singleton’s destructor to be called at program exit, I’d recommend allocating singletons on the heap and never freeing them (they live until program exit anyway, and all resources get freed when that happens):

Read the ZeroMQ documentation

ZeroMQ is a very well-designed piece of software. Even if you don’t plan to use it but you have an interest in distributed systems, low-latency networking, or protocol design, I urge you to read its documentation in detail, particularly chapters 3, 4, and especially 7. You’ll need to familiarize yourself with ZeroMQ before you do this, so you lose something if you just jump to chapter 7 and start reading, but trust me, it’s worth it.

Yet another gcc bug

Fahim ul Haq was debugging something involving shared_ptr that ended up being a bug in GCC related to C++11 lambdas.

Original code:

I confirmed that it is, indeed, a GCC bug, and reduced it to:

We filed the bug as Incorrect code generation in lambda with argument of type reference to template class.

boost NOT considered harmful

Boost C++ libraries vary a lot in quality and efficiency, but here’s a list of things that I’ve personally found useful.

Some components started out in Boost and made their way into the C++ standard: algorithm,
array, bind, enable_if, function, random, ref, smart pointers, system, thread, tuple, type traits; use the standard version whenever possible.

  • Non-standard containers, specifically flat_set and flat_map.
  • Filesystem access; never use readdir_r() again.
  • Intrusive: old-style C linked lists where elements link to each other instead of using a level of indirection (and memory allocation). Also particularly useful is get_parent_from_member , which allows you to save memory when you know that an object of type Foo is embedded in an object of type Bar (you don’t need to save a pointer from Foo to its enclosing Bar ).
  • Iterator: iterator_facade and iterator_adaptor make it easy to write iterators that behave like STL iterators (they supply all the boilerplate for you). The premade iterators are also useful.
  • I/O Stream State Savers for the (hopefully rare) cases where you actually have to deal with iostreams.
  • THE BOOST MPL LIBRARY makes complex generic programming somewhat easier. It’s big and loud (as its all-upper-case name indicates) but powerful; use very sparingly.
  • Multi-Index allows you to implement containers indexed by multiple keys — think of a LRU cache (elements are sorted by access time for expiration, but also keyed by a key for retrieval).
  • Operators: overload the minimal set of (comparison, assignment, arithmetic, etc) operators for your class, let Boost generate the rest.
  • Smart pointers: unique_ptr and shared_ptr are now standard, but intrusive_ptr is also useful when you need to save memory.
  • String algorithms, particularly starts_with.
  • Utility, particularly noncopyable , next() , prior().