Archive for the 'Fun' Category

Hash Table Shootout!

It’s gratifying to see GHashTable faring not-too-badly against an assortment of other hash table implementations in Nick Welch’s excellent benchmarks. I did some work on this previously, but there’s at least one thing that can still be done to reduce its memory footprint on 64-bit architectures. The table is a big array of structs that look like this:

struct _GHashNode
{
  gpointer   key;
  gpointer   value;
  guint      key_hash;
};

The key_hash is useful, as it lets us skip over mismatches and resize the hash table quickly - and on a 32-bit arch, it adds only 4 bytes to every entry, for a total of 12 bytes. However, on a 64-bit arch, it causes entries to be padded so that each entry starts on a multiple of 8 bytes. That wastefulness can be remedied by packing key_hash values in a separate array - as a bonus, entry size becomes a power of two, which means offsets into the array can be calculated using a single shift. On the downside, we’ll incur the added overhead of maintaining two arrays and accessing an extra cache line for each lookup. I suspect it’s still worth it, though.

A couple of open questions/comments on the benchmarks themselves:

  • Google’s dense_hash_map is suspiciously fast at integer deletion - is it shrinking the hash table at all, re-inserting the remaining items as it should? Or does it have some other trick up its sleeve?
  • How robust are the different strategies with respect to poorly distributed keys? E.g. N*2^M spaced integers: 1024, 2048, 3072, 4096, etc.
  • How about poor hash functions supplied by the API user? GHashTable attempts to tackle this by calculating the initial array index modulo a prime number, before applying a quadratic modulo on subsequent probes (faster, and required for quadratic probing).
  • What’s the per-table overhead? How much memory will, say, some 100.000 tables with <100 items each consume? This is not uncommon in practice - nested hashes are often used to store complex properties, and given suitably large working sets, this can become a significant factor for some programs.
  • Are the approaches affected by memory fragmentation? Do they cause it? This is hard to measure; maybe many tables could be grown simultaneously in the same address space.

More GUADEC, Istanbul

A day and a half left of the conference. Some pictures from the awesome, Collabora-sponsored Bosporus cruise, sightseeing, etc. below. Everyone seems to be having a good time.

Bosporus cruise Istanbul bridge Castle Mosque Another mosque Cats everywhere Clemens and Federico

Fun at GUADEC 2008

Tram in istanbul

At GUADEC 2008. Istanbul is a fun and interesting place, and the talks aren’t all that bad either. Not much time to write or code, though.

Home, sweet home

I’m finally back in Xalapa after more than three weeks on the road.

After the wildly successful GTK+ Berlin hackfest, I spent a day in the Hague with my good friends Andrea and Marcus, then headed for home. The flight was awful as usual, but mitigated by the fact that I was bringing back good memories and lots and lots of Belgian and Dutch candy.

Three happy hackers
Christian, Alex and Benjamin: Three happy hackers
(Ryan was AWOL for the picture)

GTK+ 2008 hackfest crew
The hackfest gang
(Picture courtesy of Tor Lillqvist)

Andrea, Marcus and their parrot, Perla
Andrea, Marcus and their pet parrot, Perla

Bikes at the Hague central station
Bikes parked at the Hague central station

Swinging by Europe: The Story Thus Far

Late last month, I strapped on my backpack and went across the ocean to FOSDEM in Brussels. It was good to see some of the openSUSE and old Ximian guys again, but I was feeling kind of drained the whole time. Had Belgian waffles w/everything, which is a lot more “everything” than it is “waffle” (i.e. ice cream, whipped cream, chocolate sauce, chocolate shavings, strawberry jam, strawberries and banana).

Old Building in Brussels
Some old building in Brussels

After FOSDEM, I spent close to two weeks in Norway, visiting family and friends in between work. I even went skiing one weekend, an experience I’ve been idealizing in my mind since I moved to Mexico. Even so, it lived up to expectations.

Ski Tracks
Ski tracks

Me and Mom Skiing
Me and my Mom skiing

My friends at Copyleft Software A/S were kind enough to lend me office space in Oslo. I must say they have an awesome work environment thing going.

Outside Copyleft’s Offices
Outside Copyleft’s offices

Lunch at Copyleft
Lunch at Copyleft

Leo is Donald Duck
Leo making his famous Donald Duck impression

After Norway, I left for Berlin where I’m currently taking up space at the GTK+ Hackfest.