ezyang's research log

What I did today. High volume, low context. Sometimes nonsense. (Archive)

foldingcookie: use-after-free on prev? I feel like a DSL or even a diagrammatical tool would make these pointer shuffle loops way easier to inspect

Bingo!

Debugging segfaults in GHC »

GHC is a nice, complicated piece of Haskell code, so if you make a bug in the compiler, you are likely to cause ghc-stage2 to segfault. Before you try debugging directly, see if you can build a simpler program using ghc-stage1 -debug which segfaults, and debug that instead. Of course, for subtle compiler bugs, GHC may be the easiest piece of code to induce the failure with. If you are doing in-place builds, you can rebuild the stage 2 compiler by deleting ghc/stage2/build/tmp/ghc-stage2 and then running make 2 EXTRA_HC_OPTS=-debug in the ghc directory to rebuild the compiler with the debugging RTS. (Substitute 2 with 1 if debugging the stage 1 compiler, although that one really shouldn’t be segfaulting!)

It’s not possible to directly run GHC using gdb inplace/bin/ghc-stage2 on Linux, because this file is not actually an executable: it’s a shell script. You can convert it to run gdb by editing the last line to read exec gdb --args "$executablename" -B"$topdir" ${1+"$@"}; alternately, you can manually import the environment variables set by this script and then run GDB by hand on the result. (XXX: Someone should put instructions for how to run gdb on GHC, while feeding it information from stdin, on Windows.)

Spot the bug

prev = NULL;
for (oc = unloaded_objects; oc; prev = oc, oc = next) {
  next = oc->next;
  if (oc->referenced == 0) {
    if (prev == NULL) {
      unloaded_objects = oc->next;
    } else {
      prev->next = oc->next;
    }
    deallocate(oc);
  }
}

How foreign exports and -split-objs work

In the case of a foreign import, we generate C code (compiled to a single object file) that looks like this:

extern StgClosure Mod_foo_closure;
void foo() { /* code involving Mod_foo_closure */ }
static void __attribute__((constructor)) stginit_export_Mod_foo() {
  getStablePtr((StgPtr) &Mod_foo_closure);
}

Then in the splittable generated code, we have references to Mod_foo_closure. The constructor is thus called in two situations:

  1. When we have a reference to Mod_foo_closure (i.e. we link with Haskell code which refers to the closure), or
  2. When we have a reference to foo (i.e. we link with C code which refers to the foreign export).

What about dynamically loaded code? Well, as it turns out, the correct way to implement a loader is to reload the library in question if some dynamically loaded code requests it, and since the GHCi linker does not drop unreferenced object files, the library has a second chance to setup the foreign export.

Split objects and initializers

One day, you discover GCC’s support for constructors, and write this code:

#include <stdio.h>

void __attribute__((constructor)) c() {
    printf("init\n");
}

void main() {
    printf("main\n");
}

You think, “Wow, what a wonderful way to implement a plugin system, where if you load an object, some initialization code can get run.”

Later, you refactor your code into two source files, aux.c:

#include <stdio.h>

void __attribute__((constructor)) c() {
    printf("init\n");
}

and main.c:

#include <stdio.h>

void main() {
    printf("main\n");
}

build the object files and link them together, and things still work. You nod approvingly.

Finally, one day, you decide aux.c deserves to be put in a library.

ar rcs libaux.a aux.o
gcc main.o -laux -L.

And suddenly, the initializer is no longer being called! You got punked, and your friend demonstrates to you that an object file is not loaded unless it contains a symbol that is being requested:

gcc main.o -laux -L. -uc

"Well!" you think, "That’s pretty annoying!"

Why Launchpad can’t build GHC

I’m sure there are more problems on the way, but here are the problems I have encountered:

  1. Some of the repository imports are hung on the “Bazaar import stage”. Affected: https://code.launchpad.net/~ghc/ghc/packages-containers-master and https://code.launchpad.net/~ghc/ghc/packages-time-master. See also bug https://bugs.launchpad.net/launchpad/+bug/904683 (Update: wgrant has delightfully unwedged these branches!)

  2. Submodules are not supported by the importer: https://code.launchpad.net/~ghc/ghc/master. See also bug https://bugs.launchpad.net/launchpad/+bug/402814

  3. Importer can’t deal with gpgsig fields on commits: https://code.launchpad.net/~ghc/ghc/haddock-master. See also bug https://bugs.launchpad.net/ubuntu/+source/bzr-git/+bug/1084403