Tuesday, January 18, 2011

Writing Ruby Extensions in C - Part 12, Allocating memory

This is the twelfth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talked about dealing with numbers. The eighth post talked about strings. The ninth post focused on arrays. The tenth post looked at hashes. The eleventh post explored blocks and callbacks. This post will look at allocating and freeing memory.

Allocating memory


When creating a new ruby object, memory will be automatically allocated from the garbage collector as needed.

If the ruby extension needs to allocate C-style memory, the basic malloc/realloc/calloc calls can be used. However, there are ruby counterparts that do the work of malloc/realloc/calloc in a slightly better way. The advantage of the following calls is that they first try to allocate memory, and if they fail, they will invoke the garbage collector to free up a bit of memory and try again. That way if the program is low on memory, or the address space is fragmented because of the ruby memory allocator, these functions will succeed where basic malloc/realloc/calloc would fail:
  • ALLOC(type) - allocate a structure of the pointer type
  • ALLOC_N(type, num) - allocate num structures of pointer type
  • REALLOC_N(var, type, num) - realloc var to num structure of pointer type

It is important to use xfree() to free the memory allocated by these calls. In the nominal case there isn't much difference between regular free() and xfree(), but if ruby is built a certain way, xfree() does some additional internal accounting. In any case, there is no reason not to use xfree(), so it is recommended to always use xfree(). Thanks to SodaBrew for pointing this out in the comments.

A simple example to demonstrate the use of these functions:

 1) struct mystruct {
 2)     int a;
 3)     char *b;
 4) };
 5)
 6) static VALUE implementation(VALUE a) {
 7)     struct mystruct *single;
 8)     struct mystruct *multiple;
 9)
10)     single = ALLOC(struct mystruct);
11)     xfree(single);
12)
13)     multiple = ALLOC_N(struct mystruct, 5);
14)
15)     REALLOC_N(multiple, struct mystruct, 10);
16)
17)     xfree(multiple);
18)
19)     return Qnil;
20) }

Lines 1 through 4 just define a simple structure containing a char * and an int. The implementation of a ruby method on lines 6 through 20 show the use of the allocation functions. Line 10 shows the allocation of a single structure of type struct mystruct, which is freed on line 11. Line 13 shows the allocation of an array of 5 elements of struct mystructs into the multiple pointer. Line 15 shows the reallocation of the multiple array to 10 elements. Notice that since REALLOC_N is a macro, it operates slightly differently than realloc(); in particular, there is no need (and no way) to re-assign the pointer. Finally, line 17 frees up the multiple pointer and line 19 returns successfully from the function.

Error handling and not leaking memory


Ruby is a garbage collected language, meaning that applications don't generally have to worry about freeing memory after it is used. This garbage collection extends into C extension modules, but only to a certain point. If you are writing a C extension to ruby, there are some places that you have to worry about keeping track of your pointers and freeing them up. To understand why, we need to dig a little into the memory allocation functions of ruby.

When you are writing pure ruby, and execute a line of code like:

x = ['a']

the ruby virtual machine causes some memory to come into existence to hold that list for you. The way that this memory is allocated is with rb_ary_new() (or one of its derivatives). The call chain looks like: rb_ary_new() -> rb_ary_new2() -> ary_new() -> NEWOBJ() -> rb_newobj(). Inside of rb_newobj(), no memory is actually allocated; instead, the new object that we need to come into existence is just taken off of the list of free objects, and the free list head is moved to the next object. If it turns out that no memory is available in this freelist, the garbage collector is run to try to reap some memory, and then the memory is given to this new object. Because this memory is coming from the freelist, it is all involved with (and can later be reaped by) the garbage collection.

When you allocate memory in C code using malloc (or one of its derivatives), no such thing happens. The memory is properly allocated, but it is not involved in any of the garbage collection schemes. This leads to 2 problems:
  1. Since malloc isn't involved in the garbage collection, the malloc can fail earlier than it normally would due to address space fragmentation. This isn't generally a problem on 64-bit architectures, but it could crop up as a problem on 32-bit ones.
  2. If a ruby call in your extension module fails, it will throw an exception. In ruby, exceptions are done via a longjmp out of the extension code and into the ruby exception handling code. If you have allocated any memory with malloc and friends, you have now lost the pointers to that memory, so you now have a memory leak (apparently this problem is much worse when dealing with C++; see [1]).

Problem 1) is partially solved by using the built-in ruby ALLOC, ALLOC_N, and ruby_xmalloc functions. Problem 2) is much more insidious, and more difficult to handle. Luckily, it is not impossible to handle.

Assume you have the following code snippet:


 1) int *ids;
 2) VALUE result;
 3) int i;
 4)
 5) ids = ALLOC_N(int, 5);
 6) for (i = 0; i < 5; i++)
 7)     ids[i] = i;
 8)
 9) result = rb_ary_new2();
10)
11) for (i = 0; i < 5; i++)
12)     rb_ary_push(result, INT2NUM(ids[i]));
13)
14) xfree(ids);

(while this is a bit of a contrived example, it actually bears a lot of resemblance to this[2] code in ruby-libvirt)

What this code is trying to do is to create an array full of the values in the "ids" array. If there are no errors, then this code works absolutely fine and doesn't leak any memory (ids gets freed at line 14, and the ruby array will get reaped by the garbage collector eventually). However, if either rb_ary_new2() or rb_ary_push() fails in lines 9 or 12, then they will automatically longjmp to the ruby exception handler, completely skipping the xfree at line 14. This code has now leaked memory.

The way to fix this is to interrupt ruby's normal longjmp on exception mechanism so that you can insert code of your own before throwing the exception. The rb_protect() ruby call can be used to do exactly this. Unfortunately the interface is a bit clunky, but we have to do what we have to do.

rb_protect() takes 3 arguments: a name of a callback function that takes 1 (and exactly 1 argument), the argument to pass to that callback function, and a pointer to an integer to store the exception address (if any). Because the callback function can only take one argument, typical usage is to create a callback "wrapper" that takes the one and only argument. The data that you pass in can be anything, so if you want to pass in multiple arguments, you can do so by passing in a pointer to a structure containing all of the data that you need. An example should help clarify some of this:


 1) struct rb_ary_push_arg {
 2)     VALUE arr;
 3)     VALUE value;
 4) };
 5)
 6) static VALUE rb_ary_push_wrap(VALUE arg) {
 7)     struct rb_ary_push_arg *e = (struct rb_ary_push_arg *)arg;
 8)
 9)     return rb_ary_push(e->arr, e->value);
10) }
11)
12) int *ids;
13) VALUE result;
14) int i;
15) int exception = 0;
16) struct rb_ary_push_arg args;
17)
18) ids = ALLOC_N(int, 5);
19) for (i = 0; i < 5; i++)
20)     ids[i] = i;
21)
22) result = rb_ary_new2();
23)
24) for (i = 0; i < 5; i++) {
25)     args.arr = result;
26)     args.value = INT2NUM(ids[i]);
27)     rb_protect(rb_ary_push_wrap, (VALUE)&args, &exception);
28)     if (exception) {
29)         xfree(ids);
30)         rb_jump_tag(exception);
31)     }
32) }
33)
34) xfree(ids);

Now when we add entries to the ruby array, we are doing so through the rb_ary_push_wrap() function, called by rb_protect(). This means that if rb_ary_push() fails for any reason and throws an exception, control will be returned back to the code above at line 28, but with exception set to a non-zero number. We have a chance to clean up after ourselves, and then continue propagating the exception with rb_jump_tag(). Note that with the use of a proper structure, we can pass any number of arguments through to the wrapper function, so we can use this for all internal ruby functions. Notice that I did not wrap rb_ary_new2(), even though that can cause the same problem; I leave this as an exercise to the reader.

[1] http://www.thoughtsincomputation.com/posts/ruby-c-extensions-c-and-weird-crashing-on-rb_raise
[2] http://libvirt.org/git/?p=ruby-libvirt.git;a=blob;f=ext/libvirt/domain.c;h=eb4426252af635311e14e234a62780fbd4048f0b;hb=HEAD#l80

11 comments:

  1. Really useful stuff.
    I am wondering, how does the GC garbage collect ruby objects that you use only in a C extension? For example if you had a ruby array in a global variable in the extension, and you wouldn't even pass it into ruby code...
    My first thought is that GC will free the array as soon as it's ran since no ruby reference exist to it - and probably it can't check if C code has a reference to it. I base this guess on some general knowledge of the 1.8 mark and sweep GC but I don't know how it works in 1.9, which is what I am more interested in.
    So how does this work, do you know?

    ReplyDelete
  2. Honestly, I'm not sure. It's been a long time since I looked at it :). What I do remember is that any objects you want to be automatically garbage collected must be announced to the GC via a particular call. Unfortunately I don't quite remember the call, and I also vaguely recall that there were some limitations with it, but I'm really not sure.

    ReplyDelete
  3. What happens when ALLOC or ALLOC_N fails? Does it fail like MALLOC - or is there special treatment that makes it raise an ruby exception?

    ReplyDelete
    Replies
    1. Unfortunately it has been quite a while since I looked at this, so I don't exactly remember. I took a quick look at my ruby-libvirt code. I am always very conscious to check for NULL after allocating memory in my C code. However, in the ruby-libvirt code, I never check for NULL after an ALLOC or ALLOC_N, which pretty much leads me to believe that it throws a ruby exception. You can confirm by looking through the ruby code itself, which is pretty easy to follow.

      Delete
    2. This comment has been removed by the author.

      Delete
  4. Any chance you would have this series wrapped up as a PDF?

    ReplyDelete
    Replies
    1. Unfortunately, no. I honestly don't know a good way to do that from Blogger, but if I find one, I might do it :).

      Delete
  5. Ruby's xmalloc allocates memory from within a region of memory that Ruby itself has allocated with libc malloc. If you use free on the pointer returned by xmalloc, you will be releasing back to the system a subset of the memory that Ruby thinks it still owns. This will crash your program. Therefore: pointers from xmalloc must be xfree'd and pointers from malloc must be free'd.

    ReplyDelete
    Replies
    1. Ah, I didn't realize that. That is the missing piece. Thanks, I'll update the article later on.

      Delete
    2. Actually, it looks more subtle than that. If ruby is built with CALC_EXACT_MALLOC_SIZE as 0 (the default), there really isn't much difference in calling malloc/free and ALLOC_N/xfree (as far as I can tell from reading gc.c). It's when ruby is built with CALC_EXACT_MALLOC_SIZE set to 1 that xfree() becomes important. I guess that argues for using xfree() just to be safe.

      Delete
  6. Thanks for the great tutorials, got mostly everything i need now to create some fine extensions.

    ReplyDelete