Tuesday, January 11, 2011

Writing Ruby Extensions in C - Part 5, Exceptions

This is the fifth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. This post will focus on creating and handling exceptions.

Exceptions

When a method implementation in a ruby C extension encounters an error, the typical response is to throw an exception (a value indicating error can also be returned, but that is not idiomatic). The exception to be thrown can either be one of the built-in exception classes, or a custom defined exception class. The built-in exception classes are:
  • rb_eException
  • rb_eStandardError
  • rb_eSystemExit
  • rb_eInterrupt
  • rb_eSignal
  • rb_eFatal
  • rb_eArgError
  • rb_eEOFError
  • rb_eIndexError
  • rb_eStopIteration
  • rb_eRangeError
  • rb_eIOError
  • rb_eRuntimeError
  • rb_eSecurityError
  • rb_eSystemCallError
  • rb_eThreadError
  • rb_eTypeError
  • rb_eZeroDivError
  • rb_eNotImpError
  • rb_eNoMemError
  • rb_eNoMethodError
  • rb_eFloatDomainError
  • rb_eLocalJumpError
  • rb_eSysStackError
  • rb_eRegexpError
  • rb_eScriptError
  • rb_eNameError
  • rb_eSyntaxError
  • rb_eLoadError

Extension modules should usually define a custom exception class for errors related directly to the extension, and use one of the built-in exception classes for standard errors. The custom exception class should generally be a subclass of rb_eException or rb_eStandardError, though if the module has special needs any of the built-in exception classes can be used. Example:

 1) static VALUE m_example;
 2) static VALUE e_ExampleError;
 3)
 4) static VALUE exception_impl(VALUE klass, VALUE input) {
 5)     if (TYPE(input) != T_FIXNUM)
 6)         rb_raise(rb_eTypeError, "invalid type for input");
 7)
 8)     if (NUM2INT(input) == -1)
 9)         rb_raise(e_ExampleError, "input was < 0");
10)         return Qnil;
11) }
12)
13) void Init_example() {
14)     m_example = rb_define_module("Example");
15)
16)     e_ExampleError = rb_define_class_under(m_example, "Error",
17)                                            rb_eStandardError);
18)
19)     rb_define_module_function(m_example, "exception_example",
20)                               exception_impl, 1);
21) }
Line 14 sets up the extension module. Line 16 creates the custom exception class as a subclass of rb_eStandardError. Now if the extension module runs into a situation that it can't accept, it can raise e_ExampleError and throw an exception of type Example::Error. Line 19 defines a module function that demonstrates the use of standard and custom exceptions. If Example::exception_example is called with an argument that is not a number, it raises the ArgumentError exception on line 6 (side-note: Check_Type should really be used to do this type of checking, but for example purposes we omit that). If Example::exception_example is called with a number argument that is -1, then the custom exception Example::Error is raised on line 9. Otherwise, the method succeeds and Qnil is returned.

Raising exceptions

There are a few different ways to raise exceptions:
  • rb_raise(error_class, error_string, ...) - the main interface for raising exceptions. A new exception object of class type error_class is created and then raised, with the error message set to error_string (plus any printf-style arguments)
  • rb_fatal(error_string, ...) - a function for raising an exception of type rb_eFatal with the error message set to error_string (plus any printf-style arguments). After this call the entire ruby interpreter will exit, so extension modules typically should not use it
  • rb_bug(error_string, ...) - prints out the error string (plus any printf-style arguments) and then calls abort(). Since this call doesn't allocate an error object or do any of the other typical exception handling steps, it isn't technically a function to raise exceptions. This function should only be used when a bug in the interpreter is found, and as such, should not be used by extension modules
  • rb_sys_fail(error_string) - raises an exception based on errno. Ruby defines a separate class for each of the errno values (such as Errno::EAGAIN, Errno::EACCESS, etc), and this function will raise an exception of the type that corresponds to the current errno
  • rb_notimplement() - raises an exception of rb_eNotImpError. This is used when a particular function is implemented on one platform, but possibly not on other platforms that ruby supports
  • rb_exc_new2(error_class, error_string) - allocate a new exception object of type error_class, and set the error message to error_string. Note that rb_exc_new2() does not accept printf-style options, so the string will have to be fully-formed before passing it to rb_exc_new2()
  • rb_exc_raise(error_object) - a low-level interface to raise exceptions that have been allocated by rb_exc_new2()
  • rb_exc_fatal(error_object) - a low-level interface to raise a fatal exception that has been allocated by rb_exc_new2(). After this call the entire ruby interpreter will exit, so extension modules typically should not use it
The example below shows the use of rb_raise() and rb_exc_raise(), which are the only two calls that extension modules should really use.

 1) static VALUE m_example;
 2) static VALUE e_ExampleError;
 3)
 4) static VALUE example_method(VALUE klass, VALUE input) {
 5)     VALUE exception;
 6)
 7)     if (TYPE(input) != T_FIXNUM)
 8)         rb_raise(rb_eTypeError, "invalid type for input");
 9)
10)     if (NUM2INT(input) < 0) {
11)         exception=rb_exc_new2(e_ExampleError, "input was < 0");
12)         rb_iv_set(exception, "@additional_info",
13)                   rb_str_new2("additional information"));
14)         rb_exc_raise(exception);
15)     }
16)
17)     return Qnil;
18) }
19)
20) void Init_example() {
21)     m_example = rb_define_module("Example");
22)
23)     e_ExampleError = rb_define_class_under(m_example, "Error",
24)                                            rb_eStandardError);
25)     rb_define_attr(e_ExampleError, "additional_info", 1, 0);
26)
27)     rb_define_module_function(m_example, "method",
28)                               example_method, 1);
29) }
Lines 20 through 29 show the module initialization. Since this is described in more detail elsewhere, I'll only point out line 25, where a custom attribute for the error class e_ExampleError is defined. When an error occurs in the extension module, additional error information can be placed into that attribute, and any caller can look inside of the error object to retrieve that additional information.

Lines 4 through 18 implement an example method that takes one and only one input parameter. Line 7 checks to see if the input value is a number, and if not an exception is raised with rb_raise() on line 8. Line 10 checks to see if the number is less than 0. If it is, then a new exception object of type e_ExampleError is allocated on line 11 with rb_exc_new2(), and the additional_info attribute of the object is set to "additional information" on line 12. As with most other things, the value that additional_info is set to can be any valid ruby object. Line 14 then raises the exception. This example shows very clearly the power of rb_exc_new2() and rb_exc_raise(), in that additional error information can be passed through to callers.

Handling exceptions

The other half of dealing with exceptions in an extension module is handling exceptions in C code when they are thrown from ruby functions. How is that done since C has no raise/rescue type mechanism? Through callbacks.

There are a few functions that can be used for handling exceptions:
  • rb_ensure(cb, cb_args, ensure, ensure_args) - Call function cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. When cb() finishes, regardless of whether it completes successfully or raises an exception, call ensure with ensure_args. The ensure function must take in a single VALUE parameter and return VALUE
  • rb_protect(cb, cb_args, line_pointer) - Call cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. If an exception is raised by cb(), store the exception handler point in line_pointer and return control. It is then the responsibility of the caller to call rb_jump_tag() to return to the exception point
  • rb_jump_tag(line) - do a longjmp to the line saved by rb_protect(). No code after this statement will be executed
  • rb_rescue(cb, cb_args, rescue, rescue_args) - Call function cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. If cb() raises any exception, rescue is called with rescue_args. The rescue callback should take in two VALUE parameters and return VALUE

Another example should make some of this clear:

 1) static VALUE cb(VALUE args) {
 2)     if (TYPE(args) != T_FIXNUM)
 3)         rb_raise(rb_eTypeError, "expected a number");
 4)     return Qnil;
 5) }
 6)
 7) static VALUE ensure(VALUE args) {
 8)     fprintf(stderr, "Ensure value is %s\n",
 9)               StringValueCStr(args));
10)     return Qnil;
11) }
12)
13) static VALUE rescue(VALUE args, VALUE exception_object) {
14)     fprintf(stderr, "Rescue args %s, object classname %s\n",
15)             StringValueCStr(args),
16)             rb_obj_classname(exception_object));
17)     return Qnil;
18) }
19)
20) VALUE res;
21) int exception;
22)
23) res = rb_ensure(cb, INT2NUM(0), ensure, rb_str_new2("data"));
24) res = rb_ensure(cb, rb_str_new2("bad"), ensure,
25)                 rb_str_new2("data"));
26)
27) res = rb_protect(cb, INT2NUM(0), &exception);
28) res = rb_protect(cb, rb_str_new2("bad"), &exception);
29) if (exception) {
30)     fprintf(stderr, "Failed cb\n");
31)     rb_jump_tag(exception);
32) }
33)
34) res = rb_rescue(cb, INT2NUM(0), rescue, rb_str_new2("data"));
35) res = rb_rescue(cb, rb_str_new2("bad"), rescue,
36                    rb_str_new2("data"));
Line 23 kicks off the action with a call to rb_ensure(). In this first rb_ensure, we pass a FIXNUM object to cb(), which means that no exception is raised. Because of the rb_ensure(), however, the ensure() callback on lines 7 through 11 is called anyway and does some printing.

Line 24 passes a String object to cb(), which causes cb() to raise an exception. Because of the rb_ensure, the ensure() callback on lines 7 through 11 is called and does some printing. Importantly, after ensure() is called the exception is propagated, so in reality none of the code after line 21 will be executed (we'll ignore this fact for the sake of this example).

Line 27 uses rb_protect() to call the callback; since a FIXNUM object is passed, no exception is raised. Note that if the call that is being wrapped by rb_protect() does not raise an exception, exception is always initialized to 0.

Line 28 uses rb_protect() to call cb() with a String object, which causes an exception to be raised. Because rb_protect() is being used, control will be returned to the calling code at line 29, and that code can then check for the exception. Since an exception was raised, the "exception" integer will have a non-0 number and the code can do whatever we need to clean up and then propagate the exception further with rb_jump_tag() on line 31.

Line 34 uses the rb_rescue() wrapper to call cb(). Since a FIXNUM object is passed to cb(), no exception is raised and no callbacks other than cb() are called.

Line 35 uses rb_rescue() to call cb() with a String object, which causes an exception to be raised and the rescue() callback to be executed. The rescue() callback on lines 13 through 18 takes two arguments: the VALUE initially passed into the rb_rescue() rescue_args, and the exception_object that caused the exception. Based on the exception_object, the rescue() callback can choose to handle this exception or not.

Example

Before finishing this post, I'll leave you with another example. When writing ruby code, the full begin..rescue block goes something like:

begin
  ...
rescue FooException => e
  ...
rescue
  ...
else
  ...
ensure
  ...
How would we implement this in C?

 1) static VALUE foo_exception_rescue(VALUE args) {
 2)     fprintf(stderr, "foo_exception_rescue value is %s\n",
 3)             StringValueCStr(args));
 4)     return Qnil;
 5) }
 6)
 7) static VALUE other_exception_rescue(VALUE args) {
 8)     fprintf(stderr, "other_exception_rescue value is %s\n",
 9)             StringValueCStr(args));
10)     return Qnil;
11) }
12)
13) static VALUE rescue(VALUE args, VALUE exception_object) {
14)     if (strcmp(rb_obj_classname(exception_object),
15)                "FooException") == 0)
16)         return foo_exception_rescue(args);
17)     else
18)         return other_exception_rescue(args);
19) }
20)
21) static VALUE cb(VALUE args) {
22)     return rb_rescue(cb, args, rescue, rb_str_new2("data"));
23) }
24)
25) static VALUE ensure(VALUE args) {
26)     fprintf(stderr, "Ensure args %s\n", StringValueCStr(args));
27)     return Qnil;
28) }
29)
30) VALUE res;
31)
32) res = rb_ensure(cb, INT2NUM(0), ensure, rb_str_new2("data"));
This example implements almost the entire ability of the ruby begin..rescue blocks. What it does not implement is the "else" clause; I have not yet come up with a good way to do that. If you think of something to make this example work for the "else" clause, please leave a comment.

6 comments:

  1. This series of articles was really helpful -- thanks for sharing!

    ReplyDelete
  2. I see there is many feature for handling exceptions.
    However there's a feature I really need that I can't seem to find: I would like to be able to know at any moment if I'm already protected by something.

    I work on a project called Rarity, C++ code catches the Ruby exceptions, and right now it always converts them with a bundle of std::exception/ruby exception. I would like to be able to chose if I should send them through rb_raise or if I should throw them the C++ way. To make that choice, I must know if the portion of code I'm in is under the protection of rb_protect or not.

    It looks like there is no feature for that yet, but I'm asking just to be certain of it.

    ReplyDelete
    Replies
    1. Sorry for not replying sooner; I totally missed your post back in May!

      I'm not aware of a standard way to detect whether you are in rb_protect() (that doesn't mean it is impossible, just that I don't see one). I hope you are able to find something!

      Delete
  3. This blog awesome and i learn a lot about programming from here.The best thing about this blog is that you doing from beginning to experts level.

    Love from

    ReplyDelete
  4. While it may look like something, it fails my usage.
    Everything thus far was fine. It was not good to introduce something that is out of scope of what we thought was a good project. After all what good is it if you can't use it.

    make
    compiling example.c
    example.c:89:1: warning: data definition has no type or storage class [enabled by default]
    example.c:89:1: warning: type defaults to 'int' in declaration of 'res' [-Wimplicit-int]
    example.c:89:1: error: conflicting types for 'res'
    example.c:87:7: note: previous declaration of 'res' was here
    example.c:89:41: error: braced-group within expression allowed only inside a function
    make: *** [example.o] Error 1
    rake aborted!

    ReplyDelete
    Replies
    1. I figured out where to put things so that it now compiles without warnings or errors. I didn't mean to complain about such a wonderful set of lessons but I didn't think we needed that much experience in the first place. Just my bad assumption I guess. Thanks again. I am learning a lot and this is one of the few that really does hold your hand.

      Delete