Wednesday, January 5, 2011

Writing Ruby Extensions in C - Part 2, RDoc

This is the second in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. This post focuses on documentation generation.

RDoc and ri

RDoc is the documentation generation system for ruby. The general idea is that the source code is marked up with specially-formatted comments, and then the rdoc tool is run against the source to generate the documentation. The output from this is either HTML documentation, or ri documentation, or both. Generating rdoc documentation is a simple matter of:
  1. Annotating the source code with the appropriate tags. The basic form of an RDoc tag is:
     * call-seq:
     *   obj.method(required, optional=0) -> retval
     * Call +wrappedLibraryFunction
     * +[]
     * to execute wrappedLibraryFunction.  This method takes a
     * single required argument, and one optional argument that
     * defaults to 0 if not specified.  It returns retval, which
     * can be any valid ruby object

    Most of my own knowledge about RDoc syntax comes from [1]; it is highly suggested reading. For more real-world examples of markup, please look at the ruby-libvirt bindings[3]; all of the methods are properly marked-up for RDoc.
  2. Adding appropriate task(s) to the Rakefile. This is very easy as rake has pre-defined tasks for generating RDoc documentation:
    1) require 'rake/rdoctask'
    3) RDOC_FILES = FileList["README.rdoc", "ext/example.c"]
    5) do |rd|
    6)     rd.main = "README.rdoc"
    7)     rd.rdoc_dir = "doc/site/api"
    8)     rd.rdoc_files.include(RDOC_FILES)
    9) end
    11) do |rd|
    12)     rd.main = "README.rdoc"
    13)     rd.rdoc_dir = "doc/ri"
    14)     rd.options << "--ri-system"
    15)     rd.rdoc_files.include(RDOC_FILES)
    16) end

    Line 1 pulls in the rake rdoctask that does most of the work for us. Line 3 defines the files that will be looked at for generating the rdoc. Note that the order of files is important; if there are dependencies between C files, the earlier dependencies must be listed first. Lines 5 through 9 define the main rdoc task. By default Rake::RDocTask creates a task called "rdoc", so nothing needs to be supplied for that. The "main" attribute of the rd specifies where the top-level documentation comes from. The "rdoc_dir" attribute specifies where the output will go. The "rdoc_files" attributes specifies which files to look at; here we point it at the list defined at line 3. With this task in place, we can now execute:
    $ rake rdoc

    at the command-line and the rdoc files will be generated from the C files and placed in doc/site/api. Lines 11 through 16 look very similar to the previous rdoc command, with a couple of differences. First, since we supply a symbol to the method, we get a task named "ri" instead of rdoc. Second, we specify an option in line 14 that tells rdoc to generate the ri documentation instead of the HTML rdoc documentation. Execution is again easy:
    $ rake ri

    This will generate the ri documentation from the C files and place the output in doc/ri.
While the idea behind RDoc is very cool, the actual implementation is a little bit weak for C extensions. RDoc just cannot handle several common C idioms:
  • Using a macro to define constants - I used to have code like:
    #define DEF_DOMSTATE(name) rb_define_const(c_domain, #name, INT2NUM(VIR_DOMAIN_##name))

    in ruby-libvirt. This was nice because I didn't have to repeat myself twice on every definition line. Since RDoc couldn't handle the macro, I had to remove all of these to get proper RDoc documentation.
  • Classes and methods split across multiple files - this one is an absolute deal-breaker for me. ruby-libvirt consists of around 7500 lines of C code, and having all of that in one file is just not feasible. Instead I have the code split along functional lines, which makes maintenance much easier. However, RDoc as of ruby 1.8.7 cannot follow the dependencies across different files, and hence almost none of my documentation was being generated. Luckily I found a patch[2] that makes RDoc smart enough to work across different files, but it sucks because I have to continually patch my local Ruby version. Maybe 1.9 fixes this in a better way; the RDoc parser seems to have been completely re-written, so there is hope on that front.
  • Having methods for a class defined in a different file - this one isn't a C idiom as such, but it seems like a simple thing. Given the nature of the ruby-libvirt bindings, I used to have all of the methods concerning a particular class (say, Libvirt::Network) in the same file. That included the lookup and definition methods, which are technically methods of class Libvirt::Connect (e.g. network = conn.lookup_network_by_name('netname')). However, RDoc also cannot handle this, so I was missing the RDoc documentation for all of the lookup and definition methods. I've now changed this to have all of the lookup and definition methods in the connect.c file, but it clutters that file unnecessarily. Again, maybe the Ruby 1.9 rewrite of RDoc fixes this.
That being said, RDoc is the canonical Ruby way to generate documentation, so whatever limitations it has must be worked around. The above is just a list of problems that I have come across that need workarounds in order to properly generate RDoc documentation.

Update: edited to make the example RDoc tagging readable
Update: edited to make the references readable
Update: edited to fix up minor formatting problem