Showing posts with label C. Show all posts
Showing posts with label C. Show all posts

Wednesday, January 15, 2014

Developing STM32 microcontroller code on Linux (Part 7 of 8, building and running a simple STM32 program)

The first post of this series covered the steps to build and run code for the STM32. The second post covered how to build a cross-compiler for the STM32. The third post covered how to build a debugger for the STM32. The fourth post covered building and configuring OpenOCD for your development environment. The fifth post covered building the device library, libopencm3. The sixth post covered linker scripts and command-line options necessary for building and linking programs to run on the STM32. This post will cover building and running a program on the STM32.

In the previous posts we dealt with all of the set up necessary to build programs for the STM32. It is finally time to take advantage of all of those tools and build and run something. Recall that from previous posts, we already have an OpenOCD configuration file setup, a linker script setup, and a Makefile setup. All that really remains is for us to write the code, build it, and flash it to our device. The code below is very STM32F3DISCOVERY specific; that is, it very much requires that the GPIO for the LED be on GPIO bank E, pin 12 on the board. If you have one of the other STM32 DISCOVERY boards, you'll need to look at the schematics and find one of the GPIOs that are hooked to an LED.

We are going to take an extremely simple example from libopencm3. This example does nothing more than blink one of the LEDs on the board on and off continuously. While this is simple, it will validate that everything that we've done before is actually correct.

Here is the code:

$ cd ~/stm32-project
$ cat <<EOF > tut.c
#include <libopencm3/stm32/rcc.h>
#include <libopencm3/stm32/gpio.h>

static void gpio_setup(void)
{
        /* Enable GPIOE clock. */
        rcc_peripheral_enable_clock(&RCC_AHBENR, RCC_AHBENR_IOPEEN);

        /* Set GPIO12 (in GPIO port E) to 'output push-pull'. */
        gpio_mode_setup(GPIOE, GPIO_MODE_OUTPUT, GPIO_PUPD_NONE,
                        GPIO12);
}

int main(void)
{
        int i;

        gpio_setup();

        /* Blink the LED (PC8) on the board. */
        while (1) {
                /* Using API function gpio_toggle(): */
                gpio_toggle(GPIOE, GPIO12);     /* LED on/off */
                for (i = 0; i < 2000000; i++) /* Wait a bit. */
                        __asm__("nop");
        }

        return 0;
}
EOF
You should now be able to type "make", and the thing should build. Typing "make flash" should run OpenOCD, install the program to the board, and start blinking an LED. Remember that our Makefile required sudo access to actually run openocd. If you don't have sudo access, you can either add sudo access (by adding your user to the wheel group), or just su to root and run the openocd command by hand.

Monday, January 13, 2014

Developing STM32 microcontroller code on Linux (Part 6 of 8, building and linking STM32 programs)

The first post of this series covered the steps to build and run code for the STM32. The second post covered how to build a cross-compiler for the STM32. The third post covered how to build a debugger for the STM32. The fourth post covered building and configuring OpenOCD for your development environment. The fifth post covered building the device library, libopencm3. This post will cover linker scripts and command-line options necessary for building and linking programs to run on the STM32.

Once we have all of the previous steps done, we are achingly close to being able to build and run code on our target STM32 processor. However, there is one more set of low-level details that we have to understand before we can get there. Those details revolve around how our C code gets turned into machine code, and how that code is laid out in memory.

As you may know, compiling code to run on a target is roughly a two-step process:
  1. Turn C/C++ code into machine code the target processor understands. The output of this step are what are known as object files.
  2. Take the object files and link them together to form a coherent binary. The output of this step is generally an ELF file.
Let's talk about these two steps in more detail.

Compile step

During compilation, the compiler parses the C/C++ code and turns it into an object file. A little more concretely, what we want to have our cross-compiler do is to take our C code, turn it into ARM instructions that can run on the STM32, and then output that into object files.

To do this, we use our cross-compiler. As with any version of gcc, there are many flags that can be passed to our cross-compiler, and they can have many effects on the code that is output. What I'm going to present here is a set of flags that I've found works pretty well. This isn't necessarily optimal in any dimension, but will at least serve as a starting point for our code. I'll also point out that this is where we start to get into the differences between the various STM32F* processors. For instance, the STM32F4 processor has an FPU, while the STM32F3 does not. This will affect the flags that we will pass to the compiler.

For the STM32F3, Cortex-M3 processor that I am using, here are the compiler flags: -Wall -Wextra -Wimplicit-function-declaration -Wredundant-decls -Wstrict-prototypes -Wundef -Wshadow -g -fno-common -mcpu=cortex-m3 -mthumb -mfloat-abi=hard -MD
Let's go through each of them. The -W* flags tell the compiler to generate compile-time warnings for several classes of common errors. I find that enabling these warnings and getting rid of them usually makes the code much better. The -g flag tells the compiler to include debugging symbols in the binary; this makes the code easier to debug, at the expense of some code space. The -fno-common flag tells gcc to place uninitialized global variables into the data section of the binary, which improves performance a bit. The -mcpu=cortex-m3 flag tells the compiler that we have a Cortex-M3, and thus to generate code optimized for the Cortex-M3. The -mthumb flag tells gcc to generate ARM thumb code, which is smaller and more compact than full ARM code. The -mfloat-abi=hard flag tells gcc that we want to use a hard float ABI; this doesn't make a huge difference on a processor without an FPU, but is a good habit to get into. Finally, the -MD flag tells gcc to generate dependency files while compiling, which is useful for Makefiles.

Linking step

Once all of the individual files have been compiled, they are put together into the final binary by the linker. This is more complicated when targeting an embedded platform vs. a regular program. In particular, we have to tell the linker not only which files to link together, but also how to lay the resulting binary out on flash and in memory.

We'll first start by talking about the flags that we need to pass to the linker to make this work. Here are the set of flags we are going to start with: --static -lc -lnosys -T tut.ld -nostartfiles -Wl,--gc-sections -mcpu=cortex-m3 -mthumb -mfloat-abi=hard -lm -Wl,-Map=tut.map
Again, let's go through each of them. The --static flag tells the linker to link a static, not a dynamically linked, binary. This flag probably isn't strictly necessary in this case, but we add it anyway. The -lc flag tells the linker to link this binary against the C library, which is newlib in our case. That gives us access to various convenient functions, such as printf(), scanf(), etc. The -lnosys flag tells the linker to link this binary against the "nosys" library. Several of the convenience functions in the C library require underlying implementations of certain functions to operate, such as _write() for printf(). Since we don't have a POSIX operating system that can provide these for us, the nosys library provides empty stub functions for these. If we want, we can later on define our own versions of these stub functions that will get used instead. The -T tut.ld flag tells the linker to use tut.ld as the linker script; we'll talk more about linker scripts below. The -nostartfiles flag tells the linker not to use standard system startup files. Since we don't have an OS here, we can't rely on the standard OS utilities to start our program up. The -Wl,--gc-sections flag tells the linker to garbage collect unused sections. That is, any sections that are not referenced are removed from the resulting binary, which can shrink the binary. The -mcpu=cortex-m3, -mthumb, and -mfloat-abi=hard flags have the same meaning as for the compile flags. The -lm flag tells the linker to link this binary against the math library. It isn't strictly required for our little programs, but most programs want it sooner or later. Finally, the -Wl,-Map=tut.map tells the linker to generate a map file and stick it into tut.map. The map file is helpful for debugging, but is informational only.

Linker script

As mentioned before, the linker script tells the linker how to lay out the resulting binary in memory. This script is highly chip specific. The details have to do with where the processor jumps to on reset, and where it expects certain things to be. Note that most chips are actually configurable (based on some jumper settings), so where it jumps to on reset can change. Luckily, for most off-the-shelf STM32 designs, including the DISCOVERY boards, it is always configured to expect the code to start out in flash. Therefore, the linker script tells the linker to lay out the code in flash, but to put the data and bss in RAM.

With all that said, libopencm3 actually makes this easy on you. They have default linker scripts for each of the chips that are supported. All you really need to do is to fill in a small linker script with the RAM and FLASH size of your chip, include the default libopencm3 one, and away you go.

So we are going to put all of the above together and write a Makefile and a linker script into the project directory we created in the last tutorial. Neither of these are necessarily the best examples of what to do, but they will get the job done. First the Makefile:

$ cd ~/stm32-project
$ cat <<EOF > Makefile
CC=arm-none-eabi-gcc
LD=\$(CC)
OBJCOPY=arm-none-eabi-objcopy
OPENOCD=~/opt/cross/bin/openocd
CFLAGS=-Wall -Wextra -Wimplicit-function-declaration -Wredundant-decls -Wstrict-prototypes -Wundef -Wshadow -g -fno-common -mcpu=cortex-m3 -mthumb -mfloat-abi=hard -MD -DSTM32F3
LDFLAGS=--static -lc -lnosys -T tut.ld -nostartfiles -Wl,--gc-sections -mcpu=cortex-m3 -mthumb -mfloat-abi=hard -lm -Wl,-Map=tut.map
OBJS=tut.o

all: tut.bin

tut.bin: tut.elf
$( echo -e "\t" )\$(OBJCOPY) -Obinary tut.elf tut.bin

tut.elf: \$(OBJS)
$( echo -e "\t" )\$(CC) -o tut.elf \$(OBJS) ~/opt/cross/arm-none-eabi/lib/libopencm3_stm32f3.a --static -lc -lnosys -T tut.ld -nostartfiles -Wl,--gc-sections -mcpu=cortex-m3 -mthumb -mfloat-abi=hard -lm -Wl,-Map=tut.map

flash: tut.bin
$( echo -e "\t" )sudo \$(OPENOCD) -f stm32-openocd.cfg -c "init" -c "reset init" -c "flash write_image erase tut.bin 0x08000000" -c "reset run" -c "shutdown"

clean:
$( echo -e "\t")rm -f *.elf *.bin *.list *.map *.o *.d *~
EOF
You should notice a couple of things in the Makefile. First, we use all of the compiler and linker flags that we talked about earlier. Second, our objects ($OBJS) are tut.c, which we'll create in the next post. And third, we have a flash target that will build the project and flash it onto the target processor. This requires the OpenOCD configuration file that we created a couple of posts ago.

Now the linker script:

$ cat <<EOF > tut.ld
MEMORY
{
        rom (rx) : ORIGIN = 0x08000000, LENGTH = 256K
        ram (rwx) : ORIGIN = 0x20000000, LENGTH = 40K
}

/* Include the common ld script. */
INCLUDE libopencm3_stm32f3.ld
EOF
You'll notice that there isn't a lot here. We just have to define the RAM location and size, and the ROM (Flash) location and size, and the default libopencm3 linker script will take care of the rest.

We now have all of the parts in place. The next post will write, compile, and run a simple program on the board.

Friday, January 10, 2014

Developing STM32 microcontroller code on Linux (Part 5 of 8, building libopencm3)

The first post of this series covered the steps to build and run code for the STM32. The second post covered how to build a cross-compiler for the STM32. The third post covered how to build a debugger for the STM32. The fourth post covered building and configuring OpenOCD for your development environment. This post will cover building the device library, libopencm3.

As mentioned in the introductory post, it makes our life a lot easier if we use a device library. This is a library that abstracts the low-level details of the hardware registers away from us, and gives us a nice consistent API to use. While ST provides one these directly, it is not open-source (or more specifically, it's open-source status is murky). Luckily there is libopencm3, an open-source re-implementation that is also a better library in my opinion. As usual, I'm going to compile a certain version of libopencm3; newer or later versions may or may not work better for you.

As before, we start out by exporting some environment variables:

$ export TOPDIR=~/cross-src
$ export TARGET=arm-none-eabi
$ export PREFIX=~/opt/cross
$ export BUILDPROCS=$( getconf _NPROCESSORS_ONLN )
$ export PATH=$PREFIX/bin:$PATH
The TOPDIR environment variable is the directory in which the sources are stored. The TARGET environment variable is the architecture that we want our compiler to emit code for. For ARM chips without an operating system (like the STM32), we want arm-none-eabi. The PREFIX environment variable is the location we want our cross-compile tools to end up in; feel free to change this to something more suitable. The BUILDPROCS environment variable is the number of processors that we can use; we will use all of them while building to substantially speed up the build process. Finally, we need to add the location of the cross-compile binaries to our PATH so that later building stages can find it.

Now that we have our environment set up, we can get the code. Note that unlike most of the other tools covered in this tutorial, libopencm3 does not do releases. They expect (more specifically, require) that you clone the latest version and use that. That's what we are going to do here. As of this writing, the latest libopencm3 git hash tag is a909b5ca9e18f802e3caef19e63d38861662c128. Since the libopencm3 developers don't guarantee API stability, all of the steps below will assume the API as of that hash tag. If you decide to use a newer version of libopencm3, you may have to update the example code I give you to conform to the new API. With that out of the way, let's get it:

$ sudo yum install git
$ cd $TOPDIR
$ git clone git://github.com/libopencm3/libopencm3.git
$ cd libopencm3
$ git checkout -b clalancette-tutorial \
a909b5ca9e18f802e3caef19e63d38861662c128
What we've done here is to clone the repository, then checkout a new branch with the head at hash a909b5ca9e18f802e3caef19e63d38861662c128. This ensures that even if the library moves forward in the future, we will always use that hash tag for the purposes of this tutorial. Next we build the library:

$ unset PREFIX
$ make DETECT_TOOLCHAIN=1
$ make DETECT_TOOLCHAIN=1 install
$ export PREFIX=~/opt/cross
Here we need to unset PREFIX because libopencm3 uses PREFIX for the toolchain name prefix (arm-none-eabi), not the path prefix. Once we've done that, we can tell libopencm3 to detect the toolchain, and then use it to build libopencm3. Finally we use the install target to install the headers and the static libraries (.a files) to our toolchain. Assuming this is successful, everything necessary should be in ~/opt/cross/arm-none-eabi/, with the libraries in lib/libopencm3* and the header files in include/libopencm3. Note that there is one .a file per chip that is supported by libopencm3; we'll return to this later when we start building code for our chip.

Thursday, January 9, 2014

Developing STM32 microcontroller code on Linux (Part 4 of 8, building openocd)

The first post of this series covered the steps to build and run code for the STM32. The second post covered how to build a cross-compiler for the STM32. The third post covered how to build a debugger for the STM32. This post is going to cover building OpenOCD for your development environment.

As mentioned in the introductory post, we need OpenOCD so we can take binaries that we build and upload them onto the STM32. OpenOCD is a highly configurable tool and understands a number of different protocols. For our purposes, we really only need it to understand STLinkV2, which is what the STM32 uses. Also note that unlike previous posts, this post does not need or build a cross-compiled tool. That's because OpenOCD itself runs on our development machine, so we just need to do a normal compile. As before, I'm going to compile a certain version of OpenOCD (0.7.0). Newer or older versions may work, but your mileage may vary.

As before, we start out by exporting some environment variables:

$ export TOPDIR=~/cross-src
$ export TARGET=arm-none-eabi
$ export PREFIX=~/opt/cross
$ export BUILDPROCS=$( getconf _NPROCESSORS_ONLN )
$ export PATH=$PREFIX/bin:$PATH
The TOPDIR environment variable is the directory in which the sources are stored. The TARGET environment variable is the architecture that we want our compiler to emit code for. For ARM chips without an operating system (like the STM32), we want arm-none-eabi. The PREFIX environment variable is the location we want our cross-compile tools to end up in; feel free to change this to something more suitable. The BUILDPROCS environment variable is the number of processors that we can use; we will use all of them while building to substantially speed up the build process. Finally, we need to add the location of the cross-compile binaries to our PATH so that later building stages can find it.

Now we are ready to start. Let's fetch openocd:

$ cd $TOPDIR
$ wget http://downloads.sourceforge.net/project/openocd/\
openocd/0.7.0/openocd-0.7.0.tar.gz
To start the compile, we first need to install a dependency:

$ sudo yum install libusbx-devel
Now let's unpack and build openocd:

$ tar -xvf openocd-0.7.0.tar.gz
$ cd openocd-0.7.0
$ ./configure --enable-stlink --prefix=$PREFIX
$ make
$ make install
Here we are unpacking, configuring, building, and installing OpenOCD. The configure flags require a bit of explanation. The --enable-stlink flag means to enable support for STLink and STLinkV2, which is what we need for this board. The --prefix flag tells the build system to install OpenOCD to our ~/opt/cross location. This isn't strictly correct; this isn't a cross compile tool. However, it is convenient to have everything in one place, so we install it there.

Assuming everything went properly, we should now have a openocd binary in ~/opt/cross/bin. There will also be a bunch of configuration files installed to ~/opt/cross/share/openocd. These are important as these are pre-canned configuration files provided by OpenOCD. While it is possible to create your own from scratch, the syntax is baroque and it is a lot more work than you would think. Luckily OpenOCD already comes with scripts for STLinkV2 and STM32, so we'll just use those.

In order to have a working configuration, we are going to start creating our "project" directory. This is where the code that eventually runs on the STM32 is going to be placed. I'm going to call my directory ~/stm32-project; feel free to change it for your project. So we do:

$ mkdir ~/stm32-project
$ cd ~/stm32-project
$ cat <<EOF > stm32-openocd.cfg
source [find interface/stlink-v2.cfg]
source [find target/stm32f3x_stlink.cfg]
reset_config srst_only srst_nogate
EOF
Here we create the project directory, cd into it, and then create the configuration file for OpenOCD. The configuration file deserves a bit of explanation. First, we tell it to "find" the stlink-v2.cfg configuration file. Where it looks depends on the PREFIX we configured, so in our case it is going to look through ~/opt/cross/share/openocd for that file (where it should find it). Next we tell OpenOCD to "find" the stm32f3x_stlink.cfg file. Again, that file is located in ~/opt/cross/share/openocd, and it again should find it. Note that if you have a different STM32 chip, you should substitute f3x with whatever version of the chip you have. Finally there is a line about reset_config srst_only, and srst_nogate. I honestly don't know what those are for, though they seem to be necessary.

That's it for OpenOCD. Everything should be built, configured, and ready to go.

Wednesday, January 8, 2014

Release of ruby-libvirt 0.5.2

This is a release notification for ruby-libvirt 0.5.2. ruby-libvirt is a ruby wrapper around the libvirt API. The changelog between 0.5.1 and 0.5.2 is:
  • Fix to make sure we don't free more entries than retrieved (potential crash)
Version 0.5.2 is available from http://libvirt.org/ruby:

Tarball: http://libvirt.org/ruby/download/ruby-libvirt-0.5.2.tgz
Gem: http://libvirt.org/ruby/download/ruby-libvirt-0.5.2.gem

It is also available from rubygems.org; to get the latest version, run:

$ gem install ruby-libvirt

As usual, if you run into questions, problems, or bugs, please feel free to mail me (clalancette@gmail com) and the libvirt mailing list.

Thanks to Guido Günther for the patch to fix this problem.

Developing STM32 microcontroller code on Linux (Part 3 of 8, building gdb)

The first post of this series covered the steps to build and run code for the STM32. The second post covered how to build a cross-compiler for the STM32. This post is going to cover how to build a debugger for the STM32.

Building a debugger isn't strictly necessary for developing on the STM32. However it can make certain debugging tasks easier, and it is relatively simple to do, so we'll do it here. As with the tools in the last post, the version of gdb used (7.6) worked for me. Your mileage may vary. If you fail to cross-compile gdb, then try a slightly newer or older version and try again on your development setup. If you can't build gdb, you can safely skip this step, though you may run into some problems later.

To build gdb, we'll assume you installed the tools to the path from the last post. If you changed path, you'll have to edit the PREFIX path below.

As before, we start out by exporting some environment variables:

$ export TOPDIR=~/cross-src
$ export TARGET=arm-none-eabi
$ export PREFIX=~/opt/cross
$ export BUILDPROCS=$( getconf _NPROCESSORS_ONLN )
$ export PATH=$PREFIX/bin:$PATH
The TOPDIR environment variable is the directory in which the sources are stored. The TARGET environment variable is the architecture that we want our compiler to emit code for. For ARM chips without an operating system (like the STM32), we want arm-none-eabi. The PREFIX environment variable is the location we want our cross-compile tools to end up in; feel free to change this to something more suitable. The BUILDPROCS environment variable is the number of processors that we can use; we will use all of them while building to substantially speed up the build process. Finally, we need to add the location of the cross-compile binaries to our PATH so that later building stages can find it.

Next we'll download, unpack, and build gdb:

$ cd $TOPDIR
$ wget ftp://ftp.gnu.org/gnu/gdb/gdb-7.6.tar.gz
$ tar -xvf gdb-7.6.tar.gz
$ mkdir build-gdb
$ cd build-gdb
$ ../gdb-7.6/configure --target=$TARGET --prefix=$PREFIX \
--enable-interwork
$ make -j$BUILDPROCS
$ make install
We download gdb, unpack it, then configure and build it. The flags to configure deserve some explanation. The --target flag tells gdb what target you want the tools to build for; that is, what kind of code will be emitted by the code. In our case, we want ARM with no operating system. The --prefix flag tells gdb that we want our debugger to be installed to $PREFIX. The --enable-interwork flag allows binutils to emit a combination of ARM and THUMB code; if you don't know what that is, don't worry about it for now. Assuming this step went fine on your development machine, there should be a binary in ~/opt/cross/bin (or whatever your top-level output directory is) called arm-none-eabi-gdb.

Tuesday, January 7, 2014

Developing STM32 microcontroller code on Linux (Part 2 of 8, building the cross-compiler)

The first post of this series covered the steps to build and run code for the STM32. This post is going to cover how to build a cross-compiler for the STM32.

The steps to build a cross-compiler are somewhat covered here and here. In theory, building a cross-compiler is a pretty straightforward process:
  1. Cross compile binutils, to get things like as (assembler), ld (linker), nm (list object symbols), etc.
  2. Cross compile gcc, which gives you a C and C++ compiler.
  3. Cross compile newlib, which gives you a minimal libc-like environment to program in.
However, there is a big gotcha. Not all combinations of binutils, gcc, and newlib work together. Worse, not all combinations of them build on all development environments, which can make this something of a frustrating experience. For instance, it is known that binutils < 2.24 does not build on machines with texinfo 5.x or later. Thus, on modern machines (like Fedora 19), you must use binutils 2.24 or later. Also, I found that the latest newlib of this writing (2.1.0) does not build on Fedora 19. Your mileage may vary, and this will almost certainly change in the future; the best advice I can give is to start with the latest versions of the packages and then slowly back off the ones that fail until you get a relatively recent combination that works. For the purposes of this post, I ended up using binutils 2.24, gcc 4.8.2, and newlib 2.0.0. This combination builds just fine on Fedora 19.

Now onto the steps needed to build the cross compiling environment. We first need to make sure certain tools are installed. We'll install the development tools through yum:

$ sudo yum install gcc make tar wget bzip2 gmp-devel \
mpfr-devel libmpc-devel gcc-c++ texinfo ncurses-devel
Next we fetch the relevant versions of the packages:

$ mkdir ~/cross-src
$ cd ~/cross-src
$ wget ftp://ftp.gnu.org/gnu/binutils/binutils-2.24.tar.gz
$ wget ftp://ftp.gnu.org/gnu/gcc/gcc-4.8.2/gcc-4.8.2.tar.bz2
$ wget ftp://sources.redhat.com/pub/newlib/newlib-2.0.0.tar.gz
Next we set some environment variables. This isn't strictly necessary, but will help us reduce errors in the following steps:

$ export TOPDIR=~/cross-src
$ export TARGET=arm-none-eabi
$ export PREFIX=~/opt/cross
$ export BUILDPROCS=$( getconf _NPROCESSORS_ONLN )
$ export PATH=$PREFIX/bin:$PATH
The TOPDIR environment variable is the directory in which the sources are stored. The TARGET environment variable is the architecture that we want our compiler to emit code for. For ARM chips without an operating system (like the STM32), we want arm-none-eabi. The PREFIX environment variable is the location we want our cross-compile tools to end up in; feel free to change this to something more suitable. The BUILDPROCS environment variable is the number of processors that we can use; we will use all of them while building to substantially speed up the build process. Finally, we need to add the location of the cross-compile binaries to our PATH so that later building stages can find it.

Now we can start building. We first need to build binutils:

$ cd $TOPDIR
$ tar -xvf binutils-2.24.tar.gz
$ mkdir build-binutils
$ cd build-binutils
$ ../binutils-2.24/configure --target=$TARGET --prefix=$PREFIX \
--enable-interwork --disable-nls
$ make -j$BUILDPROCS
$ make install
Basically we are unpacking binutils, doing an out-of-tree build (recommended), and then installing it. The flags to configure deserve some explanation. The --target flag tells binutils what target you want the tools to build for; that is, what kind of code will be emitted by the code. In our case, we want ARM with no operating system. The --prefix flag tells binutils that we want our tools to be installed to $PREFIX. The --enable-interwork flag allows binutils to emit a combination of ARM and THUMB code; if you don't know what that is, don't worry about it for now. Finally, the --disable-nls flag tells binutils not to build translation files, which speeds up the build. Assuming this step went fine on your development machine, there should be a set of tools in ~/opt/cross/bin (or whatever your top-level output directory is) called arm-none-eabi-*. If this didn't work, then you might want to try a newer or older version of binutils; you can't proceed any further without this working.

With binutils built, we can now move on to gcc:

$ cd $TOPDIR
$ tar -xvf newlib-2.0.0.tar.gz
$ tar -xvf gcc-4.8.2.tar.bz2
$ mkdir build-gcc
$ cd build-gcc
$ ../gcc-4.8.2/configure --target=$TARGET --prefix=$PREFIX \
--enable-interwork --disable-nls --enable-languages="c,c++" \
--without-headers --with-newlib \
--with-headers=$TOPDIR/newlib-2.0.0/newlib/libc/include
$ make -j$BUILDPROCS all-gcc
$ make install-gcc
Here we are unpacking gcc and newlib (which is required for building gcc), doing an out-of-tree build of the initial part of gcc, and then installing it. The flags to configure deserve some explanation. The --target flag tells gcc what target you want the tools to emit code for. The --prefix flag tells gcc that we want our tools to be installed to $PREFIX. The --enable-interwork flag allows gcc to emit a combination of ARM and THUMB code. The --disable-nls flag tells gcc not to build translation files, which speeds up the build. The --enable-languages flag tells gcc which compilers we want it to build; in our case, both the C and C++ compilers. The --without-headers --with-newlib and --with-headers flags tells gcc that it not to use internal headers, but rather to use newlib and the headers from newlib. Assuming this step finished successfully, there should be a file called ~/opt/cross/bin/arm-none-eabi-gcc, which is the initial compiler. Again, if it didn't work, then you might want to try a newer or older version of gcc; you can't proceed any further without this.

With the initial compiler built, we can now build newlib:

$ cd $TOPDIR
$ mkdir build-newlib
$ cd build-newlib
$ ../newlib-2.0.0/configure --target=$TARGET --prefix=$PREFIX \
--enable-interwork
$ make -j$BUILDPROCS
$ make install
Since we've already unpacked newlib, we skip that step. Here we are doing an out-of-tree build of newlib, using the compiler that we built in the last step. The configure flags have the same meaning as previously.

With newlib built, we can now go back and finish the build of gcc (the last step!):

$ cd $TOPDIR/build-gcc
$ make -j$BUILDPROCS
$ make install
This finishes the gcc build, and installs it to $PREFIX. That's it! You should now have a $PREFIX directory full of tools and headers useful for building code to run on the STM32.

Update Jan 8, 2014: Updated the formatting so it is more readable.

Monday, January 6, 2014

Developing STM32 microcontroller code on Linux (Part 1 of 8, introduction)

Recently I've been playing with the STM32, which is a small microcontroller made by ST. These seem to be pretty great microcontroller chips; they are relatively fast (depending on what model you get), have a decent amount of flash (up to 1MB), and have a decent amount of memory (up to 192KB). It is also easy to get development boards for them; there is a line of boards called the STM32DISCOVERY boards that are really cheap and easy to get. It is possible to work on these chips entirely with open source tools, which is important to me.

This series of posts will go through all of the steps necessary to develop on these boards. Note that all of this is covered elsewhere on the web, but a lot of the information is either outdated or scattered. I'll build all of the pieces from the ground up to get a working set of tools and binaries that you can use to develop your own STM32 applications.

To start with, I'm going to describe my hardware setup. I have a laptop running Fedora 19 x86_64. This is my main development machine, and this is going to be the host for everything I do with the STM32. For an STM32 board, I have an STM32F3DISCOVERY board, as shown here. However, note that for most of the posts, the exact board that you have isn't that important. As long as it is one of the STM32F*DISCOVERY boards, the steps below will mostly apply. The differences will become more important when we start to actually write code that deals with the GPIOs (as the GPIOs differ per board), but for the development environment they are really all quite similar.

This series of posts will do the steps in the following order:
  1. In order to do anything, we need a cross compiler. This is a set of tools that runs on our development environment (Fedora 19 x86_64), but emit instructions for our target hardware (STM32 ARM). Besides the C/C++ compiler, this also includes things like the assembler and linker. Part of the cross-compile toolchain also includes a minimal libc-like environment to program in, which gives you access to <stdio.h> and other familiar header files and functions. We will build a cross compile environment from binutils, gcc, and newlib.
  2. Once we have a cross-compiler, we need some way to debug the programs we write. The simplest thing to do here is to build gdb, the GNU debugger. Unfortunately we can't just use the system gdb, as that generally only understands how to debug and disassemble code on your development machine architecture. So we'll build our own version of gdb that understands ARM.
  3. With the debugger finished, we need some way to take the compiled version of our code and put it onto the target device. The STM32 devices use something called STLinkV2, which is a multi-purpose communication protocol (generally over USB). In order to upload our code to the device, we need a piece of software the speaks this protocol. Luckily there is OpenOCD, the Swiss Army Knife of communication protocols. We'll need to build a version of this that runs on our development machine, but knows how to speak STLinkV2. In this step we'll also build a configuration file that can communicate over STLinkV2.
  4. With the communications taken care of, we need a device library. This is basically an abstraction layer that allows us to talk directly to the hardware on the target device. For the purposes of these posts we are going to use libopencm3. This step will build libopencm3 for the target device.
  5. Once we have libopencm3 built, we have to know how to link programs so that they run on the STM32. This step will discuss linker scripts and command-line directives necessary to build programs that run on the STM32.
  6. Here we build our first simple program, upload it to the STM32, and watch it run! Finally!
  7. For bonus, I discuss running a simple Real-Time Operating System on the STM32, FreeRTOS. Using this will allow you to define several tasks and have the RTOS switch between them, much like tasks on a full-fledged OS. This opens up new possibilities and new problems, some of which will be discussed.
Whew, that's a lot of steps just to get the equivalent of "Hello World" running on the board. However, it should be educational and collect a lot of this information together in one place.

Sunday, December 15, 2013

Release of ruby-libvirt 0.5.1

I'm pleased to announce the release of ruby-libvirt 0.5.1. ruby-libvirt is a ruby wrapper around the libvirt API. The changelog between 0.5.0 and 0.5.1 is:
  • Fixes to compile against older libvirt
  • Fixes to compile against ruby 1.8
Version 0.5.1 is available from http://libvirt.org/ruby: Tarball: http://libvirt.org/ruby/download/ruby-libvirt-0.5.1.tgz Gem: http://libvirt.org/ruby/download/ruby-libvirt-0.5.1.gem It is also available from rubygems.org; to get the latest version, run: $ gem install ruby-libvirt As usual, if you run into questions, problems, or bugs, please feel free to mail me (clalancette@gmail.com) and/or the libvirt mailing list. Thanks to everyone who contributed patches and submitted bugs.

Monday, December 9, 2013

Release of ruby-libvirt 0.5.0

I'm pleased to announce the release of ruby-libvirt 0.5.0. ruby-libvirt is a ruby wrapper around the libvirt API. Version 0.5.0 brings new APIs, more documentation, and bugfixes:
  • Updated Network class, implementing almost all libvirt APIs
  • Updated Domain class, implementing almost all libvirt APIs
  • Updated Connection class, implementing almost all libvirt APIs
  • Updated DomainSnapshot class, implementing almost all libvirt APIs
  • Updated NodeDevice class, implementing almost all libvirt APIs
  • Updated Storage class, implementing almost all libvirt APIs
  • Add constants for almost all libvirt defines
  • Improved performance in the library by using alloca
Version 0.5.0 is available from http://libvirt.org/ruby: Tarball: http://libvirt.org/ruby/download/ruby-libvirt-0.5.0.tgz Gem: http://libvirt.org/ruby/download/ruby-libvirt-0.5.0.gem It is also available from rubygems.org; to get the latest version, run: $ gem install ruby-libvirt As usual, if you run into questions, problems, or bugs, please feel free to mail me (clalancette@gmail.com) and/or the libvirt mailing list. Thanks to everyone who contributed patches and submitted bugs.

Saturday, November 9, 2013

Writing Ruby Extensions in C - Part 13, Wrapping C data structures

This is the thirteenth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talk about dealing with numbers. The eighth post talked about strings. The ninth post focused on arrays. The tenth post looked at hashes. The eleventh post explored blocks and callbacks. The twelfth post looked at allocating and freeing memory. This post will focus on wrapping C data structures in ruby objects.

Wrapping C data structures

When developing a ruby extension in C, it may be necessary to save an allocated C structure inside a Ruby object. For instance, in the ruby-libvirt bindings, a virConnectPtr (which points to a libvirt connection object) is saved inside of Libvirt::Connect ruby object, and that pointer is fetched from the object any time an instance method is called. Note that the pointer to the C structure is stored inside the Ruby object in a way that the ruby code can't get to; only C extensions will have access to this pointer. There are only 3 APIs that are used to manipulate these pointers:
  • Data_Wrap_Struct(VALUE klass, void (*mark)(), void (*free)(), void *ptr) - Wrap the C data structure in ptr into a class of type klass. The free argument is a function pointer to a function that will be called when the object is being garbage collected. If the C structure references other ruby objects, then the mark function pointer must also be provided and must properly mark the other objects with rb_gc_mark(). This function returns a VALUE which is an object of type klass.
  • Data_Make_Struct(VALUE klass, c-type, void (*mark)(), void (*free)(), c-type *ptr) - Similar to Data_Wrap_Struct(), but first allocates and then wraps the C structure in an object. The klass, mark, free, and ptr arguments have the same meaning as Data_Wrap_Struct(). The c-type argument is the actual name of the type that needs to be allocated (sizeof(type) will be used to allocate).
  • Data_Get_Struct(VALUE obj, c-type, c-type *ptr) - Get the C data structure of c-type out of the object obj, and put the result in ptr. Note that this pointer assignment works because this is a macro.

An example will demonstrate the use of these functions:

 1) static VALUE m_example;
 2) static VALUE c_conn;
 3)
 4) struct mystruct {
 5)     int a;
 6)     int b;
 7) };
 8)
 9) static void mystruct_free(void *s)
10) {
11)    xfree(s);
12) }
13)
14) static VALUE example_open(VALUE m)
15) {
16)     struct mystruct *conn;
17)     conn = ALLOC(struct mystruct);
18)     conn->a = 25;
19)     conn->b = 99;
20)     return Data_Wrap_Struct(c_conn, NULL, mystruct_free, conn);
21) }
22)
23) static VALUE conn_get_a(VALUE c)
24) {
25)     struct mystruct *conn;
26)     Data_Get_Struct(c, struct mystruct, conn);
27)     return INT2NUM(conn->a);
28) }
29)
30) void Init_example(void)
31) {
32)     m_example = rb_define_module("Example");
33)     rb_define_module_function(m_example, "open", example_open, 0);
34)     c_conn = rb_define_class_under(m_example, "Conn", rb_cObject);
35)     rb_define_method(c_conn, "get_a", conn_get_a, 0);
36) }

On lines 32 and 33, we define the Example module and give it a module function called "open". Lines 34 and 35 define a Conn class under the Example module, and gives the Conn class a "get_a" method. Lines 14 through 21 are where we implement the Example::open function. There, we allocate memory for our C structure, then use Data_Wrap_Struct() to wrap that C structure in a ruby object of type Example::Conn. Note that we also pass mystruct_free() as the free callback; when the object gets reaped by the garbage collector, this function on lines 9 through 12 will be called to free up any memory. Now when the user calls "get_a" on the Example::Conn ruby object, the function on lines 23 through 27 will be called. There we use Data_Get_Struct() to fetch the structure back out of the object, and then return a ruby number for the integer stored inside. Update: added links to all of the previous articles. Update Jan 27, 2014: Updated the example to fix the use of ALLOC(). Thanks to Thomas Thomassen in the comments.

Tuesday, January 18, 2011

Writing Ruby Extensions in C - Part 12, Allocating memory

This is the twelfth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talked about dealing with numbers. The eighth post talked about strings. The ninth post focused on arrays. The tenth post looked at hashes. The eleventh post explored blocks and callbacks. This post will look at allocating and freeing memory.

Allocating memory


When creating a new ruby object, memory will be automatically allocated from the garbage collector as needed.

If the ruby extension needs to allocate C-style memory, the basic malloc/realloc/calloc calls can be used. However, there are ruby counterparts that do the work of malloc/realloc/calloc in a slightly better way. The advantage of the following calls is that they first try to allocate memory, and if they fail, they will invoke the garbage collector to free up a bit of memory and try again. That way if the program is low on memory, or the address space is fragmented because of the ruby memory allocator, these functions will succeed where basic malloc/realloc/calloc would fail:
  • ALLOC(type) - allocate a structure of the pointer type
  • ALLOC_N(type, num) - allocate num structures of pointer type
  • REALLOC_N(var, type, num) - realloc var to num structure of pointer type

It is important to use xfree() to free the memory allocated by these calls. In the nominal case there isn't much difference between regular free() and xfree(), but if ruby is built a certain way, xfree() does some additional internal accounting. In any case, there is no reason not to use xfree(), so it is recommended to always use xfree(). Thanks to SodaBrew for pointing this out in the comments.

A simple example to demonstrate the use of these functions:

 1) struct mystruct {
 2)     int a;
 3)     char *b;
 4) };
 5)
 6) static VALUE implementation(VALUE a) {
 7)     struct mystruct *single;
 8)     struct mystruct *multiple;
 9)
10)     single = ALLOC(struct mystruct);
11)     xfree(single);
12)
13)     multiple = ALLOC_N(struct mystruct, 5);
14)
15)     REALLOC_N(multiple, struct mystruct, 10);
16)
17)     xfree(multiple);
18)
19)     return Qnil;
20) }

Lines 1 through 4 just define a simple structure containing a char * and an int. The implementation of a ruby method on lines 6 through 20 show the use of the allocation functions. Line 10 shows the allocation of a single structure of type struct mystruct, which is freed on line 11. Line 13 shows the allocation of an array of 5 elements of struct mystructs into the multiple pointer. Line 15 shows the reallocation of the multiple array to 10 elements. Notice that since REALLOC_N is a macro, it operates slightly differently than realloc(); in particular, there is no need (and no way) to re-assign the pointer. Finally, line 17 frees up the multiple pointer and line 19 returns successfully from the function.

Error handling and not leaking memory


Ruby is a garbage collected language, meaning that applications don't generally have to worry about freeing memory after it is used. This garbage collection extends into C extension modules, but only to a certain point. If you are writing a C extension to ruby, there are some places that you have to worry about keeping track of your pointers and freeing them up. To understand why, we need to dig a little into the memory allocation functions of ruby.

When you are writing pure ruby, and execute a line of code like:

x = ['a']

the ruby virtual machine causes some memory to come into existence to hold that list for you. The way that this memory is allocated is with rb_ary_new() (or one of its derivatives). The call chain looks like: rb_ary_new() -> rb_ary_new2() -> ary_new() -> NEWOBJ() -> rb_newobj(). Inside of rb_newobj(), no memory is actually allocated; instead, the new object that we need to come into existence is just taken off of the list of free objects, and the free list head is moved to the next object. If it turns out that no memory is available in this freelist, the garbage collector is run to try to reap some memory, and then the memory is given to this new object. Because this memory is coming from the freelist, it is all involved with (and can later be reaped by) the garbage collection.

When you allocate memory in C code using malloc (or one of its derivatives), no such thing happens. The memory is properly allocated, but it is not involved in any of the garbage collection schemes. This leads to 2 problems:
  1. Since malloc isn't involved in the garbage collection, the malloc can fail earlier than it normally would due to address space fragmentation. This isn't generally a problem on 64-bit architectures, but it could crop up as a problem on 32-bit ones.
  2. If a ruby call in your extension module fails, it will throw an exception. In ruby, exceptions are done via a longjmp out of the extension code and into the ruby exception handling code. If you have allocated any memory with malloc and friends, you have now lost the pointers to that memory, so you now have a memory leak (apparently this problem is much worse when dealing with C++; see [1]).

Problem 1) is partially solved by using the built-in ruby ALLOC, ALLOC_N, and ruby_xmalloc functions. Problem 2) is much more insidious, and more difficult to handle. Luckily, it is not impossible to handle.

Assume you have the following code snippet:


 1) int *ids;
 2) VALUE result;
 3) int i;
 4)
 5) ids = ALLOC_N(int, 5);
 6) for (i = 0; i < 5; i++)
 7)     ids[i] = i;
 8)
 9) result = rb_ary_new2();
10)
11) for (i = 0; i < 5; i++)
12)     rb_ary_push(result, INT2NUM(ids[i]));
13)
14) xfree(ids);

(while this is a bit of a contrived example, it actually bears a lot of resemblance to this[2] code in ruby-libvirt)

What this code is trying to do is to create an array full of the values in the "ids" array. If there are no errors, then this code works absolutely fine and doesn't leak any memory (ids gets freed at line 14, and the ruby array will get reaped by the garbage collector eventually). However, if either rb_ary_new2() or rb_ary_push() fails in lines 9 or 12, then they will automatically longjmp to the ruby exception handler, completely skipping the xfree at line 14. This code has now leaked memory.

The way to fix this is to interrupt ruby's normal longjmp on exception mechanism so that you can insert code of your own before throwing the exception. The rb_protect() ruby call can be used to do exactly this. Unfortunately the interface is a bit clunky, but we have to do what we have to do.

rb_protect() takes 3 arguments: a name of a callback function that takes 1 (and exactly 1 argument), the argument to pass to that callback function, and a pointer to an integer to store the exception address (if any). Because the callback function can only take one argument, typical usage is to create a callback "wrapper" that takes the one and only argument. The data that you pass in can be anything, so if you want to pass in multiple arguments, you can do so by passing in a pointer to a structure containing all of the data that you need. An example should help clarify some of this:


 1) struct rb_ary_push_arg {
 2)     VALUE arr;
 3)     VALUE value;
 4) };
 5)
 6) static VALUE rb_ary_push_wrap(VALUE arg) {
 7)     struct rb_ary_push_arg *e = (struct rb_ary_push_arg *)arg;
 8)
 9)     return rb_ary_push(e->arr, e->value);
10) }
11)
12) int *ids;
13) VALUE result;
14) int i;
15) int exception = 0;
16) struct rb_ary_push_arg args;
17)
18) ids = ALLOC_N(int, 5);
19) for (i = 0; i < 5; i++)
20)     ids[i] = i;
21)
22) result = rb_ary_new2();
23)
24) for (i = 0; i < 5; i++) {
25)     args.arr = result;
26)     args.value = INT2NUM(ids[i]);
27)     rb_protect(rb_ary_push_wrap, (VALUE)&args, &exception);
28)     if (exception) {
29)         xfree(ids);
30)         rb_jump_tag(exception);
31)     }
32) }
33)
34) xfree(ids);

Now when we add entries to the ruby array, we are doing so through the rb_ary_push_wrap() function, called by rb_protect(). This means that if rb_ary_push() fails for any reason and throws an exception, control will be returned back to the code above at line 28, but with exception set to a non-zero number. We have a chance to clean up after ourselves, and then continue propagating the exception with rb_jump_tag(). Note that with the use of a proper structure, we can pass any number of arguments through to the wrapper function, so we can use this for all internal ruby functions. Notice that I did not wrap rb_ary_new2(), even though that can cause the same problem; I leave this as an exercise to the reader.

[1] http://www.thoughtsincomputation.com/posts/ruby-c-extensions-c-and-weird-crashing-on-rb_raise
[2] http://libvirt.org/git/?p=ruby-libvirt.git;a=blob;f=ext/libvirt/domain.c;h=eb4426252af635311e14e234a62780fbd4048f0b;hb=HEAD#l80

Monday, January 17, 2011

Writing Ruby Extensions in C - Part 11, Blocks and Callbacks

This is the eleventh in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talked about dealing with numbers. The eighth post talked about strings. The ninth post focused on arrays. The tenth post looked at hashes. This post will talk about blocks and callbacks.

Blocks


Blocks [1] are a great idiom in ruby, equivalent to anonymous functions attached to a line of code. As with many other things in ruby C extensions, they are fairly easy to deal with. There are a few functions to know about:
  • rb_block_given_p() - returns 1 if a block was given to this ruby function, 0 otherwise
  • rb_yield(value) - yield a single value to the given block
  • rb_yield_values() - yield multiple values to the given block

In ruby terms "yield"ing sends a value from a statement into a block. If you want to yield multiple values to a ruby block, you have two options: rb_yield() with an array or hash, and rb_yield_values(). They both work equally well, though rb_yield_values() with multiple values is a bit more idiomatic to ruby. It is also possible for the ruby block to return a result from the block; the return value of the last statement of the block will be returned from the rb_yield() or rb_yield_values() call. However, note that the last line of the block cannot be a return; in that case, the value will be lost forever. Unfortunately this puts a bit of a burden on the consumers of the APIs, but it is coded into the ruby runtime[2]. The following example will demonstrate all of these calls.

First let's look at the ruby code:


 1) obj.rb_yield_example {|single|
 2)     puts "Single element is #{single}"
 3)     "done"
 4) }
 5)
 6) obj.rb_yield_values_example {|first, second, third|
 7)     puts "1st is #{first}, 2nd is #{second}, 3rd is #{third}"
 8)     "done"
 9) }

Now let's look at the C code to implement the above:


 1) static VALUE example_rb_yield(VALUE c) {
 2)     VALUE result;
 3)
 4)     if (!rb_block_given_p())
 5)         rb_raise(rb_eArgError, "Expected block");
 6)
 7)     result = rb_yield(rb_str_new2("hello"));
 8)
 9)     fprintf(stderr, "Return value from block is %s\n",
10)             StringValueCStr(result));
11)
12)     return Qnil;
13) }
14)
15) static VALUE example_rb_yield_values(VALUE c){
16)     VALUE result;
17)
18)     if (!rb_block_given_p())
19)         rb_raise(rb_eArgError, "Expected block");
20)
21)     result = rb_yield_values(3, rb_str_new2("first"),
22)                              rb_str_new2("second"),
23)                              rb_str_new2("third"));
24)
25)     fprintf(stderr, "Return value from block is %s\n",
26)             StringValueCStr(result));
27)
28)     return Qnil;
29) }
30)
31) rb_define_method(c_obj, "rb_yield_example",
32)                  example_rb_yield, 0);
33) rb_define_method(c_obj, "rb_yield_values_example",
34)                  example_rb_yield_values, 0);


Callbacks


Although blocks are idiomatic to ruby and should be used wherever possible, there are situations in which they do not work. For instance, if a ruby method needs to be used as a callback for an asynchronous event, blocks do not work; they are only active for the duration of the method call the block is attached to. If it is necessary to call a particular ruby method from a C library asynchronous callback, there are 2 options:

  1. Procs (lambdas)
  2. Named Methods

Procs are more idiomatic to ruby, but as far as I can tell there isn't a whole lot of advantages to Procs over named methods. I'll go through both of them after setting up the example.

Let's assume that the C library being wrapped requires callbacks for asynchronous events. In this case, the library is expecting a function pointer with a signature looking like:

int (*asynccallback)(int event, void *userdata);

(that is, the function must take an event and a void pointer in, and return an int result). Also assume that we have to register the callback with the library:

void register_async_callback(int (*cb)(int, void *), void *userdata);

How would we go about calling a ruby method that the user writes when the library does the asynchronous callback?

Procs


With Procs, we would have the user of our ruby library create a Proc and pass it to the extension. An example ruby client:

 1)  cb = Proc.new {|event, userdata|
 2)      puts "event is #{event}, userdata is #{userdata}"
 3)  }
 4)
 5)  ruby_extension.register_async_proc(cb, "my user data")

Note that the body of the Proc can be any valid ruby; here we simple print out the arguments that were passed into the Proc.

In the extension, we would define a method called "register_async_proc" that takes 2 arguments: the Proc and the user data that we want passed through to the Proc. The extension C code would look something like:


 1) int internal_callback(int event, void *userdata) {
 2)     VALUE passthrough = (VALUE)userdata;
 3)     VALUE cb;
 4)     VALUE cbdata;
 5)
 6)     cb = rb_ary_entry(passthrough, 0);
 7)     cbdata = rb_ary_entry(passthrough, 1);
 8)
 9)     rb_funcall(cb, rb_intern("call"), 2, INT2NUM(event),
10)                cbdata);
11)
12)     return 0;
13) }
14)
15) VALUE ext_register(VALUE obj, VALUE cb, VALUE userdata) {
16)     VALUE passthrough;
17)
18)     if (rb_class_of(cb) != rb_cProc)
19)         rb_raise(rb_eTypeError, "Expected Proc callback");
20)
21)     passthrough = rb_ary_new();
22)     rb_ary_store(passthrough, 0, cb);
23)     rb_ary_store(passthrough, 1, userdata);
24)
25)     register_async_callback(internal_callback,
26)                             (void *)passthrough);
27) }
28)
29) rb_define_method(c_extension, "register_async_proc",
30)                  ext_register, 2);

The above is not a lot of code, but there is a lot going on, so let's step through it one line at a time starting from the end. Line 29 defines our new method called register_async_proc, that will call the internal extension function ext_register (lines 15 to 27) with 2 arguments. Lines 18 and 19 inside of ext_register check to make sure that what the user actually passed us was a Proc. Lines 21 through 23 set up a new ruby array that contains both the callback that the user gave to us and any additional user data that they want passed into the Proc. Line 25 calls the C library function register_async_callback with our *internal* callback, and the ruby array that we set up in lines 21 through 23. There are a couple of things to note with this. First, we cannot use the ruby Proc as the callback directly; the Proc will have the wrong signature, and the C library doesn't have any idea of how to marshal data so that ruby can understand it. Instead, we have the C library call an internal callback inside the extension; this internal callback will marshal the data for the ruby callback, and then invoke the ruby callback. The second thing to note about line 25 is that we pass the array that we created in lines 21 through 23 to the C library in the "opaque" callback data. It is imperative that the C library function provide a void * pointer for user data, otherwise this technique cannot work.

After line 25, the asynchronous callback is set up. When an event happens in the C library, it will callback to the function given to it by register_async_callback. In our case, this callback is internal_callback, lines 1 through 13. The first thing that internal_callback does on line 2 is to cast the void * back to a VALUE so we can operate on it. In lines 6 and 7, the array that was created and registered earlier is pulled apart into separate pieces. Finally, line 9 calls out to the Proc that was originally registered by the user, passing the event that happened and the original user data to be passed into the Proc.

Named methods


Named method callbacks work very similarly to Proc callbacks, so I won't go into great lengths to describe them. I'll show the (very similar) example code, and explain the differences to the Proc callback method.

First the ruby client code:

 1)  def cb(event, userdata)
 2)      puts "event is #{event}, userdata is #{userdata}"
 3)  end
 4)
 5)  ruby_extension.register_async_symbol(:cb, "my user data")

There are two important differences to the Proc code; the fact that the callback is a real method (defined with def), and how we pass it into the extension call. We cannot just use "cb", because otherwise ruby attempts to execute the function cb before calling register_async_symbol. Instead we have to pass the Symbol that represents the callback method.

Now we look at the extension code:

 1) int internal_callback(int event, void *userdata) {
 2)     VALUE passthrough = (VALUE)userdata;
 3)     VALUE cb;
 4)     VALUE cbdata;
 5)
 6)     cb = rb_ary_entry(passthrough, 0);
 7)     cbdata = rb_ary_entry(passthrough, 1);
 8)
 9)     rb_funcall(rb_class_of(cb), rb_to_id(cb), 2, INT2NUM(event),
10)                cbdata);
11)
12)     return 0;
13) }
14)
15) VALUE ext_register(VALUE obj, VALUE cb, VALUE userdata) {
16)     VALUE passthrough;
17)
18)     if (rb_class_of(cb) != rb_cSymbol)
19)         rb_raise(rb_eTypeError, "Expected Symbol callback");
20)
21)     passthrough = rb_ary_new();
22)     rb_ary_store(passthrough, 0, cb);
23)     rb_ary_store(passthrough, 1, userdata);
24)
25)     register_async_callback(internal_callback,
26)                             (void *)passthrough);
27) }
28)
29) rb_define_method(c_extension, "register_async_symbol",
30)                  ext_register, 2);

The differences are minor. Line 29 defines this as "register_async_symbol" instead of "register_async_proc". Line 18 checks to make sure that this is of type rb_cSymbol instead of rb_cProc. Line 9 is where the biggest difference is. Instead of using the "call" method to invoke the Proc, we instead use the class and the ID of the method that the user originally gave to us.

[1] http://ruby-doc.org/docs/ProgrammingRuby/html/tut_containers.html
[2] http://stackoverflow.com/questions/1435743/why-does-explicit-return-make-a-difference-in-a-proc

Sunday, January 16, 2011

Writing Ruby Extensions in C - Part 10, Hashes

This is the tenth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talked about dealing with numbers. The eighth post talked about strings. The ninth post focused on arrays. This post will look at hashes.

Hashes


The nice thing about hashes in ruby C extensions is that they act very much like the ruby hashes they represent. There are a few functions to know about:
  • rb_hash_new() - create a new ruby Hash
  • rb_hash_aset(hash, key, value) - set the hash key to value
  • rb_hash_aref(hash, key) - get the value for hash key
  • rb_hash_foreach(hash, callback, args) - call callback for each key,value pair in the hash. Callback must have a prototype of int (*cb)(VALUE key, VALUE val, VALUE in)

An example will demonstrate this:

 1) int do_print(VALUE key, VALUE val, VALUE in) {
 2)      fprintf(stderr, "Input data is %s\n", StringValueCStr(in));
 3)
 4)      fprintf(stderr, "Key %s=>Value %s\n", StringValueCStr(key),
 5)              StringValueCStr(val));
 6)
 7)      return ST_CONTINUE;
 8) }
 9)
10) VALUE result;
11) VALUE val;
12)
13) result = rb_hash_new();
14) // result is now {}
15) rb_hash_aset(result, rb_str_new2("mykey"),
16)              rb_str_new2("myvalue"));
17) // result is now {"mykey"=>"myvalue"}
18) rb_hash_aset(result, rb_str_new2("anotherkey"),
19)              rb_str_new2("anotherval"));
20) // result is now {"mykey"=>"myvalue",
21) //                "anotherkey"=>"anotherval"}
22) rb_hash_aset(result, rb_str_new2("mykey"),
23)              rb_str_new2("differentval"));
24) // result is now {"mykey"=>"differentval",
25) //                "anotherkey"=>"anotherval"}
26) val = rb_hash_aref(result, rb_str_new2("mykey"));
27) // result is now {"mykey"=>"differentval",
28) //                "anotherkey"=>"anotherval"},
29) // val is "differentval"
30) rb_hash_delete(result, rb_str_new2("mykey"));
31) // result is now {"anotherkey"=>"anotherval"}
32)
33) rb_hash_foreach(result, do_print, rb_str_new2("passthrough"));

Most of this is pretty straightforward. The most interesting part of this is line 33, where we perform an operation on all elements in the hash by utilizing a callback. This callback is defined on lines 1 through 8, and takes in the key, value, and the user data provided to the original rb_hash_foreach() call. The return code from the callback defines what happens to the processing of the rest of the hash. If the return value is ST_CONTINUE, then the rest of the hash is processed as normal. If the return value is ST_STOP, then no further processing of the hash is done. If the return value is ST_DELETE, then the current hash key is deleted from the hash and the rest of the hash is processed. If the return value is ST_CHECK, then the hash is checked to see if it has been modified during this operation. If so, processing of the hash stops.

Update: Fixed up the example code to show on the screen.

Saturday, January 15, 2011

Writing Ruby Extensions in C - Part 9, Arrays

This is the ninth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talked about dealing with numbers. The eighth post talked about strings. This post will focus on arrays.

Arrays


The nice thing about arrays in ruby C extensions is that they act very much like the ruby arrays they represent. There are a few functions to know about:
  • rb_ary_new() - create a new array with 0 elements. Elements can be added later using rb_ary_push(), rb_ary_store(), or rb_ary_unshift().
  • rb_ary_new2(size) - create a new array with size elements
  • rb_ary_store(array, index, value) - put the ruby value into array at index. This can be used to create sparse arrays; intervening elements that have not yet had values assigned will be set to nil
  • rb_ary_push(array, value) - put value at the end of the array
  • rb_ary_unshift(array, value) - put value at the start of the array
  • rb_ary_pop(array) - pop the last element of array off and return it
  • rb_ary_shift(array) - remove the first element of array and return it
  • rb_ary_entry(array, index) - examine array element located at index without changing array
  • rb_ary_dup(array) - copy array and return the copy
  • rb_ary_to_s(array) - invoke the "to_s" method on the array. Note that this concatenates the array elements together without spacing, so is not generally useful
  • rb_ary_join(array, string_object) - create a string by converting each element of the array to a string separated by string_object. If string_object is Qnil, then no separator is used
  • rb_ary_reverse(array) - reverse the order of all of the elements in array
  • rb_ary_to_ary(ruby_object) - create an array out of any ruby object. If the object is already an array, a reference to the same object is returned. If the object supports the "to_ary" method, then "to_ary" is invoked on the object and the result is returned. If neither of the previous are true, then a new array with 1 element containing the object is returned

An example should make most of this clear:

 1) VALUE result, elem, arr2, mystr;
 2)
 3) result = rb_ary_new();
 4) // result is now []
 5) rb_ary_push(result, INT2FIX(1));
 6) // result is now [1]
 7) rb_ary_push(result, INT2FIX(2));
 8) // result is now [1, 2]
 9) rb_ary_unshift(result, INT2FIX(0));
10) // result is now [0, 1, 2]
11) rb_ary_store(result, 3, INT2FIX(3));
12) // result is now [0, 1, 2, 3]
13) rb_ary_store(result, 5, INT2FIX(5));
14) // result is now [0, 1, 2, 3, nil, 5]
15) elem = rb_ary_pop(result);
16) // result is now [0, 1, 2, 3, nil] and elem is 5
17) elem = rb_ary_shift(result);
18) // result is now [1, 2, 3, nil] and elem is 0
19) elem = rb_ary_entry(result, 0);
20) // result is now [1, 2, 3, nil] and elem is 1
21) arr2 = rb_ary_dup(result);
22) // result is now [1, 2, 3, nil] and arr2 is [1, 2, 3, nil]
23) mystr = rb_ary_to_s(result);
24) // result is now [1, 2, 3, nil] and mystr is 123
25) mystr = rb_ary_join(result, rb_str_new2("-"));
26) // result is now [1, 2, 3, nil] and mystr is 1-2-3-
27) rb_ary_reverse(result);
28) // result is now [nil, 3, 2, 1]
29) rb_ary_shift(result);
30) // result is now [3, 2, 1]
31) result = rb_ary_to_ary(rb_str_new2("hello"));
32) // result is now ["hello"]

Friday, January 14, 2011

Writing Ruby Extensions in C - Part 8, Strings

This is the eighth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. The seventh post talk about dealing with numbers. This post will talk about strings.

Dealing with Strings


It is fairly easy to convert C-style strings to ruby string objects, and vice-versa. There are a few functions to know about:
  • rb_str_new(c_str, length) - take the char * c_str pointer and a length in, and return a ruby string object. Note that c_str does *not* have to be NULL terminated; this is one way to deal with binary data
  • rb_str_new2(c_str) - take the NULL terminated char * c_str pointer in, and return a ruby string object
  • rb_str_dup(ruby_string_object) - take ruby_string_object in and return a copy
  • rb_str_plus(string_object_1, string_object_2) - concatenate string_object_1 and string_object_2 and return the result without modifying either object
  • rb_str_times(string_object_1, fixnum_object) - concatenate string_object_1 with itself fixnum_object number of times and return the result
  • rb_str_substr(string_object, begin, length) - return the substring of string_object starting at position begin and going for length characters. If length is less than 0, then "nil" is returned. If begin is passed the end of the array or before the beginning of the array, then "nil" is returned. Otherwise, this function returns the substring of string_object that matches begin..length, though it may be cut short if there are not enough characters in the array
  • rb_str_cat(string_object, c_str, length) - take the char * c_str pointer and length in, and concatenate onto the end of string_object
  • rb_str_cat2(string_object, c_str) - take the NULL-terminated char *c_str pointer in, and concatenate onto the end of string_object
  • rb_str_append(string_object_1, string_object_2) - concatenate string_object_2 onto string_object_1
  • rb_str_concat(string_object, ruby_object) - concatenate ruby_object onto string_object_1. If ruby_object is a FIXNUM between 0 and 255, then it is first converted to a character before concatenation. Otherwise it behaves exactly the same as rb_str_append
  • StringValueCStr(ruby_object) - take ruby_object in, attempt to convert it to a String, and return the NULL terminated C-style char *
  • StringValue(ruby_object) - take ruby_object in and attempt to convert it to a String. Assuming this is successful, the C char * pointer for the string is available via the macro RSTRING_PTR(return_value) and the length of the string is available via the macro RSTRING_LEN(return_value). This is useful to retrieve binary data out of a String object

An example should make most of this clear:

 1) VALUE result, str2, substr;
 2)
 3) result = rb_str_new2("hello");
 4) // result is now "hello"
 5) str2 = rb_str_dup(result);
 6) // result is now "hello", str2 is now "hello"
 7) result = rb_str_plus(result, rb_str_new2(" there"));
 8) // result is now "hello there"
 9) result = rb_str_times(result, INT2FIX(2));
10) // result is now "hello therehello there"
11) substr = rb_str_substr(result, 0, 2);
12) // result is now "hello therehello there", substr is "he"
13) substr = rb_str_substr(result, -2, 2);
14) // result is now "hello therehello there", substr is "re"
15) substr = rb_str_substr(result, -2, 5);
16) // result is now "hello therehello there", substr is "re"
17) // (substring was cut short because the length goes past the end of the string)
18) substr = rb_str_substr(result, 0, -1);
19) // result is now "hello therehello there", substr is Qnil
20) // (length is negative)
21) substr = rb_str_substr(result, 23, 1);
22) // result is now "hello therehello there", substr is Qnil
23) // (requested start point after end of string)
24) substr = rb_str_substr(result, -23, 1);
25) // result is now "hello therehello there", substr is Qnil
26) // (requested start point before beginning of string)
27) rb_str_cat(result, "wow", 3);
28) // result is now "hello therehello therewow"
29) rb_str_cat2(result, "bob");
30) // result is now "hello therehello therewowbob"
31) rb_str_append(result, rb_str_new2("again"));
32) // result is now "hello therehello therewowbobagain"
33) rb_str_concat(result, INT2FIX(33));
34) // result is now "hello therehello therewowbobagain!"
35) fprintf(stderr, "Result is %s\n", StringValueCStr(result));
36) // "hello therehello there wowbobagain!" is printed to stderr

Update: modified the code to fit in the pre box.

Thursday, January 13, 2011

Writing Ruby Extensions in C - Part 7, Numbers

This is the seventh in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. The sixth post talked about ruby catch and throw blocks. This post will talk about numbers.

Dealing with numbers


Numbers are pretty easy to deal with in a ruby C extension. There are two possible types of Ruby numbers; FIXNUMs and Bignums. FIXNUMs are very fast since they just use the native long type of the architecture. However, due to some implementation details, the range of a FIXNUM is limited to one-half of the native long type. If larger (or smaller) numbers need to be manipulated, Bignums are full-blown ruby objects that can represent any number of any size, at a performance cost. The ruby C extension API has support for converting native integer types to ruby FIXNUMs and Bignums and vice-versa. Some of the functions are:
  • INT2FIX(int) - take an int and convert it to a FIXNUM object (but see INT2NUM below)
  • LONG2FIX(long) - synonym for INT2FIX
  • CHR2FIX(char) - take an ASCII character (0x00-0xff) and convert it to a FIXNUM object
  • INT2NUM(int) - take an int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object. Since this does the right thing in all circumstances, this should always be used in place of INT2FIX
  • LONG2NUM(long) - synonym for INT2NUM
  • UINT2NUM(unsigned int) - take an unsigned int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • ULONG2NUM(unsigned long int) - synonym for UINT2NUM
  • LL2NUM(long long) - take a long long int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • ULL2NUM(unsigned long long) - take an unsigned long long int and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • OFFT2NUM(off_t) - take an off_t and convert it to a FIXNUM object if it will fit; otherwise, convert to a Bignum object
  • FIX2LONG(fixnum_object) - take a FIXNUM object and return the long representation (but see NUM2LONG below)
  • FIX2ULONG(fixnum_object) - take a FIXNUM object and return the unsigned long representation (but see NUM2ULONG below)
  • FIX2INT(fixnum_object) - take a FIXNUM object and return the int representation (but see NUM2INT below)
  • FIX2UINT(fixnum_object) - take a FIXNUM object and return the unsigned int representation (but see NUM2UINT below)
  • NUM2LONG(numeric_object) - take a FIXNUM or Bignum object in and return the long representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2LONG
  • NUM2ULONG(numeric_object) - take a FIXNUM or Bignum object in and return the unsigned long representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2ULONG
  • NUM2INT(numeric_object) - take a FIXNUM or Bignum object in and return the int representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2INT
  • NUM2UINT(numeric_object) - take a FIXNUM or Bignum object in and return the unsigned int representation. Since this does the right thing in all circumstances, this should be used in favor of FIX2UINT
  • NUM2LL(numeric_object) - take a FIXNUM or Bignum object in and return the long long representation
  • NUM2ULL(numeric_object) - take a FIXNUM or Bignum object in and return the unsigned long long representation
  • NUM2OFFT(numeric_object) - take a FIXNUM or Bignum object in and return the off_t representation
  • NUM2DBL(numeric_object) - take a FIXNUM or Bignum object in and return the double representation
  • NUM2CHR(ruby_object) - take ruby_object in and return the char representation of the object. If ruby_object is a string, then the char of the first character in the string is returned. Otherwise, NUM2INT is run on the object and the result is returned
For this particular topic I'll omit the example. There aren't really a lot of interesting things to show or odd corner cases that you need to deal with when working with numbers.

Wednesday, January 12, 2011

Writing Ruby Extensions in C - Part 6, Catch/Throw

This is the sixth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. The fifth post focused on creating and handling exceptions. This post will talk about ruby catch and throw blocks.

Catch/Throw

In ruby, raising exceptions is used to transfer control out of a block of code when something goes wrong. Ruby has a second mechanism for transferring control to blocks called catch/throw. Any ruby block can be labelled via catch(), and then any line of code within that block can throw() to terminate the rest of the block. This also works with nested catch/throw blocks so an inner nested throw could throw all the way back out to the outer block. Essentially, they are a fancy goto mechanim; see [1] for some examples. How can we catch and throw from within our C extension module? Like exceptions, we accomplish this through callbacks.

To set up a catch in a C extension, the rb_catch() function is used. rb_catch() takes 3 parameters: the first parameter is the name of the catch block, the second parameter is the name of the callback to invoke in block context, and the third parameter is data to be passed to the callback. As may be expected, the callback function must take a single VALUE parameter in and return a VALUE.

To return to a catch point in a C extension, the rb_throw() function is used. rb_throw() takes two parameters: the name of the catch block to return to, and the return value (which can be any valid ruby object, including Qnil). If rb_throw() is executed, control is returned from the point of the rb_throw() to the end of the rb_catch() block, and execution continues from there.

An example can demonstrate much of this. First let's look at the C code to implement an example catch/throw:

 1) static VALUE m_example;
 2)
 3) static VALUE catch_cb(VALUE val, VALUE args, VALUE self) {
 4)     rb_yield(args);
 5)     return Qnil;
 6) }
 7)
 8) static VALUE example_method(VALUE klass) {
 9)     VALUE res;
10)
11)     if (!rb_block_given_p())
12)         rb_raise(rb_eStandardError, "Expected a block");
13)
14)     res = rb_catch("catchpoint", catch_cb, rb_str_new2("val"));
15)     if (TYPE(res) != T_FIXNUM)
16)         rb_throw("catchpoint", Qnil);
17)
18)     return res;
19) }
20)
21) void Init_example() {
22)     m_example = rb_define_module("Example");
23)
24)     rb_define_module_function(m_example, "method",
25)                               example_method, 0);
26) }
Lines 21 through 26 set up the extension module, as described elsewhere.

Lines 8 through 19 implement the module function "method". Line 11 checks if a block is given; if not, an exception is raised on line 12. Line 14 sets up an rb_catch() named "catchpoint". The callback catch_cb() will be executed, and a new string of "val" will be passed into the callback. Lines 3 through 6 implement the callback; the value is yielded to the block initially passed into "method", and a nil is returned (which is ignored). Line 15 checks the return value from the block; if it is not a number, then line 16 does an rb_throw() to abort the entire block (with control passing to the line of ruby code after the Example::method call). If the value from the block is a number, then it is returned at line 18. Note that this particular sequence of calls is contrived, since the value returned from the block is just returned to the caller. Still, I think it is a good example of what can be done with rb_catch() and rb_throw().

Now let's look at some example ruby code that might utilize the above code:

require 'example'

# if the method were to be called like this, an exception would be
# raised since no block is given
# retval = Example::method

# if the method were to be called like this, an exception would be
# raised since the return value from the block is not a number
# retval = Example::method {|input|
#     "hello"
# }

# this works properly, since the return value is a number
retval = Example::method {|input|
    puts "Input is #{input}"
    6
}

[1] http://ruby-doc.org/docs/ProgrammingRuby/html/tut_exceptions.html

Tuesday, January 11, 2011

Writing Ruby Extensions in C - Part 5, Exceptions

This is the fifth in my series of posts about writing ruby extensions in C. The first post talked about the basic structure of a project, including how to set up building. The second post talked about generating documentation. The third post talked about initializing the module and setting up classes. The fourth post talked about types and return values. This post will focus on creating and handling exceptions.

Exceptions

When a method implementation in a ruby C extension encounters an error, the typical response is to throw an exception (a value indicating error can also be returned, but that is not idiomatic). The exception to be thrown can either be one of the built-in exception classes, or a custom defined exception class. The built-in exception classes are:
  • rb_eException
  • rb_eStandardError
  • rb_eSystemExit
  • rb_eInterrupt
  • rb_eSignal
  • rb_eFatal
  • rb_eArgError
  • rb_eEOFError
  • rb_eIndexError
  • rb_eStopIteration
  • rb_eRangeError
  • rb_eIOError
  • rb_eRuntimeError
  • rb_eSecurityError
  • rb_eSystemCallError
  • rb_eThreadError
  • rb_eTypeError
  • rb_eZeroDivError
  • rb_eNotImpError
  • rb_eNoMemError
  • rb_eNoMethodError
  • rb_eFloatDomainError
  • rb_eLocalJumpError
  • rb_eSysStackError
  • rb_eRegexpError
  • rb_eScriptError
  • rb_eNameError
  • rb_eSyntaxError
  • rb_eLoadError

Extension modules should usually define a custom exception class for errors related directly to the extension, and use one of the built-in exception classes for standard errors. The custom exception class should generally be a subclass of rb_eException or rb_eStandardError, though if the module has special needs any of the built-in exception classes can be used. Example:

 1) static VALUE m_example;
 2) static VALUE e_ExampleError;
 3)
 4) static VALUE exception_impl(VALUE klass, VALUE input) {
 5)     if (TYPE(input) != T_FIXNUM)
 6)         rb_raise(rb_eTypeError, "invalid type for input");
 7)
 8)     if (NUM2INT(input) == -1)
 9)         rb_raise(e_ExampleError, "input was < 0");
10)         return Qnil;
11) }
12)
13) void Init_example() {
14)     m_example = rb_define_module("Example");
15)
16)     e_ExampleError = rb_define_class_under(m_example, "Error",
17)                                            rb_eStandardError);
18)
19)     rb_define_module_function(m_example, "exception_example",
20)                               exception_impl, 1);
21) }
Line 14 sets up the extension module. Line 16 creates the custom exception class as a subclass of rb_eStandardError. Now if the extension module runs into a situation that it can't accept, it can raise e_ExampleError and throw an exception of type Example::Error. Line 19 defines a module function that demonstrates the use of standard and custom exceptions. If Example::exception_example is called with an argument that is not a number, it raises the ArgumentError exception on line 6 (side-note: Check_Type should really be used to do this type of checking, but for example purposes we omit that). If Example::exception_example is called with a number argument that is -1, then the custom exception Example::Error is raised on line 9. Otherwise, the method succeeds and Qnil is returned.

Raising exceptions

There are a few different ways to raise exceptions:
  • rb_raise(error_class, error_string, ...) - the main interface for raising exceptions. A new exception object of class type error_class is created and then raised, with the error message set to error_string (plus any printf-style arguments)
  • rb_fatal(error_string, ...) - a function for raising an exception of type rb_eFatal with the error message set to error_string (plus any printf-style arguments). After this call the entire ruby interpreter will exit, so extension modules typically should not use it
  • rb_bug(error_string, ...) - prints out the error string (plus any printf-style arguments) and then calls abort(). Since this call doesn't allocate an error object or do any of the other typical exception handling steps, it isn't technically a function to raise exceptions. This function should only be used when a bug in the interpreter is found, and as such, should not be used by extension modules
  • rb_sys_fail(error_string) - raises an exception based on errno. Ruby defines a separate class for each of the errno values (such as Errno::EAGAIN, Errno::EACCESS, etc), and this function will raise an exception of the type that corresponds to the current errno
  • rb_notimplement() - raises an exception of rb_eNotImpError. This is used when a particular function is implemented on one platform, but possibly not on other platforms that ruby supports
  • rb_exc_new2(error_class, error_string) - allocate a new exception object of type error_class, and set the error message to error_string. Note that rb_exc_new2() does not accept printf-style options, so the string will have to be fully-formed before passing it to rb_exc_new2()
  • rb_exc_raise(error_object) - a low-level interface to raise exceptions that have been allocated by rb_exc_new2()
  • rb_exc_fatal(error_object) - a low-level interface to raise a fatal exception that has been allocated by rb_exc_new2(). After this call the entire ruby interpreter will exit, so extension modules typically should not use it
The example below shows the use of rb_raise() and rb_exc_raise(), which are the only two calls that extension modules should really use.

 1) static VALUE m_example;
 2) static VALUE e_ExampleError;
 3)
 4) static VALUE example_method(VALUE klass, VALUE input) {
 5)     VALUE exception;
 6)
 7)     if (TYPE(input) != T_FIXNUM)
 8)         rb_raise(rb_eTypeError, "invalid type for input");
 9)
10)     if (NUM2INT(input) < 0) {
11)         exception=rb_exc_new2(e_ExampleError, "input was < 0");
12)         rb_iv_set(exception, "@additional_info",
13)                   rb_str_new2("additional information"));
14)         rb_exc_raise(exception);
15)     }
16)
17)     return Qnil;
18) }
19)
20) void Init_example() {
21)     m_example = rb_define_module("Example");
22)
23)     e_ExampleError = rb_define_class_under(m_example, "Error",
24)                                            rb_eStandardError);
25)     rb_define_attr(e_ExampleError, "additional_info", 1, 0);
26)
27)     rb_define_module_function(m_example, "method",
28)                               example_method, 1);
29) }
Lines 20 through 29 show the module initialization. Since this is described in more detail elsewhere, I'll only point out line 25, where a custom attribute for the error class e_ExampleError is defined. When an error occurs in the extension module, additional error information can be placed into that attribute, and any caller can look inside of the error object to retrieve that additional information.

Lines 4 through 18 implement an example method that takes one and only one input parameter. Line 7 checks to see if the input value is a number, and if not an exception is raised with rb_raise() on line 8. Line 10 checks to see if the number is less than 0. If it is, then a new exception object of type e_ExampleError is allocated on line 11 with rb_exc_new2(), and the additional_info attribute of the object is set to "additional information" on line 12. As with most other things, the value that additional_info is set to can be any valid ruby object. Line 14 then raises the exception. This example shows very clearly the power of rb_exc_new2() and rb_exc_raise(), in that additional error information can be passed through to callers.

Handling exceptions

The other half of dealing with exceptions in an extension module is handling exceptions in C code when they are thrown from ruby functions. How is that done since C has no raise/rescue type mechanism? Through callbacks.

There are a few functions that can be used for handling exceptions:
  • rb_ensure(cb, cb_args, ensure, ensure_args) - Call function cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. When cb() finishes, regardless of whether it completes successfully or raises an exception, call ensure with ensure_args. The ensure function must take in a single VALUE parameter and return VALUE
  • rb_protect(cb, cb_args, line_pointer) - Call cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. If an exception is raised by cb(), store the exception handler point in line_pointer and return control. It is then the responsibility of the caller to call rb_jump_tag() to return to the exception point
  • rb_jump_tag(line) - do a longjmp to the line saved by rb_protect(). No code after this statement will be executed
  • rb_rescue(cb, cb_args, rescue, rescue_args) - Call function cb with cb_args. The callback must take in a single VALUE parameter and return VALUE. If cb() raises any exception, rescue is called with rescue_args. The rescue callback should take in two VALUE parameters and return VALUE

Another example should make some of this clear:

 1) static VALUE cb(VALUE args) {
 2)     if (TYPE(args) != T_FIXNUM)
 3)         rb_raise(rb_eTypeError, "expected a number");
 4)     return Qnil;
 5) }
 6)
 7) static VALUE ensure(VALUE args) {
 8)     fprintf(stderr, "Ensure value is %s\n",
 9)               StringValueCStr(args));
10)     return Qnil;
11) }
12)
13) static VALUE rescue(VALUE args, VALUE exception_object) {
14)     fprintf(stderr, "Rescue args %s, object classname %s\n",
15)             StringValueCStr(args),
16)             rb_obj_classname(exception_object));
17)     return Qnil;
18) }
19)
20) VALUE res;
21) int exception;
22)
23) res = rb_ensure(cb, INT2NUM(0), ensure, rb_str_new2("data"));
24) res = rb_ensure(cb, rb_str_new2("bad"), ensure,
25)                 rb_str_new2("data"));
26)
27) res = rb_protect(cb, INT2NUM(0), &exception);
28) res = rb_protect(cb, rb_str_new2("bad"), &exception);
29) if (exception) {
30)     fprintf(stderr, "Failed cb\n");
31)     rb_jump_tag(exception);
32) }
33)
34) res = rb_rescue(cb, INT2NUM(0), rescue, rb_str_new2("data"));
35) res = rb_rescue(cb, rb_str_new2("bad"), rescue,
36                    rb_str_new2("data"));
Line 23 kicks off the action with a call to rb_ensure(). In this first rb_ensure, we pass a FIXNUM object to cb(), which means that no exception is raised. Because of the rb_ensure(), however, the ensure() callback on lines 7 through 11 is called anyway and does some printing.

Line 24 passes a String object to cb(), which causes cb() to raise an exception. Because of the rb_ensure, the ensure() callback on lines 7 through 11 is called and does some printing. Importantly, after ensure() is called the exception is propagated, so in reality none of the code after line 21 will be executed (we'll ignore this fact for the sake of this example).

Line 27 uses rb_protect() to call the callback; since a FIXNUM object is passed, no exception is raised. Note that if the call that is being wrapped by rb_protect() does not raise an exception, exception is always initialized to 0.

Line 28 uses rb_protect() to call cb() with a String object, which causes an exception to be raised. Because rb_protect() is being used, control will be returned to the calling code at line 29, and that code can then check for the exception. Since an exception was raised, the "exception" integer will have a non-0 number and the code can do whatever we need to clean up and then propagate the exception further with rb_jump_tag() on line 31.

Line 34 uses the rb_rescue() wrapper to call cb(). Since a FIXNUM object is passed to cb(), no exception is raised and no callbacks other than cb() are called.

Line 35 uses rb_rescue() to call cb() with a String object, which causes an exception to be raised and the rescue() callback to be executed. The rescue() callback on lines 13 through 18 takes two arguments: the VALUE initially passed into the rb_rescue() rescue_args, and the exception_object that caused the exception. Based on the exception_object, the rescue() callback can choose to handle this exception or not.

Example

Before finishing this post, I'll leave you with another example. When writing ruby code, the full begin..rescue block goes something like:

begin
  ...
rescue FooException => e
  ...
rescue
  ...
else
  ...
ensure
  ...
How would we implement this in C?

 1) static VALUE foo_exception_rescue(VALUE args) {
 2)     fprintf(stderr, "foo_exception_rescue value is %s\n",
 3)             StringValueCStr(args));
 4)     return Qnil;
 5) }
 6)
 7) static VALUE other_exception_rescue(VALUE args) {
 8)     fprintf(stderr, "other_exception_rescue value is %s\n",
 9)             StringValueCStr(args));
10)     return Qnil;
11) }
12)
13) static VALUE rescue(VALUE args, VALUE exception_object) {
14)     if (strcmp(rb_obj_classname(exception_object),
15)                "FooException") == 0)
16)         return foo_exception_rescue(args);
17)     else
18)         return other_exception_rescue(args);
19) }
20)
21) static VALUE cb(VALUE args) {
22)     return rb_rescue(cb, args, rescue, rb_str_new2("data"));
23) }
24)
25) static VALUE ensure(VALUE args) {
26)     fprintf(stderr, "Ensure args %s\n", StringValueCStr(args));
27)     return Qnil;
28) }
29)
30) VALUE res;
31)
32) res = rb_ensure(cb, INT2NUM(0), ensure, rb_str_new2("data"));
This example implements almost the entire ability of the ruby begin..rescue blocks. What it does not implement is the "else" clause; I have not yet come up with a good way to do that. If you think of something to make this example work for the "else" clause, please leave a comment.