Advanced topics in Ruby FFI
Short primer: what is FFI?
This article is not a tutorial on the basics of FFI. However, if you’ve never heard of FFI before, I’d like to wet your appetite before continuing on.
FFI is an alternative to writing C to use functionality locked within native libraries from Ruby. It allows you to explain, with an intuitive Ruby DSL, which functions your native library contain, and how they should be used. Once the functionality of your native library is mapped out, you can call the functions directly from Ruby.
Furthermore, gems using ffi do not need to be compiled, and will run without modifications on CRuby, JRuby and Rubinius! In practice there could be small differences between the platforms in the behaviour and usage of FFI, and if you find any you should report them to the Ruby FFI issue tracker so it can be dealt with.
As far as basic tutorials on using FFI, your best resource is the FFI wiki. It also has a list of projects using FFI, which is your second best resource on learning how to use FFI.
Aliasing with typedef
If we look at the header for a function from libspotify:
SP_LIBEXPORT(sp_error) sp_session_player_prefetch(sp_session *session, sp_track *track);
Naively mapping this to FFI we’ll need:
enum :error, [ … ]
attach_function :sp_session_player_prefetch, [ :pointer, :pointer ], :error
Unfortunately, we lost two pieces of valuable information here. Both sp_session
and sp_track
are types that occur many times in the library. When we look at the ruby implementation, there is no hint whatsoever of what type the two pointers should be of.
It does not need to be like this. Using typedef
we can name our parameters, and bring back the information that we lost in our translation.
typedef :pointer, :session
typedef :pointer, :track
enum :error, [ … ]
attach_function :sp_session_player_prefetch, [ :session, :track ], :error
Functionality of our method does not change, but implementation is now slightly more clear and maintainable.
Specializing in attach_function
C libraries do not follow Ruby naming conventions, which makes sense since they’re not written in Ruby. However, bindings written with Ruby FFI are in Ruby and will be called from Ruby, so they should have the look and feel of Ruby.
Attach function allow you to call it in two ways:
attach_function :c_name, [ :params ], :returns, { :options => values } # 1
attach_function :ruby_name, :c_name, [ :params ], :returns, { :options => values } # 2
Using the first form will create your Ruby methods with the same name as your native library’s functions. Using the second form allows you to rename the bound method, giving it a more expected final name.
Native libraries you bind with FFI will have naming conventions of their own. For example, OpenAL will prefix it’s functions with al
or alc
, and camel case. libspotify will prefix it’s functions with sp_
. Apart from removing the suffix, and snake_casing the function name, we want the Ruby method to be named similarly. We could repeat ourselves for every method:
attach_function :open_device, :alcOpenDevice, [ :string ], :device
attach_function :close_device, :alcCloseDevice, [ :device ], :bool
But remember! When you use FFI, you extend the FFI::Library inside a module of your own. This also means you can override the attach_function
call, without your specialized version leaking to the outside world. By overriding attach_function
we can avoid unnecessary noise in our FFI bindings.
def self.attach_function(c_name, args, returns)
ruby_name = c_name.to_s.sub(/\Aalc?/, "").gsub(/(?\<\!\A)\p{Lu}/u, '_\0').downcase
super(ruby_name, c_name, args, returns)
end
attach_function :alcOpenDevice, [ :string ], :device # gets bound to open_device
attach_function :alcCloseDevice, [ :device ], :bool # gets bound to close_device
This does not end here. After calling super
inside attach_function
you have the option of further specializing the newly bound method. You could implement automatic error checking for every API call, or alter the parameters based on native library conventions, and more. Just remember that the added complexity should be worth the savings.
FFI::Structs as parameters
Structs in FFI can be used as parameters, and is by default equivalent to specifying a type of :pointer
.
class SomeStruct < FFI::Struct
end
attach_function :some_function, [ SomeStruct ], :void
# equivalent to:
attach_function :some_function, [ :pointer ], :void
callback :some_callback, [ SomeStruct ], :void
# equivalent to:
callback :some_callback, [ :pointer ], :void
I’d like to bring forth an alternative for your referenced struct parameters, namely FFI::Struct.by_ref
. It behaves very similarly to the above, with the important difference in that it type-safety built-in!
attach_function :some_function, [ SomeStruct ], :void
some_function FFI::Pointer.new(0xADDE55) # this is possibly unsafe, but allowed
attach_function :some_function, [ SomeStruct.by_ref ], :void
some_function FFI::Pointer.new(0xADDE55) # BOOM, wrong argument type FFI::Pointer (expected SomeStruct) (TypeError)
some_function SomeOtherStruct.new # BOOM, wrong argument type SomeOtherStruct (expected SomeStruct) (TypeError)
Further more, if you use FFI::Struct.by_ref for your callback parameters or function return values, FFI will automatically cast the pointer to an instance of your struct for you!
callback :some_callback, [ SomeStruct.by_ref ], :void
attach_function :some_function, [ :some_callback ], :void
returned_struct = some_function(proc do |struct|
# struct is an instance of SomeStruct, instead of an FFI::Pointer
end)
attach_function :some_other_function, [ ], SomeStruct.by_ref
some_other_function.is_a?(SomeStruct) # true, instead of being an FFI::Pointer
Keep in mind, that on JRuby 1.7.3, FFI::Struct.by_ref
type accepts any descendant of FFI::Struct, and not only instances of YourStruct. See https://github.com/jruby/jruby/issues/612 for updates.
Piggy-back on Ruby’s garbage collection with regular FFI::Structs
If we take a look again at the above code with SomeStruct as return value.
attach_function :some_other_function, [ ], SomeStruct.by_ref
In some libraries, the memory for the pointer to SomeStruct returned from some_other_function
is expected to be managed by us. This means we’ll most likely need to call some function free_some_struct
to specifically free the memory used by SomeStruct when the object is no longer needed. Here’s how it would be used:
begin
some_struct = some_other_function
# do something with some_struct
ensure
free_some_struct(some_struct)
end
Unfortunately, if we pass some_struct
somewhere else beyond our control, we must be able to trust that the new guardian of some_struct
calls free_some_struct
in the future, or we will have a memory leak! Oh no!
Fear not, for FFI::Struct has a trick up it’s sleeve for us. Have a look at this.
class SomeStruct < FFI::Struct
def self.release(pointer)
MyFFIBinding.free_some_struct(pointer) unless pointer.null?
end
end
attach_function :some_other_function, [], SomeStruct.auto_ptr
With the above binding code, some_other_function
still returns an instance of SomeStruct. However, when our object is garbage collected FFI will call upon SomeStruct.release
to free the native memory used by our struct. We can safely pass our instance of SomeStruct around everywhere and to everyone, and safely remember that when the object goes out of scope and Ruby garbage collects it, FFI will call upon us to free the underlying memory!
Related to this, you should look into FFI::ManagedStruct and FFI::AutoPointer if you have not already.
Writing our own data types
class Device < FFI::Pointer
end
attach_function :some_function, [ ], Device
Subclassing FFI::Pointers is a convenient way of working with pointers from native libraries less generic. Using the above code, when we call some_function
we’ll receive an instance of Device, instead of the FFI::Pointer we would get if we specified the return value as a :pointer
.
If objects in our native library are not pointers we can’t do what we’ve done above. For example, in OpenAL there’s a concept of audio sources, but they are represented by an integer, and not a pointer. Passing arbitrary integers around is not a nice practice, so what you could do is wrap the source in an object for further use.
class Source
def initialize(id)
@id = id
end
attr_reader :id
end
typedef :int, :source
attach_function :create_source, [], :source
attach_function :destroy_source, [ :source ], :void
# Usage
source = Source.new(create_source)
destroy_source(source.id)
While the code above is not bad, we could do much better by utilizing something in FFI called DataConverters. DataConverters are a way of writing code that tells FFI how to convert a native value to a ruby value and back. By doing this, we could have FFI automatically wrap source above in an object, making it completely transparent to the developer using the library.
class Source
extend FFI::DataConverter
native_type FFI::Type::INT
class << self
# `value` is a ruby object that we want to convert to a native object
# this method should return a type of the native_type we specified above
def to_native(value, context)
if value
value.id # in our case, we convert a Source to an int
else
-1 # if value is nil, we represent a `no source` value as -1
end
end
# `value` is a type of the native_type specified above, we should return
# a ruby object we wish to pass around in our application
def from_native(value, context)
new(value)
end
# this is needed when FFI needs to figure out the native size of your native type
# for example, if you want to generate a pointer to hold something of this type
# e.g. FFI::MemoryPointer.new(Source) # <= requires size to be defined and correct
def size
FFI.type_size(FFI::Type::INT)
end
# this method is a hint to FFI that the object returned from to_native needs to
# be kept alive for the native value in the object to remain valid, so that if we
# return an object that automatically frees itself on garbage collection, ffi will
# prevent it from being garbage collected while it’s still needed, mainly useful
# for to_native methods that allocate memory
def reference_required?
false
end
end
def initialize(id)
@id = id
end
attr_reader :id
end
attach_function :create_source, [], Source
attach_function :destroy_source, [ Source ], :void
source = create_source # an instance of Source, created through Source.from_native!
source.id # => the native value
destroy_source(source) # converts source to native value through Source.to_native!
You could do this to all types, even pointers. Even more, you are not constrained to only doing type conversion in to_native
and from_native
— you could perform validation, making sure your values have the correct type, length, or what ever you may need!
If you’d like some more example of custom types, I’ve written down a few in this gist: https://gist.github.com/varvet-dev/41c27fdb0a007ad4cac6
Implementing type safety
Do you remember what I mentioned earlier about FFI::Struct.by_ref
automatically giving us some kind of type safety, preventing us from shenanigans where somebody sends invalid values to native functions? We can implement the very same kind of type safety ourselves for all types, by overriding to_native
in our DataConverters.
# A to_native DataConverter method that raises an error if the value is not of the same type.
module TypeSafety
def to_native(value, ctx)
if value.kind_of?(self)
super
else
raise TypeError, "expected a kind of #{name}, was #{value.class}"
end
end
end
We could now mix the above module into our own custom data types from the previous chapters.
# Even if we have another object that happens to look like a Source from our previous chapter,
# by having a #value method, we now won’t allow sending it down to C unless it’s an instance of
# Source or any of it’s subclasses.
Source.extend(TypeSafety)
# Remember Device from earlier? It’s a descendant of FFI::Pointer. Now all parameters of type Device
# will only accept instances of Device or any of it’s subclasses. All else results in a type error.
Device.extend(TypeSafety)
Duck-typing is very useful in Ruby, where raising an exception is the worst thing that can happen when we try to call a method on an object that does not respond to such a method. However, when interfacing with C libraries, passing in the wrong type will segfault your application with little information on what went wrong. Using this TypeSafety module, we can catch errors early, with a useful error message as a result.
Final words
Personally I really like using FFI. It’s a low-pain way of writing gems that use native libraries, and if you set your types up properly, not having a compiler that type-checks your code won’t be so bad. If you can work with native libraries through the means of FFI instead of writing a C extension, by all means do. Even if you intend on writing a C extension, using FFI can be a quick way of exploring a native API without wiring up C functions and data structures together with the Ruby C API.
Something that FFI excells at, in comparison to writing a C extension, is handling asynchronous callbacks from non-ruby threads in C. FFI can save you a lot of headache in that area.
Thank you.
References
- FFI on rubygems: https://rubygems.org/gems/ffi
- FFI on GitHub: https://github.com/ffi/ffi
- FFI wiki on Why use FFI?: https://github.com/ffi/ffi/wiki/Why-use-FFI
- spotify gem, from where I learnt most of these things as I went.
- Gist of custom types: https://gist.github.com/elabs-dev/41c27fdb0a007ad4cac6
Comments