Making Rubinius .rbc Files Disappear

Rubinius is rather unusual as a Ruby implementation. It both compiles Ruby source code to bytecode and saves the compiled code to a cache so it does not need to recompile unless the source code changes. This can be great for utilities that are run often from the command line (including IRB). Rubinius merely reloads the cached file and runs the bytecode directly rather than needing to parse and compile the file. Sounds like a real win!

Unfortunately, it is not that simple. We need some place to store that cache and this is where the thorns on that pretty rose start poking us in the thumbs. The solution we have been using since forever is to store the cached file alongside the source file in the same directory, like so:

$ echo 'puts "hello!"' > hello.rb
$ ls hello.*
hello.rb
$ rbx hello.rb
hello!
$ ls hello.*
hello.rb	hello.rbc

That doesn’t look too crazy, but it can get more complicated:

$ mv hello.rb hello
$ rbx hello
$ ls hello.*
hello.compiled.rbc	hello.rbc

Whoa, what is hello.compiled.rbc? Since hello did not have an extension, we add that longer compiled.rbc to make it clear which file the cache is for. Also, note that we have that hello.rbc hanging about even though the original hello.rb is gone.

To summarize the issues with our caching scheme:

  1. It requires an additional file for every Ruby source file.
  2. It requires some potentially complicated naming scheme to associate the cache file with the source and not clash with other names.
  3. Removing or renaming the Ruby source file leaves the cache file behind.

Again, the advantage of the cache file is that you do not have to wait for Rubinius to recompile the file if you have not changed the source. Let’s see if we can get all the advantages with none of the disadvantages. That old saying comes to mind, Having your cake and eating it, too, so we may not be successful, but it is worth a shot.

First, let’s take a step back. This issue is not unique to Rubinius. Python has .pyc and .pyo files. Java has .class files. C/C++ has .o files. Lots of things need a place to store a compiled or cached representation of some data. Every SCM worth mention has some mechanism to ignore the files you don’t want to track. The same is generally true of editors. So in some sense, this is a solved problem. However, we have always received complaints about the .rbc files, so we thought we would try to make other, hopefully better, solutions available.

Solution 1: No Cache

One simple solution is just to never ever ever create the compiled cache files in any form anywhere. We have an option for that:

$ ls hello.*
hello.rb
$ rbx -Xcompiler.no_rbc hello.rb
hello!
$ ls hello.*
hello.rb

Win! Not one lousy .rbc file in sight. Although, that’s quite the option to type. Never fear, we have a solution to that below.

Here is our scorecard for solution 1:

Use Case: Use when you never want any compiler cache files created. For example, on a server where startup time is not really a concern.

Pros: No .rbc files at all.

Cons: Startup will be slightly slower depending on what Ruby code you are running. It will be more noticeable in a Rails application, for example. However, the Rubinius bytecode compiler is several times faster than it was a couple years ago so it may not be an issue for you.

Solution 2: Cache Database

What if we could put all the compilation data in a single cache location, something like a database? We have an option for that.

This option is a little more complex, so let’s take it in two steps.

$ ls hello.*
hello.rb
$ rbx -Xrbc.db hello.rb
hello!
$ ls hello.*
hello.rb
$ ls -R .rbx
60

.rbx/60:
60c091c3ed34c1b93ffbb33d82d810772902d3f9

Success! No .rbc files here. But what’s with all the numbers in the .rbx directory and how did that directory get there?

The -Xrbc.db option without any argument will store the compilation cache in the .rbx directory in the current working directory. The cache files themselves are split into subdirectories to avoid creating too many entries for the file system to handle in one directory.

What if you have a special location where you would prefer all compilation cache files be saved? No problem, just give -Xrbc.db a path as follows:

$ ls hello.*
hello.rb
$ rbx -Xrbc.db=$HOME/.my_special_place hello.rb 
hello!
$ ls hello.*
hello.rb
$ ls -R $HOME/.my_special_place
60

/Users/brian/.my_special_place/60:
60c091c3ed34c1b93ffbb33d82d810772902d3f9

If you primarily work with projects, putting the .rbx directory in the current working directory may be the best solution because it keeps the compilation cache with the project. It is easy to add an SCM ignore for the directory and easy to remove the directory to clear the cache (e.g. in a clean task).

However, if you are frequently running scripts in many directories, you may not want to litter .rbx directories everywhere. In this case, putting the directory in your $HOME dir or /tmp may be preferable. Additionally, /tmp may be cleared on every reboot so you will not accumulate many stale cache files.

Note that, right now, Rubinius does not clear the cache directory. It will happily continue adding to it indefinitely. However, this may not be an issue unless you are cycling through a bunch of Ruby files, for example, working on a number of Ruby projects in series. In that case, using a per-project (per current working directory) cache is probably the best option.

Here is how solution 2 shakes out:

Use Case: You want to combine all compilation cache files in one location.

Pros: No .rbc files mixed in with the rest of your files.

Cons: You may still need a per-project or per-working-directory cache directory. However, you can easily specify where to put that directory.

Using RBXOPT for Options

As mentioned above, the -X options can get a little long and you certainly don’t want to retype them constantly. We have added support for the RBXOPT environment variable, which is an analog of the RUBYOPT environment variable that we already support.

Use RBXOPT to specify -X options that Rubinius should use. For example:

export RBXOPT=-Xrbc.db=/path/to/dir

You can check out all the -X options with rbx -Xconfig.print or rbx -Xconfig.print=2 for more verbose output. If you want to use multiple -X options in RBXOPT, use quotes and separate the options with a space:

export RBXOPT='-Xrbc.db -Xagent.start'

Conclusion

Rubinius saves a compilation cache for compiled Ruby code to avoid wasting time and resources recompiling source that has not changed. However, we need some place to store the cache. Rubinius provides options for omitting the cache altogether or for storing it in a directory of your choosing. Note that the format of the compilation cache is an implementation detail and we reserve the right to change it at any time, so please don’t rely on it being in any particular format.

We have not turned on -Xrbc.db by default yet because we don’t know what a good default is. So give us feedback on your use cases and what you would find most useful.

Finally, whenever we discuss the compilation cache we are inevitably asked if you can run directly from the cache and not use the Ruby source at all after it has been compiled. The short answer is “Yes”, the long answer is “It depends”. I will be writing a post exploring this question in detail shortly. For now, get out there and write more Ruby code!

Why Use Rubinius

Why should I use Rubinius? We have been asked that question many, many times over the past four years. It is a great question. It is an important question. It’s a hard question. I’m not holding out on you. I want to give you an answer that sates your curiosity, helps you make informed decisions, and empowers you to speak eloquently when you are inevitably asked, “Why do you use Rubinius?”

The trouble is, there are many different situations in which people use Ruby and there is simply no answer, however comprehensive, that really speaks to everyone’s concerns. So rather that boring you at length, I thought a Choose your own adventure style would be a better approach.

From the list below, select the persona that best describes you. Don’t worry, if the one you select doesn’t sound right, you can easily backtrack here. Read as many as interest you. After all, none of us fit easily into any one box. When you are done exploring all the fascinating reasons to use Rubinius, let’s meet up at the Conclusion for some parting words.

Enjoy!

Choose Your Persona

Rails or Ruby Newby

You are pretty new to programming and after hearing about Ruby on Rails you watched a screencast and made a website. You are curious and enthusiastic.

You are the empty teacup of the Zen proverb. You are a fresh-faced flower glistening with the morning dew. The sun smiles on you and you smile back. You seem to like this Ruby language that makes programmers happy and you’ve come to lend your cheery spirit…

Welcome!

So, you have heard of this thing called Rubinius or rbx or whatever and some folks you respect or admire seem to like it and naturally you want to know what the big deal is and you’re like, “Yo, why would I use Rubinius?”.

Cool.

Well, you should use Rubinius because I said so. Try your code on it. Tell us what worked for you. Tell us if something didn’t work by opening an issue. Set your imagination loose and tell us what tool you would use if you could.

Spend some time reading the Rubinius source code. Start at the kernel/ directory. It’s full of Ruby code! As you read through how Ruby is implemented, how it actually works, it will give you a level of understanding of your code that many programmers don’t have in any language.

Most of all, hang on to your curiosity and enthusiasm. Those were vital to the creation of the Rubinius project in the beginning and have sustained us through many challenges. We can make our Ruby experience better, freeing us from the shackles of other languages and foreign libraries. We can have fast and reliable web servers, games, editors, websites and applications written in Ruby. We can have first class tools written for and with Ruby. The world can be rosy red without our glasses.

Back to personas

The Creative

Ruby is groovy. No, not that Groovy, eww, no. I mean:

groovy |ˈgroōvē| adj.

  • fashionable and exciting : sporting a groovy new haircut
  • enjoyable and excellent : he played all the remarkably groovy guitar parts himself

(Apple's dashboard dictionary widget.)

Ruby respects creativity. It has an aesthetic. You don’t just write Ruby code, you write beautiful Ruby code. It would be unthinkable to do otherwise. Sure, there is more than one way to do many things. This is not some sterile laboratory. We are not automatons; we are people. Of course, being utilitarian is not bad. But other languages have that angle pretty well covered. There is probably only one right way to implement Python.

Rubinius has an aesthetic, too: excellence, utility, simplicity, beauty, joy. Mostly in that order. Useful code that isn’t of very good quality is a drag. It slows you down. It gives you a headache. It drives you away. We strive to keep it out of Rubinius. On the other hand, we are not just writing sonnets here. This is Serious Business™. We have some hard-core problems to solve. So we strive for excellent, useful, beautiful code that is a joy to work with.

Of course, this is an ongoing process. It is a journey, not a destination. There are areas of Rubinius that could use a thorough cleaning or a new perspective on making the implementation of this beautiful object-oriented language more beautiful and object-oriented.

We welcome your artistic perspective. Help us improve the dialog between Rubinius and the person using it. The command line doesn’t have to be a desolate place of obscure, condescending error messages. Web interfaces to the diagnostic tools deserve a good dose of user-experience and interaction design. You know that feeling you get when looking at an Enterprise web application? That weird plastic-masquerading-as-quality-material feeling? The too much 1996-Enterprise-faux-rounded-corner-wanabe-2006-hip gloss? Gives me the willies whenever I have to use an app like that. Yeah, we don’t want that.

We want to create tools that are powerful, graceful, easy to use, and beautiful to look at. Beautiful tools are easier to use. (Yehuda Katz provided a couple links related to this: The Impact of Design and Aesthetics on Usability, Credibility, and Learning in an Online Environment and In Defense of Eye Candy. If you know of other research, leave us a comment.) So if you have a creative bent but enjoy writing code also, try out Rubinius and let us know where it could use some polish.

Back to personas

Experienced programmer

That saying, Time is Money, you live by it. You have applications to deliver and you choose the best tool for the job. You are professional, conscientious, duly cautious, and not inclined to episodes of emotional exuberance about the latest fad. You accept compromises. There are always trade-offs. The correct approach is cost-benefit analysis. The numbers tell the story and level-headed decision making follows the numbers.

You have heard about Rubinius and you are curious whether it may be appropriate for your current project. As usual, rather than speculating or paying too much heed to the buzz, you look into it yourself. After some investigation, you discover that:

  1. Much of Rubinius is implemented in Ruby itself. This may be a big help when tracking down troublesome bugs.
  2. Rubinius has a very fast bytecode virtual machine, as well as a modern generational garbage collector so memory profiles should be more predictable and consistent in deployed applications.
  3. It has a profile-driven JIT compiler that uses type-feedback to aggressively inline methods resulting in significant performance improvements.
  4. It has a built-in debugger and precise method profiler, both of which are fast due to being well integrated.
  5. It has a built-in API for monitoring a VM out-of-process, even on a remote machine. We are building a variety of diagnostic tools atop this API.

Of course, even if the technology in Rubinius sounds terrific in theory, how suitable is Rubinius for your application? How does it perform under your specific constraints? Again, you do some investigating. You have a solid test suite for your application, so you start by running that. If you hit any problems, please open an issue to let us know.

If everything goes well with the tests, you start running some of the benchmarks that you have accumulated while doing performance tuning. Of course, no sensible person asks for benchmark results from other people’s code. That defies logic. It’s like asking if your program will run because your Aunt Mabeline likes decaf coffee. It’s contrary to the very point of benchmarking, where you are trying to correlate two values that are connected.

Again, if you note an significant issues, please let us know. Sometimes Rubinius exposes issues in existing code. Performance characteristics of real applications are vital to making Rubinius faster. Also, if you have suggestions for tools you would like to use, tell us. If you just want to chat about the technology, that’s fine, too. We’re hanging out in the #rubinius channel on freenode.net.

Back to personas

Seasoned programmer

Well, I am being kind by saying seasoned. You know when you look in the mirror that jaded and cynical are much more apt. You’ve seen it all and it has worn you down. You’ve been fighting the good fight, carefully guarding that last flicker of optimism that burns in the secret place deep in your heart. You’ve programmed Java/.NET/C++ professionally. You’ve even sucked it up and written some PHP and Python when asked; you are a professional, they ask and you deliver. You’ve seen attacked servers on fire off the shoulder of Rackspace…

Rubinius has a lot to offer you. Remember that little flicker of optimism? It is only the idealists that get ground down by the complete indifference to pursuit of an ideal in so much of the world. Deep down, you are an idealist and you will find plenty to refresh you here.

Rubinius aims to be the best possible implementation of Ruby by putting Ruby itself front and center. We are using modern technology and always improving. We change when there is a better way to do things. We judiciously rewrite and are not too attached to any code or algorithm. The legacy Enterprise isn’t on the steering committee. Our work will be done when you can use Ruby, just Ruby, to solve your thorny problems.

Sure, that sounds idealistic. But never mind the pessimists that tell you that you have to compromise. If you are not idealistic, you will not be unsatisfied with things that are not as good as they could be; you will not try to change the world. So give Rubinius a try, you may be surprised. And if you are, put all that hard-earned wisdom you have gained to use for the betterment of Ruby.

Back to personas

Academic Researcher

Forgive me for staring, I know it is impolite. I’m just… intrigued. Of course, you know Ruby is a late bound language, every message sent could conceivably fail to find a target, potentially resulting in an uncaught exception and program termination. There’s shared state, wild orgies of mutation that disallow any reasonable attempt at automated parallelization. Program proof is as oxymoronic a concept as military intelligence. It’s a very messy affair of programming and meta-programming and meta-meta-programming, which, for the love of Lisp, could be done so simply with macros. There’s all this eager evaluation and complete disregard for purity. Despite vast odds, somehow programs are written that actually run. You have noted all this with great objectivity but you are nonetheless interested.

Excellent, we are pleased. We have much to learn and welcome the opportunity for lively discussions about bringing formal methods to bear on the problems of making Ruby as fast as possible.

Java benefited tremendously from the amount of attention it received by academic researchers. Ruby can benefit from some of this research as well, not to mention the research into Smalltalk and Self that preceded it. But Ruby has its own set of problems to solve and deserves specific attention. The problems are hard but not insurmountable. Rubinius is already demonstrating that. The suggestion that we need to add more keywords, restrict Ruby dynamism, or write public static final int all over are simply nonsense.

Rubinius already leverages research for fast virtual machines, garbage collection (e.g. the generational approach and the Immix mark-region algorithm), and JIT compilers (based on pioneering research done in Self and used in the JVM Hotspot VM). Rubinius uses the exceptional LLVM project for optimization and code generation in the JIT compiler. We are also working on better infrastructure for the JIT to address Ruby complexities head-on.

Rubinius would be excellent to use in teaching. A compiler construction class could study the architecture of the bytecode compiler written in Ruby and experiment with exploratory changes to the compiler using IRB without having to recompile anything! A 30-minute introduction to Rubinius could proceed immediately to simple AST generation and have students experimenting with their own syntax immediately. While it is easy to get started, there is plenty of depth for exploring complex topics in virtual-machine construction and garbage collection.

Whether you are interested in language research or language pedagogy, Rubinius is an great project to consider. We look forward to hearing from you.

Back to personas

Über programmer

You learned the untyped lambda calculus sitting on your mother’s knee while she worked on her doctorate in computer science. You were substituting terms before you even uttered the word, “dada”. You wrote three different Lisp implementations in Commodore Basic before you were seven. You can write multi-threaded web servers in one pass with no tests and never hit a deadlock or critical data race. You write parsers and compilers for odd languages on a Friday night for the heck of it while waiting for the pizza to arrive before a night out at the karaoke bar where you give an inspiring performance of Laga Gaga’s Poker Face.

(Loooong pause. You’re not reading this. You’ve already written one or a few languages on Rubinius and posted them to our Projects page. But anyway, I’ll continue…)

You are the Luke Skywalker of Ruby; Yoda has nothing more to teach you. Only your fate confronts you now. Use the Source Luke and save the Federation of Ruby loyalists from the Evil Oracle and its Java the Hurt.

There are a number of domains in which Ruby could benefit tremendously from excellent libraries:

  1. Servers and web servers: the web is here to stay but the argument that all applications are going to be in Javascript on the client is not valid. A variety of hybrid client-server architectures will continue to be the norm. We need software that enables application authors to build a suitable solution to their particular problems rather than trying to stuff their apps into someone else’s solution with layers of wrapping.
  2. Concurrency: multi-core is here to stay but it is not only functional programming that is suitable for high-concurrency applications.
  3. Graphical user interface: the web browser is also here to stay but it is not the last word in applications. There are many cases where GUI apps are the best option and Ruby needs a mature library or set of libraries to build these apps on any major platform. I know some of these libraries exist, but they seem to be collecting dust lately.
  4. Big data and data analysis libraries: our industry repeatedly witnesses the same pattern: domain X starts with huge applications running on huge horsepower servers for huge businesses and then it starts appearing in small applications on small computers for small businesses. Accounting and geographic information systems (GIS) are two examples. Data analysis is coming to a laptop near you.

These are general areas in which Ruby can be an excellent solution. So how does Rubinius fit in? Rubinius is dedicatedly pushing more and more into Ruby itself. Each of these domain is typically handled in Ruby right now by going to find a library in a foreign language to wrap in a fuzzy Ruby embrace. Rubinius is calling on the über-programmers of the world to implement solutions in Ruby to help us identify performance challenges and address them.

Rubinius is also being used in some fascinating language experiments. Two of these are Atomo (http://atomo-lang.org which is implemented in Haskell, with a Rubinius implementation code-named quanto) and Fancy (http://fancy-lang.org). So, if language design is your cup of tea, Rubinius offers an excellent platform for experimentation.

Back to personas

Philosophy Student Seeking the Meaning of Ruby

Like your persona description, you tend to be long winded. You find most descriptions too brief, almost dismissive. There are words and words should be used to delve into the minutiae of minutiae. You, more than anyone, want to know “Why?” with every fiber of your being. You will continue asking long after the supply of hallucinogens has been exhausted and everyone else is drooling in their sleep.

For you, Rubinius is an existential dilemma crying out for justification. If we already have MRI, why build Rubinius?

It would be accurate to say that Rubinius has a philosophy. That philosophy is simply this: Ruby should be a first class language. What does that mean? Simply that it should be possible to solve problems writing Ruby code.

Let’s consider libraries: Being first class means not having to wrap a Java library or build a C extension. If wrapping the library were the end of the story, it wouldn’t be so bad. But that is never the case. Libraries have bugs, weird APIs, incompatibility with other libraries, threading issues, and disappearing maintainers. They may even be incompatible with newer versions of the language in which they are written.

This list goes on. To address any one of these issues requires delving into a different language with weird and incompatible semantics. If the library is your core competency, that’s not such a big deal. But I will wager that it is not, which is why you are using the library in the first place. Also, the language in which you are wrapping the library (Ruby here) is not likely the core competency of the library author, or you probably wouldn’t need to be wrapping it. So Ruby wrapping one of these libraries will always be a second-class citizen. Decisions will be made about the library’s API that do not give one thought to the Ruby programs using it. Furthermore, the code written in that foreign language does nothing to support the ecosystem of Ruby. The knowledge gained in writing the library and the improved skills of the library author do not benefit Ruby. Ruby deserves better.

Ruby has gotten a big boost recently with the production release of MRI 1.9.2. There are significant speed improvements and welcomed additions to the core libraries, like powerful handling of String encodings. At the same time, the Complex and Rational libraries were added to the core library and rewritten from Ruby to C code. This is disappointing. We should be able to solve these problems more effectively in Ruby itself.

The philosophy of Rubinius is to make Ruby a first-class citizen. Ruby plays second fiddle to no one. There is no other language whose history, semantics, or vested interests compete with Ruby’s. It is true that there are difficult problems to solve in making Ruby fast. But much of the technology already exists and we will build what does not. Evan often quips that if we can get Rubinius caught up to the dynamic language technology of ten years ago, Ruby will be light-years ahead. That may be overstating how far behind Ruby is, but it illustrates the focus of Rubinius.

There’s the saying, In theory, there is no difference between theory and practice. In practice, there is. In Rubinius, theory and practice are merging. We are motivated by the desire for Ruby to be a first-class language. But we are also showing real progress in making that a reality. The Rubinius VM executes Ruby code blazingly fast. The JIT compiler, while still being quite young, is showing great promise. Compatibility with MRI is quite good and speed is constantly improving.

Is the Rubinius philosophy valid? We think the proof is in the pudding.

Back to personas

Manager

No, it did not cross my mind to describe this persona as Pointy-haired Boss. Not only would that be unfair to Dilbert, but that persona would be reading an article on Web Scale. No, you are someone who has fought hard battles in the trenches and learned valuable lessons: it’s about execution and execution depends on good technology.

Rubinius is building solid technology. We started the RubySpec project and have contributed tens of thousands of lines of code to it. With the support of Rubyspec, in just over four years as a public project, we have basically caught up with MRI 1.8.7 in compatibility and performance. For some code, our performance is much better, for other code, it is not as good. However, Rubinius is built on solid, modern technology and the project’s trajectory and velocity are outstanding.

Rubinius is a completely new implementation of core Ruby. Rubinius did not start as a port of existing code. Furthermore, Rubinius implements its own virtual machine and garbage collector in C++. The bytecode compiler that targets the virtual machine is pure Ruby. The core Ruby library is mostly Ruby with some primitive operations in C++. The JIT compiler uses the LLVM project. Given the amount of work being done in the project, Rubinius is pacing extremely well relative to other implementations.

Currently, we are working on support for Ruby 1.9 features, Windows support, and full concurrency with no global interpreter lock (GIL).

If you are looking at Ruby to implement your next project, rest assured that Ruby will have the support of excellent technology. If you are already using Ruby, consider investigating how your application runs on Rubinius. We welcome the feedback and look forward to solving challenging engineering problems.

Back to personas

Knowledge Seeker

You thirst for Knowledge. You follow it wherever it leads you. You’ll happily walk Haskell’s hallowed halls of pure laziness or sit at the feet of the meta-program gazing raptly at class transmorgrification. You don’t judge. You have more than enough knowledge to be dangerous, enough to know that the universe is amoral and knowledge is the only Truth there is. Nor does any mere mortal language bind you. All languages are finite. You’ll be here today and gone tomorrow; there is no permanence for the knowledge seeker.

Rubinius is merely a step along the path you journey. Take what you want, it is all free. As a Ruby implementation, it has much to offer your quest for knowledge. The Ruby code in the core library is accessible and easy to follow. The interface between Ruby and the C++ primitives is consistent. The C++ code itself is restrained. You won’t need a PhD in Turing-complete template languages to understand it.

Rubinius offers extensive opportunities to learn about programming languages in general and Ruby in particular. When I first started working with Rubinius, I knew a little bit about garbage collection and virtual machines. I would call what I knew, toy knowledge. As I struggled to learn more, it seemed helpful to consider layers of understanding:

  1. General programming language semantics: the procedure abstraction, looping and iteration, recursion, references and values, etc.
  2. Ruby semantics: modules and classes, access restrictions, blocks and lambdas, etc. Even with fundamental programming knowledge, a particular language can be confusing. When I was learning C, a friend was also studying it. One day he walked over and threw The C Programming Language book down on my desk and said, “This for loop makes no sense!” He was quite upset. “Look,” he said, “in this example for (i=0; i < n; i++) how can i < n get executed after the code in the body?!” It’s easy to laugh at that confusion, but coming from BASIC, that really threw him. Deepening our understanding to this second level requires confronting some “counter-intuitive” notions.
  3. Hypothetical implementation: knowing how Ruby works, how might one implement it. I think this is an important layer of understanding and it is easy to miss or gloss over it. By pausing at this layer and thinking how you might implement something, you test whether or not you are really understanding it.
  4. The MRI implementation: Reading the MRI source code is an excellent way to investigate Ruby. For one thing, it will inform you how Ruby actually works, and you may be surprised.
  5. The Rubinius implementation: here you are exposed to the philosophy of Rubinius and the challenges to implementing Ruby. We are attempting to bring the beauty of Ruby as an object-oriented language deep into the core of Ruby itself.

While the Rubinius code itself offers many opportunities for learning, don’t hesitate to drop by the #rubinius channel on freenode.net and ask us questions. Perhaps you already know a lot about another language and are interested in how Rubinius implements some feature. Or you may be relatively new to programming languages and have some basic questions. We enjoy talking about these concepts. If you are quite new to Rubinius, you may find these posts informative:

Finally, consider helping other knowledge seekers by writing blog posts on what you learn about Rubinius. Or, help us write documentation!

Back to personas

Language Enthusiast

You like languages for their intrinsic value. Of course the world comes in many shapes and sizes. You wouldn’t have it any other way. That’s the fun and spice, joie de vivre, raison d’etre, supermarché… Sometimes you get carried away writing a program in another language just because you like how the letters arrange down the screen. Ruby is definitely one of the impressive languages and sometimes you almost notice a tiny bit of favoritism in your normally egalitarian attitude.

As with any enthusiast, you like to experiment. Your interest is not mere curiosity or sterile investigation. You want to get your feet wet and your hands dirty. Rubinius is an excellent opportunity to delve into a number of fascinating subjects. We can merely suggest a path; your experiences along the way will tell you whether or not Rubinius has value to you.

If you are most interested in languages themselves, the syntax and arrangement of features, Rubinius offers you immediate gratification. Look for Evan’s upcoming post on his Language Toolkit or check out the code to prattle, a Smalltalk dialect used to illustrate the ease of building a language on Rubinius. Also look at some of the existing languages projects targeting Rubinius.

If it is the machinery under the covers that is more interesting, start reading some code. The bytecode compiler lives in lib/compiler/. The virtual machine is in vm/, and the garbage collector is in vm/gc. As you are reading through, consider helping us write better documentation. There are already sections for the virtual machine, garbage-collector, JIT compiler and bytecode compiler in the documentation, so adding content is easy.

You may also be interested in these previous posts about Rubinius:

Most of all, experiment. Rubinius is easy to hack on. Are you curious about a particular feature needed in your language? Try adding it to Rubinius. Think Lua is all the rage because it uses a register VM? You could probably write a register-based bytecode interpreter for Rubinius in an afternoon. That’s just an example, of course. The point is to play around with your ideas and have fun doing it. I think you’ll find Rubinius to be an adventuresome companion.

Be sure to let us know what you’re working on. We like to be inspired, too! Consider writing a blog post about things that you find interesing, like this recent post by Yehuda Katz.

Back to personas

Conclusion

So there you have it. Just like there are many different viewpoints, there are many different reasons to use Rubinius. Not all those reasons make sense to everyone. We believe, however, that Rubinius has something to offer to just about everyone interested in Ruby. Most importantly, try it!

If we didn’t answer your question here, leave us a comment. If you have a reason for using Rubinius that we didn’t mention, let us know. As always, we appreciate your feedback. Chat with us in the #rubinius channel on freenode.net, watch our Github project, and follow us on Twitter.

P.S. Thanks to David Waite for suggesting the Academic Researcher and Language Enthusiast personas, I always forget those!

Introduction to Fancy

Fancy is a new general-purpose programming language targetting the Rubinius VM.

This blog post will give a short introduction to the language, what kind of problems it’s trying to solve and why I chose Rubinius as the VM to run Fancy on.

What is Fancy?

Fancy is a new general-purpose, dynamic, pure object-oriented programming language heavily inspired by Ruby, Smalltalk and Erlang that runs on the Rubinius VM. It’s the first fully bootstrapped language, aside from Ruby, running on Rubinius. This means that the compiler that generates bytecode for Rubinius is written in Fancy itself.

You can think of Fancy as a mix of features from the mentioned languages above, taking each of their strengths and improving upon their weaknesses. Fancy has a very small core and is largely based on the concept of message passing, just like Smalltalk. It tries to have as many language concepts being first-class values in the language.

Just like Ruby, Fancy is a dynamic object-oriented language that allows changing code at runtime, everything being an expression and generally embracing more then one way to do things. Fancy also has all the literal support that Ruby has, plus literal syntax for Tuples and Patterns (more on that below).

In contrast to Ruby and just like Smalltalk, Fancy has a very small amount of built-in keywords and all of the control structures are implemented in terms of message sends to objects using closures.

The third language that served as an inspiration is Erlang, from which Fancy takes the idea that concurrent programming should be easy by having the Actor Model built into the language. This part is still a work in progress, but should come together soon. The fact that Rubinius has a built-in Channel type, inter-VM communication capabilities and even an actor library makes implementing this easier than in traditional systems.

Why Fancy?

I believe there is real value in having a language that supports certain things out of the box. Especially when it comes to things like asynchronous and concurrent programming, having proper semantics built into the language can often help developers more than a library can. Very often it’s not just about the functionality itself but also about the semantics you want that functionality to have. This can cause problems particularly if the language’s semantics differ from what your library is trying to solve. A good example is the callback-based approach to asynchronous progamming which leads to code that differs both in semantics as well as how code is structured, compared to synchronous code. Ideally you’d still want to write code in a synchronous fashion, where exceptions pop up naturally while still being highly asynchronous.

In that sense Fancy is more flexible than Ruby as there’s not many special case semantics built in to the core language. Everything’s done via message passing, which fits nicely the actor model approach to concurrency. Fancy’s syntax is a lot simpler, too.

Since all the core control structures are just implemented in Fancy itself and adhere to the message passing protocol, you can easily override them for your personal needs. This is especially interesting when implementing domain specific languages. Say, you’d want to add some logging to conditional or looping constructs - it’s as easy as overriding a method in your DSL’s classes. Fancy also has class-based mixins, so it makes it easy to share functionality across class hierarchy boundaries.

Finally, I created Fancy because I wanted a language implementation that was well documented, easy to understand and very flexible to extend. Ruby is a nice language, but it has some inconsistencies and there’s only so much you can do when you’re bound by backwards compatibility. By starting fresh, Fancy has a clean, simple and easy to extend core which allows further exploration of features and abstractions.

Why target Rubinius?

The initial implementation of Fancy was a simple interpreter written in C++, similar to how Ruby 1.8 (MRI) works. It was a simple AST walker. After moving to Rubinius and writing an initial bootstrap compiler in Ruby, the codebase shrank to about 20% of the original implementation while actually being more performant. This of course is mostly due to Rubinius’ architecture and JIT compiler but it was a great experience nontheless.

The nice part about having a common virtual machine and runtime is that you’re not forced to a completely different platform to get the job done. Fancy and Ruby can coexist in the same application nicely and calling code from one another is dead simple. In fact, as of now, Rubinius doesn’t know anything about Fancy. And it shouldn’t. As long as all languages running on top of it adhere to the same interface (in this case the bytecode), it should just work fine.

Choosing Rubinius as a successor platform for Fancy was easy. It’s built for Ruby, a language that’s closely related to Fancy. Rubinius, while having been developed as a VM for running Ruby code, is very flexible and there are many features that abstract over Ruby’s external semantics. It was just a natural choice given the fact that Rubinius’ architecture and design was heavily influenced by Smalltalk VMs. Also, it’s a very nice dynamic bytecode virtual machine. The community is very responsive and helpful. Bugs get fixed instantly, there’s always someone to help out and overall it’s been a great experience.

Let’s look at some code!

OK, enough talking. Let’s have a look on how to get some Fancy code up and running. Our little sample application will be a simple IRC bot that connects to Fancy’s irc channel on Freenode and says hello to everyone that greets it. To make life easier, there’s already a Fancy package out there that helps with exactly this task: FancyIRC.

FancyIRC is a simple IRC client library inspired by Ruby’s IRC bot framework Cinch. It’s much simpler and the code is fairly easy to read, but it gives you a similar interface for writing IRC clients or bots.

So let’s get going by installing Fancy. You can either use the Fancy Rubygem and install it with Rubinius or get the code from GitHub and run rake in the directory. You’ll also then have to add the bin directory to your $PATH. If you want the latest and greatest version of Fancy I recommend building directly from source, as the Gem might not be up to date all the time. For demonstration purposes, let’s install the Rubygem.

$ rbx -S gem install fancy

To get the FancyIRC package we use Fancy’s built-in package manager, which knows how to find the code on GitHub and install it locally:

$ fancy install bakkdoor/fancy_irc

Writing the code

 1 require: "fancy_irc"
 2 
 3 greeter_bot = FancyIRC Client new: {
 4   configuration: {
 5     nickname: "greeter_bot"
 6     server: "irc.freenode.net"
 7     port: 6667
 8     channels: ["#fancy"]
 9   }
10 
11   # greet person back
12   on: 'channel pattern: /^[hH]ello greeter_bot/ do: |msg| {
13     msg reply: "Hello to you too, #{msg author}!"
14   }
15 
16   # "echo" command
17   # invoke with: !echo <text>
18   on: 'channel pattern: /^!echo (.*)$/ do: |msg, text| {
19     msg reply: "#{msg author} said: #{text}"
20   }
21 
22   # tell bot to shutdown via !shutdown command
23   on: 'channel pattern: /^!shutdown/ do: |msg| {
24     msg reply: "OK, shutting down"
25     System exit
26   }
27 }
28 
29 greeter_bot connect
30 greeter_bot run

I think the code is pretty straight forward. This should give you a feeling for what Fancy looks and feels like. There is of course lots more to Fancy than what was shown here. It would not fit into a single blog post.

A quick list of what’s currently being worked on:

Interested?

If you got interested in Fancy and want to know where to go next, here’s a short list of things to check out:

Running Multiple Rubinius Branches Simultaneously with RVM.

This article is written with the assumption that you have RVM installed already. If you do not, follow the Installation Instructions followed by the Basics closely first.

Named Ruby Installs

Everyone familiar with RVM knows that it allows you to quickly and easily install a particular Ruby interpreter by simply running, for example,

rvm install rbx

What is not widely known (yet) is that there is a “Named Rubies” feature that allows you to install altered versions of the same Ruby installation along side the original.

In the case of Rubinius there is this facinating branch called ‘hydra’. So let us see how we can have the Rubinius master branch installed as the main rbx with the hydra branch installed along side as well.

As above you first install rbx which is currently defaulted to -head version so

rvm install rbx

is currently equivalent to

rvm install rbx-head

After we have the mainline head Rubinus branch installed, we now want to use the named rubies feature. This is done using the -n specifier in the Ruby identifier string. So for example to install our hydra branch as an RVM ruby with the name ‘hydra’ in it we do the following:

rvm install --branch hydra rbx-nhydra

Now we can see that they can be used together! Using the Rubinius master environment,

$ rvm rbx ; ruby -v
rubinius 1.2.1 (1.8.7 6feb585f 2011-02-15 JI) [x86_64-apple-darwin10.6.0]

Whereas using the Rubinius hydra environment,

$ rvm rbx-nhydra ; ruby -v
rubinius 1.3.0dev (1.8.7 6feb585f xxxx-xx-xx JI) [x86_64-apple-darwin10.6.0]

We see that the next release of Rubinius (hydra branch) is indeed version 1.3.0 whereas the master branch is version 1.2.1.

Also please note that RVM creates wrapper scripts, so you do not need to switch out the entire environment just to run the differen versions either:

For Rubinius master,

$ rbx-head -v
rubinius 1.2.1 (1.8.7 6feb585f 2011-02-15 JI) [x86_64-apple-darwin10.6.0]

For Rubinius hydra,

$ rbx-head-nhydra -v
rubinius 1.3.0dev (1.8.7 6feb585f xxxx-xx-xx JI) [x86_64-apple-darwin10.6.0]

There is a lot more available to you than this, for more information on RVM capabilities please visit the RVM Website and also come talk to us in #rvm on irc.freenode.net during the daytime EDT.

I hope that this is helpful and informative to you!

~Wayne

Rubinius, What's Next?

On Tuesday, we released version 1.2.1 (see the Release notes). This release weighs in at 256 commits and 21 tickets closed in the 56 calendar days since the release of 1.2.0. Many thanks to those who contributed patches and to everyone who helped us test it.

While we were working on 1.2.1, we were also working on a Top Secret project that we’ve craftily hidden in plain sight. I’d like to introduce the work we are doing on the hydra branch and the features you can expect to see in Rubinius soon.

Daedalus - A new build system

Rubinius is a fairly complex project. It combines multiple components into a single system. We have worked hard to contain this complexity and from the beginning we insisted that building Rubinius be as simple as possible. For example, Rubinius can be run from the source directory, there is no need to install it first. Typically, building requires:

./configure
rake

The Rubinius system combines:

  1. External libraries written in C/C++, sometimes built with just Makefiles and sometimes using autotools.
  2. The virtual machine, garbage collector, and JIT compiler written in C++.
  3. The virtual machine interpreter instructions, including support code for the JIT, and instruction documentation all generated at build time from an instruction template.
  4. The core library and bytecode compiler written in Ruby.
  5. Various C extensions like the Melbourne parser, BigDecimal, Digest, and OpenSSL libraries. In the case of the parser, we have to build two versions, one for the bootstrapping system and one for the Rubinius system being built.

It has not been easy to make this work and over the years we have compiled a list of exactly what we need in a build system. Evan, in typical form, started hacking out a first pass and created daedalus, our new build system. It features such exotic (and extremely useful) features as SHA-based change detection, parallel builds, single-process execution, and use-aware configuration options. Allow me to elaborate.

Full-on Concurrency

Nobody likes waiting in line. In fact, the more desirable a thing is, the less we want to stand idly waiting in a line for it, tapping our foot, twiddling our thumbs. The same could be said about our programs.

Threads give us the ability to add concurrency to our programs. However, unless the hardware either has multiple CPUs or multiple cores per CPU (or both), the apparent concurrency will still be executing serially. Since there are so many multi-core CPUs around these days, our programs should be getting stuff done in parallel.

Unfortunately, there’s a twist. Even with native threads on a multi-core CPU, the amount of parallelism you get depends on how well you manage locks around shared data and resources. Sometimes managing these locks is complex and you opt for one big lock, essentially only allowing one thread at a time to run. That big lock is usually called a global interpreter lock (GIL) or global VM lock (GVL).

The Rubinius VM originally had green (user-space) threads, but it has had native threads with a GIL for a while now. In the hydra branch, Evan and contributors like Dirkjan Bussink have been working on replacing the GIL with fine-grained locks so that threads truly execute in parallel. This work has been going very well, owing in part to the fact that so much code in Rubinius is actually written in Ruby. Contributors like Chuck Remes have been running hydra under heavy concurrency loads and Rubinius is performing well.

Rubinius also has experimental support for Fibers and a built-in Actor library. There is more work to be done but Rubinius is quickly becoming an excellent platform for concurrency, with a variety of approaches available to the programmer. Evan has also suggested rewriting the Rubinius IO subsystem to enable even better APIs for concurrency, all from Ruby.

Performance

Forget everything anyone has ever told you about Ruby being slow. There are two things that make Ruby, as implemented, slow: 1) inexperience; 2) inadequate tools. These two result in one big thing: doing too much. Or, as they say: No code runs faster than no code. We have been working for 4+ years to build adequate tools in Rubinius, and there is plenty of experience in Smalltalk, Self, and other languages for making dynamic languages fast.

Presently, Rubinius typically runs pure Ruby code almost 2 times faster than MRI 1.9. However, there are also cases where Rubinius is slower. These mostly involve core libraries that are implemented in C in MRI. There are three main fronts on which we are attacking performance issues: 1) improving the algorithms in the Ruby code that implements the core library; 2) continuing to tune the VM and garbage collector; and 3) improving the JIT compiler. Which leads me to one of the most exciting things we are working on…

JIT Intermediate Representation (IR)

The just-in-time (JIT) compiler is the key to making Ruby fast. One of the biggest challenges with a dynamic language like Ruby is knowing what method is actually being invoked when a message is sent to an object. Consider the following code:

 1 class A
 2   def m(x)
 3     ...
 4   end
 5 end
 6 
 7 class B
 8   def m(x)
 9     ...
10   end
11 end
12 
13 class C
14   def work(obj)
15     obj.m(y)
16   end
17 end

What method is being invoked by obj.m(y)? There is no way to definitively know this by looking at the source code. However, when the program is actually running, we can know precisely what obj is and precisely which method m was invoked. This is called type profiling and that is exactly what the Rubinius VM does. Then the JIT uses the type information to make decisions like whether to inline a method into another method. When methods are inlined, it gives the optimizer more data and more possibilities to remove redundant code. The less code we can run, the faster Ruby will be.

Presently, the JIT compiler converts Rubinius bytecode into LLVM IR and LLVM handles the thorny task of generating machine code. However, Rubinius bytecode is designed for fast execution by the virtual machine rather than as a rich intermediate representation. So Evan has started work on a new JIT IR.

This new IR will help us to express Ruby semantics in a way that enables many powerful optimizations and will ultimately allow LLVM to generate even better machine code. Put another way, Rubinius loves Ruby code! Right down to the metal. There’s no fighting a foreign type system or the semantics of a language at odds with Ruby’s rosy view of the world.

Ruby 1.9

MRI 1.9 introduced two completely different changes to Ruby. The first was a new implementation based on a bytecode virtual machine. While the virtual machine replaced the AST-walking interpreter, little else changed architecturally. Mostly the same core library and garbage collector code exists in MRI 1.9 as was in MRI 1.8. The second change introduced some new syntax (minor) and encodings (major). Many of the other changes, for example, returning Enumerator objects from methods that take blocks, have been back-ported to Ruby 1.8.7 and are already available in Rubinius.

So, the key to supporting Ruby 1.9 in Rubinius essentially involves supporting the 1.9 syntax changes and encodings. We have begun implementing the parser changes and introduced the foundation for Encoding-aware Strings. A good amount of work remains to be done, but over the next month we expect that we will be starting to run Ruby 1.9-specific code in Rubinius.

Tools of Information

It has been said that printf is the mother of all debuggers. That illustrates two points: 1) data is often buried in our program code; and 2) we should have tools (e.g. a debugger) that enables us to access the data without manually instrumenting our code.

Presently, Rubinius has a built-in debugger, precise method profiler, memory analysis tool, and Agent interface that permits querying a running Rubinius VM–even one running on a remote machine–for a variety of information.

We will be adding the ability to track the location where objects are allocated to assist finding object leaks or code that is creating unusually large numbers of objects. We are also working on a tool to graphically display information like number of running threads, amount of CPU usage, and amount of memory used while actively monitoring a VM.

I am also curious about correlating this VM information with external data to enable play-back review. For example, I would like to monitor RubySpec runs and correlate which spec is running with the VM data. I imagine a simple monotonic reference ID provided by the VM would be useful in correlating these two otherwise unrelated pieces of data. The RubySpec runner would request the ID before running each spec and the Agent monitor would request the ID when gathering VM data. Later the two data sets could easily be merged.

When you find yourself manually instrumenting some code, consider what data you are trying to get your hands on and let us know the scenario. We’ll likely be able to build a tool that will open up new vistas into the behavior of your Ruby programs.

Windows®

However one may feel about Windows as an operating system, it is undeniable that the vast majority of people in the world use Windows. We believe those people have an inalienable right to use Rubinius, too.

Thanks to the wonderful, hard-working MinGW-w64 folks, we are able to compile the Rubinius VM into a native Windows executable. Presently, the VM will compile, link, and attempt to load the Ruby core library. More platform-specific work is needed to load the library. The next step after that will be getting the RubySpecs to run and start fixing issues.

Since the Windows work is being done on the hydra branch, the other features discussed above will be available on Windows as soon as we complete them.

Multi-language-ualization

The Rubinius VM began as an effort to create a modern, first-class environment for running programs written in Ruby. However, it turns out that Ruby is a terrific language for writing subsystems for other programming languages. Actually, this should come as no surprise; Ruby is a fabulous general purpose programming language.

To support experimenting with writing other languages that run on the Rubinius VM, Evan has started to put together a Language Toolkit. This includes things like a built-in PEG parser, convenient ways to create methods from Rubinius bytecode, and decoupling method dispatch from Ruby semantics.

Hopefully, Evan will introduce us to all this in a future blog post, but here is a taste of what you can do:

 1 class Hello
 2   dynamic_method :world do |g|
 3     g.push :self
 4     g.push_literal "Hello, world"
 5     g.send :puts, 1, true
 6     g.ret
 7   end
 8 end
 9 
10 Hello.new.world

Of course, that is much more concisely written in Ruby, but combine this ability with a built-in PEG parser and you can be experimenting with your own fascinating syntax in a matter of minutes.

Check out the Rubinius Projects page for some of these language experiments. One language in particular is Fancy, which is fully bootstrapped (i.e. the Fancy compiler is now written in Fancy) on Rubinius.

Documentation

One the one hand, Rubinius just runs Ruby code, and you shouldn’t need any special knowledge to run your application on Rubinius. On the other hand, as I’ve discussed above, there are some specific Rubinius features that may be very helpful to you. However, they can only be as helpful as the documentation we have for them.

Before we released 1.2.0 in December last year, I spent quite a bit of time getting a new documentation system in place. Since then, we’ve had contributors help with translations to Russian, Polish, Spanish, and German. Adam Gardiner started documenting the garbage collector algorithms. Yehuda Katz (you may have heard the name) has contributed documentation for the bytecode compiler complete with diagrams!. Chuck Remes wrote up a great piece on the memory analysis tool.

We really appreciate these contributions. We understand the need for great documentation and we have been creating better support for it. In many cases, all that is needed is to just open a file and start writing. Of course, one cannot expect to understand much about Rubinius without digging into the code. If there is a particular part of Rubinius that you are curious about, jump in the #rubinius channel on freenode.net and ask us questions. We can point you in the right direction and help clarify things. If nothing else, let us know which part of the missing documentation is most important to you and we can start filling that in.

How you can help

There you have it, some super exciting things coming very soon for Rubinius and for Ruby! We would love to have your help making Rubinius even better. The most important thing you can do is try running your Ruby code. Give us feedback. Let us know what features or tools would make your life easier. Help us to build them.

Rubinius adopts Ruby’s rosy view of the world. We want to empower you to solve your hardest problems with Ruby, and have fun doing it.

Rubinius Has a Blog!

Many thought the day would never come, but Rubinius finally has a blog. That’s not all, though: We have integrated the website, blog, and documentation using Jekyll. The source code for it all is in the main Rubinius repository.

People have often requested that we write more about the awesome features in Rubinius. We hear you and we’d love to do this. However, there is always a trade-off between working on those awesome features and writing about them. Until now, it’s been rather painful to write docs or blog posts because we did not have good infrastructure in place. Now, I think we do. I’m sure there are still a lot of improvements we can make, but we have a good place to start. I’d like to give a brief tour of our new system.

The primary goal was to improve collaboration and reduce friction for writing new documentation and blog posts. That’s right, improve collaboration. There are many people who have experience developing Rubinius and running their applications on it. We love how people have collaborated with source code commits. Now anyone has the ability to write a blog post as well. I’ve written a basic How-To - Write a Blog Post document. If you have an idea for a blog post, just let us know. We will exercise a bit of editorial control just to ensure the topics are appropriate for Rubinius, but generally, we are thrilled to have your contributions.

Recently, we added the rbx docs command. This will run a web server on your machine and open a browser window to display the Rubinius documentation. Now the documentation will also be available at the rubini.us website. I have added a basic outline and a bunch of files to further simplify the task of writing docs. In many cases, merely open a file and start writing docs in Markdown format.

We have also begun translating our documentation to other languages. I am excited about this, being a huge language geek. I wish that I were proficient in 10 languages so I could polish our documentation for the many people who are not native English speakers. Alas, I only have a fair ability to write in Spanish, so we are again depending on your help. I started the translation effort by passing the existing English docs through Google translate. We have a beginning guide for How-To - Translate Documentation. I’ve been told by kronos_vano in our #rubinius IRC channel that he’s already working on a Russian translation. I personally would love to see Japanese and Chinese translations.

So that’s a brief introduction to our new infrastructure for documenting and explaining Rubinius. It’s been such a joy to see so many people contribute to the Rubinius source code over the years. We hope that the blog, documentation, and translations will further empower people to contribute and benefit from the value that Rubinius has to offer the Ruby community.

¡Adelante!