Thursday, January 30, 2014

What Ruby does at parse time

Yesterday at the Las Vegas Ruby Group, someone was surprised that Kernel#global_variables will list global variables that have not been used yet. The same thing is actually true for symbols and methods, as shown here:


p Symbol.all_symbols.map(&:to_s).grep(/neve/)
    # => ["never", "never_runs", "$never_global", "never_sym"]

p global_variables.grep(/neve/)  # => ["$never_global"]

def never_runs
  $never_global
  :never_sym
end

The first line of this program tells us that Ruby has already found four things. It found a symbol :never which apparently comes from the Ruby interpreter since we never defined it. It has found the name of the method in our program even though we haven't defined that method yet. It found a symbol we used in that method even though we are never going to call the method. Similarly, it found a the name of a global variable we were never going to use.

The second line just reproduces the original thing that happened at the Ruby group, and shows that Ruby is aware of "$never_global" as an actual global variable and not just a symbol.

To understand this, you need to know that Ruby does not simply run a source file one line at a time:

  • First, Ruby parses the syntax of all the code in the file.

  • Later, Ruby runs the code in the file.

We already intuitively know this because whenever we make a syntax error in a file, we can see that Ruby just reports a syntax error and does not run any of the other code in the file. This means it is parsing the entire file before running any of it.

So it shouldn't be surprising that the Ruby interpreter is aware of symbols and global variables before they are used; it just tells us that those things are probably recorded at parse time instead of run time. Some more evidence for this conclusion can be found in the fact that the Ruby interpreter has lots of symbol table code in parse.y.

Most of the time you can just pretend that Ruby processes your file one line at a time, but this is a case where the details of its behavior reveal a little bit about how it actually works.

The code in this blog post was executed with Ruby 2.0.0. Ruby 1.8.7 had the same results except the symbol "never" was not present.