Full Frontal Nerdity

Saturday, January 10, 2009

vendor everything should include Rubygems itself

There's a lot of talk about the vendor everything approach for Rails. That is, you put all gems that your Rails app depends on into vendor (also called freezing). The newer versions of Rails even include rake tasks to help with specifying gem dependencies and freezing them into vendor/gems.

The problem is, Rails is tightly bound to features in Rubygems itself. For example, the feature I mentioned above ("config.gem") only works for Rubygems version > 1.1.1. Even if you aren't using any Rubygem-ish features of Rails, it'll break for real old versions of Rubygems, such as

undefined method `loaded_specs' for Gem:Module (NoMethodError)

when using Rails 2.1.0 and Rubygems <= 0.9.0.

So why is this an issue?

First of all, it's a philosophical thing. We're trying to isolate our Rails app from system changes, right? We freeze Rails into vendor, we freeze gems into vendor, and yet we're still beholden to the Rubygems system? Why?

More practically, what if you're running on a host that has old software? Or, for reasons out of your control, you have to deploy your Rails app to an old system? In my case, I'm using Rails 2.1.0 but have to deploy to an Ubuntu Feisty (!) system. Feisty's Rubygems is 0.9.0.

What to do?

All of the "vendor everything" articles I've found just talk about vendoring the gems-- none of them ever talk about vendoring Rubygems itself (except, in a roundabout fashion, here).

To the extent that anyone addresses this issue, mostly they mention doing a "gem update --system" or somesuch. Sadly this doesn't usually work. It seems to break Rubygems on Feisty (assuming you even have rights to update Rubygems in the first place), and on newer systems you'll get this error:

gem update --system is disabled on Debian. RubyGems can be updated using the official Debian repositories by aptitude or apt-get.

which locks you into whatever the latest package is for your distro.

So I had to do this myself. It turns out its not that hard!

First we have to find where the Rubygems code is. If you look in /usr/lib/ruby/1.8/ you'll see a bunch of files. Then copy these files

rbconfig/ rubygems/ rubygems.rb ubygems.rb

to vendor/rubygems (keeping the structure intact, of course).

Now we have to tell our Rails app to look in vendor for Rubygems rather than use the system version. We do this by adding vendor/rubygems to the load path ($:) variable:

RUBYGEMS_VENDORED = File.join(RAILS_ROOT, 'vendor/rubygems')
$:.insert(0, RUBYGEMS_VENDORED)

But where do we put this code? My first thought was to put it in config/environment.rb, but this doesn't work. It's not early enough. We need to put this at the top of config/boot.rb (despite what the "do not edit" comment says). The reason for this is that boot.rb does some Rubygems stuff before it loads environment.rb.

Now you should be able to run your Rails app anywhere-- even on systems that don't even have Rubygems installed. Your only system dependency should be Ruby itself (and possibly rake).

Friday, December 26, 2008

MySQL gotcha

Try this on a fresh Ubuntu installation:

$ sudo apt-get install mysql-server mysql-client $ mysql -u root -p mysql> grant all on testdb.* to 'testuser'@'%' identified by 'testpass'; mysql> quit; $ mysql -u testuser -ptestpass ERROR 1045 (28000): Access denied for user 'testuser'@'localhost' (using password: YES)

WTF? Why is the testuser being denied access? We just created it!

Let's have a look at the users in the system:

$ mysql -u root -p mysql > select User,Host from mysql.user order by User; +------------------+----------------+ | User | Host | +------------------+----------------+ | | localhost | | | ubuntu-desktop | | debian-sys-maint | localhost | | root | localhost | | root | ubuntu-desktop | | root | 127.0.0.1 | | testuser | % | +------------------+----------------+ 7 rows in set (0.00 sec) mysql > quit;

Notice the empty user names (these show up as "Any" in phpmyadmin). It turns out that ''@'localhost' will trump 'testuser'@'%'. Why is this? Intuitively you'd think that a wildcard ('%') would win out, but in fact 'localhost' beats '%'-- because it is more specific. From the MySQL Docs:

One account ('monty'@'localhost') can be used only when connecting from the local host. The other ('monty'@'%') can be used to connect from any other host. Note that it is necessary to have both accounts for monty to be able to connect from anywhere as monty. Without the localhost account, the anonymous-user account for localhost that is created by mysql_install_db would take precedence when monty connects from the local host. As a result, monty would be treated as an anonymous user. The reason for this is that the anonymous-user account has a more specific Host column value than the 'monty'@'%' account and thus comes earlier in the user table sort order.

So what we need to do is have two testusers-- one at 'localhost' and one at '%'. (Although if you know testuser is only ever going to connect locally, you don't need the '%'.)

$ mysql -u root -p mysql> grant all on testdb.* to 'testuser'@'localhost' identified by 'testpass'; mysql> quit; $ mysql -u testuser -ptestpass mysql> quit;

Woot!

Wednesday, December 10, 2008

Spoofing subdomains

Apache supports subdomains (e.g., subdomain.mydomain.com) through the use of VirtualHost and ServerName.

This isn't magic, though! Apache can't help you if the DNS isn't setup to find your subdomain. That is, you need DNS set up to forward your subdomain to your machine (e.g., *.mydomain.com => mydomain.com or something).

If you're on a development box, this can be complicated. You either have to get your DNS admin to add the forwarding (if you're in a big corporate network, this can be problematic) or you have to create your own LAN-local DNS server (using bind9 or somesuch...omg).

This is way too much trouble if you just want to try some stuff out! Isn't there a way to simulate accessing a subdomain?

Actually, yes.

If you think about it: how does Apache even know when you're hitting a subdomain? When you tell your browser to go to a URL, it's just hitting an IP address in the end (subdomain or not). How is this information passed on?

It turns out its passed in through the "Host" HTTP header.

If there were some way to hack the "Host" header, we could access the ordinary URL but have the "Host" header be the subdomain, in which case Apache should respond with the subdomain's website.

As it happens, there's a great little Firefox addon called, appropriately enough, Modify Headers. Using it we can edit, add, or remove any HTTP headers that Firefox will send to the webserver.

For example, I can modify the "Host" header to be "subdomain.mydomain.com" like so:

then in Firefox I can go to the URL "mydomain.com" and apache will give me the website for "subdomain.mydomain.com". Woot!

Apache and mod_rewrite

If you're getting errors like this in Apache:

.htaccess: Invalid command 'RewriteEngine', perhaps misspelled or defined by a module not included in the server configuration

that probably means you don't have mod_rewrite enabled. You can enable it easily enough by using the "a2enmod" tool. On Ubuntu, like this:

$ sudo a2enmod rewrite

and then restarting apache, of course.

Friday, December 5, 2008

Piping input to exec in Ant

Using the Exec task in Ant is pretty useful.

But what if the program you want to run is interactive? That is, it prompts for input and won't let you pass in arguments from the command-line (or, its insecure to pass it in on the command line, like a password). Well, unfortunately we can't get the interactivity to show up in Ant (it's a design decision-- I'm not totally clear on why, but it has to do with reliably being able to know when the child process is done.)

So we have to pipe the input in. Using the "inputstring" attribute we can do this. (And using the "input" task, you can prompt for things in Ant and then pass them on to the exec task using inputstring.)

But what about multiple arguments? e.g., where the user would hit <enter>?

This stumped me for a bit (there's a dearth of exec and inputstring examples on the web) until I realized that we need to encode the <enter> into the inputstring ourselves. But how? '\n' doesn't mean anything to Ant.

But we can encode the ASCII code for '\n' as an XML entity. The ASCII code for '\n' is 0x0A, so we use "
".

Like, so:

<!-- Line-Feed (LF or '\n') -->
  <property name="LF" value="&#x0A;" />
  
  <!-- arguments to feed into exec -->
  <property name="username" value="joeschmoe" />
  <property name="password" value="blahblah" />

  <exec executable="${myprog}" failonerror="true" inputstring="${username}${LF}${password}${LF}">
   <arg value="lalala" />
  </exec>

How do you verify your Ant version?

In my last post, I mentioned that "SecureInputHandler" was only added as of Ant 1.7.1.

But how do you make your buildfile require a particular version of Ant??

You can do it this way. It works, but its inelegant.

You can also do it this way:

<property name="ant.min-version" value="1.7.1" /> 
 <target name="verify-ant">
  <fail message="Ant version is '${ant.version}'.  Must have ant version ${ant.min-version}+">
   <condition>
    <not><antversion atleast="${ant.min-version}" /></not>
   </condition>
  </fail>
 </target>

Password prompting in Ant

Sometimes you've just gotta prompt for info in Ant. Everything can't go in config files-- passwords, for example.

The problem is, for things like passwords we'd really like to not have Ant echo what we're typing.

It turns out that in Ant 1.7.1 they added a "SecureInputHandler" which does this (it takes advantage of some things in Java 1.6). It's not documented, exactly, but its in there:

<target name="input-test" >
  <input message="username:>" addproperty="username" defaultvalue="lalala" />

  <input message="password:>" addproperty="password">
   <handler classname="org.apache.tools.ant.input.SecureInputHandler" />
  </input>

  <echo message="username= ${username}" />
  <echo message="password= ${password}" />
 </target>