kaukas

An Idiomatic Ruby Interview

I had an interview for a Ruby developer position. As part of a live Rails coding session a class performing operations needed refactoring into supporting services. Let's say the logic in question is

query = 'troglodytes troglodytes'
response = URI.open("https://xeno-canto.org/api/2/recordings?query=#{CGI.escape(query)}")
if response.status[0] == '200'
  body = JSON.load(response)
  ...
end

This is a time-boxed assignment, so my mind races through the possible improvements. The simplest that could work is a method:

def fetch_bird(query)
  response = URI.open("https://xeno-canto.org/api/2/recordings?query=#{CGI.escape(query)}")
  [response.status[0], JSON.load(response)]
end

Ah, that return tuple is ugly. We should return some object instead. Also, typically you do not have free floating functions in Rails. The "idiomatic" approach that I've encountered in production Rails apps the most is to create a class like this:

class BirdFetcher
  BIRDS_API_URL = "https://xeno-canto.org/api/2/recordings?query=%s"

  attr_reader :status, :body

  def initialize(query)
    @query = query
  end

  def fetch
    url = sprintf(BIRDS_API_URL, CGI.escape(@query))
    response = URI.open(url)
    @status = response.status[0]
    @body = JSON.load(response)
  end
end

This is great. But also not, primarily because the instance of BirdFetcher is stateful. There is the right way to use it (initialize, fetch, then status) and the wrong way (initialize, status).

Seasoned Ruby developers call memoization to the rescue!

class BirdFetcher
  BIRDS_API_URL = "https://xeno-canto.org/api/2/recordings?query=%s"

  def initialize(query)
    @query = query
  end

  def status
    response.status[0]
  end

  def body
    JSON.load(response)
  end

  private

  def response
    @response ||=
      begin
        url = sprintf(BIRDS_API_URL, CGI.escape(@query))
        URI.open(url)
      end
  end
end

Great, only one way to call it! Except the statefulness is now hidden and waits to bite us back. You see, the call is lazy now. It means both status and body can be very slow or very fast, might throw errors or not, depending on which one you call first! Also, notice the boilerplate in the previous example. This one has more!

A better design uses classes to prevent state:

class BirdFetcher
  class Result
    attr_accessor :status, :body

    def initialize(status, body)
      @status = status
      @body = body
    end
  end

  BIRDS_API_URL = "https://xeno-canto.org/api/2/recordings?query=%s"

  def initialize(query)
    @query = query
  end

  def fetch
    url = sprintf(BIRDS_API_URL, CGI.escape(@query))
    response = URI.open(url)
    Result.new(response.status[0], JSON.load(response))
  end
end

This is an improvement; you can't fetch birds wrong anymore. But now we have an inner class which is slightly ugly in itself (besides, with Rails autoloading you're tempted to put it in bird_fetcher/result.rb. Yuck.). Let's see if we can get rid of the inner class:

class BirdFetcher
  BIRDS_API_URL = "https://xeno-canto.org/api/2/recordings?query=%s"

  def self.fetch(query)
    url = sprintf(BIRDS_API_URL, CGI.escape(query))
    response = URI.open(url)
    new(response.status[0], JSON.load(response))
  end

  attr_accessor :status, :body

  def initialize(status, body)
    @status = status
    @body = body
  end
end

We flipped the fetching code and the Result class around. Now BirdFetcher is the result itself. If you get its instance via BirdFetcher.fetch you're guaranteed to do the right thing. Nobody in their right mind would pass status and body to directly create an instance of BirdFetcher and still expect it to fetch birds. I mean, we could still try and hide the initializer…

class BirdFetcher
  ...

  def initialize(status, body)
    @status = status
    @body = body
  end

  # Do not initialize directly.
  private_class_method :new
end

That's overkill but works!

private method `new' called for BirdFetcher:Class (NoMethodError)

So that's what I should have written during the interview.

But I did not. See, we're dancing around the fact that initialize is needed by Ruby to create an object and set instance variables. But we do not want users to initialize the instance directly. And—it is pure boilerplate. Instead I put the fetching logic into the initialize:

class BirdFetcher
  BIRDS_API_URL = "https://xeno-canto.org/api/2/recordings?query=%s"

  attr_accessor :status, :body

  def initialize(query)
    url = sprintf(BIRDS_API_URL, CGI.escape(query))
    response = URI.open(url)
    @status = response.status[0]
    @body = JSON.load(response)
  end
end

There. One class. A simple, stateless interface, no laziness, no way to misuse it. And, by the way, the remaining boilerplate is pretty essential.

I could almost hear alarms going off in the minds of my interviewers. Blasphemy! As a matter of fact, I kept suppressing a tiny alarm in my own mind, too. Why?

Not sure really. It's an old adage, "avoid logic in constructors". If you search the internets for this phrase most of the content deals with Java and similar languages.

There's also an old but very good article by Miško Hevery on why and how they are a bad idea, with Java examples. Let's see how those apply to Ruby.

It violates the Single Responsibility Principle

Indeed, BirdFetcher should probably not play bird songs in the initializer. But it should not play them at all. BirdFetcher should be responsible for one thing, fetching birds. Where you put that thing, initializer or another method, is an implementation detail.

Testing Directly is Difficult

If we had to pass URI as a dependency to be able to mock it then yes. But in Ruby web requests are typically mocked at a lower level using webmock or something similar.

Subclassing and Overriding to Test is Still Flawed

BirdFetcher is an algorithm masking as a class. Subclassing is a bad idea. Use composition. This applies to every BirdFetcher implementation, not just the last one.

It Forces Collaborators on You

Again, the collaborator can be tested with webmock. If BirdFetcher gets more logic in the future then reconsider the approach. That said, be sure to follow the SRP first.

In general, the above arguments make sense when dealing with an object that has algorithms. They are irrelevant in our case since we're dealing with an algorithm that involves trivial objects.

By the way, I am not the first to propose the idea to do the messy stuff in the constructor and treat objects as values otherwise. Clojure is (spiritually) fine with mutability while the object (for example, a collection) is being constructed but expects it to become immutable right afterwards.


Anyway. I failed that interview. No doubt my BirdFetcher equivalent was not the main reason, likely not even the largest one. And they were not even wrong: I believe in my solution because it fits in this particular case. However, "keep it idiomatic" is a valid argument as well. It still irked me that you could use this as one of the reasons to reject a person.

Imagine you employed me and I created a refactoring PR and you found a logic-in-constructor class in it. No clean code nonsense 'round here! Fine. All you would need to do is say

Err… Don't do dat.

… and I will happily rewrite it with memoization.