Tables and lambdas, a cure for smelly cases

Lots of folks consider case expressions in Ruby a code smell. I’m not ready to write them off just yet, but I know a good replacement for some uses of case when I see it. Rad co-worker David Copeland’s Lookup Tables With Lambdas is one of those replacements. For cases where a method takes a parameter, throws it into a case, and returns a value, I can replace all that lookup business with a hash lookup. To carry the metaphor through, the hash is the lookup table. Rad.

Where it gets fun is when I need to do some kind of dynamic lookup in the hash. Normally I wouldn’t want to do that when the Ruby interpreter parses my hash literal. If I reach into my functional programming bag of tricks, I recall that lambdas can be used to defer evaluation. And that’s exactly what David recommends. If I’ve got database lookups or logic I need to embed in my tables, Ruby’s lambda comes to the rescue!

This approach works great at the small-to-medium scale. That said, I always keep in mind that a bunch of methods manipulating a hash, using its keys as a convention, is an encapsulated, orthogonal object begging to happen. Remember, it’s Ruby; we can make our objects behave like hashes but still do OO- and test-driven design.

Turns out I was wrong about RSpec subjects

I was afraid that David Chelimsky was going to take away my toys! Consider, explicit use of subject in RSpec considered a smell:

The problem with this example is that the word “subject” is not very intention revealing. That might not appear problematic in this small example because you can see the declaration on line 3 and the reference on line 6. But when this group grows to where you have to scroll up from the reference to find the declaration, the generic nature of the word “subject” becomes a hinderance to understanding and slows you down.

I’m so guilty of using subject heavily. Even worse, I’ve been advocating it to others too. In my defense, it does lend a good deal of concision to specs and seemed like a golden path.

Luckily, David isn’t taking away my toys. He’s got an even better recommendation: just use a method or let with a intention-revealing name. Here’s his example:

describe Article do
  def article; Article.new; end

  it "validates presence of :title" do
    article.should validate_presence_of(:title)
  end
end

This is, now that I’m looking at it, way better. As this spec grows, you can add helpers for article_with_comments, article_with_author, etc. and it’s clear right on the line that helper is used what’s going on. No jumping back and forth between contexts. Thumbs up!

Three Easy Essays on Distributed Systems

Ryan Smith is pretty good at thinking about distributed systems. Distributed systems, the systems we (sometimes unwittingly) create on a regular basis these days, are a complicated, dense, far-reaching topic. Ryan’s managed to take a few of its problems and concisely introduce them with simple solutions that apply to all but the largest systems.

In The Worker Pattern, he presents a novel solution to a problem you are probably tackling with background or asynchronous job queues. Teaser: do you know what the HTTP 202 status code does?

A web service that requires high throughput will undoubtedly need to ensure low latency while processing requests. In other words, the process that is serving HTTP requests should spend the least amount of time possible to serve the request. Subsequently if the server does not have all of the data necessary to properly respond to the request, it must not wait until the data is found. Instead it must let the client know that it is working on the fulfillment of the request and that the client should check back later.

Coordinating multiple processes that need to process a dataset in bulk is tricky. Large systems usually end up needing some kind of Paxos service like Doozer or ZooKeeper to keep all the worker processes from butting heads or duplicating work. Leader Election shows how, by scoping the problem space to existing tools, it becomes possible to put together a solution that scales down to small and medium-sized systems:

My environment already is dependent on Ruby & PostgreSQL so I want a solution that leverages my existing technologies. Also, I don’t want to create a table other than the one which I need to process.

As applications grow, they tend to maintain more and more state across more and more systems. Incidental state is problematic, especially when you have to maintain several services to keep all of it available. Applying Event Buffering mitigates many of these problems. The core idea of this one is my favorite:

We have seen several examples of how to transfer state from our client to our server. The primary reason that we take these steps to transfer state is to eliminate the number of services in our distributed system that have to maintain state. Keeping a database on a service eventually becomes and operational hazard.

Most of the systems we build on the web today are distributed systems. Ryan’s writings are an excellent introduction to thinking about and building these systems. It certainly helps to comb through research papers on the topic, but these three essays are excellent starters down the path to intentionally building distributed systems.

Posted in Code

Permalink

Ruby anthropology with Hopper

Zach Holman is doing some interesting code anthropology on the Ruby community. Consider Aggressively Probing Ruby Projects:

Hopper is a Sinatra app designed to pull down tens of thousands of Ruby projects from GitHub, snapshot each repository into ten equidistant revisions, run them through a battery of tests (which we call Probes), and hopefully come up with some deeply moving insights about how we write Ruby.

There are plenty of code metric gizmos out there. At a glance, Hopper takes a few nice steps over extant projects. Unlike previous tools, it has a clear design, an obvious extension mechanism, and the analysis tools are distinct from the reporting tools. Further, it’s designed to run out-in-the-open, on existing open source projects. This makes it immediately useful and gives it a ton of data to work with.

For entertainment, here’s some information collected on some stuff I worked on at Gowalla: Chronologic and Audit.

Posted in Curated

Permalink

A real coding workspace

Do you miss the ability to take a bunch of paper, books, and writing utensils and spread them out over a huge desk or table? Me too!

Light Table is based on a very simple idea: we need a real work surface to code on, not just an editor and a project explorer. We need to be able to move things around, keep clutter down, and bring information to the foreground in the places we need it most.

This project is fantastic. It’s taking a page from the Smalltalk environments of yore, cross-referencing that with Bret Victor’s ideas on workspace interactivity. The result is a kick-in-the-pants to almost every developer’s current workflow.

There’s a lot to think about here. A lot of people focus on making their workflow faster, but what about a workspace that makes it easier to think? There’s a lot of room to design a better workspace, even if you’re not going as far as Light Table does.

There’s a project on Kickstarter to fund further development of Light Table. If you write software, it’s likely in your interest to chip in.

UserVoice’s extremely detailed project workflow

Some nice people at UserVoice took the time to jot down how they manage their product. Amongst the lessons learned:

Have a set amount of time per week that will be spent on bugs

We have roughly achieved this by setting a limit on the number of bugs we’ll accept into Next Up per week. This was a bit contentious at first but has resolved a lot of strife about whether a bug is worthy. The customer team is now empowered (or burdened) with choice of choosing which cards will move on. It’s the product development version of the Hunger Games.

This, to me, is an interesting juxtaposition. Normally, I think of bugs as things that should all be fixed, eventually. Putting some scarcity of labor into them is a great idea. Fixing bugs is great, until it negatively affects morale. Better to address the most critical and pressing bugs and then move the product ball forward. A mechanism to limit the number of bugs to fix, plus the feedback loop of recognizing those who fix bugs in an iteration (they mention this elsewhere in the article), is a great idea.

Posted in Code, Curated

Permalink

Cowboy dependencies

So you’ve written a cool open source library. It’s at the point where it’s useful. You’re pretty excited. Even better, it seems like something that might be useful at your day job. You could go ahead and integrate it. Win-win! You get to work out the rough edges on your open source project and make progress on your professional project.

This is tricky ground and it’s not as win-win as you might think. Integrating a new dependency, whether its one maintained by a team-mate or not, requires communication. Everyone on the team will have to know about the dependency, how to work with it, and how to maintain it within the project. If there’s a deal-breaking concern with the library, consider it feedback on your library; it either needs to better address the problem, or it needs better documentation to address why the problem isn’t so much a problem.

It all comes down to communication. Adding a dependency, even if you know the person who wrote it really well, requires collaboration from your teammates. If you’re not talking to your teammates, you’re just cowboy coding.

Don’t cowboy dependencies into your project!

Posted in Code

Permalink

A Presenter is a signal

When someone says “your view or API layer needs presenters”, it’s easy to get confused. Presenter has become wildcard jargon for a lot of different sorts of things: coordinators, conductors, representations, filter, projections, helpers, etc. Even worse, many developers are in the “uncanny valley” stage of understanding the pattern; it’s close to being a thing, but not quite. I’ve come across presenters with entirely too much logic, presenters that don’t pull their weight as objects, and presenters that are merely indirection. Presenter is becoming a catch-all that stands for organizing logic more strictly than your framework typically provides for.

I could make it my mission to tell every single person they’re wrong about presenters, but that’s not productive and it’s not entirely correct. Rather, presenters are a signal. When you say “we use presenters for our API”, I hear you say “we found we had too much logic hanging around in our templates and so we started moving it into objects”. From there on out, every application is likely to vary. Some applications are template focus and so need objects that are focused on presentational logic. Other apps are API focused and need more support in the area of emitting and parsing JSON.

At first, I was a bit concerned about the explosion of options for putting more objects into the view part of your typical model-view-controller application. But as I see more applications and highly varied approaches, I’m fine with Rails not providing a standard option. Better to decide what your application really needs and craft it yourself or find something viable off the shelf.

As long as your “presenters” reduce the complexity of your templates and makes logic easier to decouple and test, we’re all good, friend.

Posted in Code

Permalink

Learn Unix the Jesse Storimer way

11 Resources for Learning Unix Programming:

I tend to steer clear of the thick reference books and go instead for books that give me a look into how smart people think about programming.

I have a soft spot in my heart for books that are way too long. But Jesse’s on to something, I think. The problem with big Unix books is that they are tomes of arcane rites; most of it just isn’t relevant to those building systems on a modern Unix (Linux) with modern tools (Java, Python, Ruby, etc.).

Jesse’s way of learning you a Unix is way better, honestly. Read concise programs, cross-reference them with manual pages. Try writing your own stuff. Rinse. Repeat.

How to approach a database-shaped problem

When it comes to caching and primary storage of an application’s data, developers are faced with a plethora of shiny tools. It’s easy to get caught up in how novel these tools are and get over enthusiastic about adopting them; I certainly have in the past! Sadly, this route often leads to pain. Databases, like programming languages, are best chosen carefully, rationally, and somewhat conservatively.

The thought process you want to go through is a lot like what former Gowalla colleague Brad Fults did at his new gig with OtherInbox. He needed to come up with a new way for them to store a mapping of emails. He didn’t jump on the database of the day, the system with the niftiest features, the one with the greatest scalability, or the one that would look best on his resume. Instead, he proceeded as follows:

  1. Describe the problem domain and narrow it down to two specific, actionable challenges
  2. Elaborate on the existing solution and its shortcomings
  3. Identify the possible databases to use and summarize their advantages and shortcomings
  4. Describe the new system and how it solves the specific challenges

Of course, what Brad wrote is post-hoc. He most likely did the first two steps in a matter of hours, took some days to evaluate each possible solution, decided which path to take, and then hacked out the system he later wrote about.

But more importantly, he cheated aggressively. He didn’t choose one database, he chose two! He identified a key unique attribute to his problem; he only needed a subset of his data to be relatively fresh. This gave him the luxury of choosing a cheaper, easier data store for the complete dataset.

In short: solve your problem, not the problem that fits the database, and cheat aggressively when you can.

Follow

Get every new post delivered to your Inbox.

Join 2,167 other followers