July 05, 2012
Older: Misleading Title About Queueing
Newer: Booleans are Baaaaaaaaaad
Four Guidelines That I Feel Have Improved My Code
I have been thinking a lot about isolation, dependencies and clean code of late. I know there is a lot of disagreement with people vehemently standing in both camps.
I certainly will not say either side is right or wrong, but what follows is what I feel has improved my code. I post it here to formalize some recent thoughts and, if I am lucky, get some good feedback.
Before I rush into the gory details, I feel I should mention that I went down this path, not as an architecture astronout, but out of genuine pain in what I was working on.
My models were growing large. My tests were getting slow. Things did not feel “right”.
I started watching Gary Bernhardt’s Destroy All Software screencasts. He is a big proponent of testing in isolation. Definitely go get a subscription and take a day to get caught up.
On top of DAS, I started reading everything I could on the subject of growing software, clean code and refactoring. When I say reading, I really should say devouring.
I was literally prowling about like a lion, looking for the next book I could devour. Several times my wife asked me to get off my hands and knees and to kindly stop roaring about SRP.
Over the past few months as I have tried to write better code, I have definitely learned a lot. Learning without reflection and writing is not true learning for me.
Reflecting on why something feels better and then writing about it formalizes it in my head and has the added benefit of being available for anyone else who is struggling with the same.
Here are a few guidelines that have jumped out at me over the past few days as I reflected on what I have been practicing the past few months.
Guideline #1. One responsibility to rule them all
Single responsibility principle (SRP) is really hard. I think a lot of us are frustrated and feeling the pain of our chubby <insert your favorite ORM> classes. Something does not feel right. Working on them is hard.
The problem is context. You have to load a lot of context in your brain when you crack open that INFAMOUS user model. That context takes up the space where we would normally create and come up with new solutions.
Create More Classes
So what are we to do? Create more classes. Your models do not need to inherit from ActiveRecord::Base, or include MongoMapper::Document, or whatever.
A model is something that has business logic. Start breaking up your huge models that have persistence bolted on into plain old Ruby classes.
I am not going to lie to you. If you have not been doing this, it will not be easy. Everything will seem like it should just be tucked as another method in a model that also happens to persist data in a store.
Naming is Hard
Another pain point will be naming. Naming is fracking hard. You are welcome for the BSG reference there. I would like to take that statement a step further though.
Naming is hard because our classes and methods are doing too much. The fewer responsibilities your class has, the easier it will be to name, especially after a few months of practice.
An Example
Enough talk, lets see some code. In our track processors, which pop tracks off a queue and store reports in a database, we query for the gauge being tracked before storing reports. The purpose of this query is to ensure that the gauge is in good standing and that we should, in fact, store reports in the database for it.
A lot of people throw the tracking code on their site and never remove it or sign up for a paying account. We do this find to make sure those people noop, instead of creating tons of data that no one is paying for.
This query happens for each track and it is pulling information that rarely if ever changes. It seemed like a prime spot for a wee bit of caching.
First, I created a tiny service around the memcached client I decided to use. This only took an hour and it means that my application now has an interface for caching (get
, set
, delete
, and fetch
). I’ll talk more about this in guideline #3.
Once I had defined the interface Gauges would use for caching, I began to integrate it. After much battling and rewriting of the caching code, each piece felt like it was doing too much and things were getting messy.
I stepped back and thought through my plans. I wanted to cache only the attributes, so I threw everything away and started with that. First, I wanted to be able to read attributes from the data store.
class GaugeAttributeService
def get(id)
criteria = {:_id => Plucky.to_object_id(id)}
if (attrs = gauge_collection.find_one(criteria))
attrs.delete('_id')
attrs
end
end
end
Given an id, this class returns a hash of attributes. That is pretty much one responsibility. Sweet action. Let’s move on.
Second, I knew that I wanted to add read-through caching for this. Typically read-through caching uses some sort of fetch pattern. Fetch is basically a shortcut for look first in the cache and if it is not there, compute the block, store the computed result in the cache and return the computed result.
If I would have added caching in the GaugeAttributeService
class, I would have violated SRP. Describing the class would have been "checks the cache and if not there it fetches from database". Note the use of "and".
As Growing Object Oriented Software states:
Our heuristic is that we should be able to describe what an object does without using any conjunctions (“and,” “or”).
Instead, I created a new class to wrap (or decorate) my original service.
class GaugeAttributeServiceWithCaching
def initialize(attribute_service = GaugeAttributeService.new)
@attribute_service = attribute_service
end
def get(id)
cache_service.fetch(cache_key(id)) {
@attribute_service.get(id)
}
end
end
I left a few bits out of this class so we can focus on the important part, which is that all we do with this class is wrap the original one with a cache fetch.
As you can see, naming is pretty easy for this class. It is a gauge attribute service with caching and is named as such. It initializes with an object that must respond to get
. Note also that it defaults to an instance of GaugeAttributeService
.
Unit testing this class is easy as well. We can isolate the dependencies (attribute_service
and cache_service
) in the unit test and make sure that they do what we expect (fetch
and get
).
Note: There definitely could a point made that "with" is the same as "and" and therefore means that we are breaking SRP. Naming is hard, really hard. Rather than get mired forever in naming, I rolled with this convention and, at this point, it does not bother me. I am definitely open to suggestions. Another name I played with was CachedGaugeAttributeService.
Below is an example setup with new dependencies inject in the test that help us verify this classes behavior in isolation.
attributes = {'title' => 'GitHub'}
attribute_service = Class.new do
def get(id)
attributes
end
end.new
cache_service = Class.new do
def fetch(key)
get(key) || yield
end
def get(key)
end
end.new
service = GaugeAttributeServiceWithCaching.new(attribute_service)
service.cache_service = cache_service
Above I used dynamic classes. Instead of dynamic classes, one could use stubbing or whatever. I’ll talk more about cache_service=
later.
Decorating in this manner means we can easily find without caching by using GaugeAttributeService or with caching by using GaugeAttributeServiceWithCaching.
The important thing to note is that we added new functionality to our application by extending existing parts instead of changing them. I read recently, but cannot find the quote, that if you can add a new feature purely by extending existing classes and creating new classes, you are winning.
Guideline #2. Use accessors for collaborators
In the example above, you probably noticed that when testing GaugeAttributeServiceWithCaching
, I changed the cache service used by assigning a new one. What I often see is others using some top level config, or even worse they actually use a $
global.
# bad
Gauges.cache = Memcached.new
class GaugeAttributeServiceWithCaching
def get(id)
Gauges.cache.fetch(cache_key(id)) { … }
end
end
# worse
$cache = Memcached.new
class GaugeAttributeServiceWithCaching
def get(id)
$cache.fetch(cache_key(id)) { … }
end
end
What sucks about this is you are coupling this class to a global and coupling leads to pain. Instead, what I have started doing is using accessors to setup collaborators. Here is the example from above, but now with the cache service accessors included.
class GaugeAttributeServiceWithCaching
attr_writer :cache_service
def cache_service
@cache_service ||= CacheService.new
end
end
By doing this, we get a sane, memoized default for our cache service (CacheService.new
) and the ability to change that default (cache_service=
), either in our application or when unit testing.
Finding ourselves doing this quite often, we created a library, aptly named Morphine. Right now it does little more than what I just showed (memoized default and writer method to change).
As I have started to use this gem, I am getting more ideas for things that would be helpful. Here is the same code as above, but using Morphine. What I like about it, over a memoized method and an attr_writer is that it feels a little more declarative and creates a standard way of declaring collaborators for classes.
class GaugeAttributeServiceWithCaching
include Morphine
register :cache_service do
CacheService.new
end
end
Note also that I am not passing these dependencies in through initialize. At first I started with that and it looked something like this:
class GaugeAttributeServiceWithCaching
def initialize(attribute_service = GaugeAttributeService.new,
cache_service = CacheService.new)
@attribute_service = attribute_service
@cache_service = cache_service
end
end
Personally, over time I found this method tedious. My general guideline is pass a dependency through initialize when you are going to decorate it, otherwise use accessors. Let’s look at the attribute service with caching again.
class GaugeAttributeServiceWithCaching
include Morphine
register :cache_service do
CacheService.new
end
def initialize(attribute_service = GaugeAttributeService.new)
@attribute_service = attribute_service
end
end
Since this class is decorating an attribute service with caching, I pass in the service we want to decorate through initialize. I do not, however, pass in the cache service through initialize. Instead, the cache service uses Morphine (or accessors).
First, I think this makes the intent more obvious. The intent of this class is to wrap another object, so that object should be provided to initialize. Defaulting the service to wrap is merely a convenience.
Second, the cache service is a dependency, but not one that is being wrapped. It purely needs a sane default and a way to be replaced, therefore it uses Morphine (or accessors).
I cannot say this is a hard and fast rule that everyone should follow and that you are wrong if you do not. I can say that through trial and error, following this guideline has led to the least amount of friction while maintaining flexibility and isolation.
Guideline #3. Create real interfaces
As I mentioned above, the first thing I started with when working on the caching code was an interface for caching for the application, rather than just using a client directly. Occasionally what I see people do is create an interface, but wholesale pass arguments through to a client like so:
# bad idea
class CacheService
def initialize(driver)
@driver = driver
end
def get(*args)
@driver.get(*args)
end
def set(*args)
@driver.set(*args)
end
def delete(*args)
@driver.delete(*args)
end
end
In my opinion, this is abstracting at the wrong level. All you are doing is adding a layer of indirection on top of a driver. It makes it harder to follow and any exceptions that the driver raises will be raised in your application. Also, any parameters that the driver works with, your interface will work with. There is no point in doing this.
Instead, create a real interface. Define the methods and parameters you want your application to be able to use and make that work with whatever driver you end up choosing or changing to down the road.
Handling Exceptions
First, I created the exceptions that would be raised if anything goes wrong.
class CacheService
class Error < StandardError
attr_reader :original
def initialize(original = $!)
if original.nil?
super
else
super(original.message)
end
@original = original
end
end
class NotFound < Error; end
class NotStored < Error; end
end
CacheService::Error is the base that all other errors inherit from. It wraps whatever the original error was, instead of discarding it, and defaults to the last exception that was raised $!
. I will show how these are used in a bit.
Portability and serialization
I knew that I wanted the cache to be portable, so instead of just defaulting to Marshal’ing, I used only raw operations and ensured that I wrapped all raw operations with serialize and deserialize, where appropriate.
In order to allow this cache service class to work with multiple serialization methods, I registered a serializer dependency, instead of just using MultiJson’s dump
and load
directly. I then wrapped convenience methods (serialize
and deserialize
) that handle a few oddities induced by the driver I am wrapping.
class CacheService
include Morphine
register :serializer do
Serializers::Json.new
end
private
def serialize(value)
serializer.serialize(value)
end
def deserialize(value)
if value.is_a?(Hash) # get with multiple keys
value.each { |k, v| value[k] = deserialize(v) }
value
else
serializer.deserialize(value)
end
end
end
Handling exceptions (continued)
I then created a few private methods that hit the driver and wrap exceptions. These private methods are what the public methods use to ensure that exceptions are properly handled and such.
class CacheService
private
def driver_read(keys)
deserialize(@driver.get(keys, false))
rescue Memcached::NotFound
raise NotFound
rescue Memcached::Error
raise Error
end
def driver_write(method, key, value)
@driver.send method, key, serialize(value), DefaultTTL.call, false
rescue Memcached::NotStored
raise NotStored
rescue Memcached::Error
raise Error
end
def driver_delete(key)
@driver.delete(key)
rescue Memcached::NotFound
raise NotFound
end
end
At this point, no driver specific exceptions should ever bubble outside of the cache service. When using the cache service in the application, I need only worry about handling the cache service exceptions and not the specific driver exceptions.
If I change to a different driver, only this class changes. The rest of my application stays the same. Big win. How many times have you upgraded a gem and then had to update pieces all over your application because they willy-nilly changed their interface.
The public interface
All that is left is to define the public methods and parameters that can be used in the application.
class CacheService
def get(keys)
driver_read(keys)
rescue NotFound
nil
end
def set(key, value)
driver_write :set, key, value
end
def delete(key)
driver_delete key
rescue NotFound
nil
end
end
At this point, the application has a defined interface that it can work with for caching and for the most part does not need to worry about exceptions as they are wrapped and, in some cases, even handled (ie: nil for NotFound).
Creating real interfaces ensures that expectations are set and upgrades are easy. Defined interfaces give other developers on the project confidence that if they follow the rules, things will work as expected.
Guideline #4. Test the whole way through
Whatever you want to call them, you need tests that prove all your components are wired together and working as expected, in the same manor as they will be used in production.
The reason a lot of developers have felt pain with pure unit testing and isolation is because they forget to add that secondary layer of tests on top that ensure that the way things are wired together works too.
Unit tests are there to drive our design. Acceptance tests are there to make sure that things are actually working the whole way through. Each of these are essential and not to be skipped over.
If you are having problems testing, it may be your design. If you are getting burned by isolation, you are probably missing higher level tests. You should be able to kill your unit tests and still have reasonable confidence that your system is working.
Nowadays, I often start with a high level test and then work my way in unit testing the pieces as I make them. I’ve found this keeps me focused on the value I am adding and ensures that my coverage is good.
Conclusion
While it has definitely taken a lot of trial and error, I am starting to find the right balance between flexibility, isolation and overkill.
- Stick to single responsibilities.
- Inject decorated dependencies through initialization and use accessors for other dependencies.
- Create real interfaces.
- Test in isolation and the whole way through.
Follow these guidelines and I believe you will start to feel better about the code you are writing, as I have over the past few months.
I would love to hear what others of you are doing and see examples. Comment below with gists, github urls, and other thoughts. Thanks!
16 Comments
Jul 05, 2012
FWIW, Ruby already has a “convention” for serialize/deserialize and it’s…wait for it…`load`/`dump`. Marshal, YAML, and JSON already implement it.
The interface you build up here is generally useful. I’ve worked on at least two apps that have or needed this general idea wrapped around all key/value operations. Thanks for writing up how to build a good cache interface up in layers!
Jul 05, 2012
This is the Open/closed principle – you maybe read about it from Bob Martin’s ASDPPP book?
Jul 05, 2012
@Adam: Good point.
@Jamie: Yep, thanks for linking. Didn’t read it in that book, but probably was linked to from another book.
Jul 06, 2012
I like these thoughts in general. One thing on the real interface that you build though: it still relies on the driver’s return values for
set
anddelete
. It would probably behoove us to provide our own consistent return values, lest the driver change them and wreak havoc throughout our application.Jul 06, 2012
Do you tink it would help a little using namespaces? With the amount of new classes that come with this type of development, I think some kind of organization might be helpful.
How would you organize it then? GaugeAttributeService::Base and GaugeAttributeService::Cache perhaps?
I’m working on a fairly big project nowadays and I’m facing these kind of decisions.
Jul 06, 2012
So, I’ve been doing this stuff for a while, and have a few thoughts/comments to add here:
Naming stuff:
GaugeAttributeService
bothers me a bit, because of theService
suffix. Goos highly recommends against naming stuff with patterns, and I tend to agree.Service
tells me nothing about why I’d use this thing, so I’d be tempted to go with something likeFindGaugeAttributes
, and then call the methodfirst_with_id
. This might seem super nitpicky, but it means the rest of your code isn’t thinking about what a “service” object is or does, only what it needs to do.GaugeAttributeServiceWithCaching
could actually be extremely generic (if you always constructor inject all dependencies, more on that in a moment), so you could have something like this:On the idea of using accessors for collaborators, I’m not a huge fan of this. I prefer my objects (that do computation) to be mostly immutable, unless some requirement means I might need to say, swap out which caching service I’m using in code, at runtime. Instead I pass everything through the constructor of every object, until you get to a `main` method (often a rails controller, or a queue object of some kind)
This leads to a thing some people find weird, but is also wonderful, which is that you will have a couple of top-level methods that just call
new
on a bunch of objects, passing them into each other, and then tells one of them to start processing. These main methods can lead to huge flexibility in how the system works, by wiring objects together differently, you can change behaviour without writing much code at all.Whilst this seems tedious to start off with, defaulting constructors has burnt me a bunch in the pass, so I’m mostly sworn off it now. If I have objects that are too hard to create, that’s a good design smell about that object doing too much. Ocasionally I break up the toplevel methods into a “dsl” layer that literally just constructs object graphs in particular configurations (note I don’t like the phrase dsl here, it’s just some ruby methods with readable names).
I love the wrapping of third party code in the driver_read etc methods, though I’d be tempted to move them into another object.
Jul 07, 2012
Your “class extension” recall maybe have been from reading anything that referenced the SOLID principles espoused by “Uncle” Bob Martin.
Jul 07, 2012
Regarding ‘Use accessors for collaborators’ I would ask if you can write code like this:
and the service instance is “ready to go,” then you are good with the API design.
But if the code had to look like this, with a “setup” stage:
I would say that is a case where you would want to demand a class be created in a useful state (via an initializer). Otherwise, it makes it very challenging to use “out of the box” without reading the API docs :-)
Of course, there are always exceptions…
Jul 07, 2012
(Odd formatting snafu above)
Regarding naming… absolutely agree. It is worth arguing about. The more critical the class is to the system design, the more I might go to the mat and wrestle a good name to the ground. The decision on a name can be a fleeting event, but it will have everlasting impact. Think: Write Once, Read Many.
For a C++ app for manufacturing (’95-98), I employed a very layered architecture. Portable business objects were sent to the thin client (no UI talking to DB crap allowed!). The paradigm of pick lists was commonplace… show me a list of parts. The list UI component merely needed a set of IDs and Names. So each domain class could basically implement the interface and they too could be tossed into a drop-down list. My clever name for this little device never grew past “IdString” — we kind of joked about it, because a better name never surfaced.
When I see something like “GaugeAttributeServiceWithCaching” in isolation, it is not as easy to unilaterally discuss a better name.
However, were this inside a basic system called “Gauge” and I saw a bunch of other classes prefixed with “Gauge” — I would throw a red flag.
BTW: My rules are strict for domain-y things, and less strict for utility classes, and lesser things.
I mostly dislike prefixes, postfixes, redundant things of any sort in a name — class or attribute. If I have to mentally strip off a prefix to get at the gist of what I am reading, then it should not be there in the first place.
For example (non-Ruby community, mostly):
I also raise an eyebrow and look more closely at any (domain) class ending in “er.” Yup. Try it. Look for some. They are usually “doing” things.
You can go too far in making “God” classes that have no properties of their own, are stuffed full of collaborators via the initializer, and wouldn’t know how to delegate if it hit them over the head. It’s the kind of “Manager” class (there’s that ‘er’) that — instead of asking (delegating) for it’s gauges to compute some stat, it gets all the data from the gauges and does the work for the gauges. Don’t be that guy!
Conversely, look for the boring data class. No methods, just accessor stuff. While it might be just fine, and there truly is no business logic for this class, I would look around just to be sure there are no over achiever “er” classes lurking in the dark alleys — ready to pimp the business logic for the data class.
Thanks for sharing, John!
Jul 07, 2012
@Tom Crayford: I left out some functionality for clarity. The gauge attribute service handles CRUD for gauges. It is the persistence layer for the gauge class. Not a fan of FindGaugeAttributes as it feels verby.
Jul 07, 2012
I’m with you on most of the important points here, but you hurt my feelings in two spots:
Assignment in that spot is, to me, unacceptable. It would cause me to do a double-take every time I read the code. I’d prefer:
The second was this:
I would never use a $ variable in Ruby code, but that’s my latent Perl-racism.
Jul 12, 2012
The quote you didn’t find was:
“When you can extend a system solely by adding new objects without
modifying any existing objects, then you have a system that is
flexible and cheap to maintain”
- Kent Beck in Smalltalk Best Practices Patterns.
Was quoted in Avdi Grimm’s “Objects on Rails” recently.
Jul 21, 2012
I read “clean code” and “the clean coder” books, both are really good. I wrote a blog post about it. Some of his advice is hard to follow but is a good guideline.
Sep 09, 2012
I made a slightly enhanced version of your 2009 ClassLevelInheritableAttributes module, and I’m wondering if it’s ok with you if I publish it as a gem.
All I did was:
1) make the names a bit shorter
2) make the attr_… into a Kernal method that starts with
include so that you don’t have to include it yourself (note, Ruby checks for multiple includes so that’s no problem)
3) had it generate object level access functions so that, like C++ you can access class variables directly from an instance without prefacing it with .class
4) made an analogous attr_ for regular class variables that just generates that access functions
5) made an attr_read option for read only class variables.
I like simple, easy to document libraries like this. I’ve seen other libraries that have similar abilities, but since they have more features and nothing is documented, I can’t bring myself to use them.
Sep 09, 2012
Actually after playing some more this has gotten more interesting.
It turns out that “include” does run more than once at least in some cases.
With some work I’ve come up with the right solution for hooking inherited, which includes making an around alias and chaining calls to it in the case where more than one extension wants to use the hook.
Frankly that should be built into the language if we’re gonna have hooks at all. As it is you can’t put aliases into modules so you have inject code with class_eval. And it’s an error to do an around alias if you’re the first one at the hook so you have to test whether you’re first before you inject code. And it will include more than once so you have to test for that too.
What a mess. But it’s ruby so even a bad solution is short and works.
Apr 18, 2013
Does your blog have a contact page? I’m having trouble locating it but, I’d like
to shoot you an e-mail. I’ve got some creative ideas for your blog you might be interested in hearing. Either way, great website and I look forward to seeing it improve over time.
Sorry, comments are closed for this article to ease the burden of pruning spam.