A Post Entitled Bang! You’re Dead?

posted 10 months ago

The other day Dave Aronson over at Codosaurus blogged about the perils of Ruby “bang methods”.  If you’re not familiar with the term, “bang methods” are methods that end with an exclamation point.  Bang methods often have the side effect of modifying the receiver.  To steal the opening example from Dave’s piece…


str = 'foo'
p str.upcase!
=> "FOO"

p str
=> "FOO"

When calling String#upcase! Ruby changes the contents of the string to all capital letters and then returns the results of the operation.  This is seen in the example above in that printing the results of str.upcase! and then printing str produces the same output.

Dave points beyond this well-documented issue to point out another problem that bang methods could present: they return value of invoking a bang method may not match the side effects.  Consider the Array#uniq! method.  This method accepts eliminates duplicate values from the receiving array as shown below.


ary = [ 1, 3, 5, 5, 7, 7, 9 ]
ary.uniq!
=> [ 1, 3, 5, 7, 9 ]

ary
=> [ 1, 3, 5, 7, 9]

ary.uniq!
=> nil

ary
=> [ 1, 3, 5, 7, 9]    

The last few lines demonstrate the gotcha that Dave is writing about.  The result of calling #uniq! on an array that has no duplicate values is nil but the side effect is an unchanged array.  Certainly this can lead to some logic problems if you are unaware of the potential pitfall but the potential pitfall is exactly what the bang (exclamation point at the end of the method name) is intended to covey — use with caution.

Most of these bang methods have a “non-bang” version and the non-bang versions operate a bit differently.  First, the non-bang versions tend not to have side effects.  Instead, the non-bang versions create a copy of the receiver and then operate on the copy.  Second, the non-bang versions always produce the “expected” result.  Recycling the above example using Array#uniq demonstrates this.


ary = [ 1, 3, 5, 5, 7, 7, 9 ]
bry = ary.uniq
=> [ 1, 3, 5, 7, 9 ]

bry.uniq
=> [ 1, 3, 5, 7, 9]

Array#uniq will always return an array of unique elements regardless of whether the receiving array is altered or not.  As shown previously, Array#uniq! will return nil if the array is not altered but will return the altered array if duplicates are received.  Dave’s rather explicit reaction to this is…

Do not depend on the bang versions returning the same value as the non-bang versions!  (Even though that value seems to be the whole point of both functions!) 

The warning itself is reasonable: the point of the “bang” is to warn that something is different.  The trouble with the statement is the parenthetical comment at the end.  Is “that value… the whole point of both functions”?

Maybe not.

Keep in mind that most bang methods can modify the receiver.  With that in mind, the result of calling a bang method is telling you something very important — the result indicates whether or not you did modify the receiver.  Why is that important?  Imagine the situation where you have an operation that is expensive enough that you must minimize the frequency with which it’s called and a data set that is modified dynamically by another part of the system.  Since Ruby allows you to use any object as the test in control flow statements (if X, unless Y, etc) you could use the result of a bang method as a logic gate to the expensive logic.


if result = object.bang!{|x| x.test_it }
  some_very_costly result
end

There are also bang methods in which “the whole point” of the method is considerably murkier.  For example, as the name suggests, Array#slice! cuts out part of an array.  The bang version modifies the receiving array by removing elements that fail the test but the return value are the removed elements.  Is the point to deal with the array after the elements have been removed?  The elements that were removed?  Both?  The answer to that question obviously lies in your application but it’s not uncommon for the answer to be both.


class TopTen
  attr_reader :results
  
  def initialize
    @results = [ ]
  end
  
  def add_result(name, score)
    @results.push({ name: name, score: score })
    
    #rearrange the top performers
    if @results.sort!{|a, b| b[:score] <=> a[:score] } || @results.size<10
      update_scoreboard
    end
    
    #trim the list to the top ten
    if eliminated = @results.slice!(10..-1)
      notify_out_of_contention eliminated[:name]
    end
  end
  
  private
    def update_scoreboard
... end def notify_out_of_contention(trial) ... end end

In the TopTen class above we are keeping a list of top ten results of some competition, sorted by their score. The list is initially empty and we use the #add_result method to add new results. The two commented blocks in this method demonstrate both uses of bang methods discussed above.

The #add_result method begins by pushing the latest result onto the list of results. The first commented block then sorts the list of results by their score. The Array#sort_by method will return the re-sorted array if a change was made or nil if no change was made. Since we’re pushing the most recent result onto the end of the list, no change will be made if the most recently added result was the worst result. In this situation there is no need to update the scoreboard to display an updated top ten list; it’s already up-to-date (with the exception of the edge case where the list is still accumulating the first ten scores). With that in mind, we call the update_scoreboard method only when the sort returns a result.

In the second commented block we trim the list of results at 10. While the length of the array is less than 10 the #slice! method will return nil. Hurray for the contestants — we do not yet need to tell them that any of them are eliminated. However, once we reach the eleventh contestant the #slice method will remove the last contenstant and we can notify that contenstant that he or she is no longer in the top ten. What’s important about this latter block is that we need both the immediate result of the #slice (to notify which contestant was eliminated) and the side effect of the #slice (a trimmed list of only ten contestants).

The caveats for using bang methods are valid. Beware that they will likely produce side effects. Beware that they return different results than their similarly named cousins. However, they’re unlikely to catch you unaware if you consider the result returned from the bang method to be the result of changing the receiver. With that perspective you’ll not likely be tripped up in your logic and will instead see not only the consistency of the returned results but also find creative ways to employ them.

A Post Entitled A Different Form of Ancestry

posted 2 years ago contains 1 note

Okay, this is not really a code-related post.  Several months ago I sat down with my son and tried to teach him some good coding concepts.  It didn’t last too long because… I had no plan.  I didn’t really think much of it until last night.

We were on our way to a soccer practice and talking about things that had gone on during the day.  My son is taking an engineering class that focuses on mechanical engineering and told me that their next project is to build a bridge.  The test for the bridge will be to support a bucket into which the teacher will pour water until the bridge collapses.  Apparently the student whose bridge supports the greatest weight gets an ‘A’.

The objective is simple enough. What floored me was the way he tried to explain it to me even after laying out all the guidelines.  He said, “So I have to build it knowing that I am going to break it.  It’s like when you write software and write a test first, knowing that it’s going to fail, so that you can write the least amount of software possible to make it work.”

Wow.  I guess the lesson got through much more than I thought!

Now, go build your own Ancestry: pass along your best coding skills to someone else.

Loading...

A Post Entitled getting into ancestry

posted 3 years ago contains 1 note

Heading into The Big Rewrite I finally took the opportunity to play with the ancestry gem.  Initially developed as a replacement for for acts_as_tree, ancestry has evolved to something much more akin to nested-set support in a tree-like structure.  The advances allow you to pluck out a entire subtree or hierarchy path in a single trip to the database.  Awesome.

The one thing that I wish we had for ancestry is a clear upgrade path for those of us who discovered ancestry after building up an application with acts_as_tree.  Stefan has kindly renamed the initializing ‘acts_as_tree’ method to ‘has_ancestry’ which is a step in the right direction.  Hopefully it will bother me enough to actually do something about it…

The ancestry gem has greatly simplified a tangential issue in the code base.  My first use of ancestry was in my Account model.  In this particular application Account is hierarchical in order to allow higher-level accounts to establish rules that must be followed by lower-level accounts.  Specifically, the corporation (root) will set rules for everyone, the regional manager (child of corporate) may extend those rules, and the local account (leaf of the tree) is subject to both of them.

With acts_as_tree the solution to this was not difficult but it was time consuming. First, we created a method to walk up the hierarchy and collect the parent nodes of the tree until we hit the root node. Each parent node is unshifted into the array so that the resulting array index corresponds to the depth of the node in the hierarchy (index 0 is the root node, etc). Another method can then iterate over that hierarchy and collect the rules into a single array as shown below.


class Account < ActiveRecord::Base
  has_ancestry

  ...

  def account_hierarchy
    hierarchy = [self]
    current_account = parent

    while(!current_account.nil?) do
      hierarchy.unshift current_account
      current_account = current_account.parent
    end
    return hierarchy
  end
 
  def collect_from_hierarchy(collection_name)
    account_hierarchy.collect{|acct| acct.send(collection_name) }.flatten.compact
  end
end

While straight-forward, this code clearly will not scale all that well.  We will have 2N database cycles (N=node depth) every time we need to retrieve a collection of objects.  Yuck.

The ancestry gem dramatically reduces this.  The gem provides a method called ‘path’ (and corresponding ‘path_ids’) that returns a set roughly equivalent to the ‘account_hierarchy’ method shown above.  The win is not that we no longer need the ‘account_heirarchy’ method (obviously the equivalent method exists somewhere) but that ancestry manages to build the path collection in one trip to the database.

That bit of inspiration led to a refactoring of the ‘collect_from_hierarchy’ method.  Since we have the ids of the accounts it was obvious that we could retrieve the collection of rules in just one trip as well.


class Account < ActiveRecord::Base
  has_ancestry

  ...

  def collect_from_hierarchy(collection_name)
    Object.const_get(collection_name.to_s.classify).where :account_id => path_ids
  end
end

In this refactoring we are taking advantage of the fact that Ruby classes are constants.  We accept the name of the collection we would like to assemble from the hierarchy, convert to a string (allowing the user to call it with a symbol), and then get the class name equivalent by calling classify and finally grab a handle to the class by asking Ruby to retrieve the class from the known constants.  We then ask the class to return the collection of rules where the account_id for the rule is one of the members of the path_ids.  As an added bonus, the refactoring also takes advantage of scopes so that the caller may chain additional scopes as necessary.


  acct.collect_from_hierarchy(:birthdays).order(:born_on)
  acct.collect_from_hierarchy(:salaries).order(:amount).paginate :page => params[:page]

Admittedly, this refactoring was available earlier.  The ‘account_hierarchy’ method could have been written to return an array of ids rather than the objects themselves.  ancestry provided the inspiration by making that fact obvious.  We stand on the shoulders of giants, indeed.

Loading...

A Post Entitled …and?

posted 3 years ago

'and' and 'or' originate (like so much of Ruby) in Perl. In Perl, they were largely used to modify control flow, similar to the 'if' and 'unless' statement modifiers. A common Perl idiom is:

do_something() or die “It didn’t work!”;

The ‘and’ and ‘or’ keywords serve the same purpose in Ruby. Properly understood, ‘and’ and ‘or’ are control flow operators, not boolean operators.

I just ran across the quote above in Avdi’s explanation of ‘and’ vs ‘&&’. I almost skipped the article but am glad that I read it. This may be the first time that I have seen an explanation of why the operators differ rather than how they differ. The former makes it much easier to remember the difference. Thanks, Avdi!

A Post Entitled Constant-ly?

posted 4 years ago in ruby

I’ve gone back and forth about what should constitute a good use of class-level constants and a recent project has brought this question up once again.  In this project there are basically two different uses for class-level constants.  The first, and less frequent, use of class-level constants serves to identify “magic values”.  The second use is to define a closed set of options.

Class Constants as Magic Values

"Magic values", if you are not familiar with the term, are those values that require you to change the response of a class or class instance to a message. Usually you find them in flow control logic such as if-then or case statements.


class AtBat
  MAX_BALLS = 4
  MAX_STRIKES = 3

  attr_reader :pitches, :result, :strikes, :balls

  def initialize(first_pitch = nil)
    @balls = @strikes = 0
    @pitches = [ ]
    @result = nil
    add_pitch( first_pitch ) unless first_pitch.nil?
  end

  def add_pitch(pitch_result)
    @pitches ||= [ ]
    @pitches.push pitch_result

    case pitch_result
      when :foul
        add_strike if @strikes < MAX_STRIKES-1
      when :ball
        add_ball
      when :strike
        add_strike
      else
        @result = pitch_result
    end
  end

  def active?
    !@result.nil?
  end

  private
    def add_strike
      if active?
        @strikes += 1 
        @result = :strike_out if struck_out?
      end
    end

    def struck_out?
      @strikes == MAX_STRIKES
    end

    def add_ball
      if active?
        @balls += 1 
        @result = :walked if walked?
      end
    end

    def walked?
      @balls == MAX_BALLS
    end
end

In the code above the class-constants MAX_BALLS and MAX_STRIKES are magic values. They represent different “high water marks” for the AtBat class. Each instance should respond differently when a pitch is a ball and there are less than MAX_BALLS of them in the AtBat.

This is the best, and possibly only, justifiable use of class constants. Using constants in this case accomplishes two important objectives. First, it pulls the magic values into one place so that they can easily be maintained. If the rules of baseball changed to permit more balls or strikes in at bat you simple change the class-constant and you’re done. Second, the class constant is more expressive of the intent than the value. Why use “3” when evaluating the number of strikes? Because that’s the maximum allowed.

Class Constants as Closed Options

A special case of magic values is when the constant represents a closed set of options for the class. For example, consider when my favorite baseball team, the Boston Red Sox, conducts a contest during the season to give away a pair of tickets to a game and then does something I detest: they restrict entry to residents of the New England states. (Yes, I’m crazy enough to drive the thousand miles between my house and Fenway to see a game.)


class Entrant
  ELIGIBLE_STATES = %w{CT MA ME NH RI VT}
  attr_accessor :first_name, :last_name, :dob, :email, :street, :city, :state_abbrev

  def valid?
    new_england_resident? and over_18?
  end

  private
    def new_england_resident?
      ELIGIBLE_STATES.include? @state_abbrev
    end

    def over_18?
      @dob < 18.years.ago
    end
end

In the code above the ELIGIBLE_STATES constant is used as a special type of magic value. The instance reacts differently based on whether or not one of the attributes “matches” the constant. To that extent the closed option type of class constant makes some sense. It does often have an irritating side effect (constant redeclaration warnings when restarting a Rails application) but that’s the main downside.

More than Magic?

The problem that I have seen fairly often is that these class constants that begin as a closed set of options tend to evolve over time. The list of states shown above, for example, may take on an additional property for reporting purposes such as a state-specific tax rate for reporting the tickets as “income.”


class Entrant
  ELIGIBLE_STATES = {:CT=>0.07, :MA=>0.09, :ME=>0.07, :NH=>0.0, :RI=>0.07 :VT=>0.09}
  attr_accessor :first_name, :last_name, :dob, :email, :street, :city, :state_abbrev

  def valid?
    new_england_resident? and over_18?
  end

  def extended_ticket_value( ticket_price, qty )
    qty * ticket_price * (1 + ELIGIBLE_STATES[@state_abbrev.to_sym])
  end

  private
    def new_england_resident?
      ELIGIBLE_STATES.include? @state_abbrev
    end

    def over_18?
      @dob < 18.years.ago
    end
end

Before you know it there are new “constants” springing up to deal with other flow control changes related to the first constant. Tax rates, shipping charges, and who knows what else might change based on the state of residence for the Entrant. This is where the problem lies. Because the needs emerged slowly they feel like a simple, organic extension of the class. But they’re not.

They are classes in and of themselves.

These classes are not constants in the traditional, magic value sense. They are classes with constant data. That is, they are classes whose data is set and known at the time the software is developed. Just like other classes, they deserve their own class body.


class State
  NEW_ENGLAND_STATES = %w{CT MA ME NH RI VT}
  attr_reader :state_abbrev, :tax_rate, :shipping_charge

  def initialize(state_abbrev, tax_rate, shipping_charge)
    @state_abbrev, @tax_rate, @shipping_charge = state_abbrev, tax_rate, shipping_charge
  end

  def new_england?
    NEW_ENGLAND_STATES.include? @state_abbrev
  end

  def find_by_abbrev( abbrev )
    @@states.detect{|state| state.state_abbrev == abbrev}
  end

  @@states = {
    new(:CT, 0.07, 1.50),
    new(:MA, 0.09, 0.50),
    new(:ME, 0.07, 1.75),
    new(:NH, 0.00, 1.75),
    new(:RI, 0.07, 0.75),
    new(:VT, 0.09, 2.50),
    new(:SC, 0.07, 4.00)
  }
end

class Entrant
  attr_accessor :first_name, :last_name, :dob, :email, :street, :city, :state_abbrev

  def valid?
    new_england_resident? and over_18?
  end

  def extended_ticket_value( ticket_price, qty )
    qty * ticket_price * (1 + state_of_residence.tax_rate)
  end

  private
    def new_england_resident?
      state_of_residence.new_england?
    end

    def state_of_residence
      State.find_by_abbrev( @state_abbrev )
    end

    def over_18?
      @dob < 18.years.ago
    end
end

By pulling this class of constant data out as a unique class we gain similar advantages to the use of constants as magic values. The class now encapsulates all the knowledge about how to deal with a particular state and properly separates those concerns from how to deal with an Entrant. The Entrant code is even more expressive of its intent (e.g., state_of_residence.tax_rate vis-a-vis ELIGIBLE_STATES[@state_abbrev.to_sym]).

Admittedly, something very similar to this could be accomplished with a database-backed class like an ActiveRecord or MongoMapper derived class. The constant nature of the data, however, always makes that feel a little improper even if the end user has no access to the maintenance function. In the back of my mind I always worry that a necessary record will be inadvertently dropped so I want to ‘freeze’ it in code.

What about you? How do you deal with these kinds of classes of constant data? Do you recognize them up front? Shuttle them off to the database?

A Post Entitled RSpec best practices

posted 4 years ago in rails rspec contains 1 note

A few days ago Philippe Creux posted his RSpec Best Practices and Tips.  The post is well worth reading even if you prefer another test framework (you are testing, right?!) because there are several tips that are applicable to TDD more broadly than rspec alone.  Of all the tips there are two that I cannot re-emphasize strongly enough: “Only One Expectation Per It Block” and “(Over)Use Describe and Context”.

Single Expectation Tests

The “one expectation” tip is more broadly expressed as “each test should make only one assertion.”  Philippe’s example is a good one: rather than have one test that asserts that all expected attributes are present, have multiple tests that assert that individual attributes are present.  The advantage of this approach lies in its granularity.  The “one expectation” way allows to you to more quickly identify when multiple expectations have been violated.  Why?  Because a monolithic test will fail on the first failed expectation even though there may be multiple points of failure behind it.

The impetus for a monolithic test is typically the perceived advantage of collecting all the similar expectations into one place.  Grouping the expectations does make it simpler to determine where to alter or extend that type of testing.  Taking Philippe’s example, if you wanted to replace ‘age’ in the list of attributes with ‘dob’ (date of birth) and add ‘shirt_size’ then you would know right where to go to do that type of thing.  For this reason I would make one additional recommendation that I’ve mentioned in the past: use the dynamic nature of ruby to your advantage by using code to create expectations.  For example, rather than the recommended syntax


  it { should respond_to :name }
  it { should respond_to :age }
  it { should respond_to :gender }

Try the following


[:name, :age, :gender].each do |expected_attribute|
  it { should respond_to expected_attribute }
end

This code retains the advantages of both co-locating similar kinds of tests and employing discrete “one expectation” tests. Personally I believe it is also more expressive of the intention of the test, all the more when you build blocks for required_attribute or nonnegative_attributes.

Context is King

The “(Over)Use Context” tip is spot on. Early on in my use of rspec I tended to have code littered with one-off tests. That is, most of the test code was duplicated except for one or two key parameters. By using contexts, specifically nested contexts, you can reduce the duplication and bring sharper focus to what is varying. Use the outer context to group all those unchanging parameters and let the outer context description identify how they are common. Then, use the nested context to explicitly declare what is varying. Importantly, the before block in each context should set up the test scenario as described by the context.

As an additional example, consider a class with a state machine. The state machine begins in the ‘pending’ state and then moves to the ‘active’ state only when the ‘activate’ event is triggered. You might begin testing this scenario as follows.


describe Thing do
  before :each do
    @valid_attributes = { ... }
  end

  it "should create an instance given valid attributes" do
    lambda{ Thing.create @valid_attributes}.should change(Thing, :count).by(1)
  end

  it "should default to the pending state upon creation" do
    @thing = Thing.create @valid_attributes
    @thing.should be_pending
  end

  context "pending instance" do
    before :each do
      @thing = Thing.create @valid_attributes
    end

    it "should transition to 'active' when activated" do
      lambda{ @thing.activate }.should change(@thing, :state).to('active')
    end
  end
end

From the flow of the code we can see just what was described: if the attributes are valid we can create a new instance and that instance will correctly default to the pending state and from the pending state we can activate the instance and it will become active. Perfect. Until. Until? Yes, until you have to modify the state machine. Let’s say you’ve run with this code for a while and found a bot probing your application. To filter out bot created things you decide to begin with an ‘unconfirmed’ state that requires an email confirmation of some type.

What happens with the tests in this new case? The second test (default to pending) naturally fails. So does the third test (transition to active). Should it? Probably not.

What’s missing is that the before block did not guarantee that the testing state was what the context claimed that it was. How do you guarantee such things? Add an assertion. This step is commonly overlooked in a lot of tests that I have seen.


  context "pending instance" do
    before :each do
      @thing = Thing.create @valid_attributes
      @thing.should be_pending
    end

    it "should transition to 'active' when activated" do
      lambda{ @thing.activate }.should change(@thing, :state).to('active')
    end
  end

The only change here is the addition of the assertion to the before block. When you re-run the tests, the third test about transitioning to ‘active’ will still fail. However, now the failure will indicate that the error lies with the assertion in the before block. Previously we had assumed the object was in the ‘pending’ state because it was the default state. Now we make that assumption explicit and the test report will quickly help us understand that the failure was in the test setup rather than in the state machine implementation.

Personal Preference

I don’t personally care for the shortcuts Philippe recommends. I don’t think that the use of ‘specify’ is quite as expressive as the typical it-block that invokes it in most rspec code. This matters to me somewhat because I still hold onto that thin hope that a non-technical person might be able to read an rspec and ‘get it.’ Using the more standard ‘it-block’ style also allows me to grep my rspecs for expectations. The other shortcut — using braces rather than do/end syntax for declaring tests — does not seem practical to me. In practice, only the most trivial tests are written in one line (or maybe I need more deeply nested contexts).

More Cowbell!

As a final thought, I don’t think that the

lambda{...}.should change(object, :attribute_name)

syntax is used nearly enough. I have a preference, for example, to know that calling ‘create’ actually creates a new instance rather than fails to throw an exception. That’s why in the code snippet above the first test checks to see if invoking create has changed the Thing count by one.

I have not seen the use of change() used as often with instances but it works just as well. In the third test above we’re using it to validate that the state has changed to ‘active’. You need to be careful with this; change() will fail your code if the final value is correct (ie., ended in ‘active’) but you did not change (ie., started in ‘active’ and stayed in ‘active’).

Loading...

A Link to Water Drop at 2000 Frames per Second

posted 4 years ago

Water bounces?

A Post Entitled Qu’ils mangent de la brioche

posted 4 years ago

The phrase attributed to Marie Antoinette often mistranslated into English as “Let them eat cake” seems particularly appropriate in light of the great post by the folks over at Engine Yard. The “cake” that the French noblewoman referred to was really an enriched bread that the upper classes could dine upon but was generally out of the reach of the starving peasants (all the more during a famine!). For the Ruby community today, let’s take it as a challenge to raise our collective game.

In Clean Code, “Uncle Bob” Martin from Object Mentor made these statements about the Rails community’s dominant ORM ActiveRecord.

Active Records are special forms of DTOs. They are data structures with public (or bean-accessed) variables; but they typically have navigational methods like save and find. Typically these Active Records are direct translations from database tables, or other data sources.

Unfortunately we often find that developers try to treat these data structures as though they were objects by putting business rule methods in them. This is awkward because it creates a hybrid between a data structure and an object.

The solution, of course, is to treat the Active Record as a data structure and to create separate objects that contain the business rules and that hide their internal data (which are probably just instances of the Active Record).

Ouch.  Martin is generally a pretty sharp guy and when he calls out a specific implementation pattern we should take notice.  The statements made me a bit uncomfortable because I am just as addicted to rolling business logic into ARec models as everyone else.

The EY post could be the game changer.  The technique they describe for model composition was one they created primarily for speeding up tests.  If you haven’t already taken the time to ready the article, they suggest extracting a certain amount of business logic into a namespaced module and then including that into your ARec-based model.  The advantage is that you can create mock objects for testing, extend them with your module just like their ARec counterpart, and test them without hitting the database.  It’s a simple but brilliant suggestion.

But now consider EY’s approach in light of Martin’s objection to the abuse of the Active Record pattern in general.  Extracting the business logic into a separate, testable module achieves the goal of decoupling the DTO (Active Record class) from the business logic that acts upon it.  Couldn’t the EY recommendation become the norm for building Rails applications?

One approach along these lines has already been encapsulated into the RailsConcerns plugin.  The idea is fairly simple.  Organize your Rails project similar to most gems: include both a things.rb and things folder in your app/models.  The things.rb will be responsible for including/extending itself using the modules and classes in the things folder.  The plugin provides a simple API for the latter. Let’s assume that you have an Employee model and that you break your business logic for the employee down into modules called Employee::Vacation and Employee::Pay, organizing them into appropriately named files in the app/models/employee folder. In that case you could have an ARec model that deals primarily with data (validations, etc) and brings in other classes/modules for the business logic like so:


class Employee < ActiveRecord::Base
  validates_presence_of :first_name, :last_name, :start_date
  
  concerned_with :pay, :vacation
end

I didn’t care for the initial justification of RailsConcerns (breaking down a large model) but it now makes a lot of sense based on Martin’s and EngineYard’s comments. The one place that I’ll still disagree with RailsConcerns is its example of extracting validations to a module; since that seems to be the essence of what DTO is, it makes sense to stay in the model.

A Post Entitled New Year’s Cleanup: Compact Is Not Necessarily Concise

posted 4 years ago

To start the New Year off I am having the distinct pleasure of cleaning up bunch of old code. Aside from my beautiful code (ahem) I am also having the chance to review some other developers and that is leading me to some thoughts about style.

Style Lesson 1: Compact is Not Necessarily Concise

One of the great things about Ruby is it’s concise syntax. Well-written Ruby code is and should be short and to the point. Unfortunately sometimes the notion of ‘short’ gets confused for ‘concise’. They are not the same thing. Take, for example, the following deprecated code that I ran across:


class << self
    def complete_payment(panel_id)
        Panel.transaction do
          panel = Panel.find(panel_id)
          court_ap_id = Court.find(panel.court_id).accounts_payable.id
          # identify all pending expenses for the panel
          pending_expenses = Financial::RosterExpense.for_panel(panel_id).for_state('approved')
          # collect hash fo summarized rosters to send for journalling to financials
          collection_of_rosters_for_journal_transaction =
            # convert array of hashes to a single hash
            Hash[
              # first group by pay_type
              *pending_expenses.group_by(&:pay_type).collect do |pay_type, roster_by_pay_type|
                # then sub group by payable_to
                roster_by_pay_type.group_by(&:payable_to).collect do |payable_to, roster_by_payable_to|
                    self.financial_transaction_package(pay_type, payable_to, roster_by_payable_to.map(&:amount).sum, roster_by_payable_to.size, User.current_user.id, panel.court_id, court_ap_id)
                end
              end.flatten.collect{|h| h.to_a}.flatten

            ]
          resp = Journal.post_group_of_entries_with_summary_account(collection_of_rosters_for_journal_transaction, panel.court_id, User.current_user.id)
          unless resp.instance_of?(Net::HTTPOK)
            logger.debug("ERRORED in Journaling with #{resp.body}")
            raise ActiveRecord::Rollback
            response = false
          end
        end
        response ||= true
        return response
      end
end

W O W. There’s a lot going on there, no? It is arguably short but not at all concise. Concise has to do just as much with readability as it does with brevity. Concise makes things clear and obvious, especially for those of us who get to read what you’ve left behind. So let’s take another stab at that in a way that’s hopefully more ‘concise’.

Known knowns

One of the first things that strikes me is the line just below the comment ‘identify all pending expenses for the panel.’ What is not obvious in this code snippet is that this query is repeated several times so it needs drying up. Further more, note the actual query. Is it really checking for ‘pending’ things? No, it’s looking for ‘approved’ things. Duplicated code often does that. Let’s extract a function there.


class << self
  def self.approved_expenses_for( panel_id )
    for_panel( panel_id ).for_state( 'approved' )
  end
end

Simple enough. We’re already working in the Financial::RosterExpense class and the method in question is a class-level method so we can dispense with the class reference. We can also give the query a descriptive name that matches what it returns. In this case I’ve left off the word ‘panel’ from the end of the method name simply because I like the way it will read when used: ‘approved_expenses_for panel_id’. If that does not appeal to you, no big deal.

Wielding the Unweildy

Take a deep breath and grab a cup of coffee before continuing. We’re going to tackle that big chunk in the middle.

The heart of the Panel.transaction block is simply building up a hash. The syntax may be a bit unfamiliar and hard to decipher. One way to instantiate a hash is to call Hash[] passing it an array. Hash responds by zipping the array in pairs and returning a hash in which the first, third, fifth, … entries are the keys and the second, fourth, sixth, … entries are the values of the hash. This somewhat infrequent use of Hash is all the more obscured by the fact that the square brackets in the call are separated by a huge code block. Obviously this needs to be broken down in a way that makes that intention clearer.

Delving inside the Hash call we see that the expenses are grouped twice (once by pay_type and then by pay_to) and the innermost group is used to call a helper function that returns a hash. Interestingly, the hash is collected up into an array at each level. At the end, the nested arrays are flattened, with the inner hash being converted from a hash to an array in the process. I don’t know why.

Private, get me that expense report

The first step in cleaning up this code will be to create another helper function that performs the nested grouping and summary work. Within the helper function we’ll add some variables to make some of the calculations more explicit in case we need to modify those calculations in the future.


class << self
  def approved_expenses_for( panel_id )
    for_panel( panel_id ).for_state( 'approved' )
  end

  def build_expense_summary( expenses )
    expenses.group_by(&:pay_type).collect |pay_type, expenses_by_pay_type|
      expenses_by_pay_type.group_by(&:payable_to).collect do |payable_to, pay_to_expenses|
        payment_amount = pay_to_expenses.map(&:amount).sum
        payment_count = pay_to_expenses.size
        self.financial_transaction_package(pay_type, payable_to, payment_amount, payment_count, User.current_user.id, panel.court_id, court_ap_id)
      end
    end
  end
end

It turns out that the ‘financial_transaction_package’ helper method is returning a very simple hash. The key to the hash is a unique id and the value is a hash of transaction attributes associated with that id. This is useful to know because we can merge these simple hashes together without worrying that we might replace values. In fact, we can perform the merges at multiple levels without worrying about losing data. As a result, we can transform the collect’s into inject’s and build the array from the inside out.


class << self
  def approved_expenses_for( panel_id )
    for_panel( panel_id ).for_state( 'approved' )
  end

  def build_expense_summary( expenses )
    expenses.group_by(&:pay_type).inject({}) |final_expenses, (pay_type, expenses_by_pay_type)|
      package = expenses_by_pay_type.group_by(&:payable_to).inject({}) do |temp_expenses, (payable_to, pay_to_expenses)|
        payment_amount = pay_to_expenses.map(&:amount).sum
        payment_count = pay_to_expenses.size
        expense_package = self.financial_transaction_package(pay_type, payable_to, payment_amount, payment_count, User.current_user.id, panel.court_id, court_ap_id)
        temp_expenses.merge( expense_package )
      end
      final_expenses.merge( package )
    end
  end
end

The extra statements that the inject requires is making that a little cluttered. Let’s break the build_expense_summary up into two functions, one that builds the expenses by pay_type and another that builds the expenses by payable_to. Since their scope is pretty specific we’ll make them private, too.


class << self
  def approved_expenses_for( panel_id )
    for_panel( panel_id ).for_state( 'approved' )
  end

  private
  def build_expense_summary( expenses )
    expenses.group_by(&:pay_type).inject({}) |expense_package, (pay_type, expenses_by_pay_type)|
      package = build_financial_transaction_package_by_payable(pay_type, expenses)
      expense_package.merge( package )
    end
  end

  def build_financial_transaction_package_by_payable(pay_type, expenses)
    expenses.group_by(&:payable_to).inject({}) do |expense_package, (payable_to, pay_to_expenses)|
      payment_amount = pay_to_expenses.map(&:amount).sum
      payment_count = pay_to_expenses.size
      package = self.financial_transaction_package(pay_type, payable_to, payment_amount, payment_count, User.current_user.id, panel.court_id, court_ap_id)
      expense_package.merge( package )
    end
  end
end

As constructed, build_expense_summary will return a hash. Thus, with this refactoring we’ve not only made the intention of the nested group_by calls more explicit we’ve also eliminated the need for the less-common mode of Hash instantiation. Nice.

We have also substantially simplified the original complete_payment method. The call to build up the set of expenses to process is wrapped up in a method and the processing of those expenses is also a method with an intention-revealing name.

Before presenting the final result I’ll also point out that I have a preference for methods that exit once they’ve determined they’ve reached their end. In this case I would much prefer to return false when the HTTP request fails than to set a flag and exit later.

The Reveal


class << self
  def complete_payment( panel_id )
    panel = Panel.find(panel_id)
    court_ap_id = Court.find(panel.court_id).accounts_payable.id

    transaction do
      approved_expenses = approved_expenses_for panel_id
      expense_summary = build_expense_summary approved_expenses

      unless Journal.post_group_of_entries_with_summary_account(expense_summary, panel.court_id, User.current_user.id).is_a?(Net::HTTPOK)
        logger.debug("ERRORED in Journaling with #{resp.body}")
        raise ActiveRecord::Rollback
      end
    end
  end

  def approved_expenses_for( panel_id )
    for_panel( panel_id ).for_state( 'approved' )
  end

  private
  def build_expense_summary( expenses )
    expenses.group_by(&:pay_type).inject({}) |expense_package, (pay_type, expenses_by_pay_type)|
      package = build_financial_transaction_package_by_payable(pay_type, expenses)
      expense_package.merge( package )
    end
  end

  def build_financial_transaction_package_by_payable(pay_type, expenses)
    expenses.group_by(&:payable_to).inject({}) do |expense_package, (payable_to, pay_to_expenses)|
      payment_amount = pay_to_expenses.map(&:amount).sum
      payment_count = pay_to_expenses.size
      package = self.financial_transaction_package(pay_type, payable_to, payment_amount, payment_count, User.current_user.id, panel.court_id, court_ap_id)
      expense_package.merge( package )
    end
  end
end

Conclusion

The refactored code is both longer and more concise. Huh? Well, in this case we’ve actually added a half-dozen lines or so to the solution (counting comments and blank lines in each). We’ve also expanded one method into four. In those senses the code is longer. At the same time the methods that we have are significantly shorter with method names and additional variables used to make it more explicit. The methods are more concise to read and easier to comprehend. It’s also more concise in the sense that at least one of the extracted methods will be reused elsewhere so the code as a whole is more concise.

A Post Entitled New Year’s Cleanup: Off Color Code

posted 4 years ago

For a number of reasons I have inherited a lot of code recently and I’m in the process of trying to clean up and put the finishing touches on a release we hope to see out the door soon. That’s the good news. The bad news is that my emphasis has had a lot more ‘clean up’ than ‘finishing touches’ of late.

What is Off-Color Code?

One of the things that I have had to clean up a fair bit of is off-color code. What is off-color code? It’s that code in your repository that is impossible to decipher at first glance because no source code highlighter can help. It looks something like this…


#    def complete_payment(panel_id)
#        Panel.transaction do
#          panel = Panel.find(panel_id)
#          court_ap_id = Court.find(panel.court_id).accounts_payable.id
#          # identify all pending expenses for the panel
#          pending_expenses = Financial::RosterExpense.for_panel(panel_id).for_state('approved')
#          # collect hash fo summarized rosters to send for journalling to financials
#          ...snip a few dozen lines...
#          unless resp.instance_of?(Net::HTTPOK)
#            logger.debug("ERRORED in Journaling with #{resp.body}")
#            raise ActiveRecord::Rollback
#            response = false
#          end
#        end
#        response ||= true
#        return response
#      end

The syntax highlighting on this site cannot even help there. What’s that? It’s all comments? Exactly. It’s all comments.

The uncertain developer prefers to keep deprecated code in the code base while refactoring. He fears the code surgeon’s scalpel when performing a long overdue operation. Why? Most often there is a sense that if the code is removed completely then some ‘knowledge’ has been lost or we might forget some insight we once had.

Wrong. Don’t comment out code. That’s what git, subversion, cvs, and all the other source code management software is for. Cut the dead code loose from your code base. You will not miss it. Ever. If you doubt it then try the following test.

  1. Open a project that is old enough to eat solid food.
  2. Find a model or controller that has a method (or at least a large chunk of contiguous code) commented out.
  3. Summarize what the code does in less than ten seconds. (1pt)
  4. Name at least one insight contained in the code. (1pt each)
  5. Count the number of times you have read the code in the last month (1pt each).

If you score 1 point or less then it’s time to get rid of that code.

The Cost of Dead Code

The biggest problem with dead code is that it has a larger cost than benefit. The cost is measured in a few ways. It takes away from your code base by making it more difficult to read and comprehend the active code because you have to skip around it. It takes away from your code by adding to the questions that brought you to this particular code: is this code coming or going? Has it been recaptured elsewhere? Why was it removed in the first place?

Simply put, this code in suspended animation takes your eye away from what your source is otherwise trying to accomplish.  Even worse, it does this every time you look at the source code.

Get rid of that dead code.

Simplify. Simplify. Simplify.

Comments

About this site and its Author


The Tumblrs to Follow

  • staff
  • taylorstownsquare
  • cdmwebs
  • paulsullivanjr
  • dawnvanasse
  • bitbltr