This is a reworking of a list I created a couple of years ago for a client. Hope you find it useful.
Contents
- Object-orientation
- Dry and Shy
- Tell don’t ask
- The law of Demeter
- Object Composition
- Ruby considerations
- Replace case statements with behaviour
- Fat model thin controller
- Short methods and classes
- General points
- Databases
- HTML and CSS
- Javascript
- Recommendations for fixing “unstable” code
- Continuous Testing
Object-orientation
DRY & Shy
Don’t repeat yourself (DRY)
An example of this is having setup and teardown code repeated in every method. This avoids a heavy maintenance load and ensures that the “cut and paste monster” does not create code that looks correct but references incorrect variables, or require fixes to be applied across many classes where they might not be implemented properly.
Don’t tell people about yourself (shy)
In object-oriented design this means that an object should not expose how it does things, it may expose a method that gives you a result saying that it did it. Outside of a class used to transport data around most classes should not expose their internal attributes to client classes.
Tell don’t ask
Procedural code gets information then makes decisions. Object-oriented code tells objects to do things.
— Alec Sharp
Objects should be told what to do – not asked about their state and then the client object decides, procedural code uses state like this. This causes strong coupling between objects and makes testing and refactoring a nightmare. This also avoids the “train wreck” scenario where you end up with long strings of method calls, e.g.
project.client.calls.first.message
See the following links for more discussion on this:
Mock objects and tell don’t ask
The law of Demeter
This came out of a project on object-oriented design called Demeter, it is a more formal way of talking about tell don’t ask. It is also described as the principle of least knowledge. A caricature of this is to say that there should never be more than one ‘.’ in the string of method calls; of course common sense says this is a guideline rather than a hard and fast approach. In essence constructing helper methods to mask out long chains and make it clear what you are doing.
In summary:
More formally, the Law of Demeter for functions requires that a method M of an object O may only invoke the methods of the following kinds of objects:
1. itself
2. M’s parameters
3. any objects created/instantiated within M
4. O’s direct component objects
A common mistake is discussed here; one should endeavour to delegate behaviour not attributes
Object Composition
It is often better to compose an object from others rather than extending something and giving it too much to do or overriding the default constructor and creating a hybrid object in a non-standard way.
Ruby-specific considerations
Reusability in Ruby is not just through inheritance. It also works through modules. You need to think hard about using modules or inheritance and be able to justify why you chose one over the other. There are no right answers but don’t just blindly do what you did last time. Sometimes modules can be hard to understand and debug, and even hard to find if they’re in, say, a library directory outside of the main code base.
Replace case statement with behaviour
Long case statements are not true object code. Depending upon the circumstances there are several ways to refactor this:
- Use a Hash
- If an object needs to be instantiated or a method called then use the technique in 1 to call a lambda function (created using create_function if you’re a PHP head) that will return the object.
- Any long list, e.g. of return codes, should be held as a Hash and used to return the associated value directly.
A long case statement indicates the target object is not broken up into grouped behaviours correctly – the client object should be calling a method to do what it needs directly and the case statements would be unnecessary.
Testing code written this way is far easier. The dispatching array needs to be set up correctly and each single method can be tested independently. Testing a large case statement is almost impossible. The behaviour in each case needs to be put into its own method.
Fat models, thin controllers in MVC
When using the Model/View/Controller architecture any complex work should be done by the model code. This encourages reuse and moves the controller to just controlling and dispatching.
Short methods and classes
Methods should not be in excess of about 25 lines including comments. A class should have its responsibilities well-defined and not have more than ten or so methods.
General points
Magic numbers
There should be no references to numbers or stages that aren’t given symbolic names to make the code clear. This also means method names, e.g. fred_12321 is not helpful.
Let boolean be boolean
Consider the following
isMyNumber = (defined?(biff) && boff == 3) ? true : false;
There is no point in testing a boolean result to return a boolean. (I also have a personal dislike of the ternary operator, because it encourages this kind of sloppiness, think!).
No surprises
If there is an accepted way of doing something, e.g. a class constructor calls the super class method, do not create non-standard ways of doing things. If something has to be done in a non-standard way there must be a comment to that effect.
Meaningful comments
An example of a meaningless comment is:
-- Open cursor
open cursor x for ...
A comment that just says what the code is doing is a waste of time.
Another example is:
/**
* Enter description here...
*/
Enter the description or remove the comment. I don’t care if your code generator puts it in, it’s garbage.
Exceptions
try{
someDodgyOperation() ;
} catch( e Exception ){
return null ;
}
Just don’t, ok? Handle the exception, or at least log it somewhere to give the next poor soul a chance of finding where the problem is. In Java you’ll get a null pointer exception somewhere probably completely unrelated to where the problem actually is, same in Ruby or any language you care to name that has exceptions, just don’t do this.
Databases
No filtering of result sets from the database
This involves getting the database to do the work – there should be no need to filter result sets. Retrieving more data than needed has several deleterious effects:
- Data sets need to be passed through several layers to get to the calling module – the less data that needs to be marshalled and transmitted the better.
- Retrieving large sets and filtering for a few records can easily cause locking and contention problems.
- Testing may be difficult because the PHP instance may run out of memory if the sets are not released quickly enough to be garbage collected.
In essence: make sure the database returns as little as possible and keys and relationships are properly policed.
Generic database code
If the where clause changes but the columns selected do not, the where clause should be made into a something that can be set. This reuses the method and the return bindings, meaning they only have to be changed in one place
Do not ‘select *’ from tables
Under no circumstances should * be selected from a table:
- All of the columns will be selected regardless of the need for them – see the discussion in in the above section about this.
- The values returned are dependent on the default order the columns appear in in the database and that is a function of how they were created. There is no guarantee that, for operational reasons, columns in your development and production environments were created in the same order, an RFC could have been applied in a different order – this will break and be very hard to debug.
Ruby on Rails Active Record fixes the ordering problem. But if you have several K of text you don’t use often, it won’t hurt to think about not retrieving it.
HTML and CSS
HTML code should not use deprecated tag attributes, e.g. bgcolor, or valign. Styling should be done through CSS.
Sizes of elements, e.g. tables, should be done through CSS also. This allows a DRY approach to styling output.
CSS should not be in line but in its own sheets to promote reusablility and allow the web browser to cache it. It also allows restyling work to be done with a minimum of fuss.
HTML should be conforming, e.g. attributes should be inside double or single quotes and lower case as recommended by the W3C
HTML 5 may change the definition of conforming HTML, because there’s a lot less nonsense in there.
JavaScript
JavaScript should be in separate files, if it needs PHP then the web server will sort this out. If the script is less than twenty lines or so then inline is fine, otherwise it should be handled in its own file. This is to promote reusability of the scripts and allow the web browser the chance to cache the JavaScript.
Most of the comments about code in the above section also apply to Javascript.
Recommendations for fixing “unstable” code
In order to change code you must be able to say that your changes have not broken any existing assumptions. This means that you need a reasonably complete test suite to begin with.
Creating such a test suite, and the culture that keeps it up to date, is a sound investment in future. When bugs are encountered then a new test should be created that exposes this error – then it is fixed. The whole suite running ensures that no new bugs have been introduced.
There is a good article (and book) explaining this in more depth
Continuous testing
This does not require a large amount of resources and can be built up over time. It needs to be in place before any refactoring takes place. Normal practice is to keep the testing running and ensure that refactoring has not broken any existing functionality.
Deploys are automated and run the test suite against a test database before they happen. Any failure will stop deployment. (in my dreams)