Good software design part 1: What is refactoring? What are code smells?

In SDV701 (Software Development) we have to memorize information about refactoring and design patterns for an exam at the end of this semester. In order to help me memorize these terms, dig  deeper into the topics of refactoring and good software design, and reflect on what I have learned, I am going to do four journal entries about good software design.

I am splitting this up into four because it is a huge subject; I have to cover 22 code smells alone (which will be split between this entry and the next one) and I want to do a in depth job understanding and writing about them for my own benefit.

The concepts of refactoring and design patterns fall under the umbrella term of good software design because they make the code easier to read.

Some overarching concepts I have realized from researching refactoring are:

  • In software development you will always have to change the code you write in the future. This is the only certainty of software development.
  • Even if applications are built according to a design pattern over time new user requirements will cause additional functionality to be added to the system. If the process of refactoring (which will be defined and discussed below) is not followed when new functionality is added to a system then software decay can happen. This is where the original design pattern is lost due to new functions, and classes being added hurriedly to the system without consideration for how they fit into the existing design pattern.

refactor.png

What is refactoring?

“A disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior” (“Refactoring,” n.d.).

As I see it refactoring is small changes to code that make it easier to understand and change in the future. However it is important to remember that refactoring does not change the external behavior of the system, just the internal structure and readability.

Each refactoring action should be a small and reversible change to a working system, which is tested (by running self checking tests) to make sure no bugs have been introduced to the system by the refactoring. Subsequent refactors should not be made until after the tests from the previous refactor have shown that everything is working as intended.

Now I originally thought that refactoring was just changing method and variable names to make the code easier to understand when read, however a number of refactoring actions change the structure of the code. For example Move Method which would move a method from Class A to Class B if Class B uses the method more than Class A.

Therefore an important thing to realize about refactoring is that it is not just changing variable and method names it is also changes to code structure to make it more flexible, scalable and understandable.

So an analogy I thought of that makes sense to me is refactoring is kind of like editing a book. One action that takes place during editing is the grammar and spelling is corrected so that the content of the book makes more sense. i.e. a readability change

Additionally the structure of the book is changed if a paragraph makes more sense in chapter 2 than in chapter 1, and so is moved to chapter 2. i.e. an internal structural change.

But after all the editing changes the book still has the same message and much of the same external appearance.

What are the benefits of refactoring?

Refactoring can help developers to understand a system by changing the code to a state which can be read and understood without the need for comments. This is achieved through appropriate variable and method names, short parameter lists, and short methods that only perform one action.

For example if a variable name does not make sense to a developer they can change it; this is a refactoring action. An example of this might be the variable result is changed to CustomerFee, this makes more sense to the developer now and it makes the code more closely match the systems problem domain.

code smell.jpg

(Oztoprak, 2014)

What is a code smell?

A code smell is an observable sign of a potential underlying problem. Because refactoring is changing the structure of externally working code to identify code smells you must read through the code, it isn’t as obvious as a software error crashing your program.

As Martin Fowler states in his website “smells don’t always indicate a problem” (Fowler, 2006), in other words if you find a code smell, for example a long parameter list this is not necessarily a problem you have to look at the situation and judge if the code smell is indicating an actual flaw in the code or not.

Of course having this subjective nature means that refactoring is best learnt through practice.

Types of code smell

Long method – According to Matthias any method that is longer than around 20 lines is too long.

Whereas according to the coding tutorial website sourcemaking.com methods longer than 10 lines are too long (“Long Method,” n.d.).

So what do I have to look out for with these two different viewpoints? Well I think Martin Fowler and Kent Beck have a pretty much universally applicable metric which is the name of a method describes what the method does. If a method grows so long that the method name no longer describes all the functionality of the method and you are forced to write in a comment then it is time to perform the Extract method refactoring action. (Fowler & Beck, 1999).

But what is bad about long methods? The main problem with long methods is they are  hard to understand, just like when reading a book a single huge paragraph is much harder to read then a series of many short paragraphs.

long method.png

(“Long Method,” n.d.)

Large class – You can identify a large class in two ways:

  1. A particular class is instantiated many times and so there are many instance variables of that class floating around in your program.
  2. A class contains variables and/or methods that are not directly related to the area of the problem domain that the class addresses.

Whats the problem with irrelevant variables and methods in my class? Well it means that my singular class is trying to do the job of multiple classes, this is not advantageous because in good software development we have the concept of single responsibility  (Fowler & Beck, 1999).

The single responsibility principle states that if there is more than one reason to change a class or method then that class/method is doing too much and it needs to be split up. This is because a change to the functionality of the class could unintentionally alter the other functionality of the class.  (“Single Responsibility Principle | Object Oriented Design,” n.d.).

In my view the drawing below sums up what a large class is, as the class in the drawing is doing multiple jobs when it should only be responsible for one part of the system.

What refactoring action can I do to remove long methods? The Extract class action is the most appropriate one to use. To do this create a new class making sure there is a relationship with the old class so that the old class can call the methods of this new class.

Then as part of Extract class perform the move method and move field refactoring’s, these actions actually move the data and responsibilities across to the new class. Now moving code between classes can throw a whole lot of errors it is a good idea to move the variables and methods one by one testing in between(“Extract Class,” n.d.)

large class.png

(“Large Class,” n.d.)

Primitive obsession – This is the use of primitive data types for representing data values that are important to the problem domain.

An example of an important data value would be employee wages in an accounting system. If the system is using the data type of decimal to represent the employee wages then this is the use of a primitive, instead what we want to do is make a class  for salary which is in a relationship with the employee class. This type of refactoring technique is called Replace data value with object.

Let me clarify:

In software development there are two types of data, record (also known as composite) and primitive types.

Record types are data containers such as objects, they can store multiple values. For example you can instantiate an array of objects, this means that this record type has multiple objects instances that it is storing.

Primitive types whereas store a single value. Primitives store their singular data value straight onto flash rather than pointing to another data type.

data types.jpg

(Srivastava, 2016)

Long parameter list –

Now this is a bad code smell I can definitely relate to for two reasons, not only are method calls with long parameter lists hard to understand but when you update the method parameter list you have to change both the method header and the caller and it is easy to forget to change the caller.

Having methods with long parameter lists is a a legacy of the days before Object Oriented (OO) programming when all relevant variables had to be handed to a method. Nowadays with OO “if you don’t have something you need, you can always ask another object to get it for you” (Fowler & Beck, 1999) you do this by placing the other object into your method as a parameter, and then calling the method in the other object to get the variable you need (Fowler & Beck, 1999).

How do you know when a method has too many parameters? Well if a method has a list of more than around 4 parameters then this is too long (“Long Parameter List,” n.d.).

What refactoring technique can I use to remove this bad code smell? You can use the Replace parameter with method call technique. This is where for example instead of handing MethodA a parameter price which is generated by the method GetPrice you instead call the GetPrice method from within MethodA. Thereby removing a parameter value from your parameter list.

parameter.jpg

(Bhattacaharya, 2015)

Data clumps – This is where multiple variables exist in several places throughout the application in a group. This is a sign that you need to collate these variables into a class.

To do this you need to perform the Extract class refactoring technique. This involves simply moving the variables to a newly created class, and then performing the Introduce Parameter Object technique to change parameter lists that contained the group of variables to having the object name instead.

Switch statements – When I first saw that switch statements were a bad code smell I was skeptical as I couldn’t see from first glance what the problem with them was, and how you could replace them but the solution is surprisingly easy.

The issue with switch statements is that they are often duplicated throughout a system. i.e. you are checking the type of an object (the object is in an inheritance trees and so the switch statement tries to find a subclass type matching the object type) in multiple classes.

For example in the case of NMIT students you may have a switch statement in the Institution and Course classes to check if they are local or international. Student would be the superclass of an inheritance tree with the subclasses: International and Local.

Both Institution and Course need to know the student type because the institution needs to know for accountancy whilst the course tutor needs to know their status for attendance. If a third status student subclass type became available then an additional case statement to check for the third student object type would mean changes to at least two sets of switch statements.

I can imagine in large systems this can become a real headache.

What refactoring technique can I use to remove this code smell? Well it is recommended you make use of the great Object Oriented concept polymorphism. This concept means that system treats objects differently based on what their data type is.

Now often switch statements go through a series of potential values for a object type. In the example I used before the switch statement would check the value of the object Student against the possible types of International and local.

Using polymorphism I can move the functionality that is run if the student is local to the ‘local’ subclass (and place it in the method GetStudentStatus()) and move the functionality for international to the ‘international’ class (also making the method GetStudentStatus() in the ‘international’ subclass), and then just place an abstract method of the name GetStudentStatus in the Student superclass (an abstract method is a method in a superclass which contains no code; it is just a placeholder) to be overridden by the Local and International subclasses.

I can then call the GetStudentStatus method in the Student superclass and as long as I am using a true OO programming language it will determine whether or not to run the Local subclasses GetStudentStatus() or International GetStudentStatus() method.

Below is a visual representation of what I am trying to explain:

polymorphism.jpg

(De Carufel, 2011)

Temporary field – A temporary field is a variable that is only filled with a value at certain times, this means it is empty a lot of the time (Fowler & Beck, 1999).

What is the problem with temporary fields? Effectively this is a readability issue, the use of temporary fields makes the code harder to read, we want the code to be understandable so it can be easily and efficiently extended in the future so the readability is important.

What technique can I use to remove this bad code smell? You can place the temporary field and the methods that use it in a new class, this is the Extract class technique. This creates a method object which is a where the method becomes a whole new class and the local variables that existed inside the method become variables of the new class (“Replace Method with Method Object,” n.d.)

Refused bequest – This is an easy one to remember, I just have to keep in mind my view towards inheritance which is that it shouldn’t be accepted and that people should make their own money.

As I have alluded to above, this bad code smell is where subclasses do not use all of the variables and methods they inherit from their superclass.

When developers set up inheritance they need to keep in mind Liskov substitution principle which requires the subclass to be an extension of the superclass it is inheriting from. In other words the inheritance has to be logical when looking at the system from a real world perspective (“Replace Inheritance with Delegation,” n.d.)

What refactoring technique can I use? You can use the Replace Inheritance with Delegation technique which will dismantle the inheritance tree by taking the subclass and turning it into a non-inheriting class which will call any necessary methods in the old superclass rather than inheriting them.The old subclass object calls the method in the old superclass using a delegate field.

What is a delegate? It is a variable containing an instantiated object, in this case the delegate variable will be the instantiated old superclass. So for example in the below image Stack contains a delegate variable ‘vector’ which contains the instantiated Vector class.

delegate.png

(“Replace Inheritance with Delegation,” n.d.)

The benefit of breaking up unnecessary inheritance is both the program makes more sense holistically and there are no longer un-used variables or methods in the old subclass.

Conclusion

I have found writing up the different code smells and defining what refactoring is very useful in clarifying them in my mind, and I will continue to talk about the rest of the code smells in my next Good Software design research journal entry.

Bibliography:

Fowler, M., & Beck, K. (1999). Refactoring: improving the design of existing code. Reading, MA: Addison-Wesley.

Refactoring. (n.d.). Retrieved March 7, 2017, from https://sourcemaking.com/

Fowler, M. (2006, February 9). CodeSmell. Retrieved March 13, 2017, from https://martinfowler.com/bliki/CodeSmell.html

Long Method. (n.d.). Retrieved March 14, 2017, from https://sourcemaking.com/refactoring/smells/long-method

Extract Class. (n.d.). Retrieved March 14, 2017, from https://sourcemaking.com/refactoring/extract-class

Primitive Obsession. (2013, February 18). Retrieved March 14, 2017, from http://wiki.c2.com/?PrimitiveObsession

Long Parameter List. (n.d.). Retrieved March 14, 2017, from https://sourcemaking.com/refactoring/smells/long-parameter-list

De Carufel, E. (2011, November). Top 5 ways to improve your code. Retrieved from https://www.slideshare.net/ericdecarufel/arc305-en

Replace Method with Method Object. (n.d.). Retrieved March 14, 2017, from https://sourcemaking.com/refactoring/replace-method-with-method-object

Replace Inheritance with Delegation. (n.d.). Retrieved March 14, 2017, from https://sourcemaking.com/refactoring/replace-inheritance-with-delegation

Large Class. (n.d.). Retrieved March 14, 2017, from https://sourcemaking.com/refactoring/smells/large-class

Srivastava, V. (2016, August). Data Types – Classification of Information. Retrieved from https://www.slideshare.net/vivsri/data-types-65014968

Bhattacaharya, M. (2015, September). Software Craftsmanship – Code Smells. Retrieved from https://www.slideshare.net/MrinalBhattacaharya/code-smells-52370759

Replace Inheritance with Delegation. (n.d.). Retrieved March 14, 2017, from https://sourcemaking.com/refactoring/replace-inheritance-with-delegation

Oztoprak, S. (2014, November 26). Code smells Refused bequest with example. Retrieved March 14, 2017, from http://siloracle.blogspot.co.nz/2014/11/code-smells-refused-bequest-with-example.html

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s