Chapter 2 Code Smells

2.1 Bloaters

Bloaters are code, functions or objects that have been increased to an unmanageable size. This can happen slowly or fast but should be handled either way.

2.1.1 Long Function

Long functions can be a hassle to deal with as the complexity slowly can creep in and become unmanageable after a while. A function can become long for various reasons. sometimes it is just a series of calculations that happens one after another that each takes up some space, in that case, it can be worth it to Extract Functions to give yourself a better overhead of the code. Other times it is the conditional logic in your function that will increase the size, in that case, many of the Simplifying Conditional Expressions refactors be useful.

Common refactors: Extract Function, Replace Temp with Function

2.1.2 Primitive/Atomic Obsession

Working with specialized data structures such as dates, phone-numbers, coordinates, and currencies it might be tempting to represent them as simple integers or characters. A problem that might sneak in is that units are going to be ignored or non-sensical values appear. It might be time to introduce a new S3 vector class Replace Atomic with S3 Class.

groups of primitive vectors that always travel together are Data Clumps and should be fixed with Extract Class and Introduce Parameter Object.

TODO talk about vctrs package

2.1.3 Data Clumps

When you have multiple objects always traveling together it might be time to combine them to one object. Think x-coordinates and y-coordinates.

Refactors: Extract Class, Introduce Parameter Object and

2.1.4 Many Parameters

Having a function takes many parameters is often a necessary evil if you want to avoid Global Data. However, the function still might become overwhelming. There are a handful of different patterns we have refactors we can use.

If a parameter can be derived from another parameter then you can use Replace Parameter with Query to remove one of the parameters.

If you notice that some variables always travel together, then it might be worth it to Introduce Parameter Object. Sometimes you pass a lot of elements of an object or list into a function, it might be worth it to pass the whole object in and extract inside. In that case, you can use Preserve Whole Object. Sometimes a parameter is used to denote how a function should be dispatched, here you would want to use Remove Control Flag.

2.2 Change Preventers

These smells mean that if you need to change something in one place in your code, you have to make many changes in other places too. These smells violate the rule suggested by Fowler and Beck which says that classes and possible changes should have a one-to-one relationship.

2.2.1 Divergent Change

When the same function is changed in different ways for different reasons you might have a problem with Divergent Change. Usually, when we are writing code we want to be able to focus on one point and fix an issue there.

TODO write a better explanation.

2.2.2 Shotgun Surgery

Shotgun Surgery is similar to Divergent Change but is the opposite problem. If you find yourself having to make a bunch of little edits all over your codebase when you update a single function you are having this smell. Having to make edits all over the place is hard and it is easy to accidentally miss one.

2.3 Dispensables

Dispensables are things that we can easily get rid out.

2.3.1 Comments

I’m not saying that you shouldn’t use comments. Comments are great at explaining why something is done a certain way. But it sometimes happens that comments are used to hide/excuse badly written code. If that is the case then it is the code, not the comments that are smelly and should be changed, leaving the comments no longer needed.

2.3.2 Duplicate Code

Writing the same code multiple places is a waste of time and a way to potentially introduce mistakes. Here we would want to Extract Function.

2.3.3 Dead Code

Having code that is never used isn’t needed and should be removed, simply use Remove Dead Code.

2.3.4 Dead Stores

Dead stores happen when you assign a value to a variable and assigns a different value to it before using the original value. This is at worst a waste of time, but it is likely a bug in your code.

2.3.5 Speculative Generality

This smell is all too common. You think to yourself “We will need this feature later, let me just add the needed infrastructure to support it” and then you never end up needing it. You added a lot of things that aren’t being used, in the best case slowing down your runtime and worst-case making it hard to implement changes you need right now.

Unused parameters can be removed with Change Function Declaration and unused code with Remove Dead Code.

2.4 Others

2.4.1 Lazy Element

We like to add structure, but over the course of development, some functions or classes become too small to justify having around. They simply don’t do enough and become lazy.

get_mtcars <- function() {
  mtcars
}

get_mtcars()
##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

Refactors: Inline Function, Inline Class.

2.4.2 Loops

Loops have been around since the early days of programming. And you will have a hard time producing something without iteration. But this doesn’t mean that you have to write the loop yourself.

Refactors: Replace Loop With Apply

One problem that loops might introduce is that it leaves the iterating value in the environment

i <- 124

for (i in 1:10) {
  # ...
}

i
## [1] 10

to avoid this you either have to make sure you don’t reuse names, pull the whole loop in a function call or replace with a function from the apply family.

TODO: Add examples where you can’t avoid loops and what to do about it.

2.4.3 Mysterious Name

When coding you don’t want to spend an unnecessary amount of time trying to figure out what something is. Often this can be alleviated by properly naming objects in our code.

Common refactors: Rename Variable, Rename Function and Rename Argument.

2.4.4 Repeated Switches

Having branching logic isn’t necessarily a bad idea. But having the same switching logic multiple places invites trouble. Every time you need to add a new case you need to remember to introduce it everywhere.

2.5 Smells not covered in this book

  • Mutable Data
  • Global Data
  • Feature Envy
  • Temporary Field
  • Message Chains
  • Middle Man
  • Insider Trading
  • Large Class
  • Alternative Classes with Different Interfaces
  • Data Class
  • Refused Bequest
  • Switch Statements
  • Parallel Inheritance Hierarchies
  • Lazy Class
  • Inappropriate Intimacy