Taking apart StringBuilder (Java)
If you are a Java developer, you almost certainly know that it is better to use a string builder than to add strings together. You may also know that you should use
StringBuilder rather than
StringBuffer because the latter is thread-safe which means it has a wasteful overhead if you don’t need it (and you likely don’t).
But how exactly is
StringBuilder so much more better, nay, awesome than just adding strings together? Well remember that in Java strings are immutable: once a string has been created it cannot be changed. This has all sorts of benefits but it also mean that we have to create new strings if we want new values.
Lets back up a second, what exactly does Java do when you write some code like this:
The answer? It creates and throws a way an entire bunch of strings! First it creates a temporary variable that contains the result of
"Greetings " + title, (it also creates a temporary variable which contains “greetings”, but Java will have that optimised so creating it is cheap and there is no way around it anyhow), then it creates a new temporary variable that contains the result of adding the previous temporary variable and the space. At this point the first temporary variable is garbage and will need to be removed. With the second temporary variable on the call stack, Java creates a third temporary variable, sets it to the result of adding the second temporary variable and the firstName variable, etc.
In all, not counting the strings in the code, Java creates 5 strings but all but the last are garbage right away — assuming that this isn’t inside a loop, in which case it will create 5 garbage strings for each iteration in the loop. This can be done better and this is what
You can find the Java Doc here and the source code here — it might help you to have the source code open in new tab (or a new window). Since much of what we will be looking at is the
AbstractStringBuilder class (source code here, I suggest opening that one as well).
Right then, lets rewrite the previous example to use a StringBuilder:
Most usages of the StringBuilder uses just two methods:
toString. So lets look at the
append(String str) method:
If you have read my entry on ArrayLists nothing here should be surprising:
StringBuilder is basically backed by a big
char array that it then copies the chars of the string into.
expandCapacity method? It (surprise) implements the logic to expand the backing store and serves as an illustration of why you shouldn’t reimplement something like that if you don’t have to:
Here the size of the new capacity will always be more than all its previous capacities which can cause stack fragmentation: if you read my article on the ArrayList you can see the details, the important part is that this wouldn’t have been an issue if the implementation had used
ArrayList. Unfortunately that isn’t possible because
char is a primitive value in Java and storing those would cause needless boxing issues.
Okay enough digression, how is this more efficient than allocating a ton of strings? Simple, assuming there is enough space in the backing store, we can add characters without creating any new objects which means we will also avoid causing any garbage.
StringBuilder has append methods for just about anything, but they all work the same way: get the array of chars that corresponds to the string representation of the object we should add and put it on the end of an array.
StringBuilder also has
insert methods, but those aren’t used as often so we will pretend they don’t exist.
Okay so now we have added all the strings, how do we get them back out?
Here we create one
String object using a less common constructor.
And that is pretty much it actually. A relatively simple implementation which gives a huge speed bonus.