Diving into String (Java)

Ah yes, String. Almost certainly the most used class in the Java and yet I bet you never looked at the source code?

Well there is no time like the present, so lets dive in (source code, if you want to follow along). Okay, first thing first, what happens when we create a string?

 public String(String original) { int size = original.count; char[] originalValue = original.value; char[] v; if (originalValue.length \> size) { int off = original.offset; v = Arrays.copyOfRange(originalValue, off, off+size); } else { v = originalValue; } this.offset = 0; this.count = size; this.value = v; } 

Okay from this we can see quite a few things: first (and not too surprisingly) a String is backed by an array of chars. Since strings in Java are immutable we can just refer to the underlying backing store of the original string, which means that we do not use memory to store two two copies of the same thing.

Since the String class stores an offset, you may have guessed how the substring method is implemented:

 new String(offset + beginIndex, endIndex - beginIndex, value) 

This means that substring will avoid wasting memory and that creating a substring is O(1) – with one potential snag: if we later create a new string based on the string created with substring then (as you might have noticed in the String constructor code above) it does create a new char array to back the new String. Why? Memory usage. As long as one String instance exist then the entire original char array has to be kept in memory, even if only a very small part of the array is actually used. This is obviously wasteful, but it is also a very difficult thing to find – and the String class is used in all the code.