I defined the efficiency function in JS, I turned the crank and I got this table. So what this is is the efficiency in bases 2 through 8 for the numbers 100, 1000, 10 000, 1 000 000 and 1 000 000 000.

It’s not just the shortest string. There’re two things that make it more costly. the first thing is: the more unique digits there are in a base, the more space it takes up to store the number in that base. And second: the more digits are needed to represent a number, the more storage it will take. So formula is: the length times the number of unique digits in a base.
We now define the efficiency of a base to be the efficiency of a huge number in that base. As our number approaches infinity, we get more and more accurate in our predictions.
The optimal integer base is base 3. The reason is that 3 is the closest integer to e. If you allow fractional bases, then base e is the most optimal. But if you ignore fractional bases (which you should be doing, because computers are very bad at fractional bases), base 3 is the optimal base.
So why do computers use base 2 to store data? The answer is: some of them don’t! And I’m not talking about quantum computers, I’m talking about photonic computers. Computers that use light to store stuff. And that light actually has 3 states, not 2: an off state, a “positive” state, and a “negative” state. So these computers are ternary, but with 0, 1 and -1 instead of 0, 1 and 2 as their digits! (There’s no agreed writing for it, but I like to use 0, + and -.)
This weird base is called balanced ternary.
