“B” refer to the size of parameters of the model in the billion but people has started referring them to “bits”.
This is not entirely correct. B does stand for “Billion” parameters, but bits are a different thing. You can have, for example, a 4-bit 13B model or an 8-bit 3B model. They don’t correlate at all.
An absolutely moronic decision.