The MB Confusion

Every software developer thinks they know what MB means. It is, of course, 1,048,576 bytes. Only the hard drive vendors disagree.

How about a normal computer user? You’ll probably agree that they do not know for sure. It does not matter, anyway, as long as they know that 100 MB is greater than 90 MB. Right?

Let me ask you now, how many bytes are there in a 1.44 MB floppy disk?

You’ll probably be frustrated by the fact that 1.44 × 1024 × 1024 is not an integral value. The fact is, 1.44 MB is a misnomer: it is actually 1440 KB.

Again, confusions are from storage vendors. Or, are they?

In fact, only the semiconductor industry favours powers of two. The only thing related to powers of two in the storage industry is that a sector is 512 bytes by convention—so ‘1.44 MB’ is actually 2880 sectors, as such a floppy disk has 2 sides, 80 tracks per side, and 18 sectors per track. All the other numbers have no relationship with powers of two. So it is natural that the storage industry now reports drive capacity in decimal MBs, GBs, and TBs.

In order to mitigate the confusion, IEC 60027-2 introduced a series of binary unit prefixes in 1999:

  • Ki- or kibi-, 210
  • Mi- or mebi-, 220
  • Gi- or gibi-, 230
  • Ti- or tebi-, 240

So instead of saying a memory page is 4 KB (kilobytes), we should really say 4 KiB (kibibytes). The only problem is that more than 20 years after its introduction and more than 10 years after the ISO standardization (ISO/IEC 80000-2 in 2008), these prefixes are still not popular. The situation is so bad that Wikipedia explicitly discourages their use in its Manual of Style. The reason is very practical: most Wikipedia readers are not familiar with the IEC binary prefixes. So instead of using terms like mebibytes, Wikipedia recommends using the more common prefixes, and asks the authors to ‘explicitly specify the meaning of k and K as well as the primary meaning of M, G, T, etc. in an article’.

Anyway, when we talk about RAM, there is no real confusion, as we always use binary powers, and 4 kB, 4 KB, and 4 KiB all mean 4096 bytes in most cases. When we talk about frequency or bandwidth, we always use decimal powers, and people will not misunderstand what 4 GHz or 100 Mbps means. The only place where there are a lot of confusions is storage. We have seen 1.44 MB is neither binary nor decimal. We should also be aware that different OSs/tools use different conventions. While Microsoft Windows always sticks to the binary notation with units like KB, MB, and GB, Linux and GNU core utilities have begun to use the IEC binary prefixes, and macOS has been using the SI decimal prefixes (1 GB = 1,000,000,000 bytes) since 2009 (Snow Leopard).

While I am not sure I will switch to representing 1,048,576 bytes as 1 MiB when talking about RAM usage, I am pretty sure I will not report a 123,456,789-byte file as 117.7 MB ever again—123.5 MB seems much more simple, natural, and correct.

What will be your choice?

P.S. For a more in-depth coverage on this topic, check out the Wikipedia article binary prefix.

P.P.S. By the way, did you notice that I had a number inconsistency in the very first sentence? If not, it is evidence that we can get rid of ‘he or she’. For more details, check out On the Use of She as a Generic Pronoun.

One thought on “The MB Confusion

Leave a comment