The MB Confusion

Every software developer thinks they know what MB means. It is, of course, 1,048,576 bytes. Only the hard drive vendors disagree.

How about a normal computer user? You’ll probably agree that they do not know for sure. It does not matter, anyway, as long as they know that 100 MB is greater than 90 MB. Right?

Let me ask you now, how many bytes are there in a 1.44 MB floppy disk?

You’ll probably be frustrated by the fact that 1.44 × 1024 × 1024 is not an integral value. The fact is, 1.44 MB is a misnomer: it is actually 1440 KB.

Again, confusions are from storage vendors. Or, are they?

In fact, only the semiconductor industry favours powers of two. The only thing related to powers of two in the storage industry is that a sector is 512 bytes by convention—so ‘1.44 MB’ is actually 2880 sectors, as such a floppy disk has 2 sides, 80 tracks per side, and 18 sectors per track. All the other numbers have no relationship with powers of two. So it is natural that the storage industry now reports drive capacity in decimal MBs, GBs, and TBs.

In order to mitigate the confusion, IEC 60027-2 introduced a series of binary unit prefixes in 1999:

  • Ki- or kibi-, 210
  • Mi- or mebi-, 220
  • Gi- or gibi-, 230
  • Ti- or tebi-, 240

So instead of saying a memory page is 4 KB (kilobytes), we should really say 4 KiB (kibibytes). The only problem is that more than 20 years after its introduction and more than 10 years after the ISO standardization (ISO/IEC 80000-2 in 2008), these prefixes are still not popular. The situation is so bad that Wikipedia explicitly discourages their use in its Manual of Style. The reason is very practical: most Wikipedia readers are not familiar with the IEC binary prefixes. So instead of using terms like mebibytes, Wikipedia recommends using the more common prefixes, and asks the authors to ‘explicitly specify the meaning of k and K as well as the primary meaning of M, G, T, etc. in an article’.

Anyway, when we talk about RAM, there is no real confusion, as we always use binary powers, and 4 kB, 4 KB, and 4 KiB all mean 4096 bytes in most cases. When we talk about frequency or bandwidth, we always use decimal powers, and people will not misunderstand what 4 GHz or 100 Mbps means. The only place where there are a lot of confusions is storage. We have seen 1.44 MB is neither binary nor decimal. We should also be aware that different OSs/tools use different conventions. While Microsoft Windows always sticks to the binary notation with units like KB, MB, and GB, Linux and GNU core utilities have begun to use the IEC binary prefixes, and macOS has been using the SI decimal prefixes (1 GB = 1,000,000,000 bytes) since 2009 (Snow Leopard).

While I am not sure I will switch to representing 1,048,576 bytes as 1 MiB when talking about RAM usage, I am pretty sure I will not report a 123,456,789-byte file as 117.7 MB ever again—123.5 MB seems much more simple, natural, and correct.

What will be your choice?

P.S. For a more in-depth coverage on this topic, check out the Wikipedia article binary prefix.

P.P.S. By the way, did you notice that I had a number inconsistency in the very first sentence? If not, it is evidence that we can get rid of ‘he or she’. For more details, check out On the Use of She as a Generic Pronoun.

阅读的权利

作者:理查德 · 斯托曼

本文发表在 1997 年二月号的《计算机协会通信》(第 40 卷,第 2 期)。


(摘自《第谷之路》,关于月亮革命先驱者的文集,2096 年于月亮城出版。)

对于丹 · 哈尔伯特来说,第谷之路始于大学——就在丽莎 · 兰兹向他借计算机的时候。她的计算机坏了。如果她不能另外借到一台的话,期中作业就肯定会不及格。除了丹,她可不敢向任何人开口。

这可就让丹为难了。他肯定得帮她一把——不过,如果他把计算机借给她,她可能会读他的书。光这么想就让他大吃一惊,更不要说这实际算是犯罪行为了。如果你让别人读你的书的话,你会进监狱,被关上许多年!如同其他每个人,他从小学开始就被谆谆教诲,分享书籍是卑鄙和错误的行为——只有盗版者才会这么做。

不仅如此,SPA——软件保护局的缩写——很可能会抓到他。在他的软件课里,丹学到过,每本书都有一个版权监视器,会将何时、何地、何人阅读的信息报告到中央许可处。(他们利用这些信息来抓获盗版阅读者,但也会利用其把个人的兴趣资料卖给零售商。)下次他的计算机联网时,中央许可处就会发现他做了什么。他,作为计算机的所有者,就会受到最严厉的惩罚——因为他没有竭力防止犯罪。

当然,丽莎并不一定有意要读他的书。她可能只是要用他的计算机来完成她的期中作业。不过,丹知道她出身于中产阶级家庭,承担学费都很困难,更不要说阅读费了。读他的书,可能是她能够毕业的唯一办法。他了解这种情况——他自己都不得不靠贷款来支付他阅读论文的费用。(这些费用的 10% 归论文的作者所有。因为丹的理想是从事学术工作,他可以寄希望于以后他自己的研究论文带来足够的收入来归还贷款——如果它们被经常引用的话。)

后来,丹会了解到曾有一段时间任何人都可以去图书馆免费阅读杂志里的文章,甚至整本的书。曾有过独立学者,可以读几千页的资料,都不需要政府图书馆的准许。不过,从 1990 年代起,不管是商业还是非营利杂志的出版商,都开始对访问收费。在 2047 年之时,已经很少有人还记得,曾经存在过普通大众可以接触学术文献的图书馆了。

当然,总是有办法可以绕过 SPA 和中央许可处的。只不过这些办法都是非法的。丹软件课上有一个同学,叫法兰克 · 马图琪,曾通过不正当手段获得了调试工具,还用它在读书时跳过版权监视器的代码。不过,这件事情他对朋友宣扬得太多,最终有人为了得到奖金而向 SPA 揭发了他(陷入深深债务中的学生很容易受诱惑而做出背叛行为)。2047 年时,法兰克正在坐牢,不是因为盗版阅读,而是因为拥有一个调试器。

后来,丹还会了解到曾有一段时间任何人都可以拥有调试工具,有些甚至是免费的,放在光盘上,或放在网上供人下载。但是,随着普通用户逐渐开始使用它们来规避版权监视器,最终有一个法官作出判决,规避版权监视已经成为调试器的实际主要用途。这意味着,调试器是非法的;调试器的作者也被关进了监狱。

当然,程序员仍然需要调试工具。在 2047 年时,调试器厂商销售的调试器都有编号,且只对正式许可的签约程序员进行销售。丹在软件课上使用的调试器放在一个特别的防火墙后,只能在课堂练习时使用。

还有一种可能规避版权监视器的方法,就是安装一个修改过的系统内核。最终,丹会发现自由的系统内核,甚至完整的自由操作系统;它们自世纪之交前后就存在了。不过,它们不仅像调试器一样是非法的,而且你即使有的话也没法安装它们——如果你不知道你的计算机的根密码的话。无论是联邦调查局(FBI)还是微软的支持部门都不会告诉你。

丹的结论是,他不能简单地把计算机借给丽莎。但是,他也决不能拒绝帮助她,因为他爱她。每一次与她交谈,他的心中都会充满喜悦。丽莎选择了向他来寻找帮助,意味着她也爱他。

丹做了一件不可思议的事情来解决面前的难题——他不仅把计算机借给了丽莎,还把他的密码也告诉了她。这样,当丽莎阅读他的书籍时,中央许可处会认为是他在阅读。这仍然是犯罪,但 SPA 不会自动发现了。只有丽莎举报他,他们才会发现。

当然,如果学校发现他把密码告诉丽莎的话,那无论她用这密码干过什么,他们俩都完蛋了。学校的政策是,任何妨碍监视学生计算机的行为都将招致纪律惩处。你是否真的做了坏事不重要——让管理员难以对你进行检查就已经是作案了。他们认定,这就意味着你要做不被允许的事情;他们并不需要知道那是什么。

学生通常不会因此被开除——至少不会被直接开除。实际会发生的是,他们将被禁止使用学校的计算机系统,然后不可避免地在所有科目中挂科。

后来,丹还会了解到,这种大学政策在 1980 年代才开始。从那时起,大学生们开始大量使用计算机。此前,大学在学生纪律方面也采取了不同的做法:他们只是对真正有害的行为进行惩罚,而不是对仅仅有疑问的行为。

丽莎没有向 SPA 举报丹。丹帮助她的决定让他们后来走进了婚姻的殿堂,同时也使他们开始质疑他们在孩童时就接受的关于盗版的教导。夫妇俩开始阅读关于版权的历史,关于苏联及其对复印的限制,甚至还有原始的美国宪法。他们搬到了月亮城,并找到了其他逃离了 SPA 的魔爪的人们。当第谷环形山起义于 2062 年发生时,全民阅读权很快就成了起义的中心目标之一。

作者注释

本注释在 2007 年更新过。

阅读的权利在今天仍然是一场进行中的战斗。虽然我们今天的生活方式可能要过 50 年才会被遗忘,上面描述的特定法律和实践中,大部分已经被提出了。很多已经在美国和其它地方成了法律。在美国,1998 年的《数字千年版权法案》(DMCA)建立了对阅读和借阅计算机化的图书(以及其它作品)进行限制的法律基础。欧盟在 2001 年的版权指导书也施加了类似的限制。在法国,根据 2006 通过的《信息社会中的著作权及相关权利法案》(DADVSI),拥有 DeCSS 程序本身(对 DVD 上的视频进行解密的自由软件)就是一种犯罪。

在 2001 年,霍灵斯参议员在迪斯尼的赞助下提出了一项称作 SSSCA 的法案,要求每台新的计算机上都强制安装用户无法绕过的限制复制的设施。紧随「别针」芯片和类似的美国政府密钥托管提案的后尘,这一提案显示了一种长期趋势:计算机系统正在逐渐被设置成给予第三方控制的权力,而不是实际的使用者。SSSCA 后来被更名为 CBDTPA(很难发音),大家把它故意叫成「消费但不要尝试编程法案」。

共和党人不久之后控制了美国参议院。比起民主党人,他们和好莱坞的联系不那么紧密,所以他们没有推进这些提案。现在,民主党人重新掌握了控制权,危险又一次变大了。

2001 年美国开始尝试利用提出的美洲自由贸易区(FTAA)条约来对整个西半球的国家强加同样的规则。FTAA 是一个所谓的「自由贸易」条约,实际上设计成给予企业而非民主政府更大的权利。强加类似于 DMCA 的法律是这种精神的典型表现。巴西总统卢拉拒绝了 DMCA 和其它这样的要求,事实上终止了 FTAA。

自那以后,美国通过双边「自由贸易」协定对澳大利亚和墨西哥等国,还有通过《中美洲自由贸易协定》对哥斯达黎加等国,施加了类似的要求。厄瓜多尔总统科雷亚拒绝签署「自由贸易」协定,但厄瓜多尔在 2003 年采纳了类似于 DMCA 的法律。厄瓜多尔的新宪法也许提供了一个可以除掉这一法律的机会。

故事里有一个设想直到 2002 年才实际发生。这就是 FBI 和微软将持有你的个人计算机的根密码,而你却没有。

这一计划的支持者给该计划起名为「可信任计算」和「Palladium」。我们把它叫做「不可靠计算」,因为该计划的效果是使你的计算机服从其它公司,而非你。在 2007 年,这被实现为 Windows Vista 的一部分;我们认为苹果也会做类似的事情。在这一计划中,生产商将掌握密码,但 FBI 要得到它并不会有什么困难。

微软保有的并不是传统意义上的口令;没有人会在终端上输入它。确切地说,这是一个签名和加密密钥,与你的计算机上存储的第二个密钥相对应。这使得微软,甚至可能是和微软合作的站点,对用户自己的计算机拥有终极控制权。

Vista 给了微软额外的权利。举例来说,微软可以强制安装升级,并可以命令所有运行 Vista 的计算机拒绝运行某一设备驱动程序。Vista 的很多限制的主要目的就是制作用户无法克服的 DRM。

SPA,实际上代表软件出版者联合会,在这一类似于警察的角色上已被 BSA(商业软件联盟)所替代。在今天,它并不是正式的警察:但非正式地,它表现得非常像警察。它诱惑人们告发他们的同事和朋友,使用的方法让人回想起旧日的苏联。在阿根廷,2001 年的一场 BSA 的恐怖运动,暗地里威胁人们共享软件可导致被强奸。

在这个故事最初写出来的时候,SPA 正在威胁小的互联网服务提供商(ISP),要求它们允许 SPA 监控所有的用户。大部分的 ISP 在受威胁后就屈服了,因为它们担负不起在法庭还击的所需的费用。至少一个 ISP,加州奥克兰的 Community ConneXion,拒绝了这一要求,并且真的被起诉了。SPA 后来撤销了这一诉讼,但它们获得了 DMCA,法案给了它们所追寻的权利。

上面描写的大学安全策略也并非想像。比如,在芝加哥地区某大学,当你登录时计算机将印出如下信息:

本系统仅供授权用户使用。对本计算机系统非授权或超出授权的使用可能导致所有行为被系统监控并被系统工作人员记录。在对不正常使用计算机的个人进行监控时,以及在系统维护时,授权用户的行为也有可能被监控。任何使用本系统的用户都明确同意该类监控,并应知晓,如果此类监控揭示出非法行为或违反校规的证据,系统工作人员可能将该监控证据提供给校方相应部门及执法部门官员。

这可真是个有趣的对付第四修正案的方法:迫使基本上每个人提前同意,放弃他们在第四修正案下的权利。

参考资料


译者:吴咏炜

原文:https://www.gnu.org/philosophy/right-to-read.en.html

说明:这是一篇挺久之前翻译的文章。原本的特定用途已经不会再发生,发出来和大家共享。

Note: This is an article I translated quite a few years ago. Its intended usage has ceased to exist, and I am sharing it online. Recent changes at the English site are not reflected in this translation.

This work is free to share under a Creative Commons Attribution-ShareAlike 4.0 Licence.

On the Use of She as a Generic Pronoun

When reading the August 2017 issue of Communications of the ACM, I have been continually distracted by the use of she as a generic pronoun:

Instead of a field engineer constantly traveling between locations, she could troubleshoot machinery and refine product designs in real time . . .

There were times when one person had to be in charge while she captured the organization of the emerging article . . .

. . . we can let the user specify how much precision she wants . . .

A mathematician using “brute force” is a kind of barbaric monster, is she not?

I am not sure whether this is just my personal problem, but I find this usage obtrusive and annoying. I was reading something supposed to be objective and scientific, but the images of women kept surfacing. The last case was especially so, as I could not help envisioning a female mathematician (er, how many female mathematicians have there been?) who was also a barbaric monster, oops, how bad it was!

I dug around for a while for related resources. Before long, I realized one thing: my view is at least partly shaped by my education, which taught me that he be used as the third-person singular pronoun when the gender is unknown, for both English and Chinese. My unscientific survey shows that while many of my female friends are uncomfortable with either he or she used generically, most Chinese female friends actually prefer he to she! According to an online discussion, at least some peoples in Continental Europe still use the masculine pronoun when the gender is unknown, say, hij in Dutch and il/ils in French.1 I think the French example is quite interesting to Chinese speakers, as neither French nor Chinese has a gender-neutral third-person plural pronoun: the generic forms ils and 他们 are actually masculine forms. Unlike the English they, we never had a nice and simple way to escape the problem.

Talking about they, one fact during the search surprised me. My favourite English author, Jane Austen, apparently preferred they/their in her novels.2 Examples (emphasis is mine):

You wanted me, I know, to say ‘Yes,’ that you might have the pleasure of despising my taste; but I always delight in overthrowing those kind of schemes, and cheating a person of their premeditated contempt.

To be sure, you knew no actual good of me—but nobody thinks of that when they fall in love.

Digging deeper, it is revealed that they has been used after words like each, everybody, nobody, etc. since the Middle Ages. The entries everybody and their in the Oxford English Dictionary are nearly a demonstration of such usages, with a note in the latter entry that writes ‘Not favoured by grammarians’.3 Professor Steven Pinker also argues that using they/their/them after everyone is not only correct, but logical as well.4 Oops to the prescriptivist grammarians and my English education!

Accidentally, I encountered an old article by Douglas R. Hofstadter,5 author of the famous book Gödel, Escher, Bach: An Eternal Golden Braid (also known as GEB). It is vastly satirical, and it attacks most points I have for supporting the use of man and he (go read it; it is highly recommended even though I do not fully agree). It influenced my thinking, even though it ignored the etymology of man. The Oxford Dictionary of English has this usage note:6

Traditionally the word man has been used to refer not only to adult males but also to human beings in general, regardless of sex. There is a historical explanation for this: in Old English the principal sense of man was ‘a human being’, and the words wer and wif were used to refer specifically to ‘a male person’ and ‘a female person’ respectively. Subsequently, man replaced wer as the normal term for ‘a male person’, but at the same time the older sense ‘a human being’ remained in use. In the second half of the twentieth century the generic use of man to refer to ‘human beings in general’ (as in ‘reptiles were here long before man appeared on the earth’) became problematic; the use is now often regarded as sexist or at best old-fashioned.

Etymology is not a good representation of word meaning, but I want to point out that Hofstadter had a logical fallacy in comparing man/woman with white/black. Man did include woman at one point of time; one probably cannot say the same for white and black.

This said, the war for continued use of -man is already lost. Once aware of this issue, I do not think I want to use words like policeman again when the gender is unknown. I still do not think words like mankind, manhole, actress, or mother tongue are bad.7 The society and culture are probably a much bigger headache for women facing inequalities. . . .8

I started being angry, but ended up more understanding. And I also reached a different conclusion than I had expected. It is apparent that somebody will be offended, whether I use he, she, he or she, or they after a noun of unknown gender. I think offending grammarians would now probably be my default choice.

P.S. I have also found Professor Ellen Spertus’s article ‘Why are There so Few Female Computer Scientists?’ worth reading.9 Recommended.


  1. StackExchange discussion: Is using “he” for a gender-neutral third-person correct? Retrieved on 21 October 2017. 
  2. Henry Churchyard: Singular “their” in Jane Austen and elsewhere: Anti-pedantry page. 1999. Internet Archive. 
  3. Oxford English Dictionary. Oxford University Press, 2nd edition, 1989. 
  4. Steven Pinker: On the English singular “their” construction—from The Language Instinct. 1994. Internet Archive. 
  5. Douglas R. Hofstadter: A Person Paper on Purity in Language. 1985. Internet Archive. 
  6. Oxford Dictionary of English. Oxford University Press, macOS built-in edition, 2016. This is different from the famous OED
  7. These words are already banned in some places. See entry sexist language in R. W. Burchfield: Fowler’s Modern English Usage. Oxford University Press, revised 3rd edition, 2004. 
  8. Henry Etzkowitz et al.: Barriers to Women in Academic Science and Engineering. 1994. Internet Archive. 
  9. Ellen Spertus: Why are There so Few Female Computer Scientists? 1991. Internet Archive. 

Performance of My Line Readers

After I wrote the article about Python yield and C++ Coroutines, I felt that I needed to test the performance of istream_line_reader. The immediate result was both good and bad: good in that there was no actual difference between the straightforward std::getline and my istream_line_reader (as anticipated), and bad in that neither version performed well (a surprise to me). I vaguely remember that sync_with_stdio(false) may affect the performance, so I also tested calling this function in the beginning. However, it did not seem to matter. By the way, my favourite compiler has always been Clang recently (and I use a Mac).

Seeing that istream_line_reader had a performance problem, I tried other approaches. One thing I tried was using the traditional C I/O functions. I wrote another file_line_reader, which used either fgets or fread to read the data, depending what the delimiter is. (fgets could only use ‘\n’ as the delimiter, but it performed better than fread, for I could fgets into the final line buffer, but had to fread into a temporary buffer first.) I also added a switch on whether to strip the delimiter, something not possible with the getline function. The result achieved a more than 10x performance improvement (from about 28 MB/s to 430 MB/s). I was happy, and presented this on the last slide of my presentation on C++ and Functional Programming in the 2016 C++ and System Software Summit (China).

Until C++11, modifying the character array accessed through string::data() has undefined behaviour. To be on the safe side, I implemented a self-expanding character buffer on my own, which complicated the implementation a little bit. It also made the interface slightly different from istream_line_reader, which can be demonstrated in the following code snippets.

Iteration with istream_line_reader:

for (auto& line : istream_line_reader(cin)) {
    puts(line.c_str());
}

Iteration with file_line_reader:

for (auto& line : file_line_reader(stdin)) {
    puts(line);
}

I.e. each iteration with file_line_reader returns a char* instead of a string. This should be OK, as a raw character pointer is often enough. One can always construct a string from char* easily, anyway.


After the presentation, I turned to implementing a small enhancement—iterating over the lines with mmap. This proved interesting work. Not only did it improved the line reading performance, but the code was simplified as well. As I could access the file content directly with a pointer, I was able to copy the lines to a string simply with string::assign. As I used string again, there was no need to define a custom copy constructor, copy assignment operator, move constructor, and move assignment operator as well. The performance was, of course, also good: the throughput rate reached 650 MB/s, a 50% improvement! The only negative side was that it could not work on stdin, so testing it required more lines. Apart from that, I was quite satisfied. And I had three different line readers that could take an istream&, FILE*, or file descriptor as the input source. So all situations were dealt with. Not bad!

One thing of note about the implementation. I tried copying (a character at a time) while searching, before adopting the current method of searching first before assigning to the string. The latter proved faster when dealing with long lines. I can see two reasons:

  1. Strings are normally (and required to be since C++11) null-terminated, so copying one character at a time has a big overhead of zeroing the next byte. I confirmed the case from the libc++ source code of Clang.
  2. Assignment can use memcpy or memmove internally, which normally has a fast platform-specific implementation. In the case of string::assign(const char*, size_t), I verified that libc++ used memmove indeed.

If you are interested, this is the assembly code I finally traced into on my Mac (comments are my analysis; you may need to scroll horizontally to see them all):

libsystem_c.dylib`memcpy$VARIANT$sse42:
   0x7fff9291fcbd:  pushq  %rbp
   0x7fff9291fcbe:  movq   %rsp, %rbp
   0x7fff9291fcc1:  movq   %rdi, %r11           ; save dest
   0x7fff9291fcc4:  movq   %rdi, %rax
   0x7fff9291fcc7:  subq   %rsi, %rax           ; dest - src
   0x7fff9291fcca:  cmpq   %rdx, %rax
   0x7fff9291fccd:  jb     0x7fff9291fd04       ; dest in (src, src + len)?
   ; Entry condition: dest <= src or dest >= src + len; copy starts from front
   0x7fff9291fccf:  cmpq   $80, %rdx
   0x7fff9291fcd3:  ja     0x7fff9291fd09       ; len > 128?
   ; Entry condition: len <= 128
   0x7fff9291fcd5:  movl   %edx, %ecx
   0x7fff9291fcd7:  shrl   $2, %ecx             ; len / 4
   0x7fff9291fcda:  je     0x7fff9291fcec       ; len < 4?
   0x7fff9291fcdc:  movl   (%rsi), %eax         ; 4-byte read
   0x7fff9291fcde:  addq   $4, %rsi             ; src <- src + 4
   0x7fff9291fce2:  movl   %eax, (%rdi)         ; 4-byte write
   0x7fff9291fce4:  addq   $4, %rdi             ; dest <- dest + 4
   0x7fff9291fce8:  decl   %ecx
   0x7fff9291fcea:  jne    0x7fff9291fcdc       ; more 4-byte blocks?
   ; Entry condition: len < 4
   0x7fff9291fcec:  andl   $3, %edx
   0x7fff9291fcef:  je     0x7fff9291fcff       ; len == 0?
   0x7fff9291fcf1:  movb   (%rsi), %al          ; 1-byte read
   0x7fff9291fcf3:  incq   %rsi                 ; src <- src + 1
   0x7fff9291fcf6:  movb   %al, (%rdi)          ; 1-byte write
   0x7fff9291fcf8:  incq   %rdi                 ; dest <- dest + 1
   0x7fff9291fcfb:  decl   %edx
   0x7fff9291fcfd:  jne    0x7fff9291fcf1       ; more bytes?
   0x7fff9291fcff:  movq   %r11, %rax           ; restore dest
   0x7fff9291fd02:  popq   %rbp
   0x7fff9291fd03:  ret
   0x7fff9291fd04:  jmpq   0x7fff9291fdb9
   ; Entry condition: len > 128
   0x7fff9291fd09:  movl   %edi, %ecx
   0x7fff9291fd0b:  negl   %ecx
   0x7fff9291fd0d:  andl   $15, %ecx            ; 16 - dest % 16
   0x7fff9291fd10:  je     0x7fff9291fd22       ; dest 16-byte aligned?
   0x7fff9291fd12:  subl   %ecx, %edx           ; adjust len
   0x7fff9291fd14:  movb   (%rsi), %al          ; one-byte read
   0x7fff9291fd16:  incq   %rsi                 ; src <- src + 1
   0x7fff9291fd19:  movb   %al, (%rdi)          ; one-byte write
   0x7fff9291fd1b:  incq   %rdi                 ; dest <- dest + 1
   0x7fff9291fd1e:  decl   %ecx
   0x7fff9291fd20:  jne    0x7fff9291fd14       ; until dest is aligned
   ; Entry condition: dest is 16-byte aligned
   0x7fff9291fd22:  movq   %rdx, %rcx           ; len
   0x7fff9291fd25:  andl   $63, %edx            ; len % 64
   0x7fff9291fd28:  andq   $-64, %rcx           ; len <- align64(len)
   0x7fff9291fd2c:  addq   %rcx, %rsi           ; src <- src + len
   0x7fff9291fd2f:  addq   %rcx, %rdi           ; src <- dest + len
   0x7fff9291fd32:  negq   %rcx                 ; len <- -len
   0x7fff9291fd35:  testl  $15, %esi
   0x7fff9291fd3b:  jne    0x7fff9291fd80       ; src not 16-byte aligned?
   0x7fff9291fd3d:  jmp    0x7fff9291fd40
   0x7fff9291fd3f:  nop
   ; Entry condition: both src and dest are 16-byte aligned
   0x7fff9291fd40:  movdqa (%rsi,%rcx), %xmm0   ; aligned 16-byte read
   0x7fff9291fd45:  movdqa 16(%rsi,%rcx), %xmm1
   0x7fff9291fd4b:  movdqa 32(%rsi,%rcx), %xmm2
   0x7fff9291fd51:  movdqa 48(%rsi,%rcx), %xmm3
   0x7fff9291fd57:  movdqa %xmm0, (%rdi,%rcx)   ; aligned 16-byte write
   0x7fff9291fd5c:  movdqa %xmm1, 16(%rdi,%rcx)
   0x7fff9291fd62:  movdqa %xmm2, 32(%rdi,%rcx)
   0x7fff9291fd68:  movdqa %xmm3, 48(%rdi,%rcx)
   0x7fff9291fd6e:  addq   $64, %rcx
   0x7fff9291fd72:  jne    0x7fff9291fd40       ; more 64-byte blocks?
   0x7fff9291fd74:  jmpq   0x7fff9291fcd5
   0x7fff9291fd79:  nopl   (%rax)               ; 7-byte nop
   ; Entry condition: src is NOT 16-byte aligned but dest is
   0x7fff9291fd80:  movdqu (%rsi,%rcx), %xmm0   ; unaligned 16-byte read
   0x7fff9291fd85:  movdqu 16(%rsi,%rcx), %xmm1
   0x7fff9291fd8b:  movdqu 32(%rsi,%rcx), %xmm2
   0x7fff9291fd91:  movdqu 48(%rsi,%rcx), %xmm3
   0x7fff9291fd97:  movdqa %xmm0, (%rdi,%rcx)   ; aligned 16-byte write
   0x7fff9291fd9c:  movdqa %xmm1, 16(%rdi,%rcx)
   0x7fff9291fda2:  movdqa %xmm2, 32(%rdi,%rcx)
   0x7fff9291fda8:  movdqa %xmm3, 48(%rdi,%rcx)
   0x7fff9291fdae:  addq   $64, %rcx
   0x7fff9291fdb2:  jne    0x7fff9291fd80       ; more 64-byte blocks?
   0x7fff9291fdb4:  jmpq   0x7fff9291fcd5
   ; Entry condition: dest > src and dest < src + len; copy starts from back
   0x7fff9291fdb9:  addq   %rdx, %rsi           ; src <- src + len
   0x7fff9291fdbc:  addq   %rdx, %rdi           ; dest <- dest + len
   0x7fff9291fdbf:  cmpq   $80, %rdx
   0x7fff9291fdc3:  ja     0x7fff9291fdf6       ; len > 128?
   ; Entry condition: len < 128
   0x7fff9291fdc5:  movl   %edx, %ecx
   0x7fff9291fdc7:  shrl   $3, %ecx             ; len / 8
   0x7fff9291fdca:  je     0x7fff9291fdde       ; len < 8?
   ; Entry condition: len >= 8
   0x7fff9291fdcc:  subq   $8, %rsi             ; src <- src - 8
   0x7fff9291fdd0:  movq   (%rsi), %rax         ; 8-byte read
   0x7fff9291fdd3:  subq   $8, %rdi             ; dest <- dest - 8
   0x7fff9291fdd7:  movq   %rax, (%rdi)         ; 8-byte write
   0x7fff9291fdda:  decl   %ecx
   0x7fff9291fddc:  jne    0x7fff9291fdcc       ; until len < 8
   ; Entry condition: len < 8
   0x7fff9291fdde:  andl   $7, %edx
   0x7fff9291fde1:  je     0x7fff9291fdf1       ; len == 0?
   0x7fff9291fde3:  decq   %rsi                 ; src <- src - 1
   0x7fff9291fde6:  movb   (%rsi), %al          ; 1-byte read
   0x7fff9291fde8:  decq   %rdi                 ; dest <- dest - 1
   0x7fff9291fdeb:  movb   %al, (%rdi)          ; 1-byte write
   0x7fff9291fded:  decl   %edx
   0x7fff9291fdef:  jne    0x7fff9291fde3       ; more bytes?
   0x7fff9291fdf1:  movq   %r11, %rax           ; restore dest
   0x7fff9291fdf4:  popq   %rbp
   0x7fff9291fdf5:  ret
   ; Entry condition: len > 128
   0x7fff9291fdf6:  movl   %edi, %ecx
   0x7fff9291fdf8:  andl   $15, %ecx
   0x7fff9291fdfb:  je     0x7fff9291fe0e       ; dest 16-byte aligned?
   0x7fff9291fdfd:  subq   %rcx, %rdx           ; adjust len
   0x7fff9291fe00:  decq   %rsi                 ; src <- src - 1
   0x7fff9291fe03:  movb   (%rsi), %al          ; one-byte read
   0x7fff9291fe05:  decq   %rdi                 ; dest <- dest - 1
   0x7fff9291fe08:  movb   %al, (%rdi)          ; one-byte write
   0x7fff9291fe0a:  decl   %ecx
   0x7fff9291fe0c:  jne    0x7fff9291fe00       ; until dest is aligned
   ; Entry condition: dest is 16-byte aligned
   0x7fff9291fe0e:  movq   %rdx, %rcx           ; len
   0x7fff9291fe11:  andl   $63, %edx            ; len % 64
   0x7fff9291fe14:  andq   $-64, %rcx           ; len <- align64(len)
   0x7fff9291fe18:  subq   %rcx, %rsi           ; src <- src - len
   0x7fff9291fe1b:  subq   %rcx, %rdi           ; dest <- dest - len
   0x7fff9291fe1e:  testl  $15, %esi
   0x7fff9291fe24:  jne    0x7fff9291fe61       ; src 16-byte aligned?
   ; Entry condition: both src and dest are 16-byte aligned
   0x7fff9291fe26:  movdqa -16(%rsi,%rcx), %xmm0; aligned 16-byte read
   0x7fff9291fe2c:  movdqa -32(%rsi,%rcx), %xmm1
   0x7fff9291fe32:  movdqa -48(%rsi,%rcx), %xmm2
   0x7fff9291fe38:  movdqa -64(%rsi,%rcx), %xmm3
   0x7fff9291fe3e:  movdqa %xmm0, -16(%rdi,%rcx); aligned 16-byte write
   0x7fff9291fe44:  movdqa %xmm1, -32(%rdi,%rcx)
   0x7fff9291fe4a:  movdqa %xmm2, -48(%rdi,%rcx)
   0x7fff9291fe50:  movdqa %xmm3, -64(%rdi,%rcx)
   0x7fff9291fe56:  subq   $64, %rcx
   0x7fff9291fe5a:  jne    0x7fff9291fe26       ; more 64-byte blocks?
   0x7fff9291fe5c:  jmpq   0x7fff9291fdc5
   ; Entry condition: src is NOT 16-byte aligned but dest is
   0x7fff9291fe61:  movdqu -16(%rsi,%rcx), %xmm0; unaligned 16-byte read
   0x7fff9291fe67:  movdqu -32(%rsi,%rcx), %xmm1
   0x7fff9291fe6d:  movdqu -48(%rsi,%rcx), %xmm2
   0x7fff9291fe73:  movdqu -64(%rsi,%rcx), %xmm3
   0x7fff9291fe79:  movdqa %xmm0, -16(%rdi,%rcx); aligned 16-byte write
   0x7fff9291fe7f:  movdqa %xmm1, -32(%rdi,%rcx)
   0x7fff9291fe85:  movdqa %xmm2, -48(%rdi,%rcx)
   0x7fff9291fe8b:  movdqa %xmm3, -64(%rdi,%rcx)
   0x7fff9291fe91:  subq   $64, %rcx
   0x7fff9291fe95:  jne    0x7fff9291fe61       ; more 64-byte blocks?
   0x7fff9291fe97:  jmpq   0x7fff9291fdc5

I am happy that I can take advantage of such optimizations, but do not need to write such code on my own—there are so many different cases to deal with!


Of couse, nothing is simple regarding performance. More tests revealed more facts that are interesting and/or surprising:

  • While libc++ (it is the library, but not the compiler, that matters here) seems to completely ignore sync_with_stdio, it makes a big difference in libstdc++. The same function call gets a more than 10x performance improvement when the istream_line_reader test program is compiled with GCC (which uses libstdc++), from ~28 MB/s to ~390 MB/s. It shows that I made a wrong assumption! Interestingly, reading from stdin (piped from the pv tool) is slightly faster than reading from a file on my Mac (when compiled with GCC).
  • On a CentOS 6.5 Linux system, sync_with_stdio(false) has a bigger performance win (~23 MB/s vs. ~800 MB/s). Reading from a file directly is even faster at 1100 MB/s. That totally beats my file_line_reader (~550 MB/s reading a file directly) and mmap_line_reader (~600 MB/s reading a file directly) on the same machine. I was stunned when first seeing this performance difference of nearly 40 times!

So, apart from the slight difference in versatility, the first and simplest form of my line readers is also the best on Linux, while the mmap-based version may be a better implementation on OS X—though your mileage may vary depending on the different combinations of OS versions, compilers, and hardware. Should I be happy, or sad?


You can find the implementation of istream_line_reader among my example code for the ‘C++ and Functional Programming’ presentation, and the implementations of file_line_reader and mmap_line_reader in the Nvwa repository. And the test code is as follows:

test_istream_line_reader.cpp:

#include <fstream>
#include <iostream>
#include <string>
#include <getopt.h>
#include <stdio.h>
#include <stdlib.h>
#include "istream_line_reader.h"

using namespace std;

int main(int argc, char* argv[])
{
    char optch;
    while ( (optch = getopt(argc, argv, "s")) != EOF) {
        switch (optch) {
        case 's':
            cin.sync_with_stdio(false);
            break;
        }
    }
    if (!(optind == argc || optind == argc - 1)) {
        fprintf(stderr,
                "Only one file name can be specified\n");
        exit(1);
    }

    istream* is = nullptr;
    ifstream ifs;
    if (optind == argc) {
        is = &cin;
    } else {
        ifs.open(argv[optind]);
        if (!ifs) {
            fprintf(stderr,
                    "Cannot open file '%s'\n",
                    argv[optind]);
            exit(1);
        }
        is = &ifs;
    }

    for (auto& line : istream_line_reader(*is)) {
        puts(line.c_str());
    }
}

test_file_line_reader.cpp:

#include <stdio.h>
#include <stdlib.h>
#include <nvwa/file_line_reader.h>

using nvwa::file_line_reader;

int main(int argc, char* argv[])
{
    FILE* fp = stdin;
    if (argc == 2) {
        fp = fopen(argv[1], "r");
        if (!fp) {
            fprintf(stderr,
                    "Cannot open file '%s'\n",
                    argv[1]);
            exit(1);
        }
    }

    file_line_reader
        reader(fp, '\n',
               file_line_reader::no_strip_delimiter);
    for (auto& line : reader) {
        fputs(line, stdout);
    }
}

test_mmap_line_reader.cpp:

#include <stdio.h>
#include <stdlib.h>
#include <stdexcept>
#include <nvwa/mmap_line_reader.h>

using nvwa::mmap_line_reader;

int main(int argc, char* argv[])
{
    if (argc != 2) {
        fprintf(stderr,
                "A file name shall be provided\n");
        exit(1);
    }

    try {
        mmap_line_reader
            reader(argv[1], '\n',
                   mmap_line_reader::no_strip_delimiter);

        for (auto& str : reader) {
            fputs(str.c_str(), stdout);
        }
    }
    catch (std::runtime_error& e) {
        puts(e.what());
    }
}

Universal Force of LOVE

A purported letter of Albert Einstein to his daughter Lieserl has been circulating in the WeChat Moments recently. It can be summarized as: ‘The universal force is LOVE.’ I was sceptical immediately—as a physics major, I could hardly imagine that our grand master could utter such nonsense (sorry to folks who happen to like the letter). He made mistakes, he could be sentimental, but it was beyond my imagination to assume that he had generated this kind of ‘chicken soup for the soul’.

I am slightly comforted to see my search results indicate that the letter did not originate in China. It could be found here and here. People have already been discussing it, and it was quite obvious to me that the letter was a fake. A conclusive article appeared on The Huffington Post web site. Katharine Rose discussed the letter, and mentioned that she could not find anything in the online Albert Einstein archives. She also got a response from Diana Kormos-Buchwald, director and editor of the Einstein Papers Project, who clearly stated:

This document is not by Einstein. The family letters donated to the Hebrew University—referred to in this rumor—were not given by Lieserl. They were given by Margot Einstein, who was Albert Einstein’s stepdaughter. Many of those letters were published in Volume 10 of The Collected Papers of Albert Einstein in 2006 and in subsequent volumes, in chronological order.

Interestingly, Katharine Rose thought the letter was ‘seemingly written by Albert Einstein’, and thought it was ‘a beautiful read, offering a universal message that speaks to the essence of the human condition and our incessant yearning to believe in love’s conquering force’. I definitely could not agree. Fortunately, she was rational enough to investigate further. She said it was most important that ‘we always remember and strive to seek the truth in all things’, that ‘we not shy away from asking questions and challenging notions’, and that ‘we remain curious’.—I could not agree more.

The same cannot be said about some other bloggers and commentators. People have used the fake letter to strengthen their faith. Even when challenged about the truthfulness of the letter, one blogger said:

The message is powerful and I believe it to be true. Whether Einstein wrote it or not, some Genius did and I would think that Genius would have shown him/herself by now to claim this letter as theirs.

The faith is stronger than reason. They simply ignored the fact that the message could not have circulated that much, if no one claimed it had been written by Einstein. And I think it is naïve to suppose love can solve problem automatically (unlike Ms Rose, I could not concur with the sentiments). Hey, maybe I should agree that the author is a genius, not of writing, but of psychology. I even think the letter could be a bait, considering that the author used the name Lieserl, who never grew up …

Anyway, it is amazing to see so many people enjoy the chicken soup:

In awe of Mr. Einstein’s brilliance which is just as relative today as it was when he wrote the letter. No question that his words will have the same wonderful ‘light’ decades in the future as well. Thanks for sharing, Sue. Light and love to you.

How wonderful, “God is love and love is God”. The great scientist concluded this! Love is a powerful force that unites and i [sic] think Love alone will bring Peace upon the planet.

Amen. Men of science will come to know what men of faith have always known.

I do not see that anybody contest the value of this massage [sic], because it is uncontestable.

Some people may read into this sentimentality, but I think that Einstein was onto something much more profound. I’m thinking that this universal force, when focused on loving others, is what can eventually overcome all obstacles in one’s own mind (soul) and others. How do we develop pure love? Worth contemplating.

I do not think I stand a chance of persuading them the other way.

I cannot help thinking about one famous quote attributed to Einstein, which is probably false, but more like what he might have said:

Two things are infinite, the universe and human stupidity, and I am not yet completely sure about the universe.

A Complaint of ODF’s Asian Language Support

I have recently read about news about better support for ODF from Google. The author then went on to complain that neither Google nor Microsoft makes ‘it easy to use ODF as part of a workflow’. This reminds me that maybe I should write down a long-time complaint I have for ODF.

I have always loved open standards. However, there are not only open and proprietary standards, there are also good and bad standards. ODF looks pretty bad regarding Asian language support. It can be powerfully demonstrated by this image:

ODF Issue

If you are interested in it, you can download the document yourself. It simply contains four lines:

  • The first line has a left quotation mark, the English word ‘Test’, and the corresponding Chinese word. It looks OK.
  • The second line is a duplication of the first line, with an additional colon added at the beginning. It immediately changes the font of the left quotation mark.
  • The third line is a duplication of the second line, with the Chinese word removed. Both quotation marks are now using the default Western font ‘Times New Roman’.
  • The fourth line is a duplication of the third line, with the leading colon removed. Weirdly enough, the left quotation mark now uses the Chinese font. (This may be related to my using the Chinese OpenOffice version or Chinese Windows OS.)

Is it ridiculous that adding or removing a character can change how other characters are rendered? Still, I would not blog about it, if it had only been a bug in OpenOffice (actually I filed three bug reports back in 2006—more ancient than I thought—and this bug remains unfixed ). It actually seems a problem in the ODF standard. After extracting the content from the .ODT file (as a zip file), I can shrink the content of the document to these XML lines (content.xml with irrelevant contents removed and the result reformatted):

<office:font-face-decls>
<style:font-face
    style:name="Times New Roman"
    style:font-family-generic="roman"
    style:font-pitch="variable"/>
<style:font-face
    style:name="宋体"
    style:font-family-generic="system"
    style:font-pitch="variable"/>
</office:font-face-decls>
<office:automatic-styles>
<style:style
    style:name="P1"
    style:family="paragraph"
    style:parent-style-name="Standard">
<style:text-properties
    fo:font-size="12pt" fo:language="en" fo:country="GB"
    style:language-asian="zh" style:country-asian="CN"/>
</style:style>
</office:automatic-styles>
<office:body>
<office:text>
<text:p text:style-name="P1">“Test测试”</text:p>
<text:p text:style-name="P1">:“Test测试”</text:p>
<text:p text:style-name="P1">:“Test”</text:p>
<text:p text:style-name="P1">“Test”</text:p>
</office:text>
</office:body>

The problem is that instead of specifying a single language on any text, it specifies both a ‘fo:language’ and a ‘style:language-asian’. The designer of this feature definitely did not think carefully about the fact that many symbols exist in both Asian and non-Asian contexts and can often be rendered differently!

When I repeated the same process in Microsoft Word (on Windows), all text appeared correctly—Microsoft applications recognize which keyboard I use and which language it represents. Pasting as plain text introduced one error (as no language information is present). Even in that case, fixing the problem is easier. In OpenOffice I have to change the font manually, but in Microsoft Word I only need to specify the correct language (‘Office, this is English, not Chinese’). It is much more intuitive and natural.

I also analysed the XML in the resulting .DOCX file. Its styles.xml contained this:

<w:lang w:val="en-US" w:eastAsia="zh-CN" w:bidi="ar-SA"/>

So these are default languages. I had to use UK English and Traditional Chinese to force Word to specify the languages in the document explicitly. The embedded document.xml now contains content like the following:

<w:p>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="PMingLiU" w:hint="eastAsia"/>
<w:lang w:eastAsia="zh-TW"/>
</w:rPr>
<w:t>“</w:t>
</w:r>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="PMingLiU"/>
<w:lang w:val="en-GB" w:eastAsia="zh-TW"/>
</w:rPr>
<w:t>Test</w:t>
</w:r>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="PMingLiU" w:hint="eastAsia"/>
<w:lang w:eastAsia="zh-TW"/>
</w:rPr>
<w:t>測試”</w:t>
</w:r>
</w:p>
...
<w:p>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="PMingLiU"/>
<w:lang w:val="en-GB" w:eastAsia="zh-TW"/>
</w:rPr>
<w:t>“Test”</w:t>
</w:r>
</w:p>

We can argue the structure is somewhat similar (compare ‘w:val’ in <w:lang> with ‘fo:language’ and ‘fo:country’, and ‘w:eastAsia’ with ‘style:language-asian’ and ‘style:country-asian’), but the semantics are obviously different, and text of different languages is not mixed together. The English text has the language attribute <w:lang w:val="en-GB" w:eastAsia="zh-TW"/>, and the Chinese text has only <w:lang w:eastAsia="zh-TW"/>. It looks to me a more robust approach to processing mixed text.

Although it might be true that Microsoft lobbied strongly to get OOXML approved as an international standard, I do not think ODF’s openness alone is enough to make people truly adopt it.