TONT 42253&42223 低科技可用性测试、为什么地址空间的分配粒度是64K？

TONT 42253内容过于短小，故与TONT 42223合并为一篇。

TONT 42253 Low-tech usability testing

原文地址：https://blogs.msdn.microsoft.com/oldnewthing/20031007-00/?p=42253

My pal Jason Moore discusses using paper prototypes as a fast way to get usability feedback. I found it interesting that by going low-tech, you actually get better feedback, because people are more willing to criticize a paper model than running code. (And another advantage of the paper model is that you can make changes on the fly. If during the session you get the idea, “Maybe if I did it this way,” you can grab a piece of paper, write on it, and insert it into the session instantly. Try doing that with running code.)

我的朋友Jason Moore与我讨论了一番使用纸制原型作为获取可用性反馈的快速方法。有趣的是，我发现用这种看上去“低科技”的方法，事实上可以获得质量更高的反馈，因为人们比起批评跑起来的代码，更有意愿批评纸模。（纸模另一个好处是可以随时做出更改。如果在测试过程中冒出来个点子，“如果我这么设计的话……”，可以拿过一张纸来，把想法写下来，然后立刻插入到（纸模原型的）测试流程中去。想想看如果是代码的话这样做会怎么样。）

TONT 42223 Why is address space allocation granularity 64K?

原文地址：https://blogs.msdn.microsoft.com/oldnewthing/20031008-00/?p=42223

You may have wondered why VirtualAlloc allocates memory at 64K boundaries even though page granularity is 4K.

不知你是否好奇过为什么VirtualAlloc在64K边界分配内存，即使页的粒度是4K。

You have the Alpha AXP processor to thank for that.

感谢Alpha AXP处理器吧。

On the Alpha AXP, there is no “load 32-bit integer” instruction. To load a 32-bit integer, you actually load two 16-bit integers and combine them.

在Alpha AXP处理器上不存在『加载32位整数』这种指令。要加载一个32位整数，事实上你得加载两个16位整数，然后把它们拼在一起。

So if allocation granularity were finer than 64K, a DLL that got relocated in memory would require two fixups per relocatable address: one to the upper 16 bits and one to the lower 16 bits. And things get worse if this changes a carry or borrow between the two halves. (For example, moving an address 4K from 0x1234F000 to 0x12350000, this forces both the low and high parts of the address to change. Even though the amount of motion was far less than 64K, it still had an impact on the high part due to the carry.)

所以如果分配粒度比64K精细，某个在内存中被重定位的DLL的每个重定向地址就需要2个修正值：一个用于高16位，另一个用于低16位。而且，当涉及到对这两个『半块』地址进行迁移或借用等修改时，事情还会变得更糟糕。（例如，将一个4K地址从0x1234F000移动到0x12350000，这将导致整个地址的高位和地位部分被迫都进行更新。尽管总体上移动的量远小于64K，由于上述设计的影响，该移动操作对高位部分的影响仍然很显著。）

But wait, there’s more.

不过，稍等，还有呢。

The Alpha AXP actually combines two signed 16-bit integers to form a 32-bit integer. For example, to load the value 0x1234ABCD, you would first use the LDAH instruction to load the value 0x1235 into the high word of the destination register. Then you would use the LDA instruction to add the signed value -0x5433. (Since 0x5433 = 0x10000 – 0xABCD.) The result is then the desired value of 0x1234ABCD.

Alpha AXP实际上是用两个有符号16位整数来组成一个32位整数的。例如，要加载值0x1234ABCD，需要先用LDAH指令将0x1235加载到目标寄存器的高位部分，然后再用LDA指令（为该寄存器）加上一个有符号值-0x5433（因为0x5433=0x10000-0xABCD），如此得到的结果就是所要的值0x1234ABCD。

LDAH t1, 0x1235(zero) // t1 = 0x12350000
LDA t1, -0x5433(t1) // t1 = t1 – 0x5433 = 0x1234ABCD

So if a relocation caused an address to move between the “lower half” of a 64K block and the “upper half”, additional fixing-up would have to be done to ensure that the arithmetic for the top half of the address was adjusted properly. Since compilers like to reorder instructions, that LDAH instruction could be far, far away, so the relocation record for the bottom half would have to have some way of finding the matching top half.

所以，如果重定位操作使某地址在64K内存区块的『下位部分』和『上位部分』间发生了移动，会有额外的修补来确保地址『上位部分』计算的调整是正确的。鉴于编译器们喜欢对指令进行重新排序（译注：“优化”），LDAH指令可能在很远很远的地方（译注：而不是上面例子中所说的先执行），所以对下位部分的重定位操作需要有办法来确定与其相符的上位部分。

What’s more, the compiler is clever and if it needs to compute addresses for two variables that are in the same 64K region, it shares the LDAH instruction between them. If it were possible to relocate by a value that wasn’t a multiple of 64K, then the compiler would no longer be able to do this optimization since it’s possible that after the relocation, the two variables no longer belonged to the same 64K block.

编译器另一点聪明的地方在于，如果需要计算2个变量的地址，而这两个变量又在同一个64K区块内，这两次计算将共用一个LDAH指令。如果对重定位操作允许在不是64K倍数的区域内进行操作，编译器则无法进行（共用LDAH指令的）优化，因为在进行重定位操作后，两个变量将有可能不在同一个64K区块内。

Forcing memory allocations at 64K granularity solves all these problems.

强制内存分配的粒度为64K可以解决以上所有的问题。

If you have been paying really close attention, you’d have seen that this also explains why there is a 64K “no man’s land” near the 2GB boundary. Consider the method for computing the value 0x7FFFABCD: Since the lower 16 bits are in the upper half of the 64K range, the value needs to be computed by subtraction rather than addition. The naïve solution would be to use

如果你观察足够仔细的话，上述理论也能帮你理解为什么在2GB内存的边缘有一块64K的『无人之境』。假设有一个方法是要计算0x7FFFABCD这个值，那么：由于低16位在64K区域的上位部分，这个值需要用减法而不是加法来进行组合计算。理想的解法是：

LDAH t1, 0x8000(zero) // t1 = 0x80000000, right?（对吧？）
LDA t1, -0x5433(t1) // t1 = t1 – 0x5433 = 0x7FFFABCD, right?（是吧？）

Except that this doesn’t work. The Alpha AXP is a 64-bit processor, and 0x8000 does not fit in a 16-bit signed integer, so you have to use -0x8000, a negative number. What actually happens is

然鹅这并没有什么卵用——Alpha AXP是64位处理器，0x8000在16位有符号整数中是无效的，所以只能使用-0x8000，实际运行的结果是：

LDAH t1, -0x8000(zero) // t1 = 0xFFFFFFFF`80000000
LDA t1, -0x5433(t1) // t1 = t1 – 0x5433 = 0xFFFFFFFF`7FFFABCD

You need to add a third instruction to clear the high 32 bits. The clever trick for this is to add zero and tell the processor to treat the result as a 32-bit integer and sign-extend it to 64 bits.

结果你还得再写一条指令来清除多出来的高32位的值。聪明的做法是：先+0，然后告诉处理器将结果视作32位整数，再带符号扩展到64位：

ADDL t1, zero, t1 // t1 = t1 + 0, with L suffix（并附加L后缀）
// L suffix means sign extend result from 32 bits to 64（L后缀意为将32位结果带符号扩展到64位）
// t1 = 0x00000000`7FFFABCD

If addresses within 64K of the 2GB boundary were permitted, then every memory address computation would have to insert that third ADDL instruction just in case the address got relocated to the “danger zone” near the 2GB boundary.

如果在2GB边缘的64K进行地址运算是允许的，那么所有的内存地址计算都得带上第三个ADDL指令，只为了防止地址被重定位到2GB边缘附近的“危险地带”里去。

This was an awfully high price to pay to get access to that last 64K of address space (a 50% performance penalty for all address computations to protect against a case that in practice would never happen), so roping off that area as permanently invalid was a more prudent choice.

对于访问最后这64K地址空间这件事来说，这算是相当大的代价（由于需要保护地址计算结果的正确性，大约会造成50%的性能下降，然而实际操作中这种情况永远都不会发生），故而将最后这块地方隔离开来，并标记其为永远不可用，是更为精明的一种做法。

TONT 42253&42223 低科技可用性测试、为什么地址空间的分配粒度是64K？

注：请不要在评论内容的任何位置出现链接，否则您的评论将被自动移入回收站，且永远不会被复审。

Please do not put any link anywhere in your comment, or it will be automatically deleted and never be reviewed.

发表回复取消回复

注：请不要在评论内容的任何位置出现链接，否则您的评论将被自动移入回收站，且永远不会被复审。

Please do not put any link anywhere in your comment, or it will be automatically deleted and never be reviewed.

发表回复 取消回复

发表回复取消回复