JSC NaN Box Exercise
Description
In JSC, JSValues are "NaN-Boxed values". NaN-Boxing is a technique for storing a type and value in the same 64 bits. This allows the value to be stored in a single register and avoids the need for heap allocation.
This scheme allows the engine to store 3 different types in the same 64 bits of space.
Integers are encoded as:
nanboxed_jsval = some_int | 0xffff000000000000
Pointers are encoded as:
nanboxed_jsval = some_ptr | 0
(note that this is the same as a normal pointer)Doubles are encoded as:
nanboxed_jsval = reinterpret_cast<uint64_t>(some_double) + 0x1000000000000
The inverse decoding is summarized by the following logic:
switch (val>>48):
0 -> ptr
0xffff -> 32-bit int
else -> double (subtract 1<<48 to get actual value)
Pointer { 0000:PPPP:PPPP:PPPP
/ 0001:****:****:****
Double { ...
\ FFFE:****:****:****
Integer { FFFF:0000:IIII:IIII
Steps
First let's practice decoding some NaNBoxed JSValues:
- What is the decoded value of this JSValue:
0xffff000000001337
Reveal Answer
The integer0x1337
- What is the decoded value of this JSValue:
0x4029b0a3d70a3d71
Reveal Answer
The double 12.345 (note you need to subtract0x1000000000000
before decoding which gives you0x4028b0a3d70a3d71
https://float.exposed/0x4028b0a3d70a3d71 )
For JavaScript objects, the properties are NaNBoxed. If we construct an object with several properties of different types, we should be able to see how they are encoded in memory.
WarningIf you haven't already, read through the VM overview to understand the VM / exercise setup.
Run the JSC REPL under GDB:
exercise run jsc --gdb
WarningJSC will exit if given an empty line of input (i.e. just hitting enter), which can be frustratingly easy to do accidentally! (especially in gdb after continuing, as the prompt string
>>>
may be absent)
Create an object with several properties of different types. Make sure to have at least one small-ish integer (< 31 bits), a floating point number, and a reference. For example
foo = {a:0x414243, b:{}, c:1.1, d:null}
.Print the address of the object by calling
describe(o)
Break in gdb (with ctrl-c) and print memory at the object's address (e.g.
x/16gx <object-addr>
).Locate the integer property. Does it match the NaNBoxed value you expected?
Locate the double property. Notice how it has the extra
0x1000000000000
bit compared to the expected double value. For example1.1
double encodes to0x3ff199999999999a
, but the NaNBoxed version is0x3ff199999999999a + 0x1000000000000 = 0x3ff299999999999a
.Locate the reference property. As you can see pointers just look like pointers. You don't have to do anything to decode them yourself. (Note: Even though they look like normal pointers, the engine will still check the NaNBox encoding before using them as such)