JavaScriptCore Engine Internals
As we did with V8, this chapter will explain how JavaScriptCore (usually abbreviated JSC) chooses to implement core features of JavaScript.
JSC NaN Boxing
Among the first things we explored in V8 was how it differentiates between pointers and doubles. There, a method called "pointer-tagging" was used, but JSC uses a different approach, called NaN Boxing.
The main idea behind NaN Boxing is to cleverly utilize the fact that NaN has many different bit pattern representations. These bit patterns can all be normalized to one true NaN.
An IEEE 754 double is NaN if the following is true:
[sign] [exponent] [significand]
* 11111111111 ****************************************************
signficand != 0
JSC normalized NaN:
0 11111111111 1000000000000000000000000000000000000000000000000000
JSC range of non-NaN doubles:
0 00000000000 0000000000000000000000000000000000000000000000000000
... Example: 0x3ff199999999999a
1 11111111111 0000000000000000000000000000000000000000000000000000
(anything greater than this would become NaN)
Suppose for all doubles, we perform an addition of 1<<48
or 0x0001000000000000
,
the goal being to give doubles and pointers distinct high bit patterns.
To top it off, add 32-bit integers with a yet-unused pattern for the high bits:
JSC Objects (pointers):
0 00000000000 0000************************************************ Example: 0x00007f48991e4158
[BOXED] JSC normalized NaN:
0 11111111111 1001000000000000000000000000000000000000000000000000
[BOXED] JSC range of non-NaN doubles:
0 00000000000 0001000000000000000000000000000000000000000000000000
... Example: 0x3ff299999999999a
1 11111111111 0001000000000000000000000000000000000000000000000000
[BOXED] JSC integers:
1 11111111111 11110000000000000000******************************** Example: 0xffff000041424344
We've ended up with 3 types representible in 64-bits, with the type implied by the high 16 bits:
switch (val>>48):
0 -> ptr
0xffff -> 32-bit int
else -> double (subtract 1<<48 to get actual value)
Pointer { 0000:PPPP:PPPP:PPPP
/ 0001:****:****:****
Double { ...
\ FFFE:****:****:****
Integer { FFFF:0000:IIII:IIII
Note that the only edge case for adding 1<<48
to double bit patterns is the value "becoming" an integer if the high bits
become ffff
(or more dangerously, become 0000
due to overflow).
However, all such unsafe bit patterns are various representations of NaN
(with different significands), which all become the normalized pattern that can perform the addition safely.
NoteWebKit has since changed the NaN-boxing scheme slightly to use one less bit, meaning
1<<49
or0x0002000000000000
is added/subtracted to doubles. This doesn't affect exploit development other than changing a constant or two.
JSC NaN Boxing Exercise
Objects in JSC
The simplest object class in JSC is JSCell
.
C++class JSCell : public HeapCell {
...
StructureID m_structureID;
IndexingType m_indexingType;
JSType m_type;
TypeInfo::InlineTypeFlags m_flags;
CellState m_cellState;
...
}
You may notice that there isn't a Map
pointer like V8. JSC uses a different strategy, namely
the concept of structures and structureID
s to keep track of Objects' types.
Below is a diagram showing some more details about the JSCell
Header:
JSCell 8 Byte header
00: [ StructureID ] <-- Index into StructureIDTable for Structure object
04: [ Indexing Type ] <-- Storage mode for elements
05: [ Cell Type ] <-- JS Type (e.g. object, string, function, array, ...)
06: [ Type Flags ] <-- Some inline flags for object type
07: [ Cell State ] <-- Garbage collector flags
JSC Structure IDs
Instead of a pointer to another class that contains type information, JSC stores an index
into the StructureIDTable
which contains Structure
objects. Those Structure
objects
are what actually contain the type information for the object.
Below are some snippets of relevant code from the JSC source:
C++typedef uint32_t StructureID;
class StructureIDTable {
UniqueArray<StructureOrOffset> m_table;
}
inline Structure* StructureIDTable::get(StructureID structureID)
{
ASSERT_WITH_SECURITY_IMPLICATION(structureID);
ASSERT_WITH_SECURITY_IMPLICATION(!isNuked(structureID));
ASSERT_WITH_SECURITY_IMPLICATION(structureID < m_capacity);
uint32_t structureIndex = structureID >> s_numberOfEntropyBits;
RELEASE_ASSERT_WITH_SECURITY_IMPLICATION(structureIndex < m_capacity);
return decode(table()[structureIndex].encodedStructureBits, structureID);
}
JSC Structure
Now that we've described the machinery involved in storing type information, let's take a look
at JSC's Structure
class:
C++class Structure final : public JSCell {
...
uint8_t m_inlineCapacity;
WriteBarrier<Unknown> m_prototype;
const ClassInfo* m_classInfo;
StructureTransitionTable m_transitionTable;
...
}
There are many interesting properties here, but we will focus on m_transitionTable
.
JSC Transitions
In V8, we saw how the engine keeps track of the "shape" of objects via their Map
. When an
object changes in some way, its associated Map
is updated using Map Transitions, or an
entirely new Map
is created if necessary.
JSC solves this in a similar way, but uses Property Transitions:
C++Structure* Structure::addPropertyTransition(VM& vm, Structure* structure, PropertyName
propertyName, unsigned attributes, PropertyOffset& offset)
{
Structure* newStructure =
addPropertyTransitionToExistingStructure(structure, propertyName, attributes, offset);
if (newStructure)
return newStructure;
return addNewPropertyTransition(
vm, structure, propertyName, attributes, offset, PutPropertySlot::UnknownContext);
}
An important difference is that JSC does not perform a transition when the type of a property
changes. This is due to properties already being stored as generic JSValue
s.
In the the past (removed in 2019), these transitions were also used for tracking type generalizations in the JIT compiler.
JSC JSObject
As in V8, Objects in JSC are represented by a class called JSObject
. We can see part of its
definition below:
C++class JSObject : public JSCell {
...
AuxiliaryBarrier<Butterfly*> m_butterfly;
...
inline size_t JSObject::offsetOfInlineStorage()
{
return sizeof(JSObject);
}
}
Beyond a normal JSCell
, JSObject
s also contain a Butterfly
pointer and (optionally)
inline properties:
00: [ JSCell Header ]
08: [ Butterfly* ] <-- Pointer to butterfly structure
10: [ Inline Properties ] <-- Fast property values stored inline
JSC Butterfly
The Butterfly is a somewhat exotic data structure unique to JSC. It is a structure that is used to hold both the (out-of-line) properties and the elements for an object.
[ <------------- Properties ] [ Elements Length ] [ Element Array -----------> ]
/|\
|
m_butterfly
The Butterfly can be expanded dynamically for more properties to the left or elements to the right, with the butterfly
pointer (usually m_butterfly
) pointing into the middle of the structure.
The name is related to the left/right expansion being somewhat like a butterfly opening its wings.
We can see some relevant code below:
C++class Butterfly {
IndexingHeader* indexingHeader() { return IndexingHeader::from(this); }
...
}
class IndexingHeader {
...
union {
struct {
// The meaning of this field depends on the array type, but for all
// JSArrays we rely on this being the publicly visible length (array.length)
uint32_t publicLength;
// The length of the indexed property storage. The actual size of the
// storage depends on this, and the type.
uint32_t vectorLength;
} lengths;
struct {
ArrayBuffer* buffer;
} typedArray;
} u;
}
Importantly, there are two different Lengths to pay attention to:
publicLength
- the semantic length of the element array (i.e. maximum filled index + 1)vectorLength
- the allocated capacity of the element array
As mentioned above, m_butterfly
itself points to the start of the element array, rather than
to the "start" of the structure as one normally thinks of it.
Butterfly Exercise
JSC JSArray
Arrays in JSC follow a similar pattern to that of V8: they are essentially regular Objects with a
few things changed. The function to get an "array-like" object's length
is defined by JSObject
and applies the same to both arrays and all objects with
elements (indexed properties):
C++class JSObject : public JSCell {
unsigned getArrayLength() const
{
if (!hasIndexedProperties(indexingType()))
return 0;
return m_butterfly->publicLength();
}
}
// JSNonFinalObject is a type of JSObject that has some internal storage,
// but also preserves some space in the collector cell for additional
// data members in derived types.
class JSNonFinalObject : public JSObject { ... }
class JSArray : public JSNonFinalObject {
...
unsigned length() const { return getArrayLength(); }
...
}
Notably, this means JSC does not store length inline like Arrays do in V8. Instead, the standard
butterfly lengths apply, i.e. publicLength
.
Indexing Type
The IndexingType
(part of the JSCell
header) is how the engine decides how to access elements in m_butterfly
. This is
analogous to the concept of Elements Kind
in V8. Under certain circumstances, it becomes
possible for the engine to perform optimizations for certain element types.
C++class JSCell : public HeapCell {
...
IndexingType m_indexingTypeAndMisc;
...
}
Below is a list of some values for IndexingType
:
C++typedef uint8_t IndexingType;
...
static const IndexingType Int32Shape = 0x04;
static const IndexingType DoubleShape = 0x06;
static const IndexingType ContiguousShape = 0x08;
...
static const IndexingType NonArrayWithInt32 = Int32Shape;
static const IndexingType NonArrayWithDouble = DoubleShape;
static const IndexingType NonArrayWithContiguous = ContiguousShape;
...
static const IndexingType ArrayWithInt32 = IsArray | Int32Shape;
static const IndexingType ArrayWithDouble = IsArray | DoubleShape;
static const IndexingType ArrayWithContiguous = IsArray | ContiguousShape;
...
These indexing types are similar to the element kinds we've seen in V8:
*WithInt32
for integer elements, like*_SMI_ELEMENTS
*WithDouble
for native doubles, like*_DOUBLE_ELEMENTS
*WithContiguous
for genericJSValue
elements, like*_ELEMENTS
We can create various kinds of Arrays and watch how the IndexingType
changes:
>>> a=[1.1,1.1,1.1]
>>> describe(a)
Object: 0x7f4c0ecb4340 ... Array, {}, ArrayWithDouble, ...
>>> a=[1.1]; a[3]=1.1
>>> describe(a)
Object: 0x7f4c0ecb4350 ... Array, {}, ArrayWithDouble, ...
>>> a=[{},1.1]
>>> describe(a)
Object: 0x7f4c0ecb4360 ... Array, {}, ArrayWithContiguous, ...
>>> a=[0x41424344, 0x51525354]
>>> describe(a)
Object: 0x7f4c0ecb4390 ... Array, {}, CopyOnWriteArrayWithInt32, ...
WithDouble
Indexing Type WithDouble
allows doubles to be un-NaN-boxed, for the same reason we could
drop pointer tagging / "boxing" our doubles in V8. That is, the engine knows all the
elements will be doubles:
>>> a = [1.1,1.1,{}]
>>> describe(a)
Object: ... with butterfly 0x7f48991e4158 ... Array, {}, ArrayWithContiguous, ...
pwndbg> x/6xg 0x7f48991e4158-8
0x7f48991e4150: 0x0000000500000003 0x3ff299999999999a
0x7f48991e4160: 0x3ff299999999999a 0x00007f4c0ecb00c0
0x7f48991e4170: 0x0000000000000000 0x0000000000000000
>>> a = [1.1,1.1,1.1]
>>> describe(a)
Object: ... with butterfly 0x7f48991e4128 ... Array, {}, ArrayWithDouble, ...
pwndbg> x/6xg 0x7f48991e4128-8
0x7f48991e4120: 0x0000000500000003 0x3ff199999999999a
0x7f48991e4130: 0x3ff199999999999a 0x3ff199999999999a
0x7f48991e4140: 0x7ff8000000000000 0x7ff8000000000000
Boxed 0x3ff299999999999a vs unboxed 0x3ff199999999999a
Element Array Holes
Unlike V8, there is no differentiation between "packed" and "holey" elements.
The encoding of "holes" in the element array changes based on the IndexingType
;
NaN for doubles, zeroes otherwise:
>>> a = [1.1]; a[3] = 1.1
>>> describe(a)
Object: ... with butterfly 0x7f48991e4188 ... Array, {}, ArrayWithDouble, ...
pwndbg> x/6xg 0x7f48991e4188-8
0x7f48991e4180: 0x0000000500000004 0x3ff199999999999a
0x7f48991e4190: 0x7ff8000000000000 0x7ff8000000000000
0x7f48991e41a0: 0x3ff199999999999a 0x7ff8000000000000
>>> a = [1.1]; a[3] = {}
>>> describe(a)
Object: ... with butterfly 0x7f48991e41b8 ... Array, {}, ArrayWithContiguous, ...
pwndbg> x/6xg 0x7f48991e41b8-8
0x7f48991e41b0: 0x0000000500000004 0x3ff299999999999a
0x7f48991e41c0: 0x0000000000000000 0x0000000000000000
0x7f48991e41d0: 0x00007f4c0ecb0100 0x0000000000000000
WithDouble: 0x7ff8000000000000 vs WithContiguous: 0x0000000000000000
The same value is used for unused space at the end of the vector.
Array Storage
As in V8, if we try to store elements in a way that is too sparse, the engine will "switch modes" and use a hashmap implementation rather than continuing to use a simple vector.
JavaScript>>> a=[1.1];a[10000]=1.1
>>> describe(a)
Object: 0x7f4c0ecb43a0 ... Array, {}, ArrayWithArrayStorage, ...
In the code above, we see the Object is using ArrayWithArrayStorage
, which we can see
is essentially a wrapper for a hashmap:
C++struct ArrayStorage {
...
WriteBarrier<SparseArrayValueMap> m_sparseMap;
...
}
Indexing Type Exercise
JSC JSArrayBuffer
In JSC, a JSArrayBuffer
is essentially just a container class holding a reference to an
ArrayBuffer *
:
C++class JSArrayBuffer final : public JSNonFinalObject {
...
Poisoned<JSArrayBufferPoison, ArrayBuffer*> m_impl;
...
}
The associated structure-diagram:
00: [ JSCell Header ]
08: [ Butterfly* ]
10: [ ArrayBuffer* ] <--- This is a pointer to the enclosed array buffer
18: [ Inline Properties ]
JSC JSArrayBufferView
JSArrayBufferView
is more interesting, as the data pointer is stored inline:
C++class JSArrayBufferView : public JSNonFinalObject {
...
VectorPtr m_vector;
uint32_t m_length;
TypedArrayMode m_mode;
...
}
Whenever we have an inline pointer like this, it makes the object potentially useful for exploitation.
00: [ JSCell Header ]
08: [ Butterfly* ]
10: [ Backing Store* ] <--- Pointer to the backing data buffer
18: [ Byte Length ] <--- Size of the buffer in bytes
1c: [ Array Mode ] <--- What type of backing allocation
20: [ Inline Properties ]
Key Points
JSC Objects store type information via a Structure
pointer indexed by the structureID
- Largely performs the function of
Map
from V8
JSC Structure transitions do not track value types
Properties stored with Elements in Butterfly
- Length of Elements stored in Butterfly
- Switches to ArrayStorage for sparse arrays
Indexing Type controls how elements are stored
- Analogous to Element Kind in V8