🔗 DOM Events

In this module, we will introduce DOM Events, how they work, and why they were / are relevant for vulnerability research against browsers.

The exact types of bugs discussed here have mostly been eliminated from modern browsers. However, it is still valuable to learn about them, as DOM Event bugs can provide useful context and experience.

🔗 Event Introduction

At their most basic level, Events signal that something has occurred. They are dispatched to objects that implement an EventTarget interface. Those objects can then react to whatever the particular event was.

Some common types of actions that trigger an Event:

Clicking an element with the mouse
Pressing a key (keydown/keyup/etc)
Network State Changes (online/offline)
Page display actions (scrolling/resizing/fullscreen/etc)

There are many, many, many more!

🔗 Blink DOM Hierarchy

For reference, these are the classes representing Blink's DOM nodes.

C++
class CORE_EXPORT EventTarget : public ScriptWrappable {...}

// A Node is a base class for all objects in the DOM tree.
// The spec governing this interface can be found here:
// https://dom.spec.whatwg.org/#interface-node
class CORE_EXPORT Node : public EventTarget {...}

// HTMLElement class defined for Chromium
class CORE_EXPORT ContainerNode : public Node { ... }
class CORE_EXPORT Element : public ContainerNode {...}
class CORE_EXPORT HTMLElement : public Element { ... }
// All the HTML tags are subclasses of HTMLElement

// Root document class for Chromium
class CORE_EXPORT Document : public ContainerNode, ... { ... }

// The Text class defined for Chromium
class CORE_EXPORT CharacterData : public Node { ... }
class CORE_EXPORT Text : public CharacterData { ... }

🔗 DOM Event Introduction

EventTarget objects can observe events (and react to them!) by registering "Event Listeners" from JavaScript. In code, registering and listening for events looks something like the following:

javascript
someElement.addEventListener('click', (mouseEvent) => {
    console.log("someElement received a 'Click' Event!");

    if (mouseEvent.altKey) {
        console.log("Looks like the ALT key was pressed too!");

        doSomethingInteresting();
    }
});

Our callback will sit un-executed until a click event is generated by the browser, at which point our code will have an opportunity to run.

Another interesting, if less utilized, feature is the ability to dispatch events as well:

javascript
someElement.addEventListener('click', (mouseEvent) => {
    console.log("someElement received a 'Click' Event!");

    if (mouseEvent.altKey) {
        console.log("Looks like the ALT key was pressed too!");

        doSomethingInteresting();
    }
});

// Let's trigger the ALT+MouseClick functionality !!
someElement.dispatchEvent(new MouseEvent('click', {'altKey':true});

These are known as "synthetic" events, as they were generated programmatically rather than as a result of an explicit user-action, which "normally" triggers events.

🔗 DOM Events in Exploitation

Before continuing, it is worth considering why DOM Events are interesting in an offensive context at all. Broadly speaking:

Event Handlers can run at almost any time
Dispatching events can cause the engine to jump back into JavaScript
Common source of vulnerabilities in years past

Essentially, DOM Events can trigger a series of complex behaviors, at almost any time, between at least two complicated and bug-prone browser subsystems (JS & DOM). Complexity often begets bugs, so it should be no surprise that handling such events can be tricky.

🔗 Event Vulnerability Pattern ( UAF )

Let's imagine a "simplest-case", contrived example of what a DOM Event based vulnerability may look like.

C++
void canDispatchEvent( ... ) {

    // Raw Pointer, not refcounted, this function is "on its own",
    // no other part of the browser knows about this pointer!
    HTMLElement * raw = engine.getElement(...);

    // JavaScript can run as a result of this event being dispatched!
    // What happens if the element 'raw' points to is free'd ??? 
    dispatchEvent(anEvent, eventTarget);

    // We get a classic, clean UAF Scenario! :) //
    raw->do_something_interesting();
}

This models one of the scenarios we warned about in the section about reference counting. A developer used a raw pointer in a way that resulted in the underlying object being free'd, subverting any reference counting or other memory management techniques. In this case, the free() occurred as a result of JavaScript acting on an Event.

🔗 CVE-2017-2362

This CVE is a 'real world' example of a similar situation, but we end up with an iterator invalidation instead of a traditional UAF.

The vulnerable code, which runs on resetting an HTML form:

C++
void HTMLFormElement::reset()
{
    ...

    for (auto& associatedElement : m_associatedElements) {
        if (is<HTMLFormControlElement>(*associatedElement))
            downcast<HTMLFormControlElement>(*associatedElement).reset();
    }

    ...
}

/Source/WebCore/html/HTMLFormElement.cpp

This is the kind of innocent looking but subtly dangerous code that often gets complex software into trouble. The surface level simplicity hides critical caveats. Namely:

The for-each loop creates a C++ iterator
.reset() can trigger an Event dispatch (for various mutation events e.g. DOMSubtreeModified)
- and can therefore run arbitrary JavaScript registered as an event listener

To catch this bug in source-review, you would have to recognize an event dispatch is possible, and that the arbitrary JavaScript invoked by this dispatch can break the invariant for preventing iterator invalidation: that the object being iterated should never be modified during iteration.

Let's take a look at how we can abuse this scenario from JavaScript:

javascript
// register listener for output node within form
// this will get invoked during the for-each loop when output node is reset
document.getElementById("output").addEventListener('DOMSubtreeModified', function() {
    // append bunch of elements to m_associatedElements vector *during iteration*
    for(var i=0; i<20; i++) {
        form.appendChild(document.createElement("input"));
    }
}, false);

// trigger for-each loop
form.reset();

This leads to the following chain of events:

form.reset starts the for-each loop, creating a vector iterator
- in this case, the vector iterator is a raw pointer into the vector's entries, something like &vector[idx]
In the loop, calling reset() on the output element dispatches a DOMSubtreeModified event
Within the callback, form.appendChild modifies the vector during iteration, invalidating the iterator
- adding enough elements to the vector forces it to be reallocated with a larger size
- this frees the old allocation
- original iterator is now a dangling pointer!
for-each loop continues to use the invalidated iterator pointing at the freed vector allocation, causing a UAF

You can find the original bug report here.

Usually, avoiding these kinds of situations is as easy as using RefPtr when appropriate:

C++
void canDispatchEvent( ... ) {
    // HTMLElement * raw = engine.getElement(...);
    RefPtr<HTMLElement> not_raw = engine.getElement(...);

    dispatchEvent(anEvent, eventTarget);

    // This can no longer be free since it is a RefPtr
    not_raw->do_something_interesting();

Of course, the trick, both for attackers and developers, is intuiting all the scenarios where this is required.

🔗 Weak `this` Pointer

Similar situations can occur implicitly due to the nature of C++ classes. Namely, the this pointer is a raw pointer by default, which can be easy to forget or miss when reading code. (If unfamiliar with C++, this refers to the instance on which a method was invoked, similar to conventional usage of self in Python).

C++
void SomeHTMLElement::someFunctionality() {
    // ... does something

    // event dispatch: potential callbacks into JavaScript, might cause this object to be free'd!
    dispatchEvent(someEventType, someTarget);

    // usage of this object in some way, potential UAF
    some_other_class_method();
}

...

some_element_instance->someFunctionality();

In this case, the this pointer is not reference counted or protected otherwise. Member functions of this class would be using a dangling pointer after dispatchEvent. As before, a common fix is to protect this by wrapping it in a RefPtr:

C++
void SomeHTMLElement::someFunctionality() {
    RefPtr<Element> protect(this);
    // ... Does something

    // Potential return to JavaScript, this object might get free'd!
    dispatchEvent(someEventType, someTarget);

    // No UAF because *this* RefPtr still in scope
    some_other_class_method();
}

Although "these types of bugs" haven't been particularly prevalent in more recent years, they do occasionally still happen. The following code diff is from 2021, past the era where these sorts of issues were fairly common:

patchvoid HTMLPlugInElement::swapRendererTimerFired()
{
    ASSERT(displayState() == PreparingPluginReplacement);
    if (userAgentShadowRoot())
        return;

    // Create a shadow root, which will trigger the code to add a snapshot container
    // and reattach, thus making a new Renderer.
+   Ref<HTMLPlugInElement> protectedThis(*this);
    ensureUserAgentShadowRoot();
}

source commit: 0ac896ed806ece6a3a9e2ddc40fa6f9265497bfb

🔗 RefCount Manipulation

The main thing to keep in mind when dealing with any reference counting implementation is that we want to "trick" objects into thinking they have 0 references while at least one still exists. The resulting free() and subsequent "dangling pointer" are what allow us to build UAF style exploits from a bad refcount primitive.

In practice, this means finding ways of adding or removing a reference without appropriately updating the object's reference count. This is the actual "vulnerability".

However, we must also be careful to not take additional references while we build our exploit. For example, consider the following JavaScript code:

JavaScript
let p = document.getElementById('parent');
let c = p.firstChild; // IDL WRAPPED!
p.removeChild(c);

The diagram below depicts the refcounting situation:

This code results in a reference to firstChild being stored in JavaScript. Some of the autogenerated code from the IDL will update the refCount in the DOM memory manager to reflect the fact that a reference was taken. In this manner, both browser components are "in sync" with regards to this object.

However this is a problem when writing exploits. We must have 0 references for our target object to be free'd. Simply accessing p.firstChild immediately creates the IDL-wrapped object, taking a reference (regardless of the fact that our snippet also stored a reference in temporary c). We can avoid this IDL book-keeping by using other methods to remove references:

JavaScript
let p = document.getElementById('parent');
// Use innerHTML to remove child without ref'ing
p.innerHTML = '';

This time, the autogenerated IDL code does not run, so we end up with the following:

As a general rule of thumb, if an API does not return a reference to an object back to JavaScript, it is reasonable to assume that the API will not result in an extra reference being taken because of its use.

🔗 Double Fetch / TOCTOU

Double Fetch, sometimes called Time of Check Time of Use bugs, are another relatively common pattern that can arise from Events.

Conceptually, they have a similar structure as our UAF example:

C++
void SomeHTMLElement::someFunction( ... ) {
    if (m_length > 10) {
        // ASSUME m_length > 10
        
        // JavaScript running as a result of the event dispatch
        // may be able to change the value of m_length

        if (event_should_trigger) {
            dispatchEvent(someEvent, someTarget);
        }	
    
        // Now, (m_length > 10) may no longer be true! 
        do_something_with_m_length();
    }
}

These issues arise when developers make assumptions about the mutability of particular variables. Since Events can cause arbitrary JavaScript to run, there are many opportunities to "pull the rug out from under" other code that makes incorrect assumptions about mutability.

🔗 CVE-2015-1291

The following snippet shows a real world example of this type of bug. It happens to follow our template quite closely as well:

C++
void ContainerNode::parserRemoveChild(Node& oldChild)
{
    // Grab two siblings (ASSUME they will stay next to each other in the DOM tree)
    Node* prev = oldChild.previousSibling();
    Node* next = oldChild.nextSibling();
    
    if (oldChild.connectedSubframeCount())
        ChildFrameDisconnector(oldChild).disconnect(); // --> dispatch unload Event

    // ...
    // < JavaScript can move `next` and `prev`, reparent `oldChild`>
    //	

    // Use the two sibling nodes to perform an action
    removeBetween(prev, next, oldChild);

    // ...
}

You can read more details in the bug report.

🔗 DOM Event Vulnerability Exercise

[open exercise]

🔗 Auxiliary Event Knowledge

In this section, we will give some background on how Events are implemented. This will help provide valuable context when looking at potential bugs.

Specifically, we will take a look at:

Event Dispatching Methods
Observability

🔗 Event Dispatching

There are two strategies that can be used for Event Dispatch: synchronous and asynchronous.

When Events are dispatched:

Synchronously
- JavaScript Event Handlers run "right after" being dispatched
- (This has been the case in our examples so far)
Asynchronously
- The "dispatch event" action is placed onto a Task Queue, but does not fire immediately.

🔗 Synchronous Event Dispatch

Synchronous Event Dispatch works "in lockstep" with when Events are triggered / dispatched. Consider the following code:

C++
    // do_something

    dispatchEvent(someEventType, someTarget);

    // do_something_else

With synchronous dispatch, the following occurs:

do_something runs
someEventType is dispatched
Execution "pauses" while the JavaScript for someTarget's event listener runs
do_something_else runs

In fact, we subtly relied on this synchronous behavior when considering UAF and double-fetch bugs earlier in this module.

C++
// store raw pointer, validate some property etc

dispatchEvent(someEventType, someTarget); // --> This way to JavaScript-Land

//  [
//  | < JavaScript runs and breaks all sorts of assumptions before returning here >
//  [

// Back from JavaScript!!
// Use invalid stack pointer, use invalid property, etc

However, Asynchronous event dispatch does not allow for this scenario.

🔗 Asynchronous Event Dispatch

Let's change dispatchEvent to use Asynchronous Event Dispatch:

C++
    // do_something 

    asyncDispatchEvent(someEventType, someTarget);

    // do_something_else

Now, we get different behavior which defers the execution of Event handlers.

do_something runs
dispatch(someEventType) is placed onto a Task Queue
do_something_else runs, the function finishes
someEventType is dispatched, JavaScript handlers run, etc.

Now, there's no way for JavaScript to:

Mutate state
Cause UAF scenarios in the middle of the functions
Cause double-fetch scenarios

🔗 Event Dispatch - Which to use?

From this initial look, it may seem like asynchronous events should always be used, as they avoid some potential bug classes. However, whether Events are dispatched synchronously or asynchronously is defined by the DOM specification.

This forces browser engines to implement both methods and choose the appropriate one based on the Event Type. For example, MutationEvents are dispatched synchronously, as seen here.

In practice, the actual implementation is up to the browser vendors so the details can vary from our idealized explanation, but the general idea applies.

🔗 Event Observability

As mentioned in the previous section, browsers implement event dispatching algorithms in different ways. This begs the question of how each browser ensures that it will produce the same results as the others.

This is handled by the concept of observability. From the perspective of JavaScript, things should look "the same" no matter the browser. Browser vendors use this principle of perspective combined with the DOM specification to implement their algorithms in a consistent way.

As a side effect, this forces some events to be synchronous when it would be safer to handle them asynchronously.

🔗 Non-Bugs

Take a look at the following WebKit code.

Is this code vulnerable?
If so, how do you trigger the bug?

C++
ExceptionOr<Ref<Text>> Text::splitText(unsigned offset)
{
    if (offset > length())
        return Exception { IndexSizeError };

    EventQueueScope scope;
    auto oldData = data();
    auto newText = virtualCreate(oldData.substring(offset));
    setDataWithoutUpdate(oldData.substring(0, offset));

    dispatchModifiedEvent(oldData);

    if (auto* parent = parentNode()) {
        auto insertResult = parent->insertBefore(newText, nextSibling());
        if (insertResult.hasException())
            return insertResult.releaseException();
    }

    document().textNodeSplit(*this);

    if (renderer())
        renderer()->setTextWithOffset(data(), 0, oldData.length());
    return newText;
}

/Source/WebCore/dom/Text.cpp

This looks a lot like one of our vulnerability patterns...

At least 2 synchronous events dispatched
- dispatchModifiedEvent(oldData)
  - Emits DOMCharacterDataModified Event
- parent->insertBefore(newText, nextSibling())
  - Emits DOMSubtreeModified Event
  - Likely emits others (DOMNodeInserted)

Use of "unprotected/raw" this pointer afterwards
- document().textNodeSplit(*this)

However, near the start of the function, we also have EventQueueScope scope. While scope is "alive", events are queued for deferred dispatch (assuming they are dispatched as queued events).

C++
class EventQueueScope {
public:
    EventQueueScope() { ScopedEventQueue::singleton().incrementScopingLevel(); }
    ~EventQueueScope() { ScopedEventQueue::singleton().decrementScopingLevel(); }
};


void ScopedEventQueue::decrementScopingLevel()
{
    ASSERT(m_scopingLevel);
    --m_scopingLevel;
    if (!m_scopingLevel)
        dispatchAllEvents();
}

/Source/WebCore/dom/ScopedEventQueue.h

Events will not get dispatched until all EventQueueScope instances have gone out of scope.

It's easy to miss small things like this that "block" vulnerabilities (especially in unfamiliar parts of a codebase). BUT, it's just as easy for developers to forget them too.

🔗 Key Takeaways

"Jumping back to JavaScript" can be dangerous!
- Pay special attention to synchronous events.

There are many events
- And just as many actions that might trigger one, causing JS execution
- Not always obvious that a particular function can cause an event dispatch

Most "low hanging fruit" has been found and fixed
- Via RefPtrs, using asynchronous event dispatch, "scope aware" dispatch, etc