That approach destroys and rebuilds the entire DOM on every render. This would n...

_gabe_ · on Aug 3, 2023

I've always wondered why this notion is so popular (is it just because of what react does)? Wouldn't the native browser be expected to handle a DOM re-render much more efficiently than an entire managed JS framework running on the native browser and emulating the DOM? Maybe in 2013 browsers really really sucked at re-renders, but I have to wonder if the myriads of WebKit, Blink, Gecko developers are so inept at their jobs that a managed framework can somehow figure out what to re-render better than the native browser can.

And yes, I understand that when you program using one of these frameworks your explicitly pointing out what pieces of state will change and when, but in my professional experience, people just code whatever works and don't pay much attention to that. In the naive case, I feel like the browser would probably beat react or any other framework on re-renders every time. In the naive case where developers don't really disambiguate what state changes like they're supposed to. Are there any benchmarks or recent blogs/tech articles that dive into this?

8n4vidtmkvmk · on Aug 3, 2023

I think the reason that the browser is so slow is that every time you mutate something, an attribute or add or remove an element, the browser rerenders immediately. And this is indeed slow AF. If you batched everything into a DocumentFragment or similar before attaching it to the DOM then it'd be fast. I don't know how you do that ergonomically though.

dexwiz · on Aug 3, 2023

This is not true. The browser will only repaint when the event loop is empty. A bunch of synchronous DOM updates will result in one rerender.

MrJohz · on Aug 3, 2023

It's partially true. Layout and repaint are two separate rendering phases. Repaint happens asynchronously, as you point out. But layout (i.e. calculating heights, widths, etc of each box) is more complicated. If the browser can get away with it, it will batch potential layout changes until directly before the repaint - if you do ten DOM updates in a single tick, you'll get one layout calculation (followed by one repaint).

But if you mix updates and reads, the browser needs to recalculate the layout before the read occurs, otherwise the read may not be correct. For example, if you change the font size of an element and then read the element height, the browser will need to rerun layout calculation between those two points to make sure that the change in font size hasn't updated the element height in the meantime. If these reads and writes are all synchronous, then this forces the layout calculations to happen synchronously as well.

So if you do ten DOM updates interspersed with ten DOM reads in a single tick, you'll now get ten layout calculations (followed by one repaint).

This is called layout thrashing, and it's something that can typically be solved by using a modern framework, or by using a tool like fastdom which helps with batching reads and writes so that all reads always happen before all writes in a given tick.

AlotOfReading · on Aug 3, 2023

Modern browsers simply set a bit that may lead to repaint after the next layout pass. Things have changed a bit since React was released in 2013.

lelanthran · on Aug 3, 2023

> I think the reason that the browser is so slow is that every time you mutate something, an attribute or add or remove an element, the browser rerenders immediately.

Is it really immediately? I thought that was a myth.

I thought that, given toplevel function `foo()` which calls `bar()` which calls `baz()` which makes 25 modifications to the DOM, the DOM is only rerendered once when foo returns i.e. when control returns from usercode.

I do know that making changes to the DOM, when immediately entering a while(1) loop doesn't show any change to the DOM.

MrJohz · on Aug 3, 2023

Yes and no.

The browser will, as much as it can, catch together DOM changes and perform them all at once. So if `baz` looks like this:

    for (let i=0; i<10; i++) {
      elem.style.fontSize = i + 20 + 'px';
    }

Then the browser will only recalculate the size of `elem` once, as you point out.

But if we read the state of the DOM, then the browser still needs to do all the layout calculations before it can do that read, so we break that batching effect. This is the infamous layout thrashing problem. So this would be an example of bad code:

    for (let i=0; i<10; i++) {
      elem.style.fontSize = i + 20 + 'px';
      console.log(elem.offsetHeight);
    }

Now, every time we read `offsetHeight`, the browser sees that it has a scheduled DOM modification to apply, so it has to apply that first, before it can return a correct value.

This is the reason that libraries like fastdom (https://github.com/wilsonpage/fastdom) exist - they help ensure that, in a given tick, all the reads happen first, followed by all the writes.

That said, I suspect even if you add a write followed by a read to your `while(1)` experiment, it still won't actually render anything, because painting is a separate phase of the rendering process, which always happens asynchronously. But that might not be true, and I'm on mobile and can't test it myself.

lelanthran · on Aug 3, 2023

> Now, every time we read `offsetHeight`, the browser sees that it has a scheduled DOM modification to apply, so it has to apply that first, before it can return a correct value.

That makes perfect sense, except that I don't understand how using a shadow DOM helps in this specific case (A DOM write followed immediately by a DOM read).

Won't the shadow DOM have to perform the same calculations if you modify it and then immediately use a calculated value for the next modification?

I'm trying to understand how exactly a shadow DOM can perform the calculations after modifications faster than the real DOM can.

MrJohz · on Aug 3, 2023

The shadow DOM doesn't help at all here, that's mainly about scope and isolation. The (in fairness confusingly named) virtual DOM helps by splitting up writes and reads.

The goal when updating the DOM is to do all the reads in one batch, followed by all the writes in a second batch, so that they never interleave, and so that the browser can be as asynchronous as possible. A virtual DOM is just one way of batching those writes together.

It works in two phases: first, you work through the component tree, and freely read anything you want from the DOM, but rather than make any updates, you instead build a new data structure (the VDOM), which is just an internal representation of what you want the DOM to look like at some point in the future. Then, you reconcile this VDOM structure with the real DOM by looking to see which attributes need to be updated and updating them. By doing this in two phases, you ensure that all the reads happen before all the writes.

There are other ways of doing this. SolidJS, for example, just applies all DOM mutations asynchronously (or at least, partially asynchronously, I think using microtasks), which avoids the need for a virtual DOM. I assume Svelte has some similar setup, but I'm less familiar with that framework. That's not to say that virtual DOM implementations aren't still useful, just that they are one solution with a specific set of tradeoffs - other solutions to layout thrashing exist. (And VDOMs have other benefits being just avoiding layout thrashing.)

So to answer your question: the virtual DOM helps because it separates reads and writes from each other. Reads happen on the real DOM, writes happen on the virtual DOM, and it's only at the end of a given tick that the virtual DOM is reconciled with the real DOM, and the real DOM is updated.

lelanthran · on Aug 3, 2023

I'm gonna apologise in advance for being unusually obtuse this morning. I'm not trying to be contentious :-)

> So to answer your question: the virtual DOM helps because it separates reads and writes from each other. Reads happen on the real DOM, writes happen on the virtual DOM, and it's only at the end of a given tick that the virtual DOM is reconciled with the real DOM, and the real DOM is updated.

I still don't understand why this can't be done (or isn't currently done) by the browser engine on the real DOM.

I'm sticking to the example given: write $FOO to DOM causing $BAR, which is calculated from $FOO, to change to $BAZ.

Using a VDOM, if you're performing all the reads first, then the read gives you $BAR (the value prior to the change).

Doing it on the real DOM, the read will return $BAZ. Obviously $BAR is different from $BAZ, due to the writing of $FOO to the DOM.

If this is acceptable, then why can't the browser engine cache all the writes to the DOM and only perform them at the end of the given tick, while performing all the reads synchronously? You'll get the same result as using the VDOM anyway, but without the overhead.

MrJohz · on Aug 3, 2023

No worries, I hope I'm not under/overexplaining something!

The answer here is the standard one though: if you write $FOO to DOM, then read $BAR, it has to return $BAZ because it always used to return $BAZ, and we can't have breaking changes. All of the APIs are designed around synchronously updating the DOM, because asynchronous execution wasn't really planned in at the beginning.

You could add new APIs that do asynchronous writes and synchronous reads, but I think in practice this isn't all that important for two reasons:

Firstly, it's already possible to separate reads from writes using microtasks and other existing APIs for forcing asynchronous execution. There's even a library (fastdom) that gives you a fairly easy API for separating reads and writes.

Secondly, there are other reasons to use a VDOM or some other DOM abstraction layer, and they usually have different tradeoffs. People will still use these abstractions, even if the layout thrashing issue were solved completely somehow. So practically, it's more useful to provide the low-level generic APIs (like microtasks) and let the different tools and frameworks use them in different ways. I think there's also not a big push for change here: the big frameworks are already handing this issue fine and don't need new APIs, and smaller sites or tools (including the micro-framework that was originally posted) are rarely so complicated that they need these sorts of solutions. So while this is a real footgun that people can run into, it's not possible to remove it without breaking existing websites, and it's fairly easy to avoid if you do run into it and it starts causing problems.

MrJohz · on Aug 3, 2023

As I understand it, there's a couple of issues with naively rerendering DOM elements like this.

Firstly, the DOM is stateful, even in relatively simple cases, which means destroying and recreating a DOM node can lose information. The classic example is a text input: if you have a component with a text input, and you want to rerender that component, you need to make sure that the contents of the text input, the cursor position, any validation state, the focus, etc, are all the same as they were before the render. In React and other VDOM implementations, there is some sort of `reconcile` function that compares the virtual DOM to the real one, and makes only the changes necessary. So if there's an input field (that may or may not have text in it) and the CSS class has changed but nothing else, then the `reconcile` function can update that class in-place, rather than recreate it completely.

In frameworks which don't use a virtual DOM, like SolidJS or Svelte, rerendering is typically fine-grained from the start, in the sense that each change to state is mapped directly to a specific DOM mutation that changes only the relevant element. For example in SolidJS, if updating state would change the CSS class, then we can link those changes directly to the class attribute, rather than recreating the whole input field altogether.

The second issue that often comes with doing this sort of rerendering naively is layout thrashing. Rerendering is expensive in the browser not because it's hard to build a tree of DOM elements, but because it's hard to figure out the correct layout of those elements (i.e. given the contents, the padding, the surrounding elements, positioning, etc, how many pixels high will this div be?) As a result, if you make a change to the DOM, the browser typically won't update the DOM immediately, and instead batches changes together asynchronously so that the layout gets calculated less often.

However, if I mix reads and writes together (e.g. update an element class and then immediately read the element height), then I force the layout calculation to happen synchronously. Worse, if I'm doing reads and writes multiple times in the same tick of the Javascript engine, then the browser has to make changes, recalculate the layout, return the calculated value, then immediately throw all the information away as I update the DOM again somewhere else. This is called layout thrashing, and is usually what people are talking about when they talk about bad DOM performance.

The advantage of VDOM implementations like React is that they can update everything in one fell swoop - there is no thrashing because the DOM gets updated at most once per tick. All the reads are looking at the same DOM state, so things don't need to be recalculated every time. I'm not 100% sure how Svelte handles this issue, but in SolidJS, DOM updates happen as part of the `createRenderEffect` phase, which happens asynchronously after all DOM reads for a given tick have occurred.

OP's framework is deliberately designed to be super simple, and for basic problems will be completely fine, but it does run into both of the problems I mentioned. Because the whole component is rerendered every time `html` is called, any previous DOM state will immediately be destroyed, meaning that inputs (and other stateful DOM elements) will behave unexpectedly in various situations. And because the rendering happens synchronously with a `innerHTML` assignment, it is fairly easy to run into situations where multiple DOM elements are performing synchronous reads followed by synchronous writes, where it would be better to do all of the reads together, followed by all of the writes.

_gabe_ · on Aug 3, 2023

Thanks for the info! This all makes sense to me intuitively, but I know I've been bitten in the butt several times by implementing clever caching schemes or something that end up slowing the app down more than I thought it would speed it up. It seems like it would be simple enough to set up a test case and benchmark this (you wrote a simple for loop above that should exhibit this behavior). I'm curious how much, if any, react actually ends up saving cycles when a programmer does the same code naively in react and naively in the browser. I think it would make for some interesting benchmarks at least :)