From Canvas to DOM

For around 7 years at my previous company, we were building and maintaining a template builder based on GrapesJS. It supported both HTML website templates and MJML email templates.

At first glance, building a web builder sounds simple:

create some HTML blocks
drag and drop them
style them
export HTML

But once real users start using the system, everything becomes much harder.

During most of those years I was not directly coding on the builder every day because of different responsibilities, but as a team lead I was constantly involved with architecture decisions, user feedback, technical limitations, and all the pain that came with it.

Over time, our system became much bigger than a simple GrapesJS setup. We had built a huge layer around it:

custom block manager
block registry
integration with CKEditor
event management system
large styling abstractions
different renderers for HTML and MJML
and many other custom systems

Recently, while exploring modern website builders, I started thinking about an old question again: why is the industry slowly moving toward canvas-like mental models? And more importantly, what does canvas actually solve?

My current answer is this: canvas is often a better interaction layer, but not necessarily a better final output layer. That distinction matters a lot.

DOM Is Still King

Before talking about canvas-based systems, I think we need to accept one important thing: DOM is still the core primitive of the web.

Tools like:

still make a lot of sense for many companies, especially if a website builder is not the core product.

This approach has many advantages:

you can build and launch an MVP quickly
onboarding frontend engineers is usually easier
everything is based on real HTML and CSS
output is more semantic
responsive behavior feels more natural
complexity is usually lower in the beginning

You are building directly on top of the browser instead of trying to replace it. But things start getting ugly once you want interactions similar to Figma.

Where Everything Starts Falling Apart

As long as your builder only has simple blocks, things look fine. But once you try to support advanced interactions, you enter a completely different world:

selection
transforms
snapping
zooming
hit-testing
guides
freeform interactions

These sound like small features. They are not. A feature like zooming can easily become a nightmare.

Users expect:

smooth zooming
accurate selections
correct drag & drop behavior
stable coordinates
predictable interactions

Implementing this correctly on top of DOM is much harder than it looks.

We eventually realized that many interaction-heavy features fight against the natural behavior of the web itself.

The web is built around:

document flow
flexbox
grid
responsive layouts

But users want to:

freely move elements
overlap layers
zoom infinitely
drag objects anywhere

And these two mental models do not always work nicely together.

Moving one image may accidentally:

break a flex layout
move nearby text
destroy spacing
affect nested containers

This is exactly where the industry slowly started moving toward a different mental model.

Canvas-Like Mental Models

Tools like Figma and similar design tools changed something important. They did not replace the web. They changed the interaction model.

Traditional DOM-based builders are usually code-first. You work with:

sections
divs
flexbox
nested layouts
document structure

But canvas-based tools feel more like a design tool. You draw a box. You move it anywhere. You overlap elements. You zoom. You rotate.

The system later tries to figure out how to turn that design into something the browser can render.

This is a very important difference:

DOM-based systems are usually structure-first
Canvas-based systems are usually interaction-first

One feels closer to programming. The other feels closer to design. And for many users, especially less technical ones, interaction-first is simply easier.

Why Canvas-Based Tools Feel So Good

If you have used tools like this for a few hours, you immediately notice it. The experience feels smoother.

zooming feels natural
dragging feels better
overlapping elements is easy
interactions feel more predictable
users feel less like they are coding

And this is one of the biggest reasons modern builders became popular. But there is also a big misunderstanding here. Canvas-based tools did not magically solve complexity. They mostly moved complexity into a different layer.

A Small Example: Cropper Interactions

To make this easier to understand, an image cropper is actually a very good example. At first glance, a cropper looks simple:

select an area
resize it
move it around
export the result

But under the hood, it contains many of the same hard problems that exist inside modern web builders:

selection
resizing
transforms
pointer events
hit-testing
boundaries
coordinate systems

Here is a small inline demo built with Pikaso:

Live Demo

The important thing here is that canvas is not the final output. Canvas is simply a much better environment for interaction-heavy UI. That becomes much more interesting once you stop treating the editor and the renderer as the same thing.

What If Canvas Is Only the Interaction Layer?

This is the part I keep coming back to: maybe one of the biggest mistakes many older web builders made was treating the editor and renderer as the same thing.

In many DOM-based builders, the thing you drag on the screen is directly tied to the final DOM structure.

What changes once these two layers are separated? What if:

the editor is only responsible for interactions
and another layer later converts the result into real DOM

This separation is already part of how more modern builder systems tend to work. I built a small prototype with Pikaso mostly to make that architecture easier to see.

The top section is a canvas-based artboard. Changes are serialized in realtime. The bottom section renders the same scene as real HTML.

Live Demo

This is the important part. The canvas is not the final product. It is only the interaction layer, and the renderer is a separate system.

That, to me, is the real shift. The industry did not move from DOM to canvas as much as it moved from structure-first editing to interaction-first editing.

But Canvas Systems Have Their Own Problems

This is the part people usually ignore. Canvas-based systems are beautiful, but they come with serious trade-offs.

1. Final output is usually less semantic

Once you move toward freeform layouts, the final output often becomes heavily absolute-positioned.

This can hurt:

accessibility
SEO
maintainability

2. Responsive design becomes harder

Canvas naturally thinks in fixed spaces. The web does not. The web is fluid by nature.

This is why modern builder tools had to introduce:

constraints
stacks
auto-layout
breakpoints

Interestingly, the industry ended up coming back to many original web layout concepts, just with better UX around them.

3. Dynamic content becomes dangerous

Static landing pages are easy. Real products are not. Once you introduce:

CMS data
localization
async content
user-generated text

many canvas compositions start breaking apart.

A slightly longer title can destroy an entire layout. A card that looked perfect in the editor can collapse the moment real content arrives.

4. Building a Real Rendering Engine Is Extremely Hard

This is where complexity explodes. If you want to build a serious canvas-based editor, eventually you enter engine-development territory:

scene graphs
transforms
coordinate systems
virtualization
batching
memory management
hit-testing

And this is exactly why tools like Figma spent years building custom rendering architecture.

About Pikaso

Back in 2020, I released Pikaso as my first serious open source project. The original idea behind Pikaso was simple: make interaction-heavy canvas systems easier to build.

Things like:

image croppers
draggable elements
zoomable artboards
selection systems

without rebuilding everything from scratch every time.

Over the years, one thing became clear to me: the same problems that show up in croppers and artboards also show up in modern builders. That is why I keep coming back to this space.

Can We Convert Canvas Back Into Real HTML?

This is the idea I currently find interesting: can we build:

a modern interaction-heavy editor
using a canvas-like mental model
while still generating real DOM output?

On paper, this sounds very attractive. We could have:

smooth interactions
flexible artboards
better UX
and still generate React or HTML later

Pikaso already supports importing and exporting scenes as JSON. So technically, we can:

serialize the scene graph
store positions and styles
and later convert everything into:
- React
- HTML
- MJML
- or something else

The idea sounds great in demos. But the real question is whether it actually scales in production, or whether we are simply moving complexity somewhere else.

Honestly, I think the answer is both. You can absolutely build a better editing experience this way, but you do not escape the hard parts. You just decide where they live.

Final Thoughts

If you decide to build a commercial website builder, you are entering one of the hardest trade-off spaces in frontend engineering. There is no perfect answer.

If your product is:

highly dynamic
data-heavy
developer-focused
or deeply integrated with custom logic

DOM-based systems still make a lot of sense.

But if your product focuses more on:

visual storytelling
marketing websites
motion
design freedom
less technical users

then canvas-like systems can provide a much better experience.

But if I had to make the point as clearly as possible, it would be this:

DOM is still the best delivery format for the web. Canvas is often the better interaction format for building it.

So no, I do not think the future is canvas instead of DOM. I think the more interesting future is canvas for editing, DOM for output, and a strong conversion layer sitting in between.

Rarely, but worth it