Skip to main content

Command Palette

Search for a command to run...

How Browsers Work: From URL to Pixels on Your Screen

Published
11 min read
How Browsers Work: From URL to Pixels on Your Screen

The Question That Starts Everything

You open your browser. You type:

https://google.com

You press Enter.

Within milliseconds, a fully rendered page appears.

But what actually happened?

This article breaks down the journey from URL to pixels in a way that makes sense—no overwhelming specs, just clear understanding.


What Is a Browser, Really?

A browser is not just "a thing that opens websites."

It's actually:

  • A rendering engine that turns code into visuals
  • A networking client that fetches resources
  • A JavaScript runtime that executes code
  • A user interface you interact with

Think of it as:

A translator between code (HTML, CSS, JS) and what you see on screen.

Browsers don't just display content—they interpret, process, and render it.


Main Components of a Browser (High-Level)

A browser is made of several cooperating parts:

┌─────────────────────────────────────────────────────┐
│                  User Interface                     │
│  (Address bar, tabs, back/forward buttons, etc.)    │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│                  Browser Engine                     │
│     (Coordinates between UI and rendering)          │
└─────────────────────────────────────────────────────┘
                        ↓
┌──────────────────────┬──────────────────────────────┐
│   Rendering Engine   │      JavaScript Engine       │
│  (Parses HTML/CSS)   │   (Executes JS code)         │
└──────────────────────┴──────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│                    Networking                       │
│        (Fetches HTML, CSS, JS, images, etc.)        │
└─────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────┐
│                   Data Storage                      │
│         (Cookies, localStorage, cache, etc.)        │
└─────────────────────────────────────────────────────┘

Each part has a job. They work together like an assembly line.


The User Interface (What You See and Touch)

The User Interface (UI) includes:

  • Address bar
  • Back/forward buttons
  • Bookmarks
  • Tabs
  • Refresh button

This is the only part users directly interact with.

Everything else happens behind the scenes.


Browser Engine vs Rendering Engine

These terms sound similar but mean different things.

Browser Engine

  • Coordinates between the UI and the rendering engine
  • Example: You click "back" → browser engine tells rendering engine to load previous page

Rendering Engine

  • Parses HTML and CSS
  • Builds the visual representation
  • Paints pixels on screen

Examples of rendering engines:

  • Blink (Chrome, Edge, Opera)
  • WebKit (Safari)
  • Gecko (Firefox)

Simple distinction:

Browser engine = conductor
Rendering engine = orchestra


Step 1: Networking — Fetching Resources

When you press Enter after typing a URL:

User types URL
      ↓
DNS resolution (find IP address)
      ↓
Establish TCP connection
      ↓
Send HTTP request
      ↓
Receive HTTP response (HTML file)
      ↓
Parse HTML → find CSS, JS, image references
      ↓
Fetch those resources too

The networking layer:

  • Makes HTTP/HTTPS requests
  • Downloads HTML, CSS, JavaScript, images, fonts
  • Handles caching, cookies, redirects

Key point:
The browser doesn't get everything at once. It fetches resources progressively.


Step 2: HTML Parsing and DOM Creation

The browser receives HTML as text:

<!DOCTYPE html>
<html>
  <head>
    <title>Example</title>
  </head>
  <body>
    <h1>Hello World</h1>
    <p>This is a paragraph.</p>
  </body>
</html>

The rendering engine parses this HTML into a tree structure called the DOM (Document Object Model).

HTML → DOM Conversion

HTML Text
    ↓
┌─────────────────────┐
│   HTML Parser       │
└─────────────────────┘
    ↓
DOM Tree:

        Document
           |
         html
           |
      ┌────┴────┐
    head       body
      |          |
    title    ┌───┴───┐
             h1      p

The DOM is:

  • A tree representation of your HTML
  • Something JavaScript can manipulate
  • The structure browsers use internally

Analogy:

HTML is the blueprint.
DOM is the actual building constructed from it.


Step 3: CSS Parsing and CSSOM Creation

Similarly, CSS is parsed into the CSSOM (CSS Object Model).

Example CSS:

body {
  font-size: 16px;
}

h1 {
  color: blue;
}

p {
  margin: 10px;
}

CSS → CSSOM Conversion

CSS Text
    ↓
┌─────────────────────┐
│   CSS Parser        │
└─────────────────────┘
    ↓
CSSOM Tree:

       root
         |
    ┌────┴────┐
  body       h1       p
  (font-size) (color) (margin)

The CSSOM is:

  • A tree of style rules
  • Used to determine how each element should look

Why separate from DOM?
Because HTML defines structure, CSS defines style. Keeping them separate makes processing efficient.


Step 4: Combining DOM and CSSOM → Render Tree

The browser now has:

  • DOM (what to display)
  • CSSOM (how to display it)

Next step: combine them into a Render Tree.

       DOM              CSSOM
        |                 |
        └────────┬────────┘
                 ↓
          Render Tree

    (Only visible elements with styles applied)

          html
           |
         body
           |
      ┌────┴────┐
     h1         p
  (blue text) (10px margin)

Important:
The Render Tree only includes visible elements.

Elements with display: none are in the DOM but not in the Render Tree.


Step 5: Layout (Reflow) — Calculating Positions

Now the browser knows:

  • What to display
  • What styles to apply

But it doesn't yet know where to place things.

Layout (also called Reflow) calculates:

  • Position of each element
  • Size of each element
  • How elements flow together
Render Tree
     ↓
┌─────────────────────┐
│   Layout Engine     │
│  (Calculate boxes)  │
└─────────────────────┘
     ↓
Box Model with positions:

┌────────────────────────┐
│ <html>                 │
│  ┌──────────────────┐  │
│  │ <body>           │  │
│  │  ┌────────────┐  │  │
│  │  │ <h1>       │  │  │
│  │  └────────────┘  │  │
│  │  ┌────────────┐  │  │
│  │  │ <p>        │  │  │
│  │  └────────────┘  │  │
│  └──────────────────┘  │
└────────────────────────┘

Step 6: Painting — Drawing Pixels

The browser now knows:

  • What to display
  • How it should look
  • Where to place it

Painting is the process of filling in pixels.

Layout Information
       ↓
┌─────────────────────┐
│   Paint Engine      │
│  (Fill in pixels)   │
└─────────────────────┘
       ↓
Pixel data sent to screen

The browser:

  • Converts elements into actual pixels
  • Applies colors, borders, shadows, images
  • Handles layering (z-index)

Result:
Visual representation ready for display.


Step 7: Display — Pixels on Screen

Finally, the painted pixels are sent to the display.

The browser uses:

  • GPU for rendering (modern browsers)
  • Compositing layers for efficiency

You see the webpage.


The Complete Flow: URL to Pixels

Here's the full journey:

1. User enters URL
        ↓
2. DNS lookup → IP address
        ↓
3. HTTP request → Fetch HTML
        ↓
4. Parse HTML → Build DOM
        ↓
5. Fetch CSS → Parse CSS → Build CSSOM
        ↓
6. Combine DOM + CSSOM → Render Tree
        ↓
7. Layout (calculate positions)
        ↓
8. Paint (fill in pixels)
        ↓
9. Display on screen

All of this happens in milliseconds.


What Is Parsing? (Simple Explanation)

You might wonder: What does "parsing" actually mean?

Parsing means:

Breaking down text into meaningful structure.

Example: Simple Math Expression

Input (text):

3 + 5 * 2

Parser breaks this down:

Expression Tree:

        +
       / \
      3   *
         / \
        5   2

The parser:

  • Reads the text
  • Understands the grammar (math rules)
  • Builds a tree that represents meaning

Same idea with HTML:

Input:

<div><p>Hello</p></div>

Parser output:

    div
     |
     p
     |
  "Hello"

Parsing = turning text into structure.


Simple Parsing Example: HTML

Let's see how HTML parsing actually works.

Input HTML:

<div id="container">
  <h1>Title</h1>
  <p class="text">Paragraph</p>
</div>

Parsing Steps:

Step 1: Tokenization
  <div id="container">  →  START_TAG(div, id="container")
  <h1>                  →  START_TAG(h1)
  Title                 →  TEXT("Title")
  </h1>                 →  END_TAG(h1)
  <p class="text">      →  START_TAG(p, class="text")
  Paragraph             →  TEXT("Paragraph")
  </p>                  →  END_TAG(p)
  </div>                →  END_TAG(div)

Step 2: Tree Construction

          div (id="container")
            |
      ┌─────┴─────┐
     h1           p (class="text")
      |            |
  "Title"     "Paragraph"

The parser:

  1. Tokenizes (breaks into pieces)
  2. Builds a tree (creates structure)

This tree becomes the DOM.


Visual Diagram: Complete Browser Flow

┌──────────────────────────────────────────────────────────┐
│                      USER TYPES URL                       │
└────────────────────────┬─────────────────────────────────┘
                         ↓
┌────────────────────────────────────────────────────────────┐
│                   NETWORKING LAYER                         │
│  • DNS Resolution                                          │
│  • TCP Connection                                          │
│  • HTTP Request/Response                                   │
│  • Fetch HTML, CSS, JS, Images                             │
└────────────────────────┬───────────────────────────────────┘
                         ↓
         ┌───────────────┴───────────────┐
         ↓                               ↓
┌─────────────────┐            ┌──────────────────┐
│  HTML PARSER    │            │   CSS PARSER     │
│                 │            │                  │
│  Tokenize HTML  │            │  Tokenize CSS    │
│  Build Tree     │            │  Build Tree      │
└────────┬────────┘            └────────┬─────────┘
         ↓                              ↓
    ┌────────┐                    ┌─────────┐
    │  DOM   │                    │  CSSOM  │
    └────┬───┘                    └────┬────┘
         └──────────┬──────────────────┘
                    ↓
         ┌────────────────────┐
         │   RENDER TREE      │
         │ (DOM + CSSOM)      │
         └──────────┬─────────┘
                    ↓
         ┌────────────────────┐
         │      LAYOUT        │
         │  (Calculate boxes, │
         │   positions, sizes)│
         └──────────┬─────────┘
                    ↓
         ┌────────────────────┐
         │      PAINT         │
         │  (Fill in pixels)  │
         └──────────┬─────────┘
                    ↓
         ┌────────────────────┐
         │     COMPOSITE      │
         │  (Layer assembly)  │
         └──────────┬─────────┘
                    ↓
         ┌────────────────────┐
         │      DISPLAY       │
         │  (Pixels on screen)│
         └────────────────────┘

Render Tree Construction (Detailed)

DOM Tree:                      CSSOM Tree:

    html                         styles
     |                             |
   ┌─┴─┐                      ┌────┴────┐
  head body                  body      h1
   |     |                    |         |
  title ┌┴┐              font-size   color
       h1 p                16px      blue

                ↓ COMBINE ↓

         Render Tree:

            html
             |
           body (font-size: 16px)
             |
         ┌───┴───┐
        h1       p
    (color:blue)

Key insight:
Only visible, styled elements make it to the Render Tree.


What Happens When You Scroll or Resize?

When you interact with a page, the browser might need to:

Reflow (Re-layout)

  • When element positions/sizes change
  • Example: Window resize, font size change
  • Expensive operation

Repaint

  • When visual properties change (color, background)
  • Less expensive than reflow

Composite

  • When layers change (transforms, opacity)
  • Cheapest operation (GPU-accelerated)

Performance tip:
Modern web development tries to minimize reflows and maximize compositing.


JavaScript's Role in All This

We haven't talked much about JavaScript yet, but here's where it fits:

DOM is built
     ↓
JavaScript executes
     ↓
JavaScript can modify DOM
     ↓
DOM changes trigger re-rendering
     ↓
Layout → Paint → Display (again)

JavaScript:

  • Can manipulate the DOM
  • Can change styles (triggering CSSOM updates)
  • Can trigger reflows and repaints
  • Runs in a separate engine (V8, SpiderMonkey, JavaScriptCore)

Important:
JavaScript execution can block rendering. This is why script tags can slow down page loads.


Parsing Tree Example: Mathematical Expression

Let's revisit parsing with a clearer example.

Input Expression:

(3 + 5) * 2 - 4

Tokenization:

Tokens:
  LPAREN: (
  NUMBER: 3
  PLUS: +
  NUMBER: 5
  RPAREN: )
  MULTIPLY: *
  NUMBER: 2
  MINUS: -
  NUMBER: 4

Parse Tree:

           -
          / \
         *   4
        / \
       +   2
      / \
     3   5

How to Read This Tree:

  1. Start from the bottom: 3 + 5 = 8
  2. Go up: 8 * 2 = 16
  3. Go to top: 16 - 4 = 12

Result: 12

This is exactly how HTML/CSS parsers work—they build trees that represent meaning.


Browser Rendering Engines at a Glance

BrowserRendering EngineJavaScript Engine
ChromeBlinkV8
FirefoxGeckoSpiderMonkey
SafariWebKitJavaScriptCore
Edge (modern)BlinkV8
OperaBlinkV8

You don't need to memorize this.
Just know: different browsers use different engines, but the concepts are the same.


Common Misconceptions About Browsers

Misconception 1: "The browser just displays HTML"

Reality: The browser parses, interprets, styles, lays out, and renders HTML.

Misconception 2: "JavaScript runs in the browser"

Reality: JavaScript runs in a separate engine (V8, SpiderMonkey) that interacts with the browser.

Misconception 3: "CSS is just styling"

Reality: CSS affects layout, rendering performance, and even behavior (animations, transitions).

Misconception 4: "Browsers are simple"

Reality: Modern browsers are some of the most complex software ever built.


Why Understanding This Matters

Knowing how browsers work helps you:

As a Developer:

  • Write faster websites (avoid unnecessary reflows)
  • Debug rendering issues
  • Optimize performance (critical rendering path)
  • Understand browser DevTools

As a Learner:

  • Appreciate the complexity of the web
  • Understand web standards better
  • Prepare for interviews (this is a common topic)

As a Problem Solver:

  • Diagnose why pages load slowly
  • Fix cross-browser compatibility issues
  • Make informed architectural decisions

The Critical Rendering Path (Bonus Concept)

The Critical Rendering Path is the sequence of steps browsers take to render a page.

HTML → DOM
CSS → CSSOM
DOM + CSSOM → Render Tree
Render Tree → Layout
Layout → Paint
Paint → Composite → Display

Optimization goal:
Minimize time to first meaningful paint.

Techniques:

  • Inline critical CSS
  • Defer non-critical JavaScript
  • Optimize images
  • Minimize render-blocking resources

Key Takeaways

  1. Browsers are complex systems with multiple cooperating parts
  2. DOM and CSSOM are tree representations of HTML and CSS
  3. Parsing means converting text into meaningful structure
  4. Rendering involves: Layout → Paint → Composite → Display
  5. Every pixel on screen is the result of this entire process
  6. JavaScript can modify the DOM, triggering re-rendering
  7. Understanding this flow makes you a better web developer

You Don't Need to Remember Everything

This is a lot of information.

Good news:
You don't need to memorize every step.

What matters:

  • Understanding the flow (URL → DOM → Render Tree → Pixels)
  • Knowing why things happen (parsing, layout, paint)
  • Having a mental model you can reference

The details will come with practice and experience.


Further Learning

If you want to dive deeper:

  • How Browsers Work (classic article by Tali Garsiel)
  • MDN Web Docs (rendering, performance)
  • Chrome DevTools (Performance tab)
  • Web Vitals (Core Web Vitals, FCP, LCP)
  • Browser engine source code (Chromium, Gecko)

Final Thoughts

Next time you type a URL and press Enter, remember:

You're not just "opening a website."

You're triggering:

  • DNS resolution
  • HTTP requests
  • HTML parsing
  • DOM construction
  • CSS parsing
  • CSSOM construction
  • Render tree building
  • Layout calculation
  • Pixel painting
  • GPU compositing

All in milliseconds.

The web is magic—but it's understandable magic.

And now you understand how browsers make it happen.


Happy browsing. Happy building. 🚀