logicspike/docs

Blog Engine

`@vlozi/blog` — Content Security Model

Last Updated: 2026-05-06 Status: Active

SDK consumers and integrators who need to reason about the trust boundary between blog content and the rendered page. This document explains where sanitization actually happens, what guarantees the SDK provides, and where consumers must pick up additional defense.


TL;DR

  1. Server-side sanitization is the primary defense. Blog content is cleaned at write/publish time by apps/blog-service — that's the trust boundary. Anything reaching the SDK has already been scrubbed.
  2. The SDK's sanitizeHtml is defense-in-depth. It iterates to a hard cap of 5 passes (MAX_PASSES), strips <script>/<style>/<iframe> (except allowlisted YouTube), and removes inline event handlers. It is not a substitute for server-side sanitization or DOMPurify.
  3. Iframes are allowlisted by host. Only youtube-nocookie.com survives. Adding hosts requires threat-modeling.
  4. Custom HTML from outside the editor needs DOMPurify. The SDK assumes its input is already sanitized.

1. The trust boundary

The trust boundary is between the editor-side serializer and the blog API response. Everything to the right of that line treats the HTML as already clean and only re-runs sanitization defensively.

2. Server-side sanitization (primary)

Lives in apps/blog-service/src/utils/tiptap-renderer.ts. Applied when:

  • A post is saved as a draft (Tiptap JSON → HTML)
  • A draft is published (re-renders with the latest extension config)
  • A markdown import is committed (Markdown AST → Tiptap JSON → HTML)

What it does:

Class of input Treatment
Allowed tags Whitelist of tags emitted by the Tiptap extension set (paragraphs, headings, lists, tables, callouts, code blocks, mermaid blocks, carousels, YouTube embeds, links, marks like bold/italic/code)
URLs in href and src sanitizeUrl() — strips anything that doesn't start with http://, https://, mailto:, /, or #
YouTube node attrs.src Whatever the user pasted, escaped. The seller-dashboard's YouTubeDialog (and as of the Phase 0 fix, the markdown importer) normalize all forms to https://www.youtube-nocookie.com/embed/${videoId} before this point
Untrusted attributes Dropped silently

What it does NOT do:

  • Run DOMPurify
  • Validate that all editor-emitted nodes match the latest extension schema
  • Rewrite or rewrite-warn on URLs whose origin doesn't match an allowlist

Adding new editor extensions requires updating the renderer; missing renderer support causes the node to render its inner content (graceful degradation) which can leak unexpected HTML if the new extension produces it.

3. Client-side sanitization (defense-in-depth)

Lives in packages/blog-sdk/src/react/sanitize.ts. Called by <BlogContent> and <ServerBlogPost> before dangerouslySetInnerHTML.

What it strips

  • <script>, <style>, <object>, <embed>, <meta>, <base> — full content + orphan opens/closes
  • Inline event handlers (onclick, onerror, etc.) in all three quoting forms
  • Dangerous URL protocols: javascript:, vbscript:, data:text/html (including HTML-entity-obfuscated forms like j&#97;vascript:)

What survives

Iframes whose src starts with one of the allowlisted prefixes:

const ALLOWED_IFRAME_SRC_PREFIXES = [
    'https://www.youtube-nocookie.com/',
    'https://youtube-nocookie.com/',
] as const;

Everything else iframe-related — vimeo, generic, sourceless, on-handlers on allowlisted iframes — is dropped.

Why a hard iteration cap

The sanitizer iterates to a fixed point so that nested or split tags like <scr<script>ipt> can't survive a single regex pass. The cap (MAX_PASSES = 5) is defensive against pathological input that doesn't stabilize — without it, a regex DoS construction could lock the renderer.

In practice, valid editor output stabilizes after 1-2 passes. The cap only matters for adversarial input that has bypassed server-side sanitization, which is already an attack-against-the-API not against-the-SDK problem.

What it is NOT

  • Not a replacement for DOMPurify on truly untrusted HTML
  • Not a CSP enforcement layer — that's a header-level concern, see "CSP" below
  • Not a sandbox — <BlogContent> does still render content into the same JS realm as your app

4. When you need DOMPurify

The SDK's sanitizer assumes its input has already been sanitized server-side. If you bypass that assumption — i.e. you're rendering HTML from a source the editor didn't produce — wrap with DOMPurify first:

import DOMPurify from "dompurify";
import { BlogContent } from "@vlozi/blog/react";
 
function ImportedPost({ html }: { html: string }) {
    const safe = DOMPurify.sanitize(html, {
        // Match the SDK's allowlist for parity
        ALLOWED_TAGS: undefined,    // use default
        ALLOWED_URI_REGEXP: /^(?:https?|mailto|tel|#|\/):/,
    });
    return <BlogContent html={safe} />;
}

Trigger scenarios:

  • RSS-imported content
  • AI-generated drafts pasted directly into post.content
  • User-submitted HTML (comments rendered through <BlogContent>, etc. — though this is an antipattern; render comments through a different pipeline)
  • Migration-time imports from a foreign CMS

5. Iframe allowlist policy

Only youtube-nocookie.com is allowed today. The reasoning:

  • Privacy-first. Standard youtube.com embeds drop tracking cookies on first paint. The nocookie host doesn't.
  • Single host = single threat surface. Each additional iframe host expands the attack surface for click-jacking, popup spam, fullscreen abuse, and postMessage exploits.
  • Editor enforcement. The Tiptap Youtube extension forces nocookie: true (YouTubeDialog.tsx:83-85). The markdown importer (as of Phase 0 of the 2026-05-01 fix plan) enforces the same.

To add a host, you'd need to:

  1. Threat-model the embed surface (frame-src CSP, X-Frame-Options on the embedded site, postMessage handlers, fullscreen API).
  2. Update ALLOWED_IFRAME_SRC_PREFIXES.
  3. Update the editor + markdown importer to normalize incoming URLs to that host.
  4. Add iframe attribute scrubbing if the new host requires non-standard allow= permissions.
  5. Update the security model doc and the integration prompt's known-issues table.

Don't shortcut this. The default iframe-host list is intentionally conservative.

6. CSP recommendations

The SDK does not set CSP headers — that's the host site's responsibility. If you serve blog pages, the recommended CSP directives are:

Content-Security-Policy:
    default-src 'self';
    img-src 'self' https: data:;
    media-src 'self' https:;
    frame-src https://www.youtube-nocookie.com;
    style-src 'self' 'unsafe-inline';      /* if shadcn / Tailwind inline */
    script-src 'self';                      /* tighten further if you can */
    object-src 'none';
    base-uri 'self';

The frame-src directive is the complement to the SDK's iframe allowlist. The SDK strips iframes the CSP would also block; the CSP catches anything the SDK might have missed (or anything injected post-SDK by some other component).

If you embed <BlogContent> inside an iframe yourself, audit X-Frame-Options / frame-ancestors to control where the blog can be rendered.

7. Reporting a security issue

Email hello@vlozi.app. Don't open public issues for security-sensitive findings. We'll respond within one business day with a track for coordinated disclosure.

Blog Engine