What is actually happening
When a PDF renderer needs to draw a character, it looks up the glyph in the embedded font. If the font does not contain that code point, you get a hollow rectangle (often called tofu), a question mark, or nothing at all. The text is correct in the underlying PDF content stream, but the font cannot draw it.
This usually appears when a standard Western font (Helvetica, Times, Arial) is asked to render CJK, Arabic, Cyrillic, Hebrew, or Devanagari text, when emoji show up in user-generated content, or when accented Latin characters fall outside a font's embedded subset.
How to confirm it is a font problem
Open the PDF in Adobe Acrobat and check File, Properties, Fonts. The dialog lists every font the document references, including subsets, and notes whether each is embedded. If the character you expect is missing, the font shown for that text run does not cover its code point. Copying the text out and pasting it into a plain text editor confirms the underlying characters are correct, isolating the issue to rendering rather than data.
How to fix it
Switch to a Unicode-complete font. Google's Noto family was designed specifically to eliminate tofu and covers nearly every script. Noto Sans handles Latin, Cyrillic, and Greek. Noto Sans CJK adds Chinese, Japanese, and Korean. Noto Color Emoji adds emoji.
Declare a fallback stack. For HTML to PDF tools that use a Chromium-based renderer, list a brand font followed by Noto Sans and Noto CJK so the renderer falls back per glyph rather than per text run:
html body {
font-family: 'Barlow', 'Noto Sans', 'Noto CJK', sans-serif;
}Register a Unicode TTF in native PDF libraries. If you are using pdf-lib, ReportLab, or PDFKit, load the TTF for the script you need and assign it before drawing the text. Subset embedding is fine for production size, but the subset must include every code point your content can contain at runtime, not only the ones present at template-design time.
A note for HTML and CSS PDF generation
If you are generating PDFs with Anvil's HTML to PDF API, Noto Sans and Noto CJK are already the default fallbacks, so accented Latin and most CJK characters render correctly even without a custom font. Custom fonts are added through standard CSS @import or @font-face directives. Font files must be in TTF format and served with a Content-Type of font/ttf or application/x-font-ttf, otherwise the renderer ignores them.
Back to All Questions