Pandoc and Unicode font fallback in LaTeX

technology, ,

It took me a while to figure out how to do this, so I’m documenting it in the hope that the example is useful to someone else.

The problem: using Pandoc to generate PDFs, with a font that does not support some of the Unicode characters in use. (Specifically, in my case, it was the name “Hồ Chí Minh”).

In HTML and CSS this works automatically: another font will be selected that does contain the required character(s). However, LaTeX doesn’t behave that way.

The most straightforward solution is ucharclasses. To make it work with Pandoc I ended up with something like the following:

variables:
  mainfont: Valkyrie A
  mainfontoptions:
    # [a list of font options here]
  header-includes: |
    \usepackage[Latin]{ucharclasses}
    \newfontfamily\fallbackfont{Gentium Plus} % Or whatever font you prefer.
    \setTransitionsFor{LatinExtendedAdditional}{\fallbackfont}{\normalfont}
    % Repeat that line for any other Unicode blocks you need.

Some things to note: