Hi @MWojcicki
My understanding is that Genie space skills would not solve this issue.
The problem you described appears to be a **PDF rendering/export bug**, not something related to Genie instructions, skills, or table definitions.
Based on your technical analysis, it looks like the PDF export engine (**PDFium**) is routing **Latin Extended-A characters (U+0100–U+017F)** through a **Japanese CJK font (SourceHanSansJP)** instead of a font with proper Central/Eastern European language support. Since the font is not embedded in the generated PDF and the **ToUnicode CMap mapping appears incorrect**, Polish diacritical characters end up being rendered as wrong ASCII symbols or dropped entirely.
Since the Genie response itself is generated correctly and the corruption only happens during PDF export, this strongly suggests the issue is entirely in the rendering/export layer, after content generation.
Because of that, I don’t believe any Genie space customization (skills, instructions, semantic definitions, table configs, etc.) would have any influence over this behavior.
In my opinion, this needs to be fixed by the Databricks product/engineering team, likely in how the PDF export engine handles fonts and Unicode rendering.
I’d recommend opening a support case with Databricks, including your technical findings, as they’re very well documented and should help engineering triage the issue faster:
https://help.databricks.com/
This may also affect other languages using Latin Extended characters, not just Polish.
Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa