Monthly Archives: February 2013

TCPDF Does not Support Indic scripts

The other day I ran across an interesting question on Stack Overflow regarding the use of indic fonts with TCPDF.  Languages like Tamil and Malayalam fall into this category. [How can I create Malayalam PDF using TCPDF in PHP?] At first I thought it was just a simple font issue, but that was not the case.  Even with font subsetting off, and trying a number of different Malayalam capable fonts, I was having the same problem the question asker was facing. I began googling for information regarding Malayalam and TCPDF, but to no avail.  I found out that Tamil was a related script, so I did searches on that and the outlook was not good regarding proper rendering in TCPDF.

Finally after switching my google queries to specifically search for information about indic script support in TCPDF, I found this comment by a person called “Santhosh” regarding TCPDF’s lack of indic script support:

It is technical limitation. For TCPDF, the true type font need to be converted to afm format first, then for each script, the diacritics or ligature rules are implemented in tcpdf itself. That is what done for adding Arabic/Persian support. For complex scripts this is not a correct approach. Indic shaping engines like Pango has evolved by taking about 10 years. The shaping logic is very complex and duplicating it inside a PDF library is wrong approach. Instead the PDF library should depend on Pango or the upcoming Harfbuzz rendering engines. The PDF export library in Mediawiki uses reportlab pdf library. That also attempts to the rendering by itself. And ended up in having no support for any Indic languages and many bugs for Arabic scripts(Note that this extension is disabled in many Indian wikiprojects). Fonts are not enough for rendering, a shaping engine is also required for complex script to interpret the glyph formation rules. This is what PyPDFLib is trying to solve by using Pango for script rendering and Cairo for graphics.

I’m happy to have come across his posts as I might’ve banged my head against a wall trying to answer this Stack Overflow question for a while.  I don’t know anything about these languages so I don’t think I could have answered the question as quickly or as thoroughly as I did if it weren’t for finding this.

You can read the full blog post and comment thread here: Creating a new Language ecosystem- Sourashtra as example