Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A big portion of what people use Calibre for is parsing, converting, and generating HTML. The epub file format, in practice, is zipped, poorly-specified and poorly-generated HTML. It is far from an exact science, and minor improvements to this can result in countless lives being saved slight amounts of discomfort from terrible HTML-handling or poorly-handled OCR.

To give an example, I once needed to reference a 2003 book that was out of print for a long-running project I was working on at the time, and the publisher who released it digitally had gone out of business, and the author's mailserver had gone dark. I checked the usual suspects for the PDF or hard copy of the book, but the only remaining copy of it I could find was a Calibre automated epub conversion from quite a few years in the past. A few years later, I once again looked for a better copy, only to find another Calibre epub conversion, but still no PDF in sight. This one, seemingly generated a few years later, was much higher in quality.

I was pleasantly surprised.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: