Forums

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • FAQ
  • Login
  • Register
  • Board index Discussion & Documentation Gentoo Chat
  • Search

HTML, EPUB, PDF and automated publishing.

Opinions, ideas and thoughts about Gentoo. Anything and everything about Gentoo except support questions.
Post Reply
  • Print view
Advanced search
6 posts • Page 1 of 1
Author
Message
52midnight
Apprentice
Apprentice
User avatar
Posts: 176
Joined: Tue Mar 20, 2012 9:56 pm
Location: Brisbane AU
Contact:
Contact 52midnight
Website

HTML, EPUB, PDF and automated publishing.

  • Quote

Post by 52midnight » Tue Jul 07, 2015 11:35 pm

I'm currently preparing a large, multi-file document for publication, and intend making it available in three formats:

- HTML for reading on the Net.
- EPUB for the modern eReaders.
- PDF for printed versions.

Given that I expect several revised versions in coming months, I want to automate the conversion process as much as possible. I've selected 'pandoc' for investigation but haven't used it. Being a commandline utility has some advantages, although if it succeeds it'll probably end up as the back-end for one or more GUI front-ends.

'Sigil' does a nice job of converting an HTML suite into an EPUB, and I might stick with it.

The problem is PDF. It began life as Adobe's property but was subsequently open-sourced. Like all such things it has never achieved the 'open source feel' - Microsoft's .mht is another example. The biggest problem is editing PDF's. The gold standard is supposed to be Adobe's Acrobat. It's not only a monster as regards size, but it tries to take over your whole system; I've long avoided it.

'qpdfview' does everything I want from a reader, but none of the others I've tried for editing seems much good. The solution I'm looking at is to use pandoc to convert HTML into ODT, do the editing in OpenOffice, and export into PDF.

Does anyone have experience or suggestions in this area?
Top
ct85711
Veteran
Veteran
Posts: 1791
Joined: Tue Sep 27, 2005 8:54 pm

  • Quote

Post by ct85711 » Wed Jul 08, 2015 12:27 am

You may want to take a look at calibre, it is able to covert to all of those formats, and supports most if not all the various ebook formats. It does have the ability to do mass conversions (more of queues everything and does a couple conversions at the same time, I think like 4-5 processes)
Top
52midnight
Apprentice
Apprentice
User avatar
Posts: 176
Joined: Tue Mar 20, 2012 9:56 pm
Location: Brisbane AU
Contact:
Contact 52midnight
Website

  • Quote

Post by 52midnight » Wed Jul 08, 2015 12:49 am

Been a while since I looked at Calibre. If is does PDFs then I'll definitely take a look. Thanks
Top
miket
Guru
Guru
Posts: 513
Joined: Sat Apr 28, 2007 2:45 am
Location: Gainesville, FL, USA

  • Quote

Post by miket » Wed Jul 08, 2015 3:36 am

Good for you for thinking about the need to generate multiple output formats. I wish that the people who decided to play with the Gentoo home page would have that kind of sense.

A GUI tool is nice for checking things and maybe for some authoring tasks, but for that large-scale publishing you have in mind, nothing beats tools that can be controlled by scripts.

I tried pandoc for one project I had, but it didn't work out for that application. It would likely be better suited for other tasks. If you want it to output to PDF, you have to take a trip through LaTeX. LaTeX generates beautiful text, but there is a lot of overhead through this path.

One tool that some people have used at work and one I plan to use for a project I have coming up is wkhtmltopdf. It is quite a different entrant: it is a headless browser that generates PDF's with remarkable fidelity to the source HTML. It understands CSS and even Javascript--though I imagine you might want to go light on the Javascript for your purposes! From what I've seen, it's pretty fast, too. Yes, you could argue that QT is a bigger dependency than LaTeX, but it might give you better results.

To generate the source HTML, I could imagine using XSL transforms (my big go-to solution) and/or tools like pandoc.
Top
charles17
Advocate
Advocate
Posts: 3686
Joined: Sun Mar 02, 2008 3:20 pm

Re: HTML, EPUB, PDF and automated publishing.

  • Quote

Post by charles17 » Wed Jul 08, 2015 4:38 am

52midnight wrote:I'm currently preparing a large, multi-file document for publication, and intend making it available in three formats:
Sounds like generating a SAX stream from your single document source and having serializers for each output format. Have a look at Cocoon.
Top
yngwin
Retired Dev
Retired Dev
User avatar
Posts: 4572
Joined: Thu Dec 19, 2002 1:22 pm
Location: Suzhou, China

  • Quote

Post by yngwin » Wed Jul 08, 2015 1:52 pm

Write your original in reStructuredText and use dev-python/sphinx (and dev-python/rst2pdf) to convert to various formats.
"Those who deny freedom to others deserve it not for themselves." - Abraham Lincoln
Free Culture | Defective by Design | EFF
Top
Post Reply
  • Print view

6 posts • Page 1 of 1

Return to “Gentoo Chat”

Jump to
  • Assistance
  • ↳   News & Announcements
  • ↳   Frequently Asked Questions
  • ↳   Installing Gentoo
  • ↳   Multimedia
  • ↳   Desktop Environments
  • ↳   Networking & Security
  • ↳   Kernel & Hardware
  • ↳   Portage & Programming
  • ↳   Gamers & Players
  • ↳   Other Things Gentoo
  • ↳   Unsupported Software
  • Discussion & Documentation
  • ↳   Documentation, Tips & Tricks
  • ↳   Gentoo Chat
  • ↳   Gentoo Forums Feedback
  • ↳   Duplicate Threads
  • International Gentoo Users
  • ↳   中文 (Chinese)
  • ↳   Dutch
  • ↳   Finnish
  • ↳   French
  • ↳   Deutsches Forum (German)
  • ↳   Diskussionsforum
  • ↳   Deutsche Dokumentation
  • ↳   Greek
  • ↳   Forum italiano (Italian)
  • ↳   Forum di discussione italiano
  • ↳   Risorse italiane (documentazione e tools)
  • ↳   Polskie forum (Polish)
  • ↳   Instalacja i sprzęt
  • ↳   Polish OTW
  • ↳   Portuguese
  • ↳   Documentação, Ferramentas e Dicas
  • ↳   Russian
  • ↳   Scandinavian
  • ↳   Spanish
  • ↳   Other Languages
  • Architectures & Platforms
  • ↳   Gentoo on ARM
  • ↳   Gentoo on PPC
  • ↳   Gentoo on Sparc
  • ↳   Gentoo on Alternative Architectures
  • ↳   Gentoo on AMD64
  • ↳   Gentoo for Mac OS X (Portage for Mac OS X)
  • Board index
  • All times are UTC
  • Delete cookies

© 2001–2026 Gentoo Foundation, Inc.

Powered by phpBB® Forum Software © phpBB Limited

Privacy Policy

 

 

magic