Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[0916更新]pdf2htmlEX: 高保真PDF至HTML转换器
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index 中文 (Chinese)
View previous topic :: View next topic  
Author Message
coolwanglu
n00b
n00b


Joined: 01 Sep 2012
Posts: 6

PostPosted: Sat Sep 01, 2012 5:06 pm    Post subject: [0916更新]pdf2htmlEX: 高保真PDF至HTML转换器 Reply with quote

[0916 更新]
新增两个demo
http://coolwanglu.github.com/pdf2htmlEX/demo/cheat.html
http://coolwanglu.github.com/pdf2htmlEX/demo/geneve.html

* 完全去掉了boost
* 降低了C++11的依赖,GCC最低支持至4.4.6
* 支持超链接(文内链接精确到页)
* 解决了一部分字体编码问题

先上Demo:http://coolwanglu.github.com/pdf2htmlEX/demo/demo.html
还有大家可能会关心的CJK http://coolwanglu.github.com/pdf2htmlEX/demo/chn.html

项目主页: https://github.com/coolwanglu/pdf2htmlEX

传统pdf2html有两种:
一种相当于pdf2text加一些比较弱的格式,基本跟pdf2text也差不了多少
另一种是把所有渲染成图片然后嵌到一个html,结果是文字信息都丢失(不能选择,拷贝),生成的文件还巨大。

pdf2htmlEX结合二者优点,既保留了文字,又保留了格式。
具体来说有如下特性
1.从pdf提取字体
2.保证渲染准确性,针对web进行优化(包括减少文件大小,文字行合并,(为HTML文字选择)字体重编码等等)
3.其他内容用图片显示
4.单文件输出,一个HTML搞定一切

下载编译安装:
依赖:
较新的poppler (0.20.3),自己编译时记得加参数--enable-xpdf-headers
fontforge,需要git版本 https://github.com/fontforge/fontforge,因为有一些功能/bug是我开发pdf2htmlEX时提交的
boost c++库,具体依赖的组件见项目主页
cmake和支持c++11的gcc

如果哪位觉得这个小工具还不错,愿意为gentoo打包,请联系我,不胜感激!

欢迎各种意见,建议,fork,bug report


Last edited by coolwanglu on Sun Sep 16, 2012 2:12 pm; edited 1 time in total
Back to top
View user's profile Send private message
Bezetek
n00b
n00b


Joined: 19 Nov 2009
Posts: 4

PostPosted: Thu Sep 13, 2012 12:04 pm    Post subject: Reply with quote

效果不错, 支持一下
Back to top
View user's profile Send private message
microcai
n00b
n00b


Joined: 24 Apr 2011
Posts: 6

PostPosted: Sun Oct 14, 2012 12:32 am    Post subject: Reply with quote

gentoo-zh 已收录 :)
Back to top
View user's profile Send private message
coolwanglu
n00b
n00b


Joined: 01 Sep 2012
Posts: 6

PostPosted: Sun Oct 14, 2012 8:05 am    Post subject: Reply with quote

microcai wrote:
gentoo-zh 已收录 :)


Thanks!
那我在主页上应该怎么引用呢?
Back to top
View user's profile Send private message
microcai
n00b
n00b


Joined: 24 Apr 2011
Posts: 6

PostPosted: Mon Oct 15, 2012 8:42 am    Post subject: Reply with quote

coolwanglu wrote:
microcai wrote:
gentoo-zh 已收录 :)


Thanks!
那我在主页上应该怎么引用呢?


gentoo-zh overlay 已经收录。

添加 gentoo-zh overlay (如果还没添加的话)
layman -a gentoo-zh

然后安装 pdf2htmlEX

emerge pdf2htmlEX
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index 中文 (Chinese) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum