Pdf to odt/docx conversion has me weeping!

Maroon@lemmy.world · 2 days ago

Pdf to odt/docx conversion has me weeping!

Botzo@lemmy.world · 2 days ago

https://pdf2docx.readthedocs.io/ seems to fit the bill. I can’t vouch for it.

PDF is such a curse. I say this as a person currently tasked with deploying new mysteriously complex enterprise PDF conversion software for technical documents. The rabbit hole is so deep.

observantTrapezium@lemmy.ca · 2 days ago

It’s a curse because it’s used for things other than what it’s intended to. It’s doing a good job representing printed material, but unfortunately people very commonly expect it to be something more akin to a word processor file.

Botzo@lemmy.world · 2 days ago

This is probably my first time ever using it for an appropriate purpose as this team’s technical docs are destined for the press (and digital distribution). They just have no idea how to software, so I was brought in to build bridges between and ultimately simplify all their tools.

Treczoks@lemmy.world · 2 days ago

It is not a curse. It does exactly what it is intended to do: Create an archive of a document that is universally reproduceable.

It is a very well designed cul-de-sac for exactly this purpose. Using it for anything else is calling for trouble.

mesa@piefed.social · 2 days ago

As a dev the reason pdf is so strange is because it’s a compound format. It can be just images strung together. It can also be pure text with fonts, ect…etc …

If you open the file as a text file, you can see this. It’s many different formats in a trenchcoat.

Botzo@lemmy.world · 2 days ago

Yeah, also a dev here. I’d be so happy if they’d parted ways with the 90s legacy bits at some point. Just glad there are enough parsing libraries that I’ll never need to care (right? Please tell me I’m right!).

mesa@piefed.social · 2 days ago

I hope your right too lol.