Literally one of the worst formats I deal with daily, from a security standpoint are PDFs. Very useful and predictable for the end user; yes, but very dangerous for the capabilities it allows.
Dangerzone works like this: You give it a document that you don’t know if you can trust (for example, an email attachment). Inside of a sandbox, Dangerzone converts the document to a PDF (if it isn’t already one), and then converts the PDF into raw pixel data: a huge list of RGB color values for each page. Then, in a separate sandbox, Dangerzone takes this pixel data and converts it back into a PDF.
fearout ( @fearout@kbin.social ) 18•2 years agoSo it basically rasterizes it? I wonder how it affects file size
klangcola ( @klangcola@reddthat.com ) 11•2 years agoNo mention of OCR? Copy-pasting links or data will be a joy…
gromnar ( @gromnar@beehaw.org ) 5•2 years agoThere is an optional Ocr pass, from what I understand
ASK_ME_ABOUT_LOOM ( @ASK_ME_ABOUT_LOOM@beehaw.org ) 9•2 years agoOh, I think you already know.
Yeah, definitely increases the size and removes some functionality that others may rely on. But for presentation of content which is what a PDF SHOULD BE, then it has typically worked fine. I’ve been using pandoc and some home grown scripts to do this sort of thing for a while.
GhostMagician ( @GhostMagician@beehaw.org ) English8•2 years agoThis is looking like it’ll be a valuable tool I’ll use frequently.
Blackbird ( @Blackbird@infosec.pub ) English6•2 years agoCool concept.
EastEndLatte ( @EastEndLatte@beehaw.org ) English3•2 years agoI don’t know the pdf format very well, is it possible to just drop a few commands that make it vulnerable?