Version 1.11.0 (based on MuPDF v1.11) allows exporting, importing and interrogating files embedded in a PDF.
PDF "/EmbeddedFiles" are similar to ZIP archives (or the Microsoft OLE technique), allowing arbitrary data to be incorporated in a PDF and benefit from its unique features.
1 2 3 4 5 6 7 8 9 10 11 12
import fitz # = PyMuPDF doc = fitz.open("test.pdf") # open the PDF count = doc.embeddedFileCount print("number of embedded file:", count) # shows number of embedded files # get decompressed content of data stored by name "my data" # also possible to use integer between 0 and "count - 1" buff = doc.embeddedFileGet("my data") fout = open("test.file", "wb") # open output file fout.write(buff) fout.close()
Deletion, reporting, importing, copying between PDFs, etc. is just as simple.
See here for more examples and lightweight utilities:
Any Python bitness and Python 3 is fully supported and tested up to and including 3.6. Platforms include at least Windows, Mac and Linux. Ohter platforms should work that are supported by Python and MuPDF.