http://lilithlela.cyberguerrilla.org/?p=568
This blog intends to augment USP no. 3: The danger of metadata [digital footprints] (original here) for *nix distros:
There is a freeware program by the name of PDF Info (http://www.bureausoft.com/pdfinfo.exe) which lets you edit not only the aforementioned Title/Author/Subject/Keywords fields, but also the PDF Producer and Creator Application fields. It doesn’t, however, let you change the file creation and modification dates and times.
The PDF Toolkit(pdftk) claims to be that all-in-one solution. The closest thing to Adobe Acrobat for Linux.
You can download pdftk as source or as a Debian or RPM package, FreeBSD port, or Gentoo Ebuild. Binaries are available for Windows and Mac OS X too. If you decide to compile pdftk, check the build notesbefore you begin, in order to find out about any dependencies for your Linux distro or your platform.
I am lazy, I chose to install package. Installing the pdftk package on Ubuntu and Backtrack requires having "universe" enabled. Then:
$ sudo apt-get install pdftk
The following extra packages will be installed: gcj-4.4-base gcj-4.4-jre-lib libbcmail-java libbcmail-java-gcj libbcprov-java libgcj-bc libgcj-common libgcj10 libgnuinet-java libgnujaf-java libgnumail-java libitext-java libitext-java-gcj
After this operation, 66.5MB of additional disk space will be used.
To look at the metadata that Adobe Reader does not show by default (replace 034045 with your pdf filename):
$ pdftk 034045.pdf dump_data
To alter the metadata first put the metadata in a file (replace 034045 with your pdf filename):
$ pdftk 034045.pdf dump_data output pdf-metadata
Open the pdf-metadata file and remove the data you wish scrubbed:
Save the pdf-metadata file. Now you can use that data to scrub the metadata from your file (replace 034045 with your pdf filename):
$ pdftk 034045.pdf update_info pdf-metadata output 034045-no-metadata.pdf
And check the result with:
$ pdftk 034045-no-metadata.pdf dump_data
The creation date is gone and a new modification date has appeared. And the iText gives away the use of pdftk. These infokeys can be removed with sed:
sed -i 's/iText\ 2\.1\.7\ by\ 1T3XT//;s/D:20120409144213+02'\''00'\''//' 034045-no-metadata.pdf
The '\'' are for breaking out of the single quoted string thenescaping the single quote.
PdfID0 and PdfID1 are file identifiers. They are an md5 of various info about the file so that it has a unique string to identify the doc without having to use the filename. If you want to scrub those too, use sed as above. There's no geeky tricks needed for cleaning those two from the metadata with sed, it's pretty straightforward.
Tags:
"Destroying the New World Order"
THANK YOU FOR SUPPORTING THE SITE!
© 2024 Created by truth. Powered by