Origami/pdfextract

From aldeid
Jump to navigation Jump to search
You are here:
pdextract

Description

Extracts various data out of a document (streams, scripts, images, fonts, metadata, attachments).

Usage

Syntax

Usage: /usr/local/bin/pdfextract <PDF-file> [-afjms] [-d <output-directory>]

Options

-d, --output-dir DIR
Output directory
-s, --streams
Extracts all decoded streams
-a, --attachments
Extracts file attachments
-f, --fonts
Extracts embedded font files
-j, --js
Extracts JavaScript scripts
-m, --metadata
Extracts metadata streams
-i, --images
Extracts embedded images
-h, --help
Show this message

Example

Let's extract JavaScript contained in the pdf1.pdf file:

$ ./pdfextract -j /data/tmp/pdf1.pdf 
Extracted 1 scripts to 'pdf1.dump/scripts'.

The JavaScript has been dumped to the "pdf1.dump/scripts/" directory:

$ cat pdf1.dump/scripts/script_576906449.js 
function re(count,what) 
{
var v = "";
while (--count >= 0) 
v += what;
return v;
} 
[SNIP]

Comments