The Extractor does the time consuming heavy duty processing for extracting the
graphical objects from existing PDF files. It can also create markable
PDF files from arbitrary page-images. The Extractor produces an intermediate
binary object file that is needed by the Universal_pdf_file_Modifier tool.
This tool can be used to:
Read a single- or multi- page PDF file, and it extracts PDF objects.
(It does this if it is run from the command-line and single a PDF file is given as command-argument.)
Or,
Read in image files to represent each page.
(It does this if it is run from the command-line and image file(s) is(are) given as command-argument(s).)
Images can be in: JPG, GIF, PNG, or PPM format.
The recommended resolution is about 110-pixels/inch.
Higher or lower resolutions are fine, and will result in sharper/larger
or lower-quality/smaller files, respectively.
Or,
Open a graphical interface with file-browser to select a PDF file.
(It does this if it is run with no command-line arguments, such as when started by double-clicking
from your desktop interface.)
In this case, it proceeds to extract the selected PDF object data file.
Usage:
bin/Extract_pdf_file_to_Universal_pdf_objs form.pdf
Or,
bin/Extract_pdf_file_to_Universal_pdf_objs pg1.gif pg2.gif pg3.gif
(Or any number of image files, or pages.)
Or,
bin/Extract_pdf_file_to_Universal_pdf_objs pg*.gif
Or,
bin/Extract_pdf_file_to_Universal_pdf_objs
(Without arguments, will bring up file browser to select a PDF file.)
By default it writes to file "universal_pdf_obj.data", which can be re-named accordingly.
For convenience, it also writes out a blank metadata file, called "blank_meta.txt",
which can be used to start the metadata file, because it has the correct number of
page boundary keywords in it, but nothing else.
You can change the default output file name with the -o option.
For example:
On Linux systems, the Extractor tool requires the following run-time utility:
Convert - Provided by ImageMagick.org, ImageMagick Studio LLC.
Often provided by default on popular distros.
If you do not already have it installed, then:
sudo apt-get install imagemagick
(type your password for permission to install)
Additionally, on some Debian systems, you need to edit the file:
* The ImageMagick utility must be downloaded and installed separately, if not already on your Linux system.
* You will see an error message on start-up, if you are missing any required utilities.
* Note that the Extractor tool is presently supplied for PC's running Microsoft Windows, and for PC's running Linux operating systems, such as Linux Mint,
Ubuntu, Fedora, Redhat Enterprise Linux, etc..
Linux binary executables will NOT operate on Apple Mac's.
Linux binary executables will NOT operate on Microsoft Windows PC's.
Microsoft Windows executables will NOT operate on Linux or Apple PC's.
Portability is usually not a hindrance since the Extractor is usually executed only once per form, and only in the head office
where multiple computing environments are available.
In the future we may provide executables for the additional operating systems.
The source-code is presently intended for operation under either MS-Windows or Linux/POSIX complaint environments.
Some modification may be necessary for full operation under other operating systems.