Contents
Config
New installations of ImageMagick sometimes don't have the best configuration out of the box. For example it can prevent you from splitting a pdf file into separate files, or it can run out of it's specified memory limits. Here are some settings I like to change:
Increase Memory limits
On Debian, in /etc/ImageMagick-<version>/policy.xml search for the lines containing domain="recource" I replaced these with the following
<!-- <policy domain="resource" name="temporary-path" value="/tmp"/> --> <policy domain="resource" name="memory" value="2GiB"/> <policy domain="resource" name="map" value="4GiB"/> <policy domain="resource" name="width" value="128KP"/> <policy domain="resource" name="height" value="128KP"/> <!-- <policy domain="resource" name="list-length" value="128"/> --> <policy domain="resource" name="area" value="1.0737GP"/> <policy domain="resource" name="disk" value="8GiB"/> <!-- <policy domain="resource" name="file" value="768"/> --> <!-- <policy domain="resource" name="thread" value="4"/> --> <!-- <policy domain="resource" name="throttle" value="0"/> --> <!-- <policy domain="resource" name="time" value="3600"/> -->
Allow unpacking of PDF's
By default you are protected from unpacking/extracting data from PDF files, because of a (patched) vulnerability in Ghostscript. This is the interpreter that ImageMagick uses and it used to be vulnerable to remote code execution. However this should be patched by now, and as long as you don't plan on working with untrusted files, you should be fine. To allow this comment out the following line
<policy domain="coder" rights="none" pattern="PDF" />
Commands
Increasing contrast on (bitmap) PDF files
First, break up the PDF into separate jpg files. (ImageMagick can't deal with pdf directly)
pdfimages -j file.pdf page
This command will extract all jpg files from a pdf, if the pdf doesn't consist of only jpg files, try this command:
convert file.pdf page-%03d.jpg
Now, modify the images. For example:
convert page-*.jpg -level 25% page_out-%03d.jpg
And finally, convert it back into one single PDF file.
convert page_out-*.jpg file_out.pdf
Remove vertical lines from images
The 1x100 and the specified array determine lines from what length should be recognized. Playing around with the values, this worked for me to delete scanner lines.
convert input.jpg -morphology bottomhat "1x100:0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0" -negate output.jpg