This is a brief look at some of the software I’m using, mention, or am thinking of using, in the course of building a dictionary of given names, which is what is being documented on this blog. This page will start out as a simple list, and as time goes on, I will add comments on what these programs do and how I use them.

Acrobat Pro (Adobe)

Briss (Gerhard Aigner)

FineReader Express Edition (ABBYY)

OCR software for the Mac, based on a version of ABBYY’s OCR engine released in 2005. No newer version is available on the Mac, although the software interface has been updated. It does not appear that the better recognition rates or additional languages added to the Windows versions in the past eight years have made it to the Mac version. Costs $99. The newer Windows version (11) supports dictionary-assisted Hebrew OCR, which is nice, but it seems to not be available to Mac users.

Update: I contacted ABBYY and they’ve told me the Pro version for Mac which will be caught up with the Windows version, is expected either at the end of this year (2013) or the beginning of next year (2014). Look forward to seeing what they release.

Readiris Pro (I.R.I.S.)

OCR software available on the Mac. Oddly, they have Readiris Pro 12 on the Mac App Store, but offer Readiris Pro 14 on their web site. They both cost the same – $129. Supports Hebrew, but doesn’t mention if that includes dictionary-assisted OCR. Also, they have a bizarre restriction of working on up to 50 pages at a time. For software designated ‘Pro’ this is a very bizarre restriction. It certainly makes me hesitant to plunk down $129 for their software, knowing I will need to figure out workarounds to get it to process the books I’m scanning. Want unlimited scanning? You need to buy their ‘Corporate’ version for $599. I’m thinking…no.

Revel (Adobe)

Pandoc (John MacFarlane)

A swiss-army knife document converter. One of the formats it supports is ICML, the native Adobe InCopy format, which can be easily imported into InDesign. I’m thinking about outputting data from a database to a text file with simple formatting tags, perhaps markdown, and then using Pandoc to convert the formatting into ICML for import into InDesign. I’m not sure it will cover everything I need, but it’s an interesting method to test out simple layouts for the book.

TLex (TshwaneDJe)

As far as I can tell, this is the only fully featured Dictionary Writing System (DWS) that is available on the Mac. Includes tlCorpus module for building a corpus, and TLex module for compiling a dictionary. Can export to InDesign, although I’m not clear what other than text is exported (i.e. styles and internal links).

VueScan (Hamrick)

My go-to scanner app. Works with almost any scanner out there. Works connected via a cable, or wireless. Support is very good, with fast responses from the developer. A somewhat unique product that receives complaints because of how often they release updates. Includes OCR software based on the Google Tesseract OCR engine. Tesseract supports Hebrew, but that support has not yet been integrated into VueScan. I’ve asked the developers about that, and they’ve promised to look into it. Hopefully I will be able to update this page soon saying they’ve added support for Hebrew.



