I recently got involved in a project requiring the use of an OCR (Optical Character Recognition) to extract text from images. After a bit of research, we decided to use Google's Tesseract.
In particular we decided to go for version
3.0.5 due to the possibility to save the output in a nicely formatted tsv file containing, among . . .