Translate Scanned Documents with Ease

Piotr Stojanow's avatar
Name

Piotr Stojanow

Twitter
@piotrstojanow

I sometimes need to translate text from scanned PDFs, which can be a hassle. So, I created this little script that’s a translation ninja:

#!/bin/sh
IMAGE=/tmp/"$(date +%Y%m%d-%H%M%S)".png
grim -g "$(slurp)" "$IMAGE"
TRANSLATED="$(tesseract "$IMAGE" - -l swe | trans --indent 0 --brief :en --no-bidi)"
foot -H -a floating -e echo "$TRANSLATED"
rm -rf "$IMAGE"

Here’s how it works:

  • Select the area: make a global shortcut (mine in Sway) to choose the part of the PDF you want to translate with slurp.
  • OCR magic: Tesseract scans the selected area and converts it into text.
  • Translate: get a quick and accurate translation with Google Translate via translate-shell.
  • Terminal bliss: You get a foot terminal window with the translated text screen.

It’s like having a personal translator at your fingertips. No more transcribing text from PDF images.