PDF tools
OCR PDF
Pull text out of scanned PDFs — photos of paper become editable Word or plain text. Runs in your browser, nothing uploaded.
Drop or click to choose a scanned PDF
Reads text out of scanned PDFs (photos of paper). English. Up to 50 pages.
Private by design — files and OCR engine run in your browser.
Frequently asked
- What is OCR?
- Optical Character Recognition — software that reads text out of an image. Scanned PDFs are pictures of paper, so the text isn't selectable until OCR turns the pixels back into letters.
- Does the OCR run on your servers?
- No. The OCR engine (Tesseract) and your PDF both stay in your browser. The first run downloads ~15 MB (cached after) — every page after that processes locally.
- How accurate is the OCR?
- On clean scans of printed text (laser printer, flatbed scanner, English): ~95%+. On phone photos with glare, low light, or handwriting: closer to 70-85%. Always proofread.
- How long does it take?
- 5-15 seconds per page on a modern laptop. 15-60 seconds per page on a phone. Page cap is 50 per run to avoid out-of-memory on mobile browsers.
- What languages are supported?
- English in v1. More languages coming — each adds another ~5 MB of language data to download.
- Can it OCR handwriting?
- No. Tesseract is built for printed text. Handwriting recognition needs a different model (we don't ship one yet).