pdf's mit und ohne text selektieren und bearbeiten


ich habe einen ordner mit 10,000 pdf's. darin befinden sich

pdf's

pdf's aus scans mit texterkennung und

pdf's aus scans ohne texterkennung (nur bild).

 

wie kann ich die pdf's aus scans ohne texterkennung (nur bild) selektieren und mit der ocr-funktion bearbeiten?

 

danke

using acrobat xi pro (or earlier versions of pro matter) make use of preflights provided.

create 1 of more preflight profiles.

check:

--| presence of invisible text objects.
(finds text objects use text rendering mode 3 (invisible text).)
ocr using searchable image or searchable image (exact) output uses text rendering mode 3.

 

--| can mapped unicode
all glyphs in text can mapped unicode.
(this supports find/search of pdf's text content)

if scanned image of text in pdf has had ocr applied there glyphs present of text rendering mode 3. output acrobat use glyphs of fonts map unicode.

by browsing in prflights find many interesting capabities.

 

so, use of preflight profiles let identify pdfs contain images of textual content have had ocr applied.
move identified files. leave pdf files not have ocr applied.

 

be well...



More discussions in Scanning & OCR


adobe

Comments

Popular posts from this blog

Could not place because the source rectangle is empty

Thread: Using smartcard reader with vpnc

Adobe Font Folio 7.0 or just 7?