OCR Variables

(Available from version 7.2 onwards.)

Use the OCR (Optical Character Recognition) variables to capture text and other information from PDF and image files, for example, scans of forms completed by customers and employees by hand.

The following main object types are provided:

  • Advanced PDF Object: For character recognition from PDF files where the text is stored as an image and not digitally as text (which can instead be retrieved using the functions in the PDF Documents built-in service.)

  • Advanced Picture Object: For character recognition from image files.

The following additional object types are provided to receive information returned by the methods of the Advanced PDF and Advanced Picture objects:

  • OCR Suspicious Data Object: Stores information about a single instance of suspicious data in a PDF file or image.

  • OCR Word Data Object: Stores lists of words found using various methods of Advanced PDF Object and Advanced Picture Object variables.

Install NICE Advanced OCR

To use the OCR variable types and their methods, you must first install NICE Advanced OCR.

See Install NICE Advanced OCR.

Advanced PDF Object

Create an Advanced PDF Object variable to capture the text contents of a PDF file using an OCR engine.

The Advanced PDF Object type and its methods are intended for retrieving text from PDF files that contain images of text, for example, scanned forms that have been completed by hand.

For retrieving text from PDF files in which all data is saved as text, for example, PDF forms that were completed electronically, use the functions in the PDF Documents built-in service instead.

Properties

Example Files

An example workflow is presented for every method listed below.

All examples are based on the PDF JS_MembershipForm.PDF.

The File Name property of the Advanced_PDF variable is set by default to c:/temp/JS_Membeshipform.pdf.

Methods

Advanced Picture Object

Create an Advanced Picture Object variable to capture text from image files, for example, scanned forms that were hand-filled by customers.

For an example project that uses the Advanced Picture Object, see Project: Capture Data from Scanned Forms.

Methods

OCR Suspicious Data Object

Create an OCR Suspicious Data Object variable to store information about a single instance of suspicious data in a PDF file or image.

Note that you need to create a list variable of this type to capture all instances of suspicious data from a PDF file or image.

Properties

OCR Word Data Object

Create an OCR Word Data variable to store the list of words found using various methods of Advanced PDF Object and Advanced Picture Object variables.

Properties

Methods