Step 3: XML Export Keys

  • XML export with extended character information is included in the CLI OCR Tool!
    This information can now be used to: develop own, intelligent post processing of the OCR results, for example:
    • word highlighting in customer search and database applications
    • use the character position information to check and extract certain information

--xmlCharAttributesMode

  • New CLI 11 R2
  • Specifies the mode of XML char attributes writing.
    • None
      No character attributes are to be written.
    • Ascii
      Character coordinates and character confidence are to be written.
    • Basic
      Character coordinates are to be written.
    • Extended
      Character coordinates, character confidence and extended character attributes are to be written.
      • The following extended attributes are written:
        • whether the character is the first character in a word,
        • whether the word is found in the dictionary,
        • whether the word is recognized with either a standard or user-defined language, and that it is not a number or an identifier,
        • whether the word is a number,
        • whether the word is an identifier,
        • probability that a character is written with a Serif font,
        • penalty for discordance of characters in a word,
        • the mean width of stroke in the RLE representation of a word image.
Key Parameters Default
–xcam None
Ascii
Basic
Extended
None
- -xmlCharAttributesMode

--xmlWriteAsciiCharAttributes

  • Note: Starting from release 2 this option is obsolete. Using it does not affect XML export. Please use the option -xcam with the parameter Ascii instead.
  • The character coordinates and character confidence will be written into XML file.
Key Parameters Default
-xaca no
- -xmlWriteAsciiCharAttributes

--xmlWriteCharacterRecognitionVariants

  • New CLI 11 R1
  • Collections of variants of each character's recognition will be written into XML file.
Key Parameters Default
-xacv no
- -xmlWriteCharacterRecognitionVariants

--xmlWriteCharAttributes

  • Note: Starting from release 2 this option is obsolete. Using it does not affect XML export. Please use the option -xcam with the parameter Basic instead.
  • The character coordinates will be written into XML file.
Key Parameters Default
-xca no
- -xmlWriteCharAttributes

--xmlWriteExtendedCharAttributes

  • Note: Starting from release 2 this option is obsolete. Using it does not affect XML export. Please use the option -xcam with the parameter Extended instead.
  • Character coordinates, character confidence and extended character attributes will be written into XML file.
Key Parameters Default
-xeca no
- -xmlWriteExtendedCharAttributes

--xmlWriteCharFormatting

  • Character formatting will be written into XML file.
Key Parameters Default
-xcf no
- -xmlWriteCharFormatting

--xmlWriteNondeskewedCoordinates

  • Character coordinates written into XML file will be taken from a modified image plane.
Key Parameters Default
-xnc no
- -xmlWriteNondeskewedCoordinates

--xmlWriteWordRecognitionVariants

  • New CLI 11 R1
  • Collection of variants of each word's recognition will be written into XML file.
Key Parameters Default
-xwrv no
- -xmlWriteWordRecognitionVariants
  • Note. Full keys are marked by italic.