Recogniform - Tecnologie di riconoscimento ottico

 

 

Go to the Italian website

Recogniform Desktop Reader                   

Recogniform Reader    Workgroup version     download .pdf brochure   Info    

Description
Recogniform Reader is the innovative solution for the automatic data capture (optical reading) from paper forms. The manual data input often represents the bottle-neck in the processes of forms elaboration and documents filling. Using Recogniform Reader you will immediately and simultaneously get:

  • Lower costs;
  • Higher speed;
  • Quality 100% guaranteed.

Compared to manual data-entry, our data-capture system allows you to use a reduced number of human resources, always operating at the maximum hardware speed, without performance decreasing and errors caused by personnel's tiredness and boring. Using the most advanced technologies, Recogniform Reader is able to achieve results that other products can't obtain and won't probably ever reach.

 

Fields of application
Thanks to its flexibility, Recogniform Reader is suitable for an wide range of possibilities. Here some:

 

Technology
Using the state-of-art technologies, Recogniform Reader recognizes all kind of data, thanks to Recogniform recognition engines:

  • OMR Technology
    The OMR technology allows user to read the check boxes, that is the sign affixed in predefined spaces. The
    power of this engine, compared to similar products, is the advantage of working with two operative parameters, evaluating the quantity of existing ink and the size of the sign in the box;
     
  • ICR Technology:
    With ICR system it's possible to recognize manuscript data in unconstrained or constrained mode when there is usually space among characters. The engine has been expressly trained on the European and American writing style with an high accuracy.
     

  • BCR Technology:
    The BCR Technology allows the recognition of bar codes, decoding their content. It recognizes all the standard bar codes, enclosed the pharmaceutical ones.
     
  • OCR Technology:
    It's the recognition technology for printed and typed texts. It's omnifont and it can recognize characters having any font style and size.
     
  • OCR-A Technology:
    This engine works on pre-printed OCR-A codelines of postal and banking documents;
     
  • OCR-B Technology:
    This engine works on pre-printed OCR-B codelines of postal and banking documents;
     
  • MICR CMC7 - E13B Technology:
    This engines works on pre-printed CMC7 or E13B check codelines.
     
  • CHR Technology:
    It allows user to read manuscript data in cursive written (natural writing not in capital letters): what was only imaginary a few years ago, now it's real. This feature is essential for reading forms that haven't been expressly designed for the automatic acquisition and contain unconstrained fields, freely written, without any tie.


  • AMK Technology:
    The AMR allows you to perform the assisted manual data-entry for data you can't or you don't want to read automatically;

     
  • IDE Technology:
    Through the IDE system it's possible to export page zones or whole pages in image format. It's very useful when you need to keep portions of forms containing signatures, sketches, drafts or other information, without modifying anything of their characteristics or structure.

    


Processing Steps

We can split the forms processing in 4 main steps:


1. Input
2. Recognition
3. Verify/Correction
4. Output

 

A "zero" step, prior to the production itself, it's represented by the template definition. What is a template? A template is a "project" that contains all rules and information needed to read properly a specific kind of forms. (For example: the template contains the followforms ing information: read a barcode Code39 in the upper left of the form).
It's very simple to create and develop a template, using the very easy-to-use tool called Recogniform Application Designer (included in the software): just design with your mouse the fields area on the empty form, select the kind of data you need to extract (barcode, ocr, icr, omr, etc.) and that's all. Of course you can set a lot of functions and rules using a powerful scripting language that will let you feel really the "owner" of the software:discovering its flexibility and power you will soon imagine a thousand of interesting opportunities.

Recogniform Desktop Reader allows you to work on a SINGLE STATION: all production steps (input, recognition, correction and output) are executed on the same station (pc) one after the other, sequentially.

If you wish to work with a distributed architecture client/server, with mutiple inpout/correction/output stations, please give a look to Recogniform Workgroup Reader.


2.
Images input
Recogniform Desktop Reader
allows you to import images in different ways:

  • scanner
  • files
  • directory monitoring
  • internet
  • fax

You can use monchrome, greyscale or color images. In this last event the system can filter the colors that may interfere with the recognition.  

2.1 Input from scanner
All TWAIN driver scanners are supported.

2.2 Input from files
Recogniform Desktop Reader
allows to process previously scanned images anda saved in any TIFF format, compressed or not (CCITT G4, CCITT G3, Huffman, PAckbits, JPeg), single or multip-page, as well as in JPEG, BMP, PNG format.

2.3 Input by automatic directory monitoring (with sub-directories also)
You import files by directory monitoring, giving the start to the process automaically at pre-defined interval of time or when a certain nnumber of file is available in the folder.

2.4 Input from internet
You can download from any internet server via FTP with User and password.
This means that you can acquire forms from a scanner sited in a different location.

2.5 Input from fax
Faax input is directly supported, without external applications.


3. Optical recognition engines
The power of Recognifom Desktop Reader is the possibility to work with the state-of-art Recogniform recognition engines, that can be used togheter also on the same field. As described above, you can use the following engines:

  • OMR
  • ICR
  • CHR
  • OCR
  • BCR
  • OCR-A/B
  • MICR CMC7/E13B
  • IDE
  • AMK

4. Images Pre-processing
An adequate preventive images elaboration increases the quality of the recognition process and reduces the files sizes, so we introduced a lot of functions covering every specific requirement.

  • Deskew
    Correction of the slope that sometimes is caused by the scanner when it feed with ADF.
  • Black Border Removal
    Removal of the possible black edges due to the excessive sizes of the scanning area.
  • Form Alignment and Removal
    Alignment of the form to compensate the horizontal and vertical shift;
    Removal of the pre-printed elements when the form is not printed with blind-ink.
  • Lines Removal
    Removal of horizontal and vertical lines rebuilding the characters crossed by them.
  • Despeckle
    Removal of black isolated points and spots.
  • Field Box Removal
    Removal of the box delimiting a single field, if it is not printed with blind-ink.
  • Character Box Removal
    Removal of the box delimiting the single characters of constrained fields, if it is not printed with blind-ink.
  • Color Inversion
    Inversion of colours in case of white characters on dark background.

These cleaning functions can be used on the whole image or locally, defining the specific areas to be processed.


Un'adeguata e preventiva elaborazione dell'immagine aumenta la qualità del processo di riconoscimento e riduce l'ampiezza dei file, quindi abbiamo introdotto molteplici funzioni che coprono ogni specifica esigenza.

5. Data Verify and Post-processing
Using a new validation and correction system, the major part of alphabetic fields can be verified and it's possible to solve automatically all doubts generated by ambiguous recognitions.

5.1 Trigrams
This system uses statistical information relative to all the possible combinations of three consecutive characters in the target language, analyzing some thousand of words: these combinations called Trigrams are enclosed and ready to be used: in this way all the trigrams combinations, the allowed one and the forbidden one, as well as their using frequency, are fully and clearly known. For example, if a recognition engine reads the word "SMITK", the system will correct this word changing it into "SMITH", without operator's intervention and with no risk to give a wrong result. In fact, using a new technology called CREP® (Common Recognition Errors Proofing) and with the help of an Artificial Intelligence system, Recogniform Desktop Reader is able to investigate all proofing combinations, selecting the most probable one. All this process takes place in full autonomy.

Verifica e correzione dei caratteri5.2 E.B.V. (Eye Blow Verification®)
During the definition phase of the application, it's possible to get the manuscript characters or those coming from codelines (all of them or only the suspect ones, whose confidence level is under a specific parametric threshold) to be presented to an operator for a visual verification.Thanks to the Recogniform Technologies exclusive system called
 
Eye Blow Verification®, the operator is able to verify thousands of characters a minute with minimum effort and maximum accuracy. This system is based on the principle that, inside a group of similar elements, the eye is automatically attracted from the different ones. For example, if in a group of people, everybody suits a blue t-shirt except a person who has got a red t-shirt, the last one would be fast noticed without a detailed examination of the whole group! According to these considerations, Recogniform Desktop Reader shows in the same context all the characters recognized and classified in the same manner, so the operator can immediately look and notice the presence of intruders generated by wrong classification. In this way it's possible to find and correct all the possible errors of recognition introduced by: not eliminable dirtiness, defects of scanning, ambiguity, etc. Also in case of ambiguous cases,Recogniform Desktop Reader instantly allows you to solve the problem, automatically recalling on the video the image of the whole field.

5.3 Visual Assisted Verify
After the characters verification, the phase of fields correction starts, visualizing in the same context both data and the relative image, so they can be compared directly.

5.3.1 Text-to-Speech
To save time in comparison operations, we have added an option that allows user to listen read data from the computer, using the text-to-speech technology: in this way, instead of reading data two times, the operator can read just the data on the image, listening from the computer the read data.

5.3.2 Correction and Accuracy
It's possible to decide which fields need correction: all; only those whose level of esteemed accuracy is under a certain threshold; nobody. It's also possible to set for each field a different confidence level.

5.3.3 Normalization
Before being stored, the read data can be formatted in several ways, allowing the output to adapt to any necessity: conversion to uppercase or lowercase, empty spaces deletion, characters substitution, addition of suffix or prefix, etc.

5.3.4 Lookup tables and dictionaries
In addition to database formats and .txt files, you can also use compressed dictionary .dct for the look-up functions. These files are characterized by high speed of access, small memory consuming and don't need settings of alias, etc.

To create a .dct file, there is a specific utility that receives in input an ASCII file (a word for each row). Look-up speed is very high: hundreds of thousands of comparisons a seconds. Compression is around millions of words / Mb.

The following .dct dictionary are available fot the following countries (including first names, last names, zip codes, streets, towns, etc.) :

  • Italy
  • Belgium
  • Canada
  • Finland
  • France
  • Germany
  • Gret Britain
  • Holland
  • Poland
  • Portugal
  • Spain
  • Switzerland
  • Turkey
  • U.S.A.
  • other countries available on request

Moreover you can easily create your personal dictionary.

5.3.5 Overlay
When you scan images with blind ink (red/green/blue) you may need to re-build the original image in order to get a perfect idea of the original form. This may be useful, for example, to correct check boxes fields. With Recogniform Reader you will be able to get an image with overlay, overlaying the black ink of the form scanned on a color/greuscale/black and white empty form

 


6. Output

Recogniform Desktop Reader allows a flexible use of the recognized data thanks to the possibility of exporting them to the applications in different ways. It's possible to export data in three different ways: to databse, to file or to applications.

Output to database:

  • Oracle
  • Interbase
  • Sybase
  • Microsoft SQL Server
  • DB2
  • Paradox
  • dBase
  • FoxPro
  • Access
  • ODBC compatibles

 

Output to file:

  • ASCII text (.txt) (.asc)
  • Comma Separed Value (.csv)
  • Tab Separed Value (.tsv/.tab),
  • Hypertest Markup Language (.htm/.html)
  • Extensible Markup Language (.xml)
  • Microsoft Excel (.xls)
  • Symbolic Link Interchange Format (.slk/.sylk)
  • SQL (.sql)
  • dBase (.dbf)
  • Paradox (.db)
  • Access (.mdb)
  • Portable Document Format (.pdf)

7. Customization by scripts
Recogniform Desktop Reader allows user to set in a visual way all the necessary parameters for the optical reading of data. It also has a sophisticated events oriented scripting language, useful for matching particular needs. With this language you can associate a specific procedure written by user to each scheduled event, modifying the standard behaviour of the program or integrating new features. It's very similar to Basic and to Pascal so it's very easy to learn and user-friendly. This language includes in its syntax conditional expression (IF.), cycles (FOR., REPEAT. and WHILE.) and uses thousands of integrated functions. The system can automatically set specific variables, allowing effective interaction between the application and the process of optical recognition. This feature gives you the possibility to implement new personal features like controls of squaring between fields, multiple readings of the same field with different pre-processing options, or output of personalized layouts. With this system you can get a strongly customized application starting from a standardized product. The advantages are as simple as great: small costs and better reliability and flexibility.

 

8. Statistics
Recogniform Desktop Reader allows user to record automatically all statistical data relative to each job in a log file, that can be centralized in case of multiple installations.

StatisticheThe information stored are:

  • Starting date and time
  • Workstation name
  • User name
  • Form type
  • Automatic reading time
  • Characters verification time
  • Fields correction time
  • Processed forms number
  • Number of correctly read fields
  • Number of suspects fields
  • Number of corrected fields
  • Number of correctly read characters
  • Number of suspects characters
  • Number of corrected character

Using the module Recogniform Statistics Viewer it's possible to analyze data automatically, getting printer-friendly detailed graphics containing meaningful information about the productivity and the efficiency of your processes.

The graphics can refer to all the jobs or only to a specific job. They show:

  • Number of processed forms for workstation
  • Number of processed forms for user
  • Number of processed forms per day
  • Number of processed forms per type
  • Working time for workstation
  • Working time for user
  • Working time for day
  • Working time for type
  • Distribution times of reading, verification and correction
  • Total read characters, suspects and corrected ones
  • Total fields read, suspects and corrected ones

It's also possible to use filters to analyze only the data ref. to a specific workstation and/or user and/or type and/or day.Moreover, the statistical data can be exported in .csv files to be analyzed with other applications.

Briefly, with this system it's possible to monitor users' productivity by a complete set of statistics in a way detailed and global at the same time.


© 2009 Recogniform Technologies S.p.A. - VAT No. IT02376980781 - All rights reserved