Empty template form and recognition zones
Previous  Top  Next

The empty form is required also to make some image processing operation, as allignment or form removal.

It's very important to use empty forms indentical to form used in production: same colors, same sizes, same resolution. If you use form removal its required to use a total empty form else you will risk to remove portion of data to be recognized too.

Some time using form printed with blind ink, where the preprinted part of form is not visible on scanned image, is not possible to set with enough accuracy the position of recognition areas: in this case you can acquire a photocopy or make a scanning in grayscale/color, but in very case after the template setup you have to insert the original form as acquired in production.

Defining recognition zones you have to follow some guidelines allowing to avoid that recognition engines will make errors. As gneral rule, if data to recognize are distant each from others, is a good idea to include in recognition area an enough quantity of white space, about half centimeter, around the data so that the system can tolerate the orizontal or vertical offset introduced by scanner on documents. But if date are very near ones to others, is a good idea to size zones so that they includes just data to recognize avoiding that extraneous data could be enclosed for the offset introduced by scanner on documents.

When you define a zone for OCR or ICR you have to be sure that all the characters to recognize are enclosed in full inside the are else the recognition result will be not correct.

Instead when you define a zone for BCR you have to check not only that the full code is inside the area but also the the area includes enougth withe space on raight and on left of barcode, about half inch.

Finally when you define a zone for OMR you have to avoid to enclose a large space around the check box because the percentage of black pixel could result too low respect to selected area, never reaching the minimal threshold set. Also the area should bot be too small else you could have problems for the offset introduced by scanner on documents.

You can also define multiple recognition zones overlapped or not for the same data, each one with a different parameter set: after recognition you can use the multiple results with a custom script to decide how to send in output.