Recogniform Layout Analysis SDK
download demo download evaluation version info
Recogniform Layout Analysis SDK allows to analize the layout of any document using complex algorythms, able to recognize with high accuracy the different kind of areas in the page.
Recogniform Layout Analysis SDK identifies the following types of areas:
images (pictures or drawings)
tables (rows, columns and cells)
horizontal and vertical lines
After the layout analysis recognition, it's possible to operate a sub-classification defining some rules according to the kind of document to analize. For example, on a newspaper page, we could recognize as "didascaly" a text area, whatever it would be immmediately down a picture, maybe centered respect to the picture, maybe with a different font with a smaller size than the average of remaining characters of the page recognized as text. At the same way, it's possible to recognize
as "title" some text lines, in order to their position on the page or/and their font size.
Lay Out analysis: why?
Usually the goal of a layout analysis of any document (newspaper, magazine, contract, form, invoice, or any other kind of document) is to recognize automatically its structure, identify it, extract the areas of interest and run the text recognition using optical recognition engines like OCR, ICR or BCR, in order to convert the original image into a structured document, containing all information required and keeping the same layout of the original one. The classic example is a PDF resercheable
file of an old newspaper.
To get the best result from the analysis, the quality of the image to process needs to be the best quality possible. To help us in this process, we could use some of Recogniform Image Processing libraries, like:
Using Hi-capacity scanners, sometimes the ADF dekew the paper: you can solve this problem using Recogniform Deskew SDK: in this way you will get perfect images without re-scan, correcting the wrong inclination of the document automatically and quickly. You can deskew until 45° and the angle may be
exstimated using two methods: text analysis or finding the black border. For more information please give a look to our Deskew SDK.
Despeckle and noise removal
Scanning from copies or microfilm, dust and dirt may add some noise on the images. You can avoid this problem using our Recogniform Despeckle Library. You just need to determine how big a dust element can be (i.e. 2x2 pixels). For more information visit our Despeckle SDK page.
Black border removal and auto-cropping
This black border removal sdk allows the automatic black border detection and removal in monochrome or grayscale images. The black border is produced in the images acquired by scanners when paper size is smaller than scanning area or in images acquired from microfilm, microfiches and aperture-cards. Removing the border from the images is a very important pre-processing step that improves the compression rate, reducing file size,
and the visualization aspect. For more information visit our Black Border Removal SDK page.
Look up at the following image: Recogniform Layout Analysis will recognize all areas automatically, distringuishing between text areas, inverted text areas, images, lines, tables, etc.
As you can see from the image on the right, with Recogniform Layout analysis all areas with the same content are recognized properly and marked with different colors. We have:
green: inverted text
You can download the Demo version before you order it. We appreciate any feedback you might have on our products. If you have a
special requirement we would be happy to customize our product for your
specific application. You can also download an evaluation version of this product for Visual Basic, Visual C++ or Delphi.
Pricing and ordering info
For more information about Recogniform LayOut Analysis Library please use our contacts page or click here.