Home > Computing > BFO adds text extraction to PDF Library

< Previous | Next >

BFO adds text extraction to PDF Library

Added: (Thu Oct 27 2005)

London, England, 27 October 2005, - BFO (Big Faceless Organization), a global supplier of java reporting solutions, strengthens the acclaimed Big Faceless PDF Library with the addition of text and image extraction.

The 2.6.2 release adds the ability to extract text and bitmap images from PDF documents, as well as index the PDF using the Apache Lucene search engine. The library extracts and indexes text in Unicode from the form fields, annotations and document metadata as well as the document body, and at roughly 50 pages a second for large documents.

Speed and accuracy of text extraction coupled with the existing features of the PDF Library makes it a wise choice for developers involved in data mining, content management systems and form processing environments. As well as being beneficial in settings that require the ability to search or extract text from large numbers of PDF files.

Text and image extraction requires the Big Faceless PDF Library Extended Edition plus Viewer license, which can be downloaded from BFO’s website.

About BFO: BFO is a leading global provider of Java based reporting solutions founded in 1998. They produce a stable of robust Java components for the international B2B market. Such components include Report Generator, Graph and PDF Library. Report Generator comprises both Libraries and converts XML to PDF documents. Using JSP, ASP or similar technology, it is possible to create dynamic PDF reports as quickly and easily as HTML.

Submitted by: Daniel Wilson Find out more.
Disclaimer: Pressbox disclaims any inaccuracies in the content contained in these releases. If you would like a release removed please send an email to remove@pressbox.co.uk together with the url of the release.