Detect and remove empty pages

If you scan documents, there are usually pages which are empty or only contain a small amount of information. When archiving the documents, these pages will be archived too, which causes more spaces for archiving the documents. BarcodeOCR is capable of recognizing the content of the document and removing pages with a low amount of information from the destination document. This way, you can save valuable disk space in your archive!


possible file size reduction of duplex scanned pages

Each page will be analyzed and tested for information. There are multiple factors which will be combined:

  • Ratio of black to white dots after converting the page to a B/W page
  • Recognized text or symbols on the page

These information are calculated and the result is the so called “information factor”. You can define a factor per configuration that needs to be matched so that the document will be saved into the output document. If a page has less information, it will be deleted.

You can determine an initial value by scanning a sample pdf file.