SetaPDF Demos

The $boundaryBox parameter

By default, the SetaPDF-Extractor component will extract all content on a pages content stream and the content which is positioned outside the visible area of a page. To give you some more control to this behavior you can pass a page boundary box constant to the \setasign\SetaPDF2\Extractor\Extractor::getResultByPageNumber() method.

This demo shows its behavior.

 script.php
 Page-Boundaries.pdf
 Run

PHP

copy

<?php

use setasign\SetaPDF2\Core\Document;
use setasign\SetaPDF2\Core\PageBoundaries;
use setasign\SetaPDF2\Extractor\Extractor;

// load and register the autoload function
require_once __DIR__ . '/../../../../../bootstrap.php';

$boxes = [
    PageBoundaries::MEDIA_BOX,
    PageBoundaries::CROP_BOX,
    PageBoundaries::BLEED_BOX,
    PageBoundaries::TRIM_BOX,
    PageBoundaries::ART_BOX,
];

$boundaryBox = displaySelect('Page Boundary box:', $boxes);

$path = $assetsDirectory . '/pdfs/misc/Page-Boundaries.pdf';

$document = Document::loadByFilename($path);

$extractor = new Extractor($document);

$result = $extractor->getResultByPageNumber(1, $boxes[$boundaryBox]);

echo '<pre>';
echo htmlspecialchars($result);
echo '</pre>';

 script.php
 Page-Boundaries.pdf
 Run

Phrase Search