How to use the OCR (TesseractOCR) php library

丶灬走出姿态 提交于 2021-02-20 19:21:29

问题


I had clone git library of OCR using this link .

git clone git://github.com/thiagoalessio/tesseract-ocr-for-php.git

then simply i include the required file by following this example

here is the example code which i m trying to run

require_once './src/TesseractOCR.php';
$tesseract = new TesseractOCR('text.png');
$text = $tesseract->recognize();
echo "The recognized text is:", $text;

But always it fires a fatal Error

Fatal error: Uncaught Error: Call to undefined method TesseractOCR::recognize()

Edit I tried to use run() instead of recognize()

require_once './src/TesseractOCR.php';
$tesseract = new TesseractOCR('text.png');
$text = $tesseract->run();
var_dump($text);
echo PHP_EOL, "The recognized text is:", $text, PHP_EOL;

Then result is : string(0) "" The recognized text is:

I had tried my best to find some appropriate solution but failed to find some authentic solution


回答1:


This sample code probably comes from this article or some similar. But I can see that it's over 1,5 year old and apparently it's outdated.

Take a look at their github's page. It looks like it's run() instead of recognize() right now:

<?php
echo (new TesseractOCR('german.png'))
    ->run();



回答2:


***Get all texts from images using php OCR***

Include HTTP/Request2.php library.

Include **Net/URL2.php** :

<?php
class Net_URL2
{

    const OPTION_STRICT = 'strict';
    const OPTION_USE_BRACKETS = 'use_brackets';

    const OPTION_DROP_SEQUENCE = 'drop_sequence';
    const OPTION_ENCODE_KEYS = 'encode_keys';

     const OPTION_SEPARATOR_INPUT = 'input_separator';
    const OPTION_SEPARATOR_OUTPUT = 'output_separator';

    private $_options = array(
        self::OPTION_STRICT           => true,
        self::OPTION_USE_BRACKETS     => true,
        self::OPTION_DROP_SEQUENCE    => true,
        self::OPTION_ENCODE_KEYS      => true,
        self::OPTION_SEPARATOR_INPUT  => '&',
        self::OPTION_SEPARATOR_OUTPUT => '&',
        );

    private $_scheme = false;

    private $_userinfo = false;
    private $_host = false;
    private $_port = false;
    private $_path = '';
    private $_query = false;
    private $_fragment = false;

    public function __construct($url, array $options = array())
    {
        foreach ($options as $optionName => $value) {
            if (array_key_exists($optionName, $this->_options)) {
                $this->_options[$optionName] = $value;
            }
        }

        $this->parseUrl($url);
    }
    public function __set($var, $arg)
    {
        $method = 'set' . $var;
        if (method_exists($this, $method)) {
            $this->$method($arg);
        }
    }

    public function __get($var)
    {
        $method = 'get' . $var;
        if (method_exists($this, $method)) {
            return $this->$method();
        }

        return false;
    }
    public function getScheme()
    {
        return $this->_scheme;
    }
    public function setScheme($scheme)
    {
        $this->_scheme = $scheme;
        return $this;
    }
    public function getUser()
    {
        return $this->_userinfo !== false
            ? preg_replace('(:.*$)', '', $this->_userinfo)
            : false;
    }
    public function getPassword()
    {
        return $this->_userinfo !== false
            ? substr(strstr($this->_userinfo, ':'), 1)
            : false;
    }
    public function getUserinfo()
    {
        return $this->_userinfo;
    }
    public function setUserinfo($userinfo, $password = false)
    {
        if ($password !== false) {
            $userinfo .= ':' . $password;
        }

        if ($userinfo !== false) {
            $userinfo = $this->_encodeData($userinfo);
        }

        $this->_userinfo = $userinfo;
        return $this;
    }

    public function getHost()
    {
        return $this->_host;
    }
    public function setHost($host)
    {
        $this->_host = $host;
        return $this;
    }
    public function getPort()
    {
        return $this->_port;
    }

    public function setPort($port)
    {
        $this->_port = $port;
        return $this;
    }
    public function getAuthority()
    {
        if (false === $this->_host) {
            return false;
        }

        $authority = '';

        if (strlen($this->_userinfo)) {
            $authority .= $this->_userinfo . '@';
        }

        $authority .= $this->_host;

        if ($this->_port !== false) {
            $authority .= ':' . $this->_port;
        }

        return $authority;
    }
    public function setAuthority($authority)
    {
        $this->_userinfo = false;
        $this->_host     = false;
        $this->_port     = false;

        if ('' === $authority) {
            $this->_host = $authority;
            return $this;
        }

        if (!preg_match('(^(([^@]*)@)?(.+?)(:(\d*))?$)', $authority, $matches)) {
            return $this;
        }

        if ($matches[1]) {
            $this->_userinfo = $this->_encodeData($matches[2]);
        }

        $this->_host = $matches[3];

        if (isset($matches[5]) && strlen($matches[5])) {
            $this->_port = $matches[5];
        }
        return $this;
    }

    public function getPath()
    {
        return $this->_path;
    }

    public function setPath($path)
    {
        $this->_path = $path;
        return $this;
    }

    public function getQuery()
    {
        return $this->_query;
    }

    public function setQuery($query)
    {
        $this->_query = $query;
        return $this;
    }

    public function getFragment()
    {
        return $this->_fragment;
    }

    public function setFragment($fragment)
    {
        $this->_fragment = $fragment;
        return $this;
    }
    public function getQueryVariables()
    {
        $separator   = $this->getOption(self::OPTION_SEPARATOR_INPUT);
        $encodeKeys  = $this->getOption(self::OPTION_ENCODE_KEYS);
        $useBrackets = $this->getOption(self::OPTION_USE_BRACKETS);
       $return  = array();
        for ($part = strtok($this->_query, $separator);
            strlen($part);
            $part = strtok($separator)
        ) {
            list($key, $value) = explode('=', $part, 2) + array(1 => '');

            if ($encodeKeys) {
                $key = rawurldecode($key);
            }
            $value = rawurldecode($value);

            if ($useBrackets) {
                $return = $this->_queryArrayByKey($key, $value, $return);
            } else {
                if (isset($return[$key])) {
                    $return[$key]  = (array) $return[$key];
                    $return[$key][] = $value;
                } else {
                    $return[$key] = $value;
                }
            }
        }

        return $return;
    }

    /**
     * Parse a single query key=value pair into an existing php array
     *
     * @param string $key   query-key
     * @param string $value query-value
     * @param array  $array of existing query variables (if any)
     *
     * @return mixed
     */
    private function _queryArrayByKey($key, $value, array $array = array())
    {
        if (!strlen($key)) {
            return $array;
        }

        $offset = $this->_queryKeyBracketOffset($key);
        if ($offset === false) {
            $name = $key;
        } else {
            $name = substr($key, 0, $offset);
        }

        if (!strlen($name)) {
            return $array;
        }

        if (!$offset) {
            // named value
            $array[$name] = $value;
        } else {
            // array
            $brackets = substr($key, $offset);
            if (!isset($array[$name])) {
                $array[$name] = null;
            }
            $array[$name] = $this->_queryArrayByBrackets(
                $brackets, $value, $array[$name]
            );
        }

        return $array;
    }
    private function _queryArrayByBrackets($buffer, $value, array $array = null)
    {
        $entry = &$array;

        for ($iteration = 0; strlen($buffer); $iteration++) {
            $open = $this->_queryKeyBracketOffset($buffer);
            if ($open !== 0) {
                // Opening bracket [ must exist at offset 0, if not, there is
                // no bracket to parse and the value dropped.
                // if this happens in the first iteration, this is flawed, see
                // as well the second exception below.
                if ($iteration) {
                    break;
                }
                // @codeCoverageIgnoreStart
                throw new Exception(
                    'Net_URL2 Internal Error: '. __METHOD__ .'(): ' .
                    'Opening bracket [ must exist at offset 0'
                );
                // @codeCoverageIgnoreEnd
            }

            $close = strpos($buffer, ']', 1);
            if (!$close) {
                // this error condition should never be reached as this is a
                // private method and bracket pairs are checked beforehand.
                // See as well the first exception for the opening bracket.
                // @codeCoverageIgnoreStart
                throw new Exception(
                    'Net_URL2 Internal Error: '. __METHOD__ .'(): ' .
                    'Closing bracket ] must exist, not found'
                );
                // @codeCoverageIgnoreEnd
            }

            $index = substr($buffer, 1, $close - 1);
            if (strlen($index)) {
                $entry = &$entry[$index];
            } else {
                if (!is_array($entry)) {
                    $entry = array();
                }
                $entry[] = &$new;
                $entry = &$new;
                unset($new);
            }
            $buffer = substr($buffer, $close + 1);
        }

        $entry = $value;

        return $array;
    }
      private function _queryKeyBracketOffset($key)
    {
        if (false !== $open = strpos($key, '[')
            and false === strpos($key, ']', $open + 1)
        ) {
            $open = false;
        }

        return $open;
    }

    public function setQueryVariables(array $array)
    {
        if (!$array) {
            $this->_query = false;
        } else {
            $this->_query = $this->buildQuery(
                $array,
                $this->getOption(self::OPTION_SEPARATOR_OUTPUT)
            );
        }
        return $this;
    }

    public function setQueryVariable($name, $value)
    {
        $array = $this->getQueryVariables();
        $array[$name] = $value;
        $this->setQueryVariables($array);
        return $this;
    }

    public function unsetQueryVariable($name)
    {
        $array = $this->getQueryVariables();
        unset($array[$name]);
        $this->setQueryVariables($array);
    }

    public function getURL()
    {
        // See RFC 3986, section 5.3
        $url = '';

        if ($this->_scheme !== false) {
            $url .= $this->_scheme . ':';
        }

        $authority = $this->getAuthority();
        if ($authority === false && strtolower($this->_scheme) === 'file') {
            $authority = '';
        }

        $url .= $this->_buildAuthorityAndPath($authority, $this->_path);

        if ($this->_query !== false) {
            $url .= '?' . $this->_query;
        }

        if ($this->_fragment !== false) {
            $url .= '#' . $this->_fragment;
        }

        return $url;
    }

    private function _buildAuthorityAndPath($authority, $path)
    {
        if ($authority === false) {
            return $path;
        }

        $terminator = ($path !== '' && $path[0] !== '/') ? '/' : '';

        return '//' . $authority . $terminator . $path;
    }

    public function __toString()
    {
        return $this->getURL();
    }

    public function getNormalizedURL()
    {
        $url = clone $this;
        $url->normalize();
        return $url->getURL();
    }

    public function normalize()
    {
        // See RFC 3986, section 6

        // Scheme is case-insensitive
        if ($this->_scheme) {
            $this->_scheme = strtolower($this->_scheme);
        }

        // Hostname is case-insensitive
        if ($this->_host) {
            $this->_host = strtolower($this->_host);
        }

        // Remove default port number for known schemes (RFC 3986, section 6.2.3)
        if ('' === $this->_port
            || $this->_port
            && $this->_scheme
            && $this->_port == getservbyname($this->_scheme, 'tcp')
        ) {
            $this->_port = false;
        }

        // Normalize case of %XX percentage-encodings (RFC 3986, section 6.2.2.1)
        // Normalize percentage-encoded unreserved characters (section 6.2.2.2)
        $fields = array(&$this->_userinfo, &$this->_host, &$this->_path,
                        &$this->_query, &$this->_fragment);
        foreach ($fields as &$field) {
            if ($field !== false) {
                $field = $this->_normalize("$field");
            }
        }
        unset($field);

        // Path segment normalization (RFC 3986, section 6.2.2.3)
        $this->_path = self::removeDotSegments($this->_path);

        // Scheme based normalization (RFC 3986, section 6.2.3)
        if (false !== $this->_host && '' === $this->_path) {
            $this->_path = '/';
        }

        // path should start with '/' if there is authority (section 3.3.)
        if (strlen($this->getAuthority())
            && strlen($this->_path)
            && $this->_path[0] !== '/'
        ) {
            $this->_path = '/' . $this->_path;
        }
    }

    private function _normalize($mixed)
    {
        return preg_replace_callback(
            '((?:%[0-9a-fA-Z]{2})+)', array($this, '_normalizeCallback'),
            $mixed
        );
    }

    private function _normalizeCallback($matches)
    {
        return self::urlencode(urldecode($matches[0]));
    }
    public function isAbsolute()
    {
        return (bool) $this->_scheme;
    }
    public function resolve($reference)
    {
        if (!$reference instanceof Net_URL2) {
            $reference = new self($reference);
        }
        if (!$reference->_isFragmentOnly() && !$this->isAbsolute()) {
            throw new Exception(
                'Base-URL must be absolute if reference is not fragment-only'
            );
        }

        // A non-strict parser may ignore a scheme in the reference if it is
        // identical to the base URI's scheme.
        if (!$this->getOption(self::OPTION_STRICT)
            && $reference->_scheme == $this->_scheme
        ) {
            $reference->_scheme = false;
        }

        $target = new self('');
        if ($reference->_scheme !== false) {
            $target->_scheme = $reference->_scheme;
            $target->setAuthority($reference->getAuthority());
            $target->_path  = self::removeDotSegments($reference->_path);
            $target->_query = $reference->_query;
        } else {
            $authority = $reference->getAuthority();
            if ($authority !== false) {
                $target->setAuthority($authority);
                $target->_path  = self::removeDotSegments($reference->_path);
                $target->_query = $reference->_query;
            } else {
                if ($reference->_path == '') {
                    $target->_path = $this->_path;
                    if ($reference->_query !== false) {
                        $target->_query = $reference->_query;
                    } else {
                        $target->_query = $this->_query;
                    }
                } else {
                    if (substr($reference->_path, 0, 1) == '/') {
                        $target->_path = self::removeDotSegments($reference->_path);
                    } else {
                        // Merge paths (RFC 3986, section 5.2.3)
                        if ($this->_host !== false && $this->_path == '') {
                            $target->_path = '/' . $reference->_path;
                        } else {
                            $i = strrpos($this->_path, '/');
                            if ($i !== false) {
                                $target->_path = substr($this->_path, 0, $i + 1);
                            }
                            $target->_path .= $reference->_path;
                        }
                        $target->_path = self::removeDotSegments($target->_path);
                    }
                    $target->_query = $reference->_query;
                }
                $target->setAuthority($this->getAuthority());
            }
            $target->_scheme = $this->_scheme;
        }

        $target->_fragment = $reference->_fragment;

        return $target;
    }

    private function _isFragmentOnly()
    {
        return (
            $this->_fragment !== false
            && $this->_query === false
            && $this->_path === ''
            && $this->_port === false
            && $this->_host === false
            && $this->_userinfo === false
            && $this->_scheme === false
        );
    }

    public static function removeDotSegments($path)
    {
        $path = (string) $path;
        $output = '';

        // Make sure not to be trapped in an infinite loop due to a bug in this
        // method
        $loopLimit = 256;
        $j = 0;
        while ('' !== $path && $j++ < $loopLimit) {
            if (substr($path, 0, 2) === './') {
                // Step 2.A
                $path = substr($path, 2);
            } elseif (substr($path, 0, 3) === '../') {
                // Step 2.A
                $path = substr($path, 3);
            } elseif (substr($path, 0, 3) === '/./' || $path === '/.') {
                // Step 2.B
                $path = '/' . substr($path, 3);
            } elseif (substr($path, 0, 4) === '/../' || $path === '/..') {
                // Step 2.C
                $path   = '/' . substr($path, 4);
                $i      = strrpos($output, '/');
                $output = $i === false ? '' : substr($output, 0, $i);
            } elseif ($path === '.' || $path === '..') {
                // Step 2.D
                $path = '';
            } else {
                // Step 2.E
                $i = strpos($path, '/', $path[0] === '/');
                if ($i === false) {
                    $output .= $path;
                    $path = '';
                    break;
                }
                $output .= substr($path, 0, $i);
                $path = substr($path, $i);
            }
        }

        if ($path !== '') {
            $message = sprintf(
                'Unable to remove dot segments; hit loop limit %d (left: %s)',
                $j, var_export($path, true)
            );
            trigger_error($message, E_USER_WARNING);
        }

        return $output;
    }

    public static function urlencode($string)
    {
        $encoded = rawurlencode($string);

        // This is only necessary in PHP < 5.3.
        $encoded = str_replace('%7E', '~', $encoded);
        return $encoded;
    }

    public static function getCanonical()
    {
        if (!isset($_SERVER['REQUEST_METHOD'])) {
            // ALERT - no current URL
            throw new Exception('Script was not called through a webserver');
        }

        // Begin with a relative URL
        $url = new self($_SERVER['PHP_SELF']);
        $url->_scheme = isset($_SERVER['HTTPS']) ? 'https' : 'http';
        $url->_host   = $_SERVER['SERVER_NAME'];
        $port = $_SERVER['SERVER_PORT'];
        if ($url->_scheme == 'http' && $port != 80
            || $url->_scheme == 'https' && $port != 443
        ) {
            $url->_port = $port;
        }
        return $url;
    }

    /**
     * Returns the URL used to retrieve the current request.
     *
     * @return  string
     */
    public static function getRequestedURL()
    {
        return self::getRequested()->getUrl();
    }

    /**
     * Returns a Net_URL2 instance representing the URL used to retrieve the
     * current request.
     *
     * @throws Exception
     * @return $this
     */
    public static function getRequested()
    {
        if (!isset($_SERVER['REQUEST_METHOD'])) {
            // ALERT - no current URL
            throw new Exception('Script was not called through a webserver');
        }

        // Begin with a relative URL
        $url = new self($_SERVER['REQUEST_URI']);
        $url->_scheme = isset($_SERVER['HTTPS']) ? 'https' : 'http';
        // Set host and possibly port
        $url->setAuthority($_SERVER['HTTP_HOST']);
        return $url;
    }

    /**
     * Returns the value of the specified option.
     *
     * @param string $optionName The name of the option to retrieve
     *
     * @return mixed
     */
    public function getOption($optionName)
    {
        return isset($this->_options[$optionName])
            ? $this->_options[$optionName] : false;
    }

    /**
     * A simple version of http_build_query in userland. The encoded string is
     * percentage encoded according to RFC 3986.
     *
     * @param array  $data      An array, which has to be converted into
     *                          QUERY_STRING. Anything is possible.
     * @param string $separator Separator {@link self::OPTION_SEPARATOR_OUTPUT}
     * @param string $key       For stacked values (arrays in an array).
     *
     * @return string
     */
    protected function buildQuery(array $data, $separator, $key = null)
    {
        $query = array();
        $drop_names = (
            $this->_options[self::OPTION_DROP_SEQUENCE] === true
            && array_keys($data) === array_keys(array_values($data))
        );
        foreach ($data as $name => $value) {
            if ($this->getOption(self::OPTION_ENCODE_KEYS) === true) {
                $name = rawurlencode($name);
            }
            if ($key !== null) {
                if ($this->getOption(self::OPTION_USE_BRACKETS) === true) {
                    $drop_names && $name = '';
                    $name = $key . '[' . $name . ']';
                } else {
                    $name = $key;
                }
            }
            if (is_array($value)) {
                $query[] = $this->buildQuery($value, $separator, $name);
            } else {
                $query[] = $name . '=' . rawurlencode($value);
            }
        }
        return implode($separator, $query);
    }

    /**
     * This method uses a regex to parse the url into the designated parts.
     *
     * @param string $url URL
     *
     * @return void
     * @uses   self::$_scheme, self::setAuthority(), self::$_path, self::$_query,
     *         self::$_fragment
     * @see    __construct
     */
    protected function parseUrl($url)
    {
        // The regular expression is copied verbatim from RFC 3986, appendix B.
        // The expression does not validate the URL but matches any string.
        preg_match(
            '(^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?)',
            $url, $matches
        );

        // "path" is always present (possibly as an empty string); the rest
        // are optional.
        $this->_scheme   = !empty($matches[1]) ? $matches[2] : false;
        $this->setAuthority(!empty($matches[3]) ? $matches[4] : false);
        $this->_path     = $this->_encodeData($matches[5]);
        $this->_query    = !empty($matches[6])
                           ? $this->_encodeData($matches[7])
                           : false
            ;
        $this->_fragment = !empty($matches[8]) ? $matches[9] : false;
    }

    /**
     * Encode characters that might have been forgotten to encode when passing
     * in an URL. Applied onto Userinfo, Path and Query.
     *
     * @param string $url URL
     *
     * @return string
     * @see parseUrl
     * @see setAuthority
     * @link https://pear.php.net/bugs/bug.php?id=20425
     */
    private function _encodeData($url)
    {
        return preg_replace_callback(
            '([\x-\x20\x22\x3C\x3E\x7F-\xFF]+)',
            array($this, '_encodeCallback'), $url
        );
    }

    /**
     * callback for encoding character data
     *
     * @param array $matches Matches
     *
     * @return string
     * @see _encodeData
     * @SuppressWarnings(PHPMD.UnusedPrivateMethod)
     */
    private function _encodeCallback(array $matches)
    {
        return rawurlencode($matches[0]);
    }
}

To run **sample.php** file:
<?php
require_once 'HTTP/Request2.php';

$request = new Http_Request2('https://api.projectoxford.ai/vision/v1.0/ocr');
$url = $request->getUrl();
$request->setConfig(array(
    'ssl_verify_peer'   => FALSE,
    'ssl_verify_host'   => FALSE
));
$headers = array(
    'Content-Type' => 'application/json',
    'Ocp-Apim-Subscription-Key' => '08b2055a83fc4a48af161df7ef13f80f',
);
$request->setHeader($headers);
$parameters = array(
    'language' => 'unk',
    'detectOrientation ' => 'true',
);

$url->setQueryVariables($parameters);
$request->setMethod(HTTP_Request2::METHOD_POST);
$request->setBody("{'Url':'http://wp.streetwise.co/wp-content/blogs.dir/2/files/2014/07/dc-drivers-license-630x407.jpg'}");

try
{
    $response = $request->send();
$rs1 = (string) $response->getBody();
$decode[] = json_decode($rs1, true);
$reg = $decode[0]["regions"];
for($i=0;$i<count($reg);$i++) {
$cnt = count($reg[$i]["lines"]);
for($l=0;$l<$cnt;$l++) {
for($m=0;$m<count($reg[$i]["lines"][$l]["words"]);$m++){
foreach($reg[$i]["lines"][$l]["words"] as $text) {
echo $text["text"]."<br/>";
}
}
}
}  
}
catch (HttpException $ex)
{
    echo $ex;
}

?>


来源:https://stackoverflow.com/questions/39127736/how-to-use-the-ocr-tesseractocr-php-library

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!