Version 1.9.72 (Release Notes)


ocrhttps://api.pixlab.io/ocr

Description

Given an input image or video frame with human readable characters. Detect input language & extract text content from there. OCR stand for Optical Character Recognition and PixLab uses state-of-the-art processing algorithms so expect accurate results given a good quality image.
For a more specialized approach such as scanning government issued documents like passports or ID cards, the /docscan endpoint do perform such a task. If you are dealing with PDF documents, you can convert them at first to raw images via the /pdftoimg endpoint.

HTTP Methods

GET

Request Parameters

Required

FieldsTypeDescription
imgURLInput image URL. If you want to upload your image directly from your app, call store before and use the output link. Only JPEG, PNG & BMP file format are supported. convert is of particular help if you have a different image format.
keyStringYour PixLab API Key.
Optional
FieldsTypeDescription
langStringInput language code if known. Please do not set this field if you have no idea about the input language. The supported BCP-47 languages code as of this release are: en (English), de (German), ar (Arabic), he (Modern Hebrew), hi (Hindi), fr (French), cs (Czech), da (Danish), nl (Dutch), fi (Finnish) , el (Greek), hu (Hungarian), it (Italian), Ja (Japanese), ko (Korean), nb (Norwegian), pl (Polish), pt (Portuguese), zh-Hans (Chinese Simplified), zh-Hant (ChineseTraditional), ru (Russian), es (Spanish), sv (Swedish), tr (Turkish)
orientationBooleanDetect and correct text orientation in the input image.
nlBooleanOutput new line (\n) character on each detected line.
brBooleanOutput HTML line break (</br>) on each detected line.

Response

application/json

This command always return a JSON object after each call. The following are the JSON fields returned in response body:

FieldsTypeDescription
statusIntegerStatus code 200 indicates success, any other code indicates failure.
outputStringExtracted text from the image input.
langStringBCP-47 language code.
bboxArraybounding box (i.e. rectangle) coordinates for each extracted word. Each entry in this array is represented by an instance of the following object:
{
word: Extracted word,
x: X coordinate of the top left corner,
y: Y coordinate of the top left corner,
w: Width of the rectangle that englobe this word,
h: Height of the rectangle that englobe this word
}

Each entry in this array can be marked via drawrectangles for example or extracted via crop for further analysis if desired. This feature is available starting from the Prod plan and up.
errorStringError message if status != 200.

Python Example

import requests
import json
# Given an image with human readable characters. Detect input language & extract text content from there.
# https://pixlab.io/#/cmd?id=ocr for additional information.

req = requests.get('https://api.pixlab.io/ocr',params={
	'img':'http://quotesten.com/wp-content/uploads/2016/06/Confucius-Quote.jpg',
	'orientation':True, # Correct text orientation
	'nl':True, # Output new lines if any
	'key':'My_PixLab_Key'
})
reply = req.json()
if reply['status'] != 200:
	print (reply['error'])
else:
	print ("Input language: " + reply['lang'])
	print ("Text Output: " + reply['output'])
	# Iterate over all extracted words
	for box in reply['bbox']:
		print ("Word: " + box['word'])
		print ("Bounding box - X: " + str(box['x']) + " Y: " + str(box['y']) + " Width: " + str(box['w']) + " Height: " + str(box['h']))

See Also