ocrhttps://api.pixlab.io/ocr
Description
Given an input image or video frame with human readable characters. Detect input language & extract text content from there. OCR stand for Optical Character Recognition and PixLab uses state-of-the-art processing algorithms so expect accurate results given a good quality image.
For a more specialized approach such as scanning government issued documents like passports or ID cards, the /docscan endpoint do perform such a task. If you are dealing with PDF documents, you can convert them at first to raw images via the /pdftoimg endpoint.
HTTP Methods
GET
Request Parameters
Required
Fields | Type | Description |
---|---|---|
img | URL | Input image URL. If you want to upload your image directly from your app, call store before and use the output link. Only JPEG, PNG & BMP file format are supported. convert is of particular help if you have a different image format. |
key | String | Your PixLab API Key. |
Fields | Type | Description |
---|---|---|
lang | String | Input language code if known. Please do not set this field if you have no idea about the input language. The supported BCP-47 languages code as of this release are: en (English), de (German), ar (Arabic), he (Modern Hebrew), hi (Hindi), fr (French), cs (Czech), da (Danish), nl (Dutch), fi (Finnish) , el (Greek), hu (Hungarian), it (Italian), Ja (Japanese), ko (Korean), nb (Norwegian), pl (Polish), pt (Portuguese), zh-Hans (Chinese Simplified), zh-Hant (ChineseTraditional), ru (Russian), es (Spanish), sv (Swedish), tr (Turkish) |
orientation | Boolean | Detect and correct text orientation in the input image. |
nl | Boolean | Output new line (\n) character on each detected line. |
br | Boolean | Output HTML line break (</br>) on each detected line. |
Response
application/json
This command always return a JSON object after each call. The following are the JSON fields returned in response body:
Fields | Type | Description |
---|---|---|
status | Integer | Status code 200 indicates success, any other code indicates failure. |
output | String | Extracted text from the image input. |
lang | String | BCP-47 language code. |
bbox | Array | bounding box (i.e. rectangle) coordinates for each extracted word. Each entry in this array is represented by an instance of the following object: { word: Extracted word, x: X coordinate of the top left corner, y: Y coordinate of the top left corner, w: Width of the rectangle that englobe this word, h: Height of the rectangle that englobe this word } Each entry in this array can be marked via drawrectangles for example or extracted via crop for further analysis if desired. This feature is available starting from the Prod plan and up. |
error | String | Error message if status != 200. |
Python Example
import requests
import json
# Given an image with human readable characters. Detect input language & extract text content from there.
# https://pixlab.io/#/cmd?id=ocr for additional information.
req = requests.get('https://api.pixlab.io/ocr',params={
'img':'http://quotesten.com/wp-content/uploads/2016/06/Confucius-Quote.jpg',
'orientation':True, # Correct text orientation
'nl':True, # Output new lines if any
'key':'My_PixLab_Key'
})
reply = req.json()
if reply['status'] != 200:
print (reply['error'])
else:
print ("Input language: " + reply['lang'])
print ("Text Output: " + reply['output'])
# Iterate over all extracted words
for box in reply['bbox']:
print ("Word: " + box['word'])
print ("Bounding box - X: " + str(box['x']) + " Y: " + str(box['y']) + " Width: " + str(box['w']) + " Height: " + str(box['h']))