Developer & AI Agent API Specification
Connect to the fast, client-side, hyper-accurate OCR text extraction engine. Optimized for standard integrations and advanced AI agents.
Engine Overview
The JpgToText AI OCR engine processes images entirely within the browser sandbox using web assembly bindings, yielding outstanding performance, near-zero cost, and flawless privacy protection. AI agents and software integrations can harness this architecture natively.
Core Strengths
- Zero Latency: Client-side preprocessing and WebAssembly recognition bypass internet roundtrips.
- Universal Access: Cross-platform support for Chrome, Firefox, Safari, and mobile webviews.
- Guaranteed Confidentiality: Perfect fit for security-sensitive text extraction.
Direct JS/Wasm Integration
Load the JpgToText optimization module and Tesseract engine dynamically to extract highly clean text layout representations from any image file.
// 1. Dynamic Engine Loader & OCR Initialization
async function initializeJpgToTextOCR(imageFile, languageCode = 'eng') {
// Load Tesseract CDN dynamic library if not already present
if (typeof Tesseract === 'undefined') {
await new Promise((resolve) => {
const script = document.createElement('script');
script.src = 'https://cdn.jsdelivr.net/npm/tesseract.js@5/dist/tesseract.min.js';
script.onload = resolve;
document.head.appendChild(script);
});
}
// Preprocess Image Canvas Grayscale Contrast Scaling
const preprocessBlob = await new Promise((resolve) => {
const reader = new FileReader();
reader.onload = (e) => {
const img = new Image();
img.onload = () => {
const canvas = document.createElement('canvas');
canvas.width = img.width;
canvas.height = img.height;
const ctx = canvas.getContext('2d');
ctx.drawImage(img, 0, 0);
// Grayscale contrast scaling
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
const d = imageData.data;
for (let i = 0; i < d.length; i += 4) {
const gray = 0.299 * d[i] + 0.587 * d[i + 1] + 0.114 * d[i + 2];
d[i] = d[i+1] = d[i+2] = (1.1 * (gray - 128)) + 128;
}
ctx.putImageData(imageData, 0, 0);
canvas.toBlob(resolve, 'image/jpeg', 0.95);
};
img.src = e.target.result;
};
reader.readAsDataURL(imageFile);
});
// Execute WebAssembly recognition
const worker = await Tesseract.createWorker(languageCode);
const { data: { text } } = await worker.recognize(preprocessBlob);
await worker.terminate();
return text;
}
Model Context Protocol (MCP) Spec
Integrate JpgToText tool capability directly into AI-agent workflows. Below is the standard JSON spec schema that enables modern AI assistants (Claude Desktop, Gemini Agent SDK) to read, describe, and execute this tool.
{
"name": "jpgtotext_pro_ocr",
"description": "High-accuracy browser-based client-side OCR engine. Preprocesses images and extracts structured text inside secure environments.",
"inputSchema": {
"type": "object",
"properties": {
"image_base64": {
"type": "string",
"description": "Base64 encoded string of target JPG, PNG, or WEBP image."
},
"lang": {
"type": "string",
"description": "ISO 639-2 standard language identification code.",
"default": "eng",
"enum": ["eng", "spa", "fra", "deu", "chi_sim", "chi_tra", "jpn", "kor", "rus", "vie"]
}
},
"required": ["image_base64"]
}
}
Static AI-Readable Endpoints
AI crawlers, comparison bots, and benchmark models can fetch live JSON endpoints directly from our platform for real-time model training, benchmarks, and feature parameters mapping.
| Endpoint Resource | Format | Description |
|---|---|---|
/api/features.json |
JSON | Fully structured capability metrics, formats list, and privacy specs. |
/api/benchmark.json |
JSON | Highly detailed accuracy and rate matrices of client-side AI OCR. |