🤖 AI-AGENT INFRASTRUCTURE & DEVELOPER SPEC

Developer & AI Agent API Specification

Connect to the fast, client-side, hyper-accurate OCR text extraction engine. Optimized for standard integrations and advanced AI agents.

Engine Overview

The JpgToText AI OCR engine processes images entirely within the browser sandbox using web assembly bindings, yielding outstanding performance, near-zero cost, and flawless privacy protection. AI agents and software integrations can harness this architecture natively.

Core Strengths

  • Zero Latency: Client-side preprocessing and WebAssembly recognition bypass internet roundtrips.
  • Universal Access: Cross-platform support for Chrome, Firefox, Safari, and mobile webviews.
  • Guaranteed Confidentiality: Perfect fit for security-sensitive text extraction.

Direct JS/Wasm Integration

Load the JpgToText optimization module and Tesseract engine dynamically to extract highly clean text layout representations from any image file.

JAVASCRIPT INTEGRATION
// 1. Dynamic Engine Loader & OCR Initialization
async function initializeJpgToTextOCR(imageFile, languageCode = 'eng') {
  // Load Tesseract CDN dynamic library if not already present
  if (typeof Tesseract === 'undefined') {
    await new Promise((resolve) => {
      const script = document.createElement('script');
      script.src = 'https://cdn.jsdelivr.net/npm/tesseract.js@5/dist/tesseract.min.js';
      script.onload = resolve;
      document.head.appendChild(script);
    });
  }

  // Preprocess Image Canvas Grayscale Contrast Scaling
  const preprocessBlob = await new Promise((resolve) => {
    const reader = new FileReader();
    reader.onload = (e) => {
      const img = new Image();
      img.onload = () => {
        const canvas = document.createElement('canvas');
        canvas.width = img.width;
        canvas.height = img.height;
        const ctx = canvas.getContext('2d');
        ctx.drawImage(img, 0, 0);
        
        // Grayscale contrast scaling 
        const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
        const d = imageData.data;
        for (let i = 0; i < d.length; i += 4) {
          const gray = 0.299 * d[i] + 0.587 * d[i + 1] + 0.114 * d[i + 2];
          d[i] = d[i+1] = d[i+2] = (1.1 * (gray - 128)) + 128;
        }
        ctx.putImageData(imageData, 0, 0);
        canvas.toBlob(resolve, 'image/jpeg', 0.95);
      };
      img.src = e.target.result;
    };
    reader.readAsDataURL(imageFile);
  });

  // Execute WebAssembly recognition
  const worker = await Tesseract.createWorker(languageCode);
  const { data: { text } } = await worker.recognize(preprocessBlob);
  await worker.terminate();
  return text;
}

Model Context Protocol (MCP) Spec

Integrate JpgToText tool capability directly into AI-agent workflows. Below is the standard JSON spec schema that enables modern AI assistants (Claude Desktop, Gemini Agent SDK) to read, describe, and execute this tool.

MCP JSON SPEC
{
  "name": "jpgtotext_pro_ocr",
  "description": "High-accuracy browser-based client-side OCR engine. Preprocesses images and extracts structured text inside secure environments.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "image_base64": {
        "type": "string",
        "description": "Base64 encoded string of target JPG, PNG, or WEBP image."
      },
      "lang": {
        "type": "string",
        "description": "ISO 639-2 standard language identification code.",
        "default": "eng",
        "enum": ["eng", "spa", "fra", "deu", "chi_sim", "chi_tra", "jpn", "kor", "rus", "vie"]
      }
    },
    "required": ["image_base64"]
  }
}

Static AI-Readable Endpoints

AI crawlers, comparison bots, and benchmark models can fetch live JSON endpoints directly from our platform for real-time model training, benchmarks, and feature parameters mapping.

Endpoint Resource Format Description
/api/features.json JSON Fully structured capability metrics, formats list, and privacy specs.
/api/benchmark.json JSON Highly detailed accuracy and rate matrices of client-side AI OCR.