Fusion AI

Documentation

What is Fusion AI?Key Benefits How NeuroSwitch Works Data Flow

Image & PDF Input

Powerful multimedia processing capabilities that let AI models analyze images, extract text from PDFs, and understand visual content with ease.

Images

Supported Formats

JPEGPNGWebPGIF

Max Size: 20MB

Visual analysis and description

OCR text extraction

Object and scene recognition

Chart and diagram interpretation

Documents

Supported Formats

PDFTextMarkdown

Max Size: 50MB

Full text extraction

Document structure analysis

Table and form recognition

Multi-page processing

How to Upload Files

Base64 Encoding

{
  "prompt": "Analyze this image",
  "provider": "neuroswitch",
  "files": [
    {
      "type": "image",
      "data": "data:image/jpeg;base64,/9j/4AAQ...",
      "filename": "chart.jpg"
    }
  ]
}

URL Reference

{
  "prompt": "Extract text from this PDF",
  "provider": "neuroswitch",
  "files": [
    {
      "type": "pdf",
      "url": "https://example.com/document.pdf",
      "filename": "contract.pdf"
    }
  ]
}

Common Use Cases

Document Analysis

Extract insights from PDFs, forms, and documents

Invoice processing and data extraction

Contract review and summarization

Research paper analysis

Legal document review

Visual Understanding

Analyze images, charts, and visual content

Chart and graph interpretation

Medical image analysis

Product photo descriptions

Scene and object recognition

OCR & Text Extraction

Convert images and PDFs to searchable text

Scanned document digitization

Handwritten text recognition

Receipt and invoice parsing

Sign and label reading

Processing Examples

Image Analysis

curl -X POST https://api.mcp4.ai/chat \
  -H "Authorization: Bearer sk-fusion-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Describe what you see in this image and identify any text",
    "provider": "neuroswitch",
    "files": [
      {
        "type": "image", 
        "data": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD...",
        "filename": "screenshot.jpg"
      }
    ]
  }'

PDF Text Extraction

{
  "prompt": "Extract all the key information from this invoice",
  "provider": "neuroswitch", 
  "files": [
    {
      "type": "pdf",
      "url": "https://storage.example.com/invoice-2024-001.pdf",
      "filename": "invoice.pdf"
    }
  ],
  "max_tokens": 1000
}

Response Format

{
  "response": "This image shows a bar chart displaying quarterly sales data...",
  "provider_used": "claude-3-opus",
  "file_analysis": {
    "files_processed": 1,
    "total_pages": 3,
    "extracted_text_length": 2847,
    "processing_time_ms": 1250
  },
  "tokens_used": 456,
  "cost": 0.00234
}

Provider Support Matrix

Feature	GPT-4V	Claude 3	Gemini Pro
Image Analysis	✅ Excellent	✅ Excellent	✅ Good
PDF Processing	✅ Native	✅ Native	🔄 Converted
OCR Accuracy	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Max File Size	20MB	20MB	20MB

Security & Privacy

Files encrypted in transit and at rest

Automatic file deletion after processing

No permanent storage of uploaded content

GDPR and SOC 2 compliant processing

Best Practices

✅ Optimization Tips

Use high-resolution images for better OCR accuracy

Compress large files to reduce processing time

Provide specific prompts for better analysis

Use NeuroSwitch for optimal model selection

⚠️ Considerations

Processing time increases with file size

Complex documents may require multiple requests

Handwritten text recognition has lower accuracy

Some providers have different strengths

Integration Examples

Quick Start

Try multimedia processing with your first API call

Tool Calling

Combine file processing with function calls

Advanced Config

Fine-tune processing parameters

Start Processing Multimedia

Upload images and PDFs to unlock powerful AI analysis capabilities. Get started with your first multimedia request today.

Try It Now View API Reference