ColPali

Experimental10 credits

ColPali combines vision and language models for searching documents where visual context matters. Ideal for documents with charts, diagrams, tables, and complex formatting.

Production Recommendation

This is a direct endpoint for development and testing. For production workloads, use the Data Intelligence Pipeline -- it provides structured Data Packages with quality metrics, is async by default, and is covered by Enterprise SLAs.

Overview

ColPali combines vision and language models for searching documents where visual context matters. Ideal for documents with charts, diagrams, tables, and complex formatting.

Key features:

  • Vision-language embeddings (image + text)
  • Document page understanding
  • Layout-aware retrieval
  • Works with queries and document images

API Reference

POSThttps://api.latence.ai/api/v1/colpali/embed
Generate vision-language embeddings

Request Parameters

ParameterTypeRequiredDefaultDescription
textstringQuery text (for is_query=true)
imagestringBase64-encoded image data
is_querybooleanTrue for text queries, false for images

Response Fields

FieldTypeDescription

Response Example

200 OKJSON
{
  "embeddings": [[...], [...], ...],
  "shape": [196, 128],
  "encoding_format": "float",
  "success": true,
  "usage": { "credits": 1.0 }
}

Code Examples

from latence import Latence

client = Latence(api_key="YOUR_API_KEY")

# Text query embedding
result = client.experimental.colpali.embed(
    text="Find invoices from 2024",
    is_query=True
)

# Or embed a document image from file
result = client.experimental.colpali.embed(
    image_path="/path/to/document_page.png",
    is_query=False  # For indexing documents
)

print(result.embeddings)  # Float arrays
print(result.shape)       # [patches, 128]

Explore Tutorials & Notebooks

Deep-dive examples and interactive notebooks in our GitHub repository

View on GitHub

Looking for production-grade processing?

The Data Intelligence Pipeline chains services automatically and returns structured Data Packages.