ColBERT

Experimental5 credits

ColBERT provides state-of-the-art neural retrieval with token-level embeddings. Using late interaction, it delivers superior ranking precision compared to traditional dense embeddings.

Production Recommendation

This is a direct endpoint for development and testing. For production workloads, use the Data Intelligence Pipeline -- it provides structured Data Packages with quality metrics, is async by default, and is covered by Enterprise SLAs.

Overview

ColBERT provides state-of-the-art neural retrieval with token-level embeddings. Using late interaction, it delivers superior ranking precision compared to traditional dense embeddings.

Key features:

  • Token-level embeddings for precise matching
  • Query expansion support
  • FastPlaid engine (5,689x faster than exhaustive)
  • Separate query/document processing modes

API Reference

POSThttps://api.latence.ai/api/v1/colbert/embed
Generate ColBERT token-level embeddings

Request Parameters

ParameterTypeRequiredDefaultDescription
textstringInput text (1-100,000 chars)
is_querybooleanTrue for queries, false for documents
query_expansionbooleanEnable query expansion

Response Fields

FieldTypeDescription

Response Example

200 OKJSON
{
  "embeddings": [[...], [...], ...],
  "shape": [12, 128],
  "encoding_format": "float",
  "success": true,
  "usage": { "credits": 0.5 }
}

Code Examples

from latence import Latence

client = Latence(api_key="YOUR_API_KEY")

# Generate ColBERT token-level embeddings for neural search
result = client.experimental.colbert.embed(
    text="What are the benefits of machine learning?",
    is_query=True,           # True for queries, False for documents
    query_expansion=True     # Enable query expansion
)

print(result.embeddings)  # Float arrays per token
print(result.shape)       # [tokens, 128]

Explore Tutorials & Notebooks

Deep-dive examples and interactive notebooks in our GitHub repository

View on GitHub

Looking for production-grade processing?

The Data Intelligence Pipeline chains services automatically and returns structured Data Packages.