Image Retrieval by Text Guide

This document explains the technology and API usage for retrieving images using text descriptions

1. Technical Principles

Image Retrieval by Text is a technology that searches for relevant images using natural language descriptions, consisting of the following steps:

Image Feature Extraction: Using deep learning models (like CLIP) to extract high-dimensional feature vectors from images
Text Feature Extraction: Using the same model to extract feature vectors from text descriptions
Similarity Calculation: Computing cosine similarity between image and text feature vectors
Result Ranking: Returning the most relevant images sorted by similarity score

2. Get API Key

(Get your API key from here)

3. Search Images by Text

Use POST request to search images by text description:

curl -X POST \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "beach sunset with palm trees",
    "limit": 5,
    "offset": 0
  }' \
  https://your-domain.com/api/search/v1

Request Parameters:

input: Search text (English works best, required)
limit: Number of results to return (optional, default: 10)
offset: Pagination offset (optional, default: 0)

Response Example:

[
  {
    "id": "img_123",
    "thumbnail_url": "https://...",
    "score": 0.92,
    "tags": ["beach", "sunset", "palm"]
  }
]

4. Advanced Search Tips

Use specific descriptions: e.g. "red sports car" works better than just "car"
Combine multiple keywords: separate terms with commas
Specify styles: e.g. "watercolor painting of mountains"

Notes

Keep your API Key secure and don't expose it in frontend code
English text descriptions typically work better than other languages
Results are sorted by similarity score (descending)
Default returns 10 results (can be adjusted with limit parameter)
Complex scenarios may require similarity threshold adjustment

If you want to learn more, please visit our documentation.