Image Retrieval by Text Guide

This document explains the technology and API usage for retrieving images using text descriptions

1. Technical Principles

Image Retrieval by Text is a technology that searches for relevant images using natural language descriptions, consisting of the following steps:

  1. Image Feature Extraction: Using deep learning models (like CLIP) to extract high-dimensional feature vectors from images
  2. Text Feature Extraction: Using the same model to extract feature vectors from text descriptions
  3. Similarity Calculation: Computing cosine similarity between image and text feature vectors
  4. Result Ranking: Returning the most relevant images sorted by similarity score

2. Get API Key

(Get your API key from here)

3. Search Images by Text

Use POST request to search images by text description:

curl -X POST \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "beach sunset with palm trees",
    "limit": 5,
    "offset": 0
  }' \
  https://your-domain.com/api/search/v1

Request Parameters:

Response Example:

[
  {
    "id": "img_123",
    "thumbnail_url": "https://...",
    "score": 0.92,
    "tags": ["beach", "sunset", "palm"]
  }
]

4. Advanced Search Tips

  1. Use specific descriptions: e.g. "red sports car" works better than just "car"
  2. Combine multiple keywords: separate terms with commas
  3. Specify styles: e.g. "watercolor painting of mountains"

Notes

  1. Keep your API Key secure and don't expose it in frontend code
  2. English text descriptions typically work better than other languages
  3. Results are sorted by similarity score (descending)
  4. Default returns 10 results (can be adjusted with limit parameter)
  5. Complex scenarios may require similarity threshold adjustment

More

If you want to learn more, please visit our documentation.