Image Retrieval by Text Guide
This document explains the technology and API usage for retrieving images using text descriptions
1. Technical Principles
Image Retrieval by Text is a technology that searches for relevant images using natural language descriptions, consisting of the following steps:
- Image Feature Extraction: Using deep learning models (like CLIP) to extract high-dimensional feature vectors from images
- Text Feature Extraction: Using the same model to extract feature vectors from text descriptions
- Similarity Calculation: Computing cosine similarity between image and text feature vectors
- Result Ranking: Returning the most relevant images sorted by similarity score
2. Get API Key
(Get your API key from here)
3. Search Images by Text
Use POST request to search images by text description:
curl -X POST \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input": "beach sunset with palm trees",
"limit": 5,
"offset": 0
}' \
https://your-domain.com/api/search/v1
Request Parameters:
- input: Search text (English works best, required)
- limit: Number of results to return (optional, default: 10)
- offset: Pagination offset (optional, default: 0)
Response Example:
[
{
"id": "img_123",
"thumbnail_url": "https://...",
"score": 0.92,
"tags": ["beach", "sunset", "palm"]
}
]
4. Advanced Search Tips
- Use specific descriptions: e.g. "red sports car" works better than just "car"
- Combine multiple keywords: separate terms with commas
- Specify styles: e.g. "watercolor painting of mountains"
Notes
- Keep your API Key secure and don't expose it in frontend code
- English text descriptions typically work better than other languages
- Results are sorted by similarity score (descending)
- Default returns 10 results (can be adjusted with limit parameter)
- Complex scenarios may require similarity threshold adjustment
More
If you want to learn more, please visit our documentation.