Chat with Image Models
Have intelligent conversations about images using advanced AI vision models. Upload images and engage in natural language discussions about their content, analyze visual elements, extract information, and get detailed descriptions through an interactive chat interface.
Available Models
The Chat with Image API supports multiple AI models with vision capabilities:
OpenAI Models:
gpt-4o
- GPT-4o (128K context, $0.25 input / $10 output per 1K tokens)gpt-4o-mini
- GPT-4o Mini (128K context, $0.15 input / $0.6 output per 1K tokens)gpt-4-turbo
- GPT-4 Turbo (128K context, $10 input / $30 output per 1K tokens)gpt-5
- GPT-5 (400K context, $1.25 input / $10 output per 1K tokens)gpt-5-mini
- GPT-5 Mini (400K context, $0.25 input / $2 output per 1K tokens)gpt-5-chat-latest
- GPT-5 Chat Latest (128K context, $1.25 input / $10 output per 1K tokens)
Google Models:
gemini-1.5-pro
- Gemini 1.5 Pro (2M context, $1.25 input / $5 output per 1K tokens)gemini-1.5-flash
- Gemini 1.5 Flash (1M context, $0.075 input / $0.3 output per 1K tokens)gemini-2.0-flash
- Gemini 2.0 Flash (1M context, $0.075 input / $0.3 output per 1K tokens)gemini-2.5-pro
- Gemini 2.5 Pro (2M context, $1.25 input / $5 output per 1K tokens)
Anthropic Models:
claude-3-opus-20240229
- Claude 3 Opus (200K context, $15 input / $75 output per 1K tokens)claude-3-sonnet
- Claude 3 Sonnet (200K context, $3 input / $15 output per 1K tokens)claude-3-5-sonnet-20240620
- Claude 3.5 Sonnet (200K context, $3 input / $15 output per 1K tokens)claude-4-opus
- Claude 4 Opus (1M context, $15 input / $75 output per 1K tokens)
Other Models:
mistral-large-latest
- Mistral Large 2 (128K context, $2 input / $6 output per 1K tokens)pixtral-12b
- Mistral Pixtral 12B (128K context, $0.15 input / $0.15 output per 1K tokens)meta/llama-3.1-405b-instruct
- LLaMA 3.1 405B (128K context, $5.32 input / $16 output per 1K tokens)
Request Parameters
Field Name | Type | Example | Description | Required |
---|---|---|---|---|
type | text | CHAT_WITH_IMAGE | Feature identifier | ✔️ |
model | text | gpt-4o | AI model to use | ✔️ |
conversationId | text | image-analysis-123 | Unique conversation identifier | ✔️ |
promptObject.prompt | string | What do you see in this image? | Your message or question | ✔️ |
promptObject.imageList | array | ["image1.jpg", "image2.png"] | Array of uploaded image keys | ✔️ |
promptObject.isMixed | boolean | false | Enable mixed conversation mode | ✖️ |
promptObject.webSearch | boolean | false | Enable web search for enhanced responses | ✖️ |
promptObject.numOfSite | number | 5 | Number of sites to search (if webSearch enabled) | ✖️ |
promptObject.maxWord | number | 1000 | Max words per site (if webSearch enabled) | ✖️ |
Parameter Details
imageList: Array of image keys obtained from uploading images via the Asset API. Images should be in PNG, JPEG, or WebP format.
conversationId: Unique identifier for the conversation. Use the same ID to continue an existing conversation or create a new one for each new chat session.
isMixed: When enabled, allows mixing different models within the same conversation thread.
webSearch: Enables real-time web search to enhance responses with current information related to the image content.
Endpoint
Request Headers
Field | Value |
---|---|
API-KEY | <api-key> |
Content-Type | application/json |
Code Example
- cURL
- JavaScript
- Python
curl --location 'https://api.1min.ai/api/features' \
--header 'API-KEY: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"type": "CHAT_WITH_IMAGE",
"model": "gpt-4o",
"conversationId": "image-analysis-session-1",
"promptObject": {
"prompt": "What do you see in this image? Please describe the scene in detail.",
"imageList": ["uploads/images/photo-123.jpg"],
"isMixed": false,
"webSearch": false,
"numOfSite": 3,
"maxWord": 500
}
}'
fetch('https://api.1min.ai/api/features', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'API-KEY': 'YOUR_API_KEY'
},
body: JSON.stringify({
type: 'CHAT_WITH_IMAGE',
model: 'gpt-4o',
conversationId: 'image-analysis-session-1',
promptObject: {
prompt: 'What do you see in this image? Please describe the scene in detail.',
imageList: ['uploads/images/photo-123.jpg'],
isMixed: false,
webSearch: false,
numOfSite: 3,
maxWord: 500
}
})
})
import requests
url = "https://api.1min.ai/api/features"
headers = {
"Content-Type": "application/json",
"API-KEY": "YOUR_API_KEY"
}
data = {
"type": "CHAT_WITH_IMAGE",
"model": "gpt-4o",
"conversationId": "image-analysis-session-1",
"promptObject": {
"prompt": "What do you see in this image? Please describe the scene in detail.",
"imageList": ["uploads/images/photo-123.jpg"],
"isMixed": False,
"webSearch": False,
"numOfSite": 3,
"maxWord": 500
}
}
response = requests.post(url, headers=headers, json=data)
Interactive Playground
API Playground
https://api.1min.ai/api/features
Generated cURL Command:
curl -X POST "https://api.1min.ai/api/features" \
-H "API-KEY: <your-api-key>" \
-H "Content-Type: application/json" \
-d '{
"type": "CHAT_WITH_IMAGE",
"model": "gpt-4o",
"conversationId": "image-analysis-session-1",
"promptObject": {
"prompt": "What do you see in this image? Please describe the scene in detail.",
"imageList": [
"uploads/images/photo-123.jpg"
],
"isMixed": false,
"webSearch": false,
"numOfSite": 3,
"maxWord": 500
}
}'
Response Format
{
"status": "success",
"message": "Streaming response will be returned",
"data": {
"conversationId": "image-analysis-session-1",
"aiRecordId": "record-123",
"streamingUrl": "/api/features/stream/record-123"
}
}
Conversation Examples
Basic Image Description
{
"type": "CHAT_WITH_IMAGE",
"model": "gpt-4o",
"conversationId": "photo-analysis-1",
"promptObject": {
"prompt": "Describe this image in detail",
"imageList": ["landscape-photo.jpg"]
}
}
Multiple Images Comparison
{
"type": "CHAT_WITH_IMAGE",
"model": "claude-3-5-sonnet-20240620",
"conversationId": "comparison-session",
"promptObject": {
"prompt": "Compare these two images and tell me the differences",
"imageList": ["image1.jpg", "image2.jpg"]
}
}
Enhanced with Web Search
{
"type": "CHAT_WITH_IMAGE",
"model": "gemini-1.5-pro",
"conversationId": "research-session",
"promptObject": {
"prompt": "What building is this and what's its historical significance?",
"imageList": ["building-photo.jpg"],
"webSearch": true,
"numOfSite": 5,
"maxWord": 1000
}
}
Best Practices
- Image Quality: Use clear, high-resolution images for better analysis results
- Specific Questions: Ask specific questions to get more detailed and relevant responses
- Context: Provide context about what you're looking for in the image
- Conversation Flow: Use the same conversationId to maintain context across multiple exchanges
- Model Selection: Choose models based on your needs:
- GPT-4o: Best overall performance for detailed analysis
- Gemini: Excellent for large context and multiple images
- Claude: Great for creative and detailed descriptions
- Mini models: Cost-effective for simple questions
Error Handling
Common error scenarios:
- Invalid image format: Ensure images are in PNG, JPEG, or WebP format
- Image too large: Resize images if they exceed size limits
- Missing images: Verify that imageList contains valid uploaded image keys
- Model limitations: Some models may have specific image size or format requirements
Authentication
This endpoint requires an API key to be provided in the API-KEY
header.