Updated: 2025-12-03
Agent mode is Aitoearn’s intelligent workflow engine that automatically invokes various AI tools to complete complex tasks. Below are the pricing details for each tool in Agent mode.
Large Language Models
Pricing unit is per 1 million tokens.
| Model | Official Price | Input Modality | Output Modality | Notes |
|---|
| gemini-3.0-pro-preview | Input USD 2, Output USD 12 | Text, Image, Video | Text | Latest preview |
| sonnet-4.5 | Input USD 3, Output USD 15 | Text, Image | Text | Claude Advanced |
| opus-4.5 | Input USD 5, Output USD 25 | Text, Image | Text | Claude Flagship |
| gpt-5.1 | Input USD 1.25, Output USD 10 | Text, Image | Text | OpenAI Latest |
| gpt-5 | Input USD 1.25, Output USD 10 | Text, Image | Text | OpenAI Main |
Video Generation Models
| Model | Official Price | Notes |
|---|
| Veo-3.1 | USD 0.15 / sec | Google’s top video model |
| Sora-2 | USD 0.10 / sec | OpenAI video generation model |
Image Generation Models
| Model | Official Price | Description |
|---|
| Nano Banana Pro | USD 0.134 / image (1K-2K); USD 0.24 / image (4K) | 1024×1024 to 2048×2048 consumes 1120 tokens; supports up to 4096×4096 (2000 tokens) |
| Tool | Price | Billing Method |
|---|
| Video Understanding | USD 0.015 / min | Billed by input video duration |
| Video Editing | USD 2.5 / 1000 min | 720P output, other resolutions auto-converted by pixel |
| Video Style Transfer | USD 7.5 / min | 720P output, other resolutions auto-converted by pixel |
| Video Translation | USD 1.0 / min | Billed by output video duration |
| Highlight Clipping | USD 0.15 / min | Billed by input video duration |
| Subtitle Removal | USD 0.15 / min | Billed by output video duration |
| Script Restoration | USD 0.5 / min | Billed by input video duration |
| Video Narration | USD 0.1 / min | Billed by generated video duration |