Top SitesSegmind - Media Generation Workflows for Developers

Machine Readiness

Stored receipt and evidence

Overall

20

Readable

65

Callable

0

Commerce

0

Payment

0

Machine Access

Inspect the site's MCP endpoint

Open MCP explorer

DialtoneApp can scan the stored discovery files for this domain, try the MCP initialize handshake, and show the raw protocol transcript.

Purchase boundary

read only

Control boundary

unknown

Payment rails

None

Payment providers

None

Payment methods

None

Payment protocols

None

Payment assets

None

Payment networks

None

Capabilities

None

Verified payment surface

No

Crypto only

No

Readable docs

robots, llms

Products

0

Variants

0

Priced variants

0

Currencies

0

Offers

0

Priced offers

0

Priced actions

0

Samples

Offer samples

No stored offer samples.

Samples

Action samples

No stored action samples.

Samples

Product samples

No stored product samples.

Document

robots.txt

Open robots.txt
User-agent: *
Allow: /

Sitemap: https://www.segmind.com/sitemap.xml

Document

llms.txt

Open llms.txt
# Segmind API — Model Directory

> Segmind provides serverless GPU inference APIs for 200+ generative AI models including image generation, video generation, audio, LLMs, and more. Pay-per-use pricing with no infrastructure to manage.

- **Base URL**: `https://api.segmind.com/v1/{model_slug}`
- **Authentication**: Bearer token (API key)
- **Docs**: https://docs.segmind.com

For detailed API documentation, parameters, pricing, and code examples for any model, fetch:
  `https://www.segmind.com/models/{slug}/llms.txt`

---

## Text-to-Image Generation

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Background Eraser | background-eraser | Background Eraser helps in flawless background removal with exceptional accuracy. | Text-to-Image Generation | $0.0006151452871512006 | 0.79262s | [llms.txt](https://www.segmind.com/models/background-eraser/llms.txt) |
| Bria 3.2 Text to Image | bria-text-to-image | Bria 3.2 AI transforms natural language into stunning visuals for diverse creative applications — with Base, Fast, and H | Text-to-Image Generation | $0.03890776699029126 | 21.90631s | [llms.txt](https://www.segmind.com/models/bria-text-to-image/llms.txt) |
| Bria Vector Graphics | bria-text-to-vector-graphics | Bria Vision enables high-quality text-to-image and text-to-vector graphic generation for versatile commercial use. | Text-to-Image Generation | $0.03927536231884058 | 17.91585s | [llms.txt](https://www.segmind.com/models/bria-text-to-vector-graphics/llms.txt) |
| Chroma  | chroma | Chroma is an open-source, 8.9B parameter text-to-image model (based on FLUX.1-schnell) designed for diverse and uncensor | Text-to-Image Generation | $0.05537459326497977 | 53.18714s | [llms.txt](https://www.segmind.com/models/chroma/llms.txt) |
| Colossus Lightning SDXL | sdxl1.0-colossus-lightning | Colossus Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images i | Text-to-Image Generation | $0.007343366528711841 | 3.41171s | [llms.txt](https://www.segmind.com/models/sdxl1.0-colossus-lightning/llms.txt) |
| Copax Timeless SDXL | sdxl1.0-timeless | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.010301487864269121 | 5.4494s | [llms.txt](https://www.segmind.com/models/sdxl1.0-timeless/llms.txt) |
| Cyber Realistic | sd1.5-cyberrealistic | The most versatile photorealistic model that blends various models to achieve the amazing realistic images. | Text-to-Image Generation | $0.002858647598761479 | 1.49894s | [llms.txt](https://www.segmind.com/models/sd1.5-cyberrealistic/llms.txt) |
| DreamShaper Lightning SDXL | sdxl1.0-dreamshaper-lightning | DreamShaper Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px image | Text-to-Image Generation | $0.006108172840899091 | 3.18294s | [llms.txt](https://www.segmind.com/models/sdxl1.0-dreamshaper-lightning/llms.txt) |
| Dreamshaper SDXL | sdxl1.0-dreamshaper | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.011029928226258858 | 6.46014s | [llms.txt](https://www.segmind.com/models/sdxl1.0-dreamshaper/llms.txt) |
| Dynavis Lightning SDXL | sdxl1.0-dyanvis-lightning | Dynavis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in | Text-to-Image Generation | $0.006759038098416354 | 3.81495s | [llms.txt](https://www.segmind.com/models/sdxl1.0-dyanvis-lightning/llms.txt) |
| Edge of Realism | sd1.5-edgeofrealism | This model corresponds to the Stable Diffusion Edge of Realism checkpoint for detailed images at the cost of a super det | Text-to-Image Generation | $0.0033575167818571824 | 1.62471s | [llms.txt](https://www.segmind.com/models/sd1.5-edgeofrealism/llms.txt) |
| Epic Realism | sd1.5-epicrealism | This model corresponds to the Stable Diffusion Epic Realism checkpoint for detailed images at the cost of a super detail | Text-to-Image Generation | $0.0035588267762241533 | 1.66889s | [llms.txt](https://www.segmind.com/models/sd1.5-epicrealism/llms.txt) |
| Fast Flux.1 Schnell | fast-flux-schnell | Fast Flux.1 Schnell by Segmind is an optimized text-to-image model designed for developers needing faster image generati | Text-to-Image Generation | $0.005469704305806988 | 2.45278s | [llms.txt](https://www.segmind.com/models/fast-flux-schnell/llms.txt) |
| Flux .1 Pro | flux-pro | Flux Pro is a state-of-the-art image generation with top of the line prompt following, visual quality, image detail and  | Text-to-Image Generation | $0.06720196622390362 | 20.40943s | [llms.txt](https://www.segmind.com/models/flux-pro/llms.txt) |
| Flux Dev Finetuned | flux-dev-finetuned | Flux is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions | Text-to-Image Generation | $0.03397461959219858 | 22.17098s | [llms.txt](https://www.segmind.com/models/flux-dev-finetuned/llms.txt) |
| Flux Realism Lora with Upscale | flux-realism-lora | Flux Realism Lora  with upscale, developed by XLabs AI is a cutting-edge model designed to generate realistic images fro | Text-to-Image Generation | $0.05401312950388659 | 38.20368s | [llms.txt](https://www.segmind.com/models/flux-realism-lora/llms.txt) |
| Flux-1.1 Pro Ultra | flux-1.1-pro-ultra | Create stunning visuals effortlessly with Flux 1.1 Pro Ultra. Experience unparalleled image quality and speed. | Text-to-Image Generation | $0.07491933421374558 | 13.97401s | [llms.txt](https://www.segmind.com/models/flux-1.1-pro-ultra/llms.txt) |
| flux-pro-1.1 | flux-1.1-pro | Flux Pro 1.1 is a cutting-edge image generation tool offering exceptional speed, quality, and customization. Ideal for d | Text-to-Image Generation | $0.049941616963406175 | 14.08806s | [llms.txt](https://www.segmind.com/models/flux-1.1-pro/llms.txt) |
| Flux.1 Dev | flux-dev | Flux Dev is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions | Text-to-Image Generation | $0.019617503960528127 | 20.42006s | [llms.txt](https://www.segmind.com/models/flux-dev/llms.txt) |
| Flux.1 Schnell | flux-schnell | Flux Schnell  is a state-of-the-art text-to-image generation model engineered for speed and efficiency. | Text-to-Image Generation | $0.007854434602951205 | 10.14296s | [llms.txt](https://www.segmind.com/models/flux-schnell/llms.txt) |
| GPT Image 1 | gpt-image-1 | Create high-quality AI-generated images from text prompts using OpenAI's GPT Image 1 model. Ideal for product design, co | Text-to-Image Generation | $0.17882895119128733 | 48.8478s | [llms.txt](https://www.segmind.com/models/gpt-image-1/llms.txt) |
| GPT Image 1 Mini | gpt-image-1-mini | High-quality image generation from text, fast and affordable. | Text-to-Image Generation | $0.03726354359483614 | 42.19341s | [llms.txt](https://www.segmind.com/models/gpt-image-1-mini/llms.txt) |
| GPT Image 1.5 | gpt-image-1.5 | Stunning photorealistic images with exceptional instruction-following. | Text-to-Image Generation | $0.16893354443934527 | 38.40591s | [llms.txt](https://www.segmind.com/models/gpt-image-1.5/llms.txt) |
| GPT Image 2 | gpt-image-2 | Generate photorealistic images with legible multilingual text and 2K output. | Text-to-Image Generation | $5 | - | [llms.txt](https://www.segmind.com/models/gpt-image-2/llms.txt) |
| Ideogram 2a Text To Image | ideogram-2a-txt-2-img | Create captivating designs, realistic images & innovative logos with Ideogram 2a text-to-image. | Text-to-Image Generation | $0.04999999999999993 | 12.37334s | [llms.txt](https://www.segmind.com/models/ideogram-2a-txt-2-img/llms.txt) |
| Ideogram 3.0 | ideogram-3 | Ideogram 3.0 revolutionizes content creation with photorealistic text-to-image generation and diverse aesthetic styles. | Text-to-Image Generation | $0.06539005269245597 | 10.29323s | [llms.txt](https://www.segmind.com/models/ideogram-3/llms.txt) |
| Ideogram Text To Image | ideogram-txt-2-img | Ideogram Text to Image: Turn your ideas into stunning visuals instantly with this powerful AI tool. Create captivating d | Text-to-Image Generation | $0.09999999999999962 | 21.63901s | [llms.txt](https://www.segmind.com/models/ideogram-txt-2-img/llms.txt) |
| Ideogram Turbo Text To Image | ideogram-turbo-txt-2-img | Create stunning images in seconds with Ideogram Turbo Text to Image. Fast AI model for quick ideation & text rendering. | Text-to-Image Generation | $0.06299999999999999 | 12.83101s | [llms.txt](https://www.segmind.com/models/ideogram-turbo-txt-2-img/llms.txt) |
| Imagen 3 | imagen | Imagen 3 is Google DeepMind's highest quality text-to-image model. Generates detailed images with enhanced lighting, div | Text-to-Image Generation | $0.060000000000000074 | 8.15673s | [llms.txt](https://www.segmind.com/models/imagen/llms.txt) |
| Imagen 4 | imagen-4 | Imagen 4 is Google’s most advanced AI image generation model, creating detailed, photorealistic or abstract images from  | Text-to-Image Generation | $0.059999999999999915 | 11.36461s | [llms.txt](https://www.segmind.com/models/imagen-4/llms.txt) |
| Juggernaut Final | sd1.5-juggernaut | The most versatile photorealistic model that blends various models to achieve the amazing realistic images. | Text-to-Image Generation | $0.0030143181960931693 | 1.68559s | [llms.txt](https://www.segmind.com/models/sd1.5-juggernaut/llms.txt) |
| Juggernaut Lightning Flux | juggernaut-lightning-flux |  Juggernaut Lightning Flux: Blazing fast (<300ms!) & powerful inference with enhanced visuals. | Text-to-Image Generation | $0.009213980160017302 | 6.07418s | [llms.txt](https://www.segmind.com/models/juggernaut-lightning-flux/llms.txt) |
| Juggernaut Lightning SDXL | sdxl1.0-juggernaut-lightning | Juggernaut Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images | Text-to-Image Generation | $0.0041748787772685056 | 3.19259s | [llms.txt](https://www.segmind.com/models/sdxl1.0-juggernaut-lightning/llms.txt) |
| Juggernaut Pro Flux | juggernaut-pro-flux | Juggernaut Pro FLUX: Create stunningly realistic AI images with unprecedented detail and sharpness. | Text-to-Image Generation | $0.012202491649676834 | 7.66323s | [llms.txt](https://www.segmind.com/models/juggernaut-pro-flux/llms.txt) |
| Kling V3 Text to Image | kling-3-text2image | Photorealistic, print-ready images from text prompts. | Text-to-Image Generation | $0.035 | 50.47578s | [llms.txt](https://www.segmind.com/models/kling-3-text2image/llms.txt) |
| Luma Photon Flash Text to Image | luma-photon-flash-txt-2-img | Luma Photon flash is a powerful and fast text-to-image model offering high-quality visuals with unmatched speed and prec | Text-to-Image Generation | $0.0024999999999999966 | 15.9032s | [llms.txt](https://www.segmind.com/models/luma-photon-flash-txt-2-img/llms.txt) |
| Luma Photon Text to Image | luma-photon-txt-2-img | Luma Photon is a powerful AI-driven text-to-image model offering high-quality visuals with unmatched speed and precision | Text-to-Image Generation | $0.018749999999999996 | 18.87383s | [llms.txt](https://www.segmind.com/models/luma-photon-txt-2-img/llms.txt) |
| Nano Banana | nano-banana | Gemini Image Editor preserves authentic subject identity while enabling seamless image editing and manipulation. | Text-to-Image Generation | $0.03635954763730072 | 14.27113s | [llms.txt](https://www.segmind.com/models/nano-banana/llms.txt) |
| NewReality Lightning SDXL | sdxl1.0-newreality-lightning | NewReality Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images | Text-to-Image Generation | $0.006337203805577281 | 3.04087s | [llms.txt](https://www.segmind.com/models/sdxl1.0-newreality-lightning/llms.txt) |
| NightVis Lightning SDXL | sdxl1.0-nightvis-lightning | NightVis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images i | Text-to-Image Generation | $0.007701533149756406 | 4.27571s | [llms.txt](https://www.segmind.com/models/sdxl1.0-nightvis-lightning/llms.txt) |
| Playground V2.5 | playground-v2.5 | Playground V2.5 is a diffusion-based text-to-image generative model, designed to create highly aesthetic images based on | Text-to-Image Generation | $0.003721384771350553 | 4.01389s | [llms.txt](https://www.segmind.com/models/playground-v2.5/llms.txt) |
| ProtoVision Lightning SDXL | sdxl1.0-protovis-lightning | ProtoVision Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px image | Text-to-Image Generation | $0.0064859676707384704 | 3.63305s | [llms.txt](https://www.segmind.com/models/sdxl1.0-protovis-lightning/llms.txt) |
| Pruna P Image | p-image | High-quality text-to-image generation optimized for speed. | Text-to-Image Generation | $0.005 | 6.36064s | [llms.txt](https://www.segmind.com/models/p-image/llms.txt) |
| Qwen Image | qwen-image | Qwen-Image revolutionizes image generation and editing with seamless multilingual text integration and photorealistic de | Text-to-Image Generation | $0.12119319123750963 | 28.39442s | [llms.txt](https://www.segmind.com/models/qwen-image/llms.txt) |
| Qwen Image 2512 | qwen-image-2512 | Photorealistic image generation with precise text description following. | Text-to-Image Generation | $0.013920287425219943 | 19.16328s | [llms.txt](https://www.segmind.com/models/qwen-image-2512/llms.txt) |
| Qwen Image Fast | qwen-image-fast | Qwen-Image expertly generates stunning images with complex text integration, especially for Chinese typography. | Text-to-Image Generation | $0.01795099815209666 | 5.53044s | [llms.txt](https://www.segmind.com/models/qwen-image-fast/llms.txt) |
| RealDream Lightning | sdxl1.0-realdream-lightning | RealDream is a sophisticated image generation model utilizing SDXL Lightning architecture. It creates incredibly realist | Text-to-Image Generation | $0.002366460644945139 | 3.02763s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realdream-lightning/llms.txt) |
| Realdream Pony V9 | sdxl1.0-realdream-pony-v9 | Real Dream Pony V9 is an advanced image generation model based on the Stable Diffusion XL (SDXL) architecture, excelling | Text-to-Image Generation | $0.007407028161413931 | 5.24506s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realdream-pony-v9/llms.txt) |
| Realism Lightning SDXL | sdxl1.0-realism-lightning | Realism Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in | Text-to-Image Generation | $0.007190902261951501 | 4.69687s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realism-lightning/llms.txt) |
| Realistic Vision | sd1.5-realisticvision | This model corresponds to the Stable Diffusion Realistic Vision checkpoint for detailed images at the cost of a super de | Text-to-Image Generation | $0.002516299705881904 | 1.46519s | [llms.txt](https://www.segmind.com/models/sd1.5-realisticvision/llms.txt) |
| Realvis Lightning SDXL | sdxl1.0-realvis-lightning | Realvis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in | Text-to-Image Generation | $0.005007959419236132 | 3.0495s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realvis-lightning/llms.txt) |
| Realvis SDXL | sdxl1.0-realvis | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.01033161443048845 | 4.9386s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realvis/llms.txt) |
| Recraft V3 | recraft-v3 | Recraft V3, the latest iteration of Recraft AI, offers a significant advancement in AI-driven image generation. This sta | Text-to-Image Generation | $0.05000000000000011 | 15.08788s | [llms.txt](https://www.segmind.com/models/recraft-v3/llms.txt) |
| Recraft V3 Svg | recraft-v3-svg | Recraft V3 SVG generates high-quality, customizable vector graphics with precision and ease. Perfect for logos, infograp | Text-to-Image Generation | $0.10000000000000009 | 17.66325s | [llms.txt](https://www.segmind.com/models/recraft-v3-svg/llms.txt) |
| Reliberate | sd1.5-reliberate | This model corresponds to the Stable Diffusion Reliberate checkpoint for detailed images at the cost of a super detailed | Text-to-Image Generation | $0.003285052131707925 | 1.84194s | [llms.txt](https://www.segmind.com/models/sd1.5-reliberate/llms.txt) |
| Samaritan 3D XL | sdxl1.0-samaritan-3d | Samaritan 3D XL leverages the robust capabilities of the SDXL framework, ensuring high-quality, detailed 3D character re | Text-to-Image Generation | $0.008573312925109997 | 4.13346s | [llms.txt](https://www.segmind.com/models/sdxl1.0-samaritan-3d/llms.txt) |
| Samaritan Lightning SDXL | sdxl1.0-samaritan-lightning | Samaritan Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images  | Text-to-Image Generation | $0.007016612648451131 | 4.37849s | [llms.txt](https://www.segmind.com/models/sdxl1.0-samaritan-lightning/llms.txt) |
| Seedream 3.0 t2i | seedream-v3-text-to-image | Seedream V3 generates high-resolution, bilingual images in seconds, enhancing creative workflows and marketing effective | Text-to-Image Generation | $0.0374999999999999 | 5.79861s | [llms.txt](https://www.segmind.com/models/seedream-v3-text-to-image/llms.txt) |
| Seedream 5.0 Lite: Text-to-Image | seedream-v5-lite-text-to-image | Fast, affordable instruction-following image generation. | Text-to-Image Generation | $0.03499999999999999 | 36.43476s | [llms.txt](https://www.segmind.com/models/seedream-v5-lite-text-to-image/llms.txt) |
| Segmind-Vega | segmind-vega | The Segmind-Vega Model is a distilled version of the Stable Diffusion XL (SDXL), offering a remarkable 70% reduction in  | Text-to-Image Generation | $0.002249561540861523 | 2.35792s | [llms.txt](https://www.segmind.com/models/segmind-vega/llms.txt) |
| Segmind-VegaRT | segmind-vega-rt-v1 | Segmind-VegaRT a distilled consistency adapter for Segmind-Vega that allows to reduce the number of inference steps to o | Text-to-Image Generation | $0.002000385751439731 | 1.68497s | [llms.txt](https://www.segmind.com/models/segmind-vega-rt-v1/llms.txt) |
| Simple Vector Flux Lora | Simple_Vector_Flux | Flux is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions | Text-to-Image Generation | $0.0337997876486014 | 36.40649s | [llms.txt](https://www.segmind.com/models/Simple_Vector_Flux/llms.txt) |
| SSD-1B | ssd-1b | SSD-1B efficiently generates high-quality, diverse images from text prompts in real-time. | Text-to-Image Generation | $0.004170655368377181 | 2.82059s | [llms.txt](https://www.segmind.com/models/ssd-1b/llms.txt) |
| Stable Diffusion 3 Medium Text to Image | stable-diffusion-3-medium-txt2img | Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of res | Text-to-Image Generation | $0.04100084691679047 | 7.37145s | [llms.txt](https://www.segmind.com/models/stable-diffusion-3-medium-txt2img/llms.txt) |
| Stable Diffusion 3.5 Large Text to Image | stable-diffusion-3.5-large-txt2img | Stable Diffusion 3.5 Large offers exceptional customizability, efficient performance on consumer hardware, and diverse i | Text-to-Image Generation | $0.013810415239805475 | 17.49587s | [llms.txt](https://www.segmind.com/models/stable-diffusion-3.5-large-txt2img/llms.txt) |
| Stable Diffusion 3.5 Turbo Text to Image | stable-diffusion-3.5-turbo-txt2img | Stable Diffusion 3.5 Turbo offers exceptional customizability, efficient performance on consumer hardware, and diverse i | Text-to-Image Generation | $0.003497373127186722 | 4.83746s | [llms.txt](https://www.segmind.com/models/stable-diffusion-3.5-turbo-txt2img/llms.txt) |
| Stable Diffusion XL 1.0 | sdxl1.0-txt2img | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software | Text-to-Image Generation | $0.007976094449291276 | 6.26414s | [llms.txt](https://www.segmind.com/models/sdxl1.0-txt2img/llms.txt) |
| Wan 2.7 Image Generation | wan2.7-image | 2K image generation with precise multilingual text rendering. | Text-to-Image Generation | $0.0375 | 21.52606s | [llms.txt](https://www.segmind.com/models/wan2.7-image/llms.txt) |
| Wan 2.7 Image Generation Pro | wan2.7-image-pro | 4K images with chain-of-thought reasoning and multilingual text. | Text-to-Image Generation | $0.0375 | 39.03948s | [llms.txt](https://www.segmind.com/models/wan2.7-image-pro/llms.txt) |
| WildCard Lightning SDXL | sdxl1.0-wildcard-lightning | WildCard Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images i | Text-to-Image Generation | $0.005027039929891649 | 3.42103s | [llms.txt](https://www.segmind.com/models/sdxl1.0-wildcard-lightning/llms.txt) |
| Yamer's Realistic SDXL | sdxl1.0-yamers-realistic | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.008998636289076828 | 5.70452s | [llms.txt](https://www.segmind.com/models/sdxl1.0-yamers-realistic/llms.txt) |
| Z Image Turbo | z-image-turbo | Photorealistic images in under one second, bilingual text. | Text-to-Image Generation | $0.030655437032967026 | 6.58321s | [llms.txt](https://www.segmind.com/models/z-image-turbo/llms.txt) |
| Zavychroma SDXL | sdxl1.0-zavychroma | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.010168409796996965 | 6.03496s | [llms.txt](https://www.segmind.com/models/sdxl1.0-zavychroma/llms.txt) |

## Text Generation (LLM)

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Claude 3 Haiku | claude-3-haiku | Claude 3 Haiku, the fastest and most cost-effective model LLM from Anthropic, delivers instant responses and image analy | Text Generation (LLM) | $0.005565155941967141 | 4.326s | [llms.txt](https://www.segmind.com/models/claude-3-haiku/llms.txt) |
| Claude 3 Opus | claude-3-opus | Claude 3 Opus is an LLM pushing the limits of language understanding. It excels at complex tasks, generates human-qualit | Text Generation (LLM) | $0.08819386837881218 | 25.017s | [llms.txt](https://www.segmind.com/models/claude-3-opus/llms.txt) |
| Claude 3.5 Sonnet | claude-3.5-sonnet | Claude 3.5 Sonnet represents a significant advancement in AI language models, combining speed, accuracy, and visual reas | Text Generation (LLM) | $0.03664767260936965 | 11.20302s | [llms.txt](https://www.segmind.com/models/claude-3.5-sonnet/llms.txt) |
| Claude 4 Sonnet | claude-4-sonnet | Advanced coding and multi-step agentic reasoning model. | Text Generation (LLM) | $0.021433145101663584 | 11.23375s | [llms.txt](https://www.segmind.com/models/claude-4-sonnet/llms.txt) |
| Claude 4.5 Sonnet | claude-4.5-sonnet | Claude Sonnet 4.5 empowers developers with advanced coding and reasoning for complex software solutions. | Text Generation (LLM) | $0.04551593101182655 | 23.52418s | [llms.txt](https://www.segmind.com/models/claude-4.5-sonnet/llms.txt) |
| Claude Opus 4.7 | claude-opus-4.7 | Anthropic's most capable AI model excelling at agentic coding, complex reasoning, and high-resolution vision with a 1M-t | Text Generation (LLM) | $0.05 | - | [llms.txt](https://www.segmind.com/models/claude-opus-4.7/llms.txt) |
| DeepSeek Chat | deepseek-chat | DeepSeek V3 combines cutting-edge AI technology with practical usability. Featuring a 671B parameter architecture, enhan | Text Generation (LLM) | $0.0012820433904013474 | 34.10292s | [llms.txt](https://www.segmind.com/models/deepseek-chat/llms.txt) |
| DeepSeek R1 | deepseek-reasoner | DeepSeek-R1 is a cutting-edge AI reasoning model that combines reinforcement learning with supervised fine-tuning. Excel | Text Generation (LLM) | $0.04687098227507291 | 67.21856s | [llms.txt](https://www.segmind.com/models/deepseek-reasoner/llms.txt) |
| Gemini 2 Flash | gemini-2-flash-image-generation | With Gemini 2 Flash, create consistent visuals, edit images conversationally, and render text accurately. | Text Generation (LLM) | $0.05416053896353166 | 8.60779s | [llms.txt](https://www.segmind.com/models/gemini-2-flash-image-generation/llms.txt) |
| Gemini 2.5 Flash | gemini-2.5-flash | Multimodal AI with transparent reasoning, fast and affordable. | Text Generation (LLM) | $0.003799036607142857 | 11.07565s | [llms.txt](https://www.segmind.com/models/gemini-2.5-flash/llms.txt) |
| Gemini 2.5 PRO | gemini-2.5-pro | Complex multimodal reasoning across diverse inputs and formats. | Text Generation (LLM) | $0.025429340671641796 | 27.37359s | [llms.txt](https://www.segmind.com/models/gemini-2.5-pro/llms.txt) |
| Gemini 3 Pro | gemini-3-pro | Autonomous multimodal AI for complex reasoning and coding. | Text Generation (LLM) | $0.0031178443433199786 | 31.44143s | [llms.txt](https://www.segmind.com/models/gemini-3-pro/llms.txt) |
| GPT 4 | gpt-4 | GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have be | Text Generation (LLM) | $0.028924545905172412 | 13.14806s | [llms.txt](https://www.segmind.com/models/gpt-4/llms.txt) |
| GPT 4 turbo | gpt-4-turbo | GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have be | Text Generation (LLM) | $0.005317002659762601 | 9.90926s | [llms.txt](https://www.segmind.com/models/gpt-4-turbo/llms.txt) |
| GPT 4o | gpt-4o | GPT-4o (“o” for “omni”) is our most advanced model. It is multimodal (accepting text or image inputs and outputting text | Text Generation (LLM) | $0.007496491503571936 | 6.0069s | [llms.txt](https://www.segmind.com/models/gpt-4o/llms.txt) |
| GPT 5 | gpt-5 | GPT-5 automates complex coding tasks with integrated tools for seamless software development and deployment. | Text Generation (LLM) | $0.04185687687634024 | 61.68263s | [llms.txt](https://www.segmind.com/models/gpt-5/llms.txt) |
| GPT 5 Mini | gpt-5-mini | Rapid high-quality AI across text, images, and files. | Text Generation (LLM) | $0.01020766289791438 | 47.90171s | [llms.txt](https://www.segmind.com/models/gpt-5-mini/llms.txt) |
| GPT 5 Nano | gpt-5-nano | Ultra-fast LLM responses for real-time AI applications. | Text Generation (LLM) | $0.0010103470974352404 | 30.08176s | [llms.txt](https://www.segmind.com/models/gpt-5-nano/llms.txt) |
| GPT 5.1 | gpt-5.1 | Precise code review and developer workflow assistant. | Text Generation (LLM) | $0.0072179697892271666 | 9.00481s | [llms.txt](https://www.segmind.com/models/gpt-5.1/llms.txt) |
| GPT 5.2 | gpt-5.2 | Advanced reasoning with multimodal input for precise tasks. | Text Generation (LLM) | $0.012528342994100295 | 20.53192s | [llms.txt](https://www.segmind.com/models/gpt-5.2/llms.txt) |
| GPT 5.4 | gpt-5.4 | Most powerful GPT for frontier reasoning and multimodal tasks. | Text Generation (LLM) | $0.009438483055555556 | 4.55805s | [llms.txt](https://www.segmind.com/models/gpt-5.4/llms.txt) |
| GPT 5.4 Mini | gpt-5.4-mini | Fastest efficient model for coding and computer-use tasks. | Text Generation (LLM) | $0.0005428316746203904 | 5.14682s | [llms.txt](https://www.segmind.com/models/gpt-5.4-mini/llms.txt) |
| GPT 5.4 Nano | gpt-5.4-nano | Flagship-class AI for classification and extraction tasks. | Text Generation (LLM) | $0.0029356314497417857 | 3.50369s | [llms.txt](https://www.segmind.com/models/gpt-5.4-nano/llms.txt) |
| Grok 2 | grok-2 | Grok-2, xAI's latest language model, boasts superior reasoning, coding, and chat capabilities, outperforming many popula | Text Generation (LLM) | $0.0046698497191011235 | 8.51163s | [llms.txt](https://www.segmind.com/models/grok-2/llms.txt) |
| Grok 2 Vision | grok-2-vision | Grok-2, xAI's latest language model with vision understanding. | Text Generation (LLM) | $0.003865039487565939 | 5.03739s | [llms.txt](https://www.segmind.com/models/grok-2-vision/llms.txt) |
| Kimi K2 Instruct 0905 | kimi-k2-instruct-0905 | Deep contextual understanding and complex code generation. | Text Generation (LLM) | $0.001018350267379679 | 2.16009s | [llms.txt](https://www.segmind.com/models/kimi-k2-instruct-0905/llms.txt) |
| Llama 3 70b | llama-v3-70b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.001026968779153933 | 2.38864s | [llms.txt](https://www.segmind.com/models/llama-v3-70b-instruct/llms.txt) |
| Llama 3 8b | llama-v3-8b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.0006315522434414854 | 1.33946s | [llms.txt](https://www.segmind.com/models/llama-v3-8b-instruct/llms.txt) |
| Llama 3.1 405b | llama-v3p1-405b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.006355001054333309 | 6.60671s | [llms.txt](https://www.segmind.com/models/llama-v3p1-405b-instruct/llms.txt) |
| Llama 3.1 70b | llama-v3p1-70b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.0019101056600713583 | 2.70658s | [llms.txt](https://www.segmind.com/models/llama-v3p1-70b-instruct/llms.txt) |
| Llama 3.1 8b | llama-v3p1-8b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.00002702273886072785 | 1.66543s | [llms.txt](https://www.segmind.com/models/llama-v3p1-8b-instruct/llms.txt) |
| Llama 4 Maverick Instruct Basic | llama4-maverick-instruct-basic | Llama 4 Maverick Instruct Basic is a 400B parameter powerhouse with 128 experts for unparalleled text and image understa | Text Generation (LLM) | $0.0010211187845303867 | 2.47979s | [llms.txt](https://www.segmind.com/models/llama4-maverick-instruct-basic/llms.txt) |
| Llama 4 Scout Instruct Basic | llama4-scout-instruct-basic | Unlock powerful multimodal AI with Llama 4 Scout basic, a 17 billion active parameters model offering leading text & ima | Text Generation (LLM) | $0.0007072039722329349 | 2.66331s | [llms.txt](https://www.segmind.com/models/llama4-scout-instruct-basic/llms.txt) |
| Mixtral 8x22b | mixtral-8x22b-instruct | Mistral MoE 8x22B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following. | Text Generation (LLM) | $0.0005056373593842508 | 2.73773s | [llms.txt](https://www.segmind.com/models/mixtral-8x22b-instruct/llms.txt) |
| Mixtral 8x7b | mixtral-8x7b-instruct | Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following. | Text Generation (LLM) | $0.0002638041493775933 | 1.8292s | [llms.txt](https://www.segmind.com/models/mixtral-8x7b-instruct/llms.txt) |
| O4 Mini | o4-mini | OpenAI o4-mini enhances decision-making by processing text and images with advanced reasoning capabilities. | Text Generation (LLM) | $0.006826813091158328 | 12.00599s | [llms.txt](https://www.segmind.com/models/o4-mini/llms.txt) |
| OpenAI o1-mini | o1-mini | o1-mini by OpenAI provides high-performance reasoning and coding capabilities. Ideal for developers and businesses seeki | Text Generation (LLM) | $0.08289552217591578 | 30.12678s | [llms.txt](https://www.segmind.com/models/o1-mini/llms.txt) |
| OpenAI o1-preview | o1-preview | o1-preview by OpenAI,  is a powerful AI model that can tackle complex problems with exceptional accuracy and efficiency. | Text Generation (LLM) | $0.34304200481254693 | 51.96905s | [llms.txt](https://www.segmind.com/models/o1-preview/llms.txt) |
| OpenAI o3 | o3 | Frontier reasoning model for complex coding, math, and science. | Text Generation (LLM) | $0.0050085 | 10.96443s | [llms.txt](https://www.segmind.com/models/o3/llms.txt) |
| OpenAI o3 Mini | o3-mini | Cost-efficient reasoning model for coding, math, and science. | Text Generation (LLM) | $0.00138958 | 3.26736s | [llms.txt](https://www.segmind.com/models/o3-mini/llms.txt) |
| QVQ Max | qvq-max | Chain-of-thought visual reasoning for math, charts, and diagrams. | Text Generation (LLM) | $0.0075934999999999996 | 27.5905s | [llms.txt](https://www.segmind.com/models/qvq-max/llms.txt) |
| Qwen 3 Coder Flash | qwen3-coder-flash | High-volume code generation with 1M token context window. | Text Generation (LLM) | $0.0007448666666666667 | 4.72234s | [llms.txt](https://www.segmind.com/models/qwen3-coder-flash/llms.txt) |
| Qwen 3 Coder Plus | qwen3-coder-plus | Generates, debugs, and refactors entire codebases efficiently. | Text Generation (LLM) | $0.0011938 | 3.80229s | [llms.txt](https://www.segmind.com/models/qwen3-coder-plus/llms.txt) |
| Qwen 3 Max | qwen3-max | 1T-parameter LLM with hybrid reasoning and 262K context. | Text Generation (LLM) | $0.00105925 | 5.62272s | [llms.txt](https://www.segmind.com/models/qwen3-max/llms.txt) |
| Qwen 3 VL Flash | qwen3-vl-flash | Fast, affordable vision-language model with 262K context OCR. | Text Generation (LLM) | $0.00019149999999999997 | 3.97372s | [llms.txt](https://www.segmind.com/models/qwen3-vl-flash/llms.txt) |
| Qwen 3 VL Plus | qwen3-vl-plus | Powerful visual QA and document analysis from images. | Text Generation (LLM) | $0.0009447333333333334 | 7.08631s | [llms.txt](https://www.segmind.com/models/qwen3-vl-plus/llms.txt) |
| Qwen 3.5 Flash | qwen3.5-flash | Fast multimodal AI processing text, images, and video affordably. | Text Generation (LLM) | $0.0007941999999999999 | 31.56291s | [llms.txt](https://www.segmind.com/models/qwen3.5-flash/llms.txt) |
| Qwen 3.5 Plus | qwen3.5-plus | Multimodal 1M context AI for image, video, and text. | Text Generation (LLM) | $0.003051714285714286 | 17.11801s | [llms.txt](https://www.segmind.com/models/qwen3.5-plus/llms.txt) |
| Qwen Flash | qwen-flash | Fastest low-cost LLM with 1M context for high-volume tasks. | Text Generation (LLM) | $0.00011157241379310346 | 4.61847s | [llms.txt](https://www.segmind.com/models/qwen-flash/llms.txt) |
| Qwen Plus | qwen-plus | Mid-tier 1M context LLM for summarization and content tasks. | Text Generation (LLM) | $0.00014566666666666667 | 2.12671s | [llms.txt](https://www.segmind.com/models/qwen-plus/llms.txt) |
| Qwen2 VL 72B Instruct | qwen2-vl-72b-instruct | Qwen2-VL-72B-Instruct is a state-of-the-art multimodal model excelling in image and video understanding, with advanced c | Text Generation (LLM) | $0.005773370372061172 | 7.14185s | [llms.txt](https://www.segmind.com/models/qwen2-vl-72b-instruct/llms.txt) |
| QWEN2-VL-7B-Instruct | qwen2-vl-7b-instruct | The Qwen2-VL-7B-Instruct is a cutting-edge vision-language model with 7 billion parameters, offering advanced capabiliti | Text Generation (LLM) | $0.0012699288110324266 | 36.34406s | [llms.txt](https://www.segmind.com/models/qwen2-vl-7b-instruct/llms.txt) |
| Qwen2.5-VL 32B Instruct | qwen2p5-vl-32b-instruct | Qwen2.5-VL processes text and images seamlessly for advanced multimodal instruction and reasoning. | Text Generation (LLM) | $0.0026978558645024703 | 7.5514s | [llms.txt](https://www.segmind.com/models/qwen2p5-vl-32b-instruct/llms.txt) |
| QwQ Plus | qwq-plus | Deep chain-of-thought reasoning for math, code, and logic. | Text Generation (LLM) | $0.008212777777777777 | 56.4063s | [llms.txt](https://www.segmind.com/models/qwq-plus/llms.txt) |

## Image-to-Video Generation

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| AI Face Swap (image and video) | ai-face-swap | AI Face Swap: Effortlessly replace faces online. Fine-tune swaps with advanced controls for age, gender, and resolution. | Image-to-Video Generation | $0.10309356862457175 | 29.51768s | [llms.txt](https://www.segmind.com/models/ai-face-swap/llms.txt) |
| Bytedance HuMo: Human-Centric Video Generation | bytedance-humo | HuMo generates high-quality, human-centric videos from text, images, and audio with unparalleled control and precision. | Image-to-Video Generation | $5 | - | [llms.txt](https://www.segmind.com/models/bytedance-humo/llms.txt) |
| Cog videoX Image To Video | cog-video-5b-i2v | CogVideoX image-to-video is a cutting-edge AI model that converts static images into dynamic, high-quality videos. Perfe | Image-to-Video Generation | $0.35561489718190126 | 355.73991s | [llms.txt](https://www.segmind.com/models/cog-video-5b-i2v/llms.txt) |
| Easy Animate | easy-animate | Easy Animate  is a state-of-the-art image to animation model to convert static images into dynamic animations with remar | Image-to-Video Generation | $0.7981060098555378 | 208.63524s | [llms.txt](https://www.segmind.com/models/easy-animate/llms.txt) |
| Google Veo 2 Image To Video | veo-2-image2video | Discover Google Veo 2, an AI-powered image-to-video model with 4K resolution, realistic motion, and cinematic effects fo | Image-to-Video Generation | $3.957629085982983 | 40.58279s | [llms.txt](https://www.segmind.com/models/veo-2-image2video/llms.txt) |
| Hailuo 02 Fast | hailuo-02-fast | Transform any static image into a captivating, high-quality video clip effortlessly. | Image-to-Video Generation | $0.1612957434649876 | 79.67199s | [llms.txt](https://www.segmind.com/models/hailuo-02-fast/llms.txt) |
| Hailuo 2.3 | hailuo-2.3 | Hyper-realistic videos from text with fluid character motion. | Image-to-Video Generation | $0.529004739336493 | 140.66875s | [llms.txt](https://www.segmind.com/models/hailuo-2.3/llms.txt) |
| Hailuo 2.3 Fast | hailuo-2.3-fast | Professional-quality videos from text and images at speed. | Image-to-Video Generation | $0.3867293777134588 | 114.22468s | [llms.txt](https://www.segmind.com/models/hailuo-2.3-fast/llms.txt) |
| Hallo | hallo | Hallo lets you create portrait videos from single images. | Image-to-Video Generation | $0.4217882754233411 | 303.61238s | [llms.txt](https://www.segmind.com/models/hallo/llms.txt) |
| Heygen Avatar IV | heygen-avatar-iv | Single photo into a lifelike talking avatar video. | Image-to-Video Generation | $2.1682730471698113 | 215.82418s | [llms.txt](https://www.segmind.com/models/heygen-avatar-iv/llms.txt) |
| Higgsfield Image 2 Video | higgsfield-image2video | Transform static images into dynamic, motion-rich videos with unparalleled control and creative depth. | Image-to-Video Generation | $0.6381414141414141 | 152.31308s | [llms.txt](https://www.segmind.com/models/higgsfield-image2video/llms.txt) |
| Higgsfield Speech 2 Video | higgsfield-speech2video | Transform images and audio into dynamic, lip-synced videos for engaging digital content. | Image-to-Video Generation | $1.9714583333333338 | 290.66342s | [llms.txt](https://www.segmind.com/models/higgsfield-speech2video/llms.txt) |
| HyperSwap: Video Faceswap by FaceFusion Labs | video-faceswap-by-facefusion-labs | Realistic face swapping in videos from a single image. | Image-to-Video Generation | $0.08616183774834436 | 55.44278s | [llms.txt](https://www.segmind.com/models/video-faceswap-by-facefusion-labs/llms.txt) |
| InfiniteTalk | infinite-talk | Full-body animation from images synchronized perfectly to audio. | Image-to-Video Generation | $0.46024858417659986 | 301.51975s | [llms.txt](https://www.segmind.com/models/infinite-talk/llms.txt) |
| Kling 2 | kling-2 | Kling 2.0 is an advanced AI video generator (5 and 10 seconds) that creates cinematic, dynamic videos from text or image | Image-to-Video Generation | $2.2962962962962963 | 305.2509s | [llms.txt](https://www.segmind.com/models/kling-2/llms.txt) |
| Kling 2.1 AI Video Generator | kling-2.1 | Kling 2.1 offers hyper-realistic video generation with improved motion, sharper 1080p visuals, and instant restyling cap | Image-to-Video Generation | $0.9007380707174735 | 135.89172s | [llms.txt](https://www.segmind.com/models/kling-2.1/llms.txt) |
| Kling 2.5 Turbo | kling-2.5-turbo | Kling AI 2.5 Turbo generates fluid, cinematic videos from text and images, enhancing content creation and storytelling. | Image-to-Video Generation | $0.56640350877193 | 134.74241s | [llms.txt](https://www.segmind.com/models/kling-2.5-turbo/llms.txt) |
| Kling 2.6 | kling-2.6 | Still images into immersive cinematic videos with synchronized audio. | Image-to-Video Generation | $1.1078084832904884 | 122.43048s | [llms.txt](https://www.segmind.com/models/kling-2.6/llms.txt) |
| Kling 3.0 Pro Image-to-Video | kling-3-pro-image2video | Animated 1080p videos from images with dynamic motion. | Image-to-Video Generation | $1.9405611510791374 | 291.36004s | [llms.txt](https://www.segmind.com/models/kling-3-pro-image2video/llms.txt) |
| Kling 3.0 Standard Image-to-Video | kling-3-standard-image2video | Controlled cinematic 1080p videos from starting images. | Image-to-Video Generation | $1.2414769230769234 | 150.59963s | [llms.txt](https://www.segmind.com/models/kling-3-standard-image2video/llms.txt) |
| Kling AI 1.6 Image to Video | kling-1.6-image2video | Kling AI 1.6 Image-to-Video is a powerful AI tool that transforms static images into captivating, animated videos. Creat | Image-to-Video Generation | $1.0528155725494146 | 289.3756s | [llms.txt](https://www.segmind.com/models/kling-1.6-image2video/llms.txt) |
| Kling AI Image to Video | kling-image2video | Kling AI Image-to-Video is a powerful AI tool that transforms static images into captivating, animated videos. Create hi | Image-to-Video Generation | $0.8641459524125369 | 317.54633s | [llms.txt](https://www.segmind.com/models/kling-image2video/llms.txt) |
| Kling Avatar V2 Standard | kling-v2-standard-avatar | Lifelike video avatars with precise lip synchronization. | Image-to-Video Generation | $2.779702592592593 | 498.74452s | [llms.txt](https://www.segmind.com/models/kling-v2-standard-avatar/llms.txt) |
| Kling bloombloom | kling-bloombloom | Kling AI transforms text and images into dynamic, high-quality video content with realistic motion and sound. | Image-to-Video Generation | $0.9800000000000001 | 193.06476s | [llms.txt](https://www.segmind.com/models/kling-bloombloom/llms.txt) |
| Kling dizzydizzy | kling-dizzydizzy | Kling DizzyDizzy transforms static content into dynamic, high-resolution videos, enhancing engagement and storytelling f | Image-to-Video Generation | $0.9800000000000002 | 199.27494s | [llms.txt](https://www.segmind.com/models/kling-dizzydizzy/llms.txt) |
| Kling Expansion | kling-expansion | Unleash dynamic visuals with Kling Expansion! Effortlessly inflate and stretch elements for surreal and captivating effe | Image-to-Video Generation | $0.9500000000000001 | 118.23583s | [llms.txt](https://www.segmind.com/models/kling-expansion/llms.txt) |
| Kling fuzzyfuzzy | kling-fuzzyfuzzy | Transform your photos instantly into adorable, plush-toy-like visuals with Kling fuzzyfuzzy effect. | Image-to-Video Generation | $0.9635294117647059 | 132.86003s | [llms.txt](https://www.segmind.com/models/kling-fuzzyfuzzy/llms.txt) |
| Kling Heart Gesture | kling-heart-gesture | Express affection visually with Kling AI's heart gesture effect! Input two portraits and instantly create heartwarming v | Image-to-Video Generation | $0.945 | 208.38736s | [llms.txt](https://www.segmind.com/models/kling-heart-gesture/llms.txt) |
| Kling Hug | kling-hug | Create heartwarming videos instantly with Kling hug effect! Generate tender embracing animations. | Image-to-Video Generation | $0.9800000000000001 | 206.59058s | [llms.txt](https://www.segmind.com/models/kling-hug/llms.txt) |
| Kling Kiss | kling-kiss | Create a heartfelt video in seconds with Kling kiss effect! Input two portraits and instantly generate a kissing animati | Image-to-Video Generation | $1.0006434316353892 | 235.845s | [llms.txt](https://www.segmind.com/models/kling-kiss/llms.txt) |
| Kling O1 Image 2 Video | kling-o1-image-to-video | Physics-driven animations from images for creative storytelling. | Image-to-Video Generation | $0.9085748965517243 | 143.44042s | [llms.txt](https://www.segmind.com/models/kling-o1-image-to-video/llms.txt) |
| Kling O1 Reference Image 2 Video | kling-o1-reference-image-to-video | Identity-preserving videos from static images with character reference. | Image-to-Video Generation | $1.345300374531835 | 197.35832s | [llms.txt](https://www.segmind.com/models/kling-o1-reference-image-to-video/llms.txt) |
| Kling O3 Image To Video | kling-o3-image2video | Images to cinematic videos with precise motion control. | Image-to-Video Generation | $1.8423107569721113 | 202.45185s | [llms.txt](https://www.segmind.com/models/kling-o3-image2video/llms.txt) |
| Kling Squish | kling-squish | Transform your visuals with Kling AI squish effect! Easily compress and distort images/videos for playful, exaggerated e | Image-to-Video Generation | $0.967272727272727 | 120.28359s | [llms.txt](https://www.segmind.com/models/kling-squish/llms.txt) |
| Kling V1 Pro AI Avatar | kling-v1-pro-ai-avatar | Dynamic AI avatars with synchronized speech from image. | Image-to-Video Generation | $4.475115916666667 | 732.24765s | [llms.txt](https://www.segmind.com/models/kling-v1-pro-ai-avatar/llms.txt) |
| Kling V1 Standard AI Avatar | kling-v1-standard-ai-avatar | Lifelike AI avatars with precise lip-sync for presentations. | Image-to-Video Generation | $2.2834377333333333 | 460.07325s | [llms.txt](https://www.segmind.com/models/kling-v1-standard-ai-avatar/llms.txt) |
| Kling V2 Pro Avatar | kling-v2-pro-avatar | Talking avatar videos from image and audio, high quality. | Image-to-Video Generation | $5.285169729729731 | 779.46664s | [llms.txt](https://www.segmind.com/models/kling-v2-pro-avatar/llms.txt) |
| Live Portrait | live-portrait | Live Portrait animates static images using a reference driving video through implicit key point based framework, bringin | Image-to-Video Generation | $0.054998455614000394 | 36.03831s | [llms.txt](https://www.segmind.com/models/live-portrait/llms.txt) |
| Live Portrait video to video | live-portrait-video-to-video | Experience the magic of Live Portrait’s Video-to-Video Model! Transform your static images into dynamic videos seamlessl | Image-to-Video Generation | $0.27947058954138704 | 74.44037s | [llms.txt](https://www.segmind.com/models/live-portrait-video-to-video/llms.txt) |
| LTX 2 Fast | ltx-2-fast | Fast, high-quality text-to-video generation by Lightricks. | Image-to-Video Generation | $0.5186933065217391 | 46.42703s | [llms.txt](https://www.segmind.com/models/ltx-2-fast/llms.txt) |
| LTX 2 Pro | ltx-2-pro | High-quality video generation with advanced motion control. | Image-to-Video Generation | $0.6267278331034484 | 69.73345s | [llms.txt](https://www.segmind.com/models/ltx-2-pro/llms.txt) |
| LTX Video | ltx-video | LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produ | Image-to-Video Generation | $0.057473419067595385 | 54.94509s | [llms.txt](https://www.segmind.com/models/ltx-video/llms.txt) |
| Luma Image-to-Video | luma-img-2-video | With Luma's Dream Machine, transform your static images into dynamic videos. It offers high-fidelity video generation, r | Image-to-Video Generation | $0.9472327964860999 | 60.84224s | [llms.txt](https://www.segmind.com/models/luma-img-2-video/llms.txt) |
| Luma Modify Video | modify-video | Transform videos seamlessly with high-fidelity generative edits while preserving original actor performances. | Image-to-Video Generation | $0.6006692185185185 | 164.37768s | [llms.txt](https://www.segmind.com/models/modify-video/llms.txt) |
| Luma Ray flash 2 (720p) | ray-flash-2-720p | Generate stunning 720p videos from text with the Luma ray-flash-2-720p model. Faster & cheaper than Ray 2, offering real | Image-to-Video Generation | $0.4267106325842696 | 60.61618s | [llms.txt](https://www.segmind.com/models/ray-flash-2-720p/llms.txt) |
| Luma Ray Image to Video | luma-ray-img-2-video | With Luma's Ray2 image-to-video, transform your static images into cinematic dynamic videos. | Image-to-Video Generation | $1.6000000000000005 | 125.16943s | [llms.txt](https://www.segmind.com/models/luma-ray-img-2-video/llms.txt) |
| Minimax (Hailuo) Video-01-live | minimax-ai-live | Create stunning animations with Minimax (Hailuo) video-01-live, an AI image-to-video model perfect for Live2D, anime, an | Image-to-Video Generation | $0.625 | 167.32884s | [llms.txt](https://www.segmind.com/models/minimax-ai-live/llms.txt) |
| MiniMax AI (Hailuo) | minimax-ai | With Video-01 by MiniMax, create high-definition videos at 720p resolution and 25fps, featuring cinematic camera movemen | Image-to-Video Generation | $0.6009749412252146 | 177.54842s | [llms.txt](https://www.segmind.com/models/minimax-ai/llms.txt) |
| Minimax Hailou 2 | minimax-hailuo-2 | Generate breathtaking 1080P cinematic videos from text or images with ultra-realistic motion and physics. | Image-to-Video Generation | $0.3875000000000001 | 174.24651s | [llms.txt](https://www.segmind.com/models/minimax-hailuo-2/llms.txt) |
| Motion Control SVD | motionctrl-svd | Motion Control SVD is an innovative deep learning framework that breathes life into static images. By intelligently mana | Image-to-Video Generation | $0.08928216191843766 | 62.73815s | [llms.txt](https://www.segmind.com/models/motionctrl-svd/llms.txt) |
| Muscle Surge | muscle-surge | Instantly add muscle and strength to your videos with Pixverse Muscle Surge effect! | Image-to-Video Generation | $0.41835106382978726 | 45.50548s | [llms.txt](https://www.segmind.com/models/muscle-surge/llms.txt) |
| OVI Image To Video | ovi-i2v | Synchronized video and audio generation from text and images. | Image-to-Video Generation | $0.24981727088122607 | 41.92037s | [llms.txt](https://www.segmind.com/models/ovi-i2v/llms.txt) |
| Pixverse 4.5 Effects | pixverse-4.5-effects | PixVerse 4.5 transforms photos and text into stunning animated videos for impactful storytelling and marketing. | Image-to-Video Generation | $0.3975694444444444 | 46.91432s | [llms.txt](https://www.segmind.com/models/pixverse-4.5-effects/llms.txt) |
| Pixverse 4.5 Transition | pixverse-4.5-transition | PixVerse 4.5 transforms still images into dynamic, captivating videos with seamless transitions. | Image-to-Video Generation | $0.5813148788927336 | 57.82971s | [llms.txt](https://www.segmind.com/models/pixverse-4.5-transition/llms.txt) |
| Pixverse 4.5 Video | pixverse-4.5-video | Pixverse 4.5 transforms static images and text into dynamic, engaging videos for captivating social media content. | Image-to-Video Generation | $0.5827814569536424 | 45.92084s | [llms.txt](https://www.segmind.com/models/pixverse-4.5-video/llms.txt) |
| Pixverse 5 Extend | pixverse-5-extend | Seamlessly extend and continue AI-generated videos. | Image-to-Video Generation | $0.6569148936170213 | 93.2064s | [llms.txt](https://www.segmind.com/models/pixverse-5-extend/llms.txt) |
| Pixverse 5 Transition | pixverse-5-transition | Seamless AI-generated video transitions between scenes. | Image-to-Video Generation | $0.5 | 72.51908s | [llms.txt](https://www.segmind.com/models/pixverse-5-transition/llms.txt) |
| Pixverse 5 Video | pixverse-5-video | Cinematic videos from text and images with photorealism. | Image-to-Video Generation | $0.631159420289855 | 68.81511s | [llms.txt](https://www.segmind.com/models/pixverse-5-video/llms.txt) |
| Pixverse Image to Video | pixverse-image2video | Animate your photos effortlessly with Pixverse Image to Video AI! Upload, add motion prompts and styles. | Image-to-Video Generation | $0.5479166666666667 | 40.69088s | [llms.txt](https://www.segmind.com/models/pixverse-image2video/llms.txt) |
| Pixverse Transition | pixverse-transition | PixVerse V4 transforms static images and text into dynamic, visually stunning videos for creators across various industr | Image-to-Video Generation | $0.8727272727272727 | 57.65282s | [llms.txt](https://www.segmind.com/models/pixverse-transition/llms.txt) |
| Pixverse V6 | pixverse-v6 | 15-second AI videos with native audio and cinematic controls. | Image-to-Video Generation | $0.42328571428571427 | 63.92565s | [llms.txt](https://www.segmind.com/models/pixverse-v6/llms.txt) |
| Runway Gen 4 Turbo | runway-gen4-turbo | Generate videos faster and cheaper with Runway Gen-4 Turbo! Create high-quality text, image, and combined video generati | Image-to-Video Generation | $0.7741486399657315 | 38.03304s | [llms.txt](https://www.segmind.com/models/runway-gen4-turbo/llms.txt) |
| Runway Gen Alpha Turbo Image to Video | runway-gen3-alphaturbo | Runway Gen-3 AlphaTurbo is a cutting-edge AI tool that transforms static images into dynamic videos with exceptional fid | Image-to-Video Generation | $0.5561580559321183 | 27.38178s | [llms.txt](https://www.segmind.com/models/runway-gen3-alphaturbo/llms.txt) |
| SadTalker | sadtalker | Audio-based Lip Synchronization for Talking Head Video | Image-to-Video Generation | $0.17811345022209485 | 108.78657s | [llms.txt](https://www.segmind.com/models/sadtalker/llms.txt) |
| Seedance 1.0 lite i2v | seedance-v1-lite-image-to-video | Seedance 1.0 transforms text and images into engaging 720p dynamic videos with cinematic storytelling. | Image-to-Video Generation | $0.10684891837037773 | 39.49462s | [llms.txt](https://www.segmind.com/models/seedance-v1-lite-image-to-video/llms.txt) |
| Seedance 1.0 Pro | seedance-pro | Seedance Pro transforms text and images into engaging 720p dynamic videos with cinematic storytelling. | Image-to-Video Generation | $0.3520381860174778 | 62.21196s | [llms.txt](https://www.segmind.com/models/seedance-pro/llms.txt) |
| Seedance 1.0 Pro Fast | seedance-1.0-pro-fast | Cinematic videos from text and images at ultra speed. | Image-to-Video Generation | $0.2462202225806452 | 48.66931s | [llms.txt](https://www.segmind.com/models/seedance-1.0-pro-fast/llms.txt) |
| Seedance 1.5 Pro | seedance-1.5-pro | Synchronized video and audio generation for dynamic storytelling. | Image-to-Video Generation | $0.4061963611901681 | 99.53297s | [llms.txt](https://www.segmind.com/models/seedance-1.5-pro/llms.txt) |
| Seedance 2.0 | seedance-2.0 | Cinematic AI videos with native audio and multi-shot narratives. | Image-to-Video Generation | $1.2124749569620252 | 189.30402s | [llms.txt](https://www.segmind.com/models/seedance-2.0/llms.txt) |
| Seedance 2.0 Fast | seedance-2.0-fast | Professional-grade video creation model with native audio, similar to SeeDance 2.0 but faster and cheaper. | Image-to-Video Generation | $0.7688965999999999 | 116.85878s | [llms.txt](https://www.segmind.com/models/seedance-2.0-fast/llms.txt) |
| Sora 2 | sora-2 | Stunning dynamic videos from detailed text descriptions. | Image-to-Video Generation | $1.015919811320755 | 178.74569s | [llms.txt](https://www.segmind.com/models/sora-2/llms.txt) |
| Sora 2 Pro | sora-2-pro | Cinematic-quality videos from text with temporal consistency. | Image-to-Video Generation | $3.135135135135134 | 400.42705s | [llms.txt](https://www.segmind.com/models/sora-2-pro/llms.txt) |
| Stable Video Diffusion | svd | Takes image as input and returns a video. | Image-to-Video Generation | $0.165895590295991 | 29.6271s | [llms.txt](https://www.segmind.com/models/svd/llms.txt) |
| Tooncrafter | tooncrafter | Create videos from illustrated input images | Image-to-Video Generation | $0.12288504435102474 | 108.41335s | [llms.txt](https://www.segmind.com/models/tooncrafter/llms.txt) |
| V Express | v-express | V-Express lets you create portrait videos from single images. | Image-to-Video Generation | $0.26351073672376873 | 196.28656s | [llms.txt](https://www.segmind.com/models/v-express/llms.txt) |
| Veo 3.1 | veo-3.1 | Static images into high-quality videos with synchronized audio. | Image-to-Video Generation | $2.161127895266869 | 110.49847s | [llms.txt](https://www.segmind.com/models/veo-3.1/llms.txt) |
| Veo 3.1 Fast | veo-3.1-fast | Fast image-to-video at 1080p with native audio. | Image-to-Video Generation | $0.8591154018859455 | 99.31184s | [llms.txt](https://www.segmind.com/models/veo-3.1-fast/llms.txt) |
| Veo 3.1 Lite | veo-3.1-lite | Affordable text-to-video with audio, powered by Google. | Image-to-Video Generation | $0.8009259259259259 | 49.51556s | [llms.txt](https://www.segmind.com/models/veo-3.1-lite/llms.txt) |
| Video Faceswap | videofaceswap | Video Faceswap is  a powerful tool for creators, filmmakers, and meme enthusiasts. With this innovative technology, you  | Image-to-Video Generation | $0.4118847845862569 | 184.88331s | [llms.txt](https://www.segmind.com/models/videofaceswap/llms.txt) |
| Video Frame Interpolation | video-frame-interpolation | FILM synthesizes smooth, high-quality intermediate frames for fluid motion in videos with significant movement. | Image-to-Video Generation | $5 | - | [llms.txt](https://www.segmind.com/models/video-frame-interpolation/llms.txt) |
| Video Stitch | video-stitch | Revolutionize your video editing with the Video Stitch Model. Seamlessly stitch clips, add captivating audio, and create | Image-to-Video Generation | $0.002885776086843744 | 30.38754s | [llms.txt](https://www.segmind.com/models/video-stitch/llms.txt) |
| Video Tryon | video-tryon | Video Tryon is Segmind’s next-generation AI video model for instant virtual try-on, allowing users to visualize any outf | Image-to-Video Generation | $1.6059995424855493 | 205.12761s | [llms.txt](https://www.segmind.com/models/video-tryon/llms.txt) |
| Video Watermark Remover | video-watermark-remover | Remove watermarks from any video instantly with AI. | Image-to-Video Generation | $0.8275678268292683 | 194.82s | [llms.txt](https://www.segmind.com/models/video-watermark-remover/llms.txt) |
| Vidu Q1 Reference to Video | vidu-q1-reference-to-video | Vidu AI reference to video transforms text and images into dynamic, high-quality videos effortlessly. | Image-to-Video Generation | $0.5 | 120.68598s | [llms.txt](https://www.segmind.com/models/vidu-q1-reference-to-video/llms.txt) |
| Vidu Template | vidu-template | Transform static images into captivating videos using diverse motion templates effortlessly. | Image-to-Video Generation | $0.0625 | 125.76909s | [llms.txt](https://www.segmind.com/models/vidu-template/llms.txt) |
| Wan 2.1 480p image to video | wan2.1-i2v-480p | Create high-quality 480p videos with excellent visual quality and a broad spectrum of motion from static images. | Image-to-Video Generation | $0.5302190763528138 | 53.38121s | [llms.txt](https://www.segmind.com/models/wan2.1-i2v-480p/llms.txt) |
| Wan 2.1 720p image to video | wan2.1-i2v-720p | Create high-quality 720p videos with excellent visual quality and a broad spectrum of motion from static images. | Image-to-Video Generation | $1.4477480769691784 | 148.71608s | [llms.txt](https://www.segmind.com/models/wan2.1-i2v-720p/llms.txt) |
| Wan 2.2 Image to Video Fast | wan-2.2-i2v-fast | Transforms simple text prompts into breathtaking cinematic-quality videos in minutes. | Image-to-Video Generation | $0.08774838350014805 | 52.92964s | [llms.txt](https://www.segmind.com/models/wan-2.2-i2v-fast/llms.txt) |
| Wan 2.2 Image to Video Flash | wan-2.2-i2v-flash | Convert a single image into a coherent dynamic video. | Image-to-Video Generation | $0.1722461538461539 | 66.87629s | [llms.txt](https://www.segmind.com/models/wan-2.2-i2v-flash/llms.txt) |
| Wan 2.5 Image to Video | wan-2.5-i2v | Wan2.5-Preview creates stunning, high-resolution videos with flawless audio synchronization from multiple inputs. | Image-to-Video Generation | $0.8588397896076353 | 177.58862s | [llms.txt](https://www.segmind.com/models/wan-2.5-i2v/llms.txt) |
| Wan 2.6 Image To Video | wan-2.6-i2v | Transform images into high-quality videos with audio sync. | Image-to-Video Generation | $1.1835867724233984 | 137.1875s | [llms.txt](https://www.segmind.com/models/wan-2.6-i2v/llms.txt) |
| Wan 2.6 Text To Video | wan-2.6-t2v | Cinematic videos with synchronized audio from text prompts. | Image-to-Video Generation | $1.0112359550561798 | 180.35542s | [llms.txt](https://www.segmind.com/models/wan-2.6-t2v/llms.txt) |
| Wan 2.7 Image to Video | wan2.7-i2v | Animate any image into cinematic 1080P video with audio. | Image-to-Video Generation | $0.703125 | 483.81044s | [llms.txt](https://www.segmind.com/models/wan2.7-i2v/llms.txt) |
| Wan 2.7 Reference to Video | wan2.7-r2v | Character-consistent multi-subject videos from reference images. | Image-to-Video Generation | $0.703125 | 490.86746s | [llms.txt](https://www.segmind.com/models/wan2.7-r2v/llms.txt) |
| Wan Animate | wan-animate | Animate characters and replace video subjects seamlessly. | Image-to-Video Generation | $1.5690977389090908 | 412.48196s | [llms.txt](https://www.segmind.com/models/wan-animate/llms.txt) |
| Wan Scail | scail | Professional character animations from reference images. | Image-to-Video Generation | $1.9292771716129034 | 429.39176s | [llms.txt](https://www.segmind.com/models/scail/llms.txt) |
| Wan Video Effects | video-effects | Transform your videos with diverse video effects. Start creating captivating videos today. | Image-to-Video Generation | $0.5248923280952382 | 126.12837s | [llms.txt](https://www.segmind.com/models/video-effects/llms.txt) |
| Warmth of Jesus | warmth-of-jesus | Experience the viral "Warmth of Jesus" effect on PixVerse! Transform your images into heartwarming videos of Jesus embra | Image-to-Video Generation | $0.3907894736842105 | 50.30539s | [llms.txt](https://www.segmind.com/models/warmth-of-jesus/llms.txt) |

## videoToVideo

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Bria Increase Video Resolution | bria-increase-video-resolution | Transform your videos with AI-powered upscaling and seamless background removal for professional quality. | videoToVideo | $1.0636303030303031 | 192.3339s | [llms.txt](https://www.segmind.com/models/bria-increase-video-resolution/llms.txt) |
| Bria Remove Video Background | bria-remove-video-background | Bria Video AI enhances videos up to 8K while seamlessly removing backgrounds for professional quality content. | videoToVideo | $2.150271232876712 | 41.1538s | [llms.txt](https://www.segmind.com/models/bria-remove-video-background/llms.txt) |
| Bria Video Eraser | bria-erase-video | Remove unwanted objects from videos while preserving audio. | videoToVideo | $0.3192 | 120.7891s | [llms.txt](https://www.segmind.com/models/bria-erase-video/llms.txt) |
| Esrgan Video Upscaler | esrgan-video-upscaler | ESRGAN Video Upscaler: Experience sharper, clearer 4k videos with ESRGAN. This AI-powered video upscaler boosts resoluti | videoToVideo | $0.3217835050269301 | 157.28935s | [llms.txt](https://www.segmind.com/models/esrgan-video-upscaler/llms.txt) |
| FlashVSR | flashvsr | Real-time video quality enhancement for high-resolution content. | videoToVideo | $1.1762287625 | 163.0629s | [llms.txt](https://www.segmind.com/models/flashvsr/llms.txt) |
| Heygen Video Translate | heygen-video-translate | Translate videos to multiple languages with natural lip-sync. | videoToVideo | $0.48275438235294116 | 169.60772s | [llms.txt](https://www.segmind.com/models/heygen-video-translate/llms.txt) |
| Kling 2.6 Pro Motion Control | kling-2.6-pro-motion-control | Transfer motion from videos to animate custom characters. | videoToVideo | $1.7247032258064519 | 595.48319s | [llms.txt](https://www.segmind.com/models/kling-2.6-pro-motion-control/llms.txt) |
| Kling 2.6 Standard Motion Control | kling-2.6-standard-motion-control | Precise motion transfer from reference videos to characters. | videoToVideo | $0.9415 | 575.55155s | [llms.txt](https://www.segmind.com/models/kling-2.6-standard-motion-control/llms.txt) |
| Kling O1 Video 2 Video Edit | kling-o1-video-to-video-edit | Edit any video with precise natural language commands. | videoToVideo | $1.2769852941176472 | 250.29267s | [llms.txt](https://www.segmind.com/models/kling-o1-video-to-video-edit/llms.txt) |
| Kling O1 Video 2 Video Reference | kling-o1-video-to-video-reference | Video style transfer using reference character images. | videoToVideo | $1.423617391304348 | 239.7181s | [llms.txt](https://www.segmind.com/models/kling-o1-video-to-video-reference/llms.txt) |
| Kling O3 Video To Video Edit | kling-o3-video2video-edit | Text-based video editor — swap backgrounds, characters, restyle scenes. | videoToVideo | $2.3126249999999997 | 202.43342s | [llms.txt](https://www.segmind.com/models/kling-o3-video2video-edit/llms.txt) |
| Kling O3 Video To Video Reference | kling-o3-video2video-reference | Swap characters and restyle videos using reference images. | videoToVideo | $2.032916666666667 | 224.65349s | [llms.txt](https://www.segmind.com/models/kling-o3-video2video-reference/llms.txt) |
| LTX Retake Video | ltx-retake-video | Precise segment-level video edits maintaining full scene continuity. | videoToVideo | $0.7911607142857141 | 38.5012s | [llms.txt](https://www.segmind.com/models/ltx-retake-video/llms.txt) |
| Multi Video Merge | multi-video-merge | Merge multiple videos into a single combined output. | videoToVideo | $0.03265760629629629 | 98.33976s | [llms.txt](https://www.segmind.com/models/multi-video-merge/llms.txt) |
| Pixverse Lipsync | pixverse-lipsync | PixVerse Lipsync expertly synchronizes lip movements to audio for flawless video content creation. | videoToVideo | $0.31236277056277056 | 112.43345s | [llms.txt](https://www.segmind.com/models/pixverse-lipsync/llms.txt) |
| Runway Gen4 Aleph | runway-gen4-aleph | Runway Aleph revolutionizes video editing with intelligent automation for seamless object and environment manipulation. | videoToVideo | $1.125 | 171.53096s | [llms.txt](https://www.segmind.com/models/runway-gen4-aleph/llms.txt) |
| Sam V2 Video | sam-v2-video | SAM v2 Video by Meta AI, allows promptable segmentation of objects in videos.  | videoToVideo | $0.05689024780269058 | 37.56451s | [llms.txt](https://www.segmind.com/models/sam-v2-video/llms.txt) |
| Sam3 Video | sam3-video | Real-time video segmentation and multi-object tracking. | videoToVideo | $0.13449561599999998 | 117.44927s | [llms.txt](https://www.segmind.com/models/sam3-video/llms.txt) |
| Sync.so Lipsync 2 Pro | sync.so-lipsync-2-pro | Lipsync-2-Pro seamlessly synchronizes lips in videos for instant, high-quality multilingual content creation. | videoToVideo | $1.0556461855670103 | 234.90347s | [llms.txt](https://www.segmind.com/models/sync.so-lipsync-2-pro/llms.txt) |
| Sync.so React 1 | sync.so-react-1 | Edit video actors' emotions with realistic re-expression. | videoToVideo | $1.9782141935483872 | 347.64718s | [llms.txt](https://www.segmind.com/models/sync.so-react-1/llms.txt) |
| Topaz Labs Video Upscale | topaz-video-upscale | Topaz Video AI upscales, enhances, denoises, stabilizes, and increases frame rates in video footage, transforming low-qu | videoToVideo | $1.6011588831683166 | 212.58892s | [llms.txt](https://www.segmind.com/models/topaz-video-upscale/llms.txt) |
| Video Audio Merge | video-audio-merge | Effortlessly merge audio and video with our intuitive Video Audio Merge model. Create stunning multimedia content with p | videoToVideo | $0.0017616686451116241 | 20.1118s | [llms.txt](https://www.segmind.com/models/video-audio-merge/llms.txt) |
| Video Captioner | video-captioner | With Video Captioner create accurate, customizable subtitles for your videos effortlessly. | videoToVideo | $0.03830279902865469 | 68.54137s | [llms.txt](https://www.segmind.com/models/video-captioner/llms.txt) |
| Video Concatenate | video-concatenate | Merge videos with custom layouts, spacing, and audio. | videoToVideo | $0.0007790461538461538 | 31.6725s | [llms.txt](https://www.segmind.com/models/video-concatenate/llms.txt) |
| Video Loop | video-loop | Effortlessly loop videos for engaging social media & storytelling with our Video Loop. | videoToVideo | $0.0009503033488372096 | 8.14397s | [llms.txt](https://www.segmind.com/models/video-loop/llms.txt) |
| Wan 2.7 Video Editing | wan2.7-videoedit | Edit existing videos precisely using natural language text instructions. | videoToVideo | $0.6433823529411765 | 362.14922s | [llms.txt](https://www.segmind.com/models/wan2.7-videoedit/llms.txt) |

## Text-to-Video Generation

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Cog Video X 5B | cog-video-5b-t2v | CogVideo is a groundbreaking AI model that turns text into high-quality videos. Create realistic scenes, animations, and | Text-to-Video Generation | $0.3555572362200434 | 229.23271s | [llms.txt](https://www.segmind.com/models/cog-video-5b-t2v/llms.txt) |
| Google Veo 2 | veo-2 | Create stunning, realistic videos with Veo 2, Google's state-of-the-art AI video generation model. Experience enhanced q | Text-to-Video Generation | $4.2576142889376225 | 39.87097s | [llms.txt](https://www.segmind.com/models/veo-2/llms.txt) |
| Google Veo 3 | veo-3 | Veo 3 revolutionizes video creation with advanced text-to-video generation and realistic audio synthesis for cinematic c | Text-to-Video Generation | $5.188406933524204 | 144.36327s | [llms.txt](https://www.segmind.com/models/veo-3/llms.txt) |
| Hunyuan Video | hunyuan-video | Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. | Text-to-Video Generation | $1.4642586306485041 | 211.32017s | [llms.txt](https://www.segmind.com/models/hunyuan-video/llms.txt) |
| Kling 3.0 Pro Text-to-Video | kling-3-pro-text2video | Cinematic 1080p videos with realistic audio from text. | Text-to-Video Generation | $3.3089230769230773 | 297.22698s | [llms.txt](https://www.segmind.com/models/kling-3-pro-text2video/llms.txt) |
| Kling 3.0 Standard Text-to-Video | kling-3-standard-text2video | Stunning 1080p cinematic videos from simple text prompts. | Text-to-Video Generation | $1.7481509433962263 | 145.32821s | [llms.txt](https://www.segmind.com/models/kling-3-standard-text2video/llms.txt) |
| Kling AI 1.6 Text to Video | kling-1.6-text2video | Kling AI 1.6 Text-to-Video is a cutting-edge AI tool that transforms text into stunning, lifelike videos. Create profess | Text-to-Video Generation | $0.6886923076923093 | 304.07888s | [llms.txt](https://www.segmind.com/models/kling-1.6-text2video/llms.txt) |
| Kling AI Text to Video | kling-text2video | Kling AI Text-to-Video is a cutting-edge AI tool that transforms text into stunning, lifelike videos. Create professiona | Text-to-Video Generation | $0.43168958742632646 | 320.23594s | [llms.txt](https://www.segmind.com/models/kling-text2video/llms.txt) |
| Kling O3 Text-to-Video | kling-o3-text2video | 15-second cinematic AI videos with native audio. | Text-to-Video Generation | $2.342307692307692 | 153.52772s | [llms.txt](https://www.segmind.com/models/kling-o3-text2video/llms.txt) |
| LTX-2-19B I2V | ltx-2-19b-i2v | Synchronized 4K audio-video generation from images, fast. | Text-to-Video Generation | $0.4551744523809522 | 113.17079s | [llms.txt](https://www.segmind.com/models/ltx-2-19b-i2v/llms.txt) |
| LTX-2-19B T2V | ltx-2-19b-t2v | Synchronized video and audio from text, multiple input types. | Text-to-Video Generation | $0.40987834111111116 | 100.94594s | [llms.txt](https://www.segmind.com/models/ltx-2-19b-t2v/llms.txt) |
| Luma Ray Text to Video | luma-ray-txt-2-video | Luma Ray2 text-to-video creates realistic, coherent videos from your text prompts. | Text-to-Video Generation | $1.599999999999998 | 114.21931s | [llms.txt](https://www.segmind.com/models/luma-ray-txt-2-video/llms.txt) |
| Luma Text-to-Video  | luma-txt-2-video | Luma Video (Text to Video) is an advanced AI model that turns text prompts into captivating videos. Designed for creator | Text-to-Video Generation | $0.9470967741935508 | 62.45225s | [llms.txt](https://www.segmind.com/models/luma-txt-2-video/llms.txt) |
| Minimax AI Director | minimax-ai-director | Minimax video-01-director: Create high-quality videos with control camera movements precisely using text prompts. | Text-to-Video Generation | $0.625 | 154.48488s | [llms.txt](https://www.segmind.com/models/minimax-ai-director/llms.txt) |
| Mochi 1 | mochi-1 | Mochi 1 is a cutting-edge, open-source AI model that transforms text prompts into stunning, high-fidelity videos. Create | Text-to-Video Generation | $0.26644555832420574 | 180.10768s | [llms.txt](https://www.segmind.com/models/mochi-1/llms.txt) |
| Pixverse Text to Video | pixverse-text2video | Effortlessly create captivating videos from text with Pixverse text to video AI! Customize style, duration, and more. | Text-to-Video Generation | $0.4281746031746032 | 44.22598s | [llms.txt](https://www.segmind.com/models/pixverse-text2video/llms.txt) |
| Seedance 1.0 lite t2v | seedance-v1-lite-text-to-video | Seedance V1 Lite transforms text into high-quality videos, streamlining content creation for diverse applications. | Text-to-Video Generation | $0.19834662576687118 | 46.71624s | [llms.txt](https://www.segmind.com/models/seedance-v1-lite-text-to-video/llms.txt) |
| Veo 3 Fast | veo-3-fast | Veo 3 Fast rapidly creates high-quality, 8-second videos with synchronized audio for diverse content needs. | Text-to-Video Generation | $1.6233309404163676 | 80.4446s | [llms.txt](https://www.segmind.com/models/veo-3-fast/llms.txt) |
| Wan 2.2 Text to Video Fast | wan-2.2-t2v-fast | Wan2.2 transforms text and images into high-quality video clips with cinematic flair. | Text-to-Video Generation | $0.09856787724935734 | 96.20325s | [llms.txt](https://www.segmind.com/models/wan-2.2-t2v-fast/llms.txt) |
| Wan 2.5 Text to Video | wan-2.5-t2v | Wan2.5-Preview generates synchronized multimedia content, merging text, image, video, and audio seamlessly. | Text-to-Video Generation | $0.801146643572621 | 213.47998s | [llms.txt](https://www.segmind.com/models/wan-2.5-t2v/llms.txt) |
| Wan 2.7 Text to Video | wan2.7-t2v | 1080P cinematic videos with audio sync and multi-shot control. | Text-to-Video Generation | $0.7589285714285714 | 302.8468s | [llms.txt](https://www.segmind.com/models/wan2.7-t2v/llms.txt) |
| Wan_2.1 Text to Video | wan2.1-t2v | Create visually impressive and feature varied, lifelike motion videos with Wan2.1 using text prompts. | Text-to-Video Generation | $0.8552705145173745 | 104.81362s | [llms.txt](https://www.segmind.com/models/wan2.1-t2v/llms.txt) |

## Image-to-Image Transformation

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| AI Product Photo Editor | ai-product-photo-editor | AI Product Photo Editor leverages advanced image-based ML techniques to generate high-quality product visuals using text | Image-to-Image Transformation | $0.022241418080724867 | 15.50389s | [llms.txt](https://www.segmind.com/models/ai-product-photo-editor/llms.txt) |
| AI Product Photography  | ai-product-photography | Elevate your product imagery with our AI-powered photography model. Create stunning, professional-quality photos that bo | Image-to-Image Transformation | $0.06473440478384125 | 11.65235s | [llms.txt](https://www.segmind.com/models/ai-product-photography/llms.txt) |
| Aura Flow | aura-flow | Largest completely open sourced flow-based generation model that is capable of text-to-image generation | Image-to-Image Transformation | $0.11701278422174842 | 79.15338s | [llms.txt](https://www.segmind.com/models/aura-flow/llms.txt) |
| Automatic Mask Generator | automatic-mask-generator | Automatic Mask Generator is a powerful tool that automates the creation of precise masks for inpainting | Image-to-Image Transformation | $0.0015574721135883614 | 1.70042s | [llms.txt](https://www.segmind.com/models/automatic-mask-generator/llms.txt) |
| Background Removal | bg-removal | This model removes the background image from any image | Image-to-Image Transformation | $0.002097570303297566 | 1.65848s | [llms.txt](https://www.segmind.com/models/bg-removal/llms.txt) |
| Background Removal V2 | bg-removal-v2 | This model removes the background image from any image | Image-to-Image Transformation | $0.0008893992875745101 | 0.69606s | [llms.txt](https://www.segmind.com/models/bg-removal-v2/llms.txt) |
| Bria Blur Background | bria-blur-background | Bria AI Image Editing API v2 enables precise and context-aware image manipulation for stunning visual outcomes. | Image-to-Image Transformation | $0.053043478260869574 | 16.51626s | [llms.txt](https://www.segmind.com/models/bria-blur-background/llms.txt) |
| Bria Enhance Image | bria-enhance-image | Bria AI creates precise, high-quality image enhancements and manipulations for diverse creative applications. | Image-to-Image Transformation | $0.0404426559356137 | 24.62666s | [llms.txt](https://www.segmind.com/models/bria-enhance-image/llms.txt) |
| Bria Erase Foreground | bria-erase-foreground | Seamlessly removes foreground subjects and regenerates backgrounds for flawless image editing. | Image-to-Image Transformation | $0.042727272727272725 | 12.00002s | [llms.txt](https://www.segmind.com/models/bria-erase-foreground/llms.txt) |
| Bria Eraser | bria-eraser | AI object removal with seamless context-aware inpainting. | Image-to-Image Transformation | $0.04176470588235294 | 14.03854s | [llms.txt](https://www.segmind.com/models/bria-eraser/llms.txt) |
| Bria Expand Image | bria-expand-image | Bria Expand enables precise image manipulation and enhancement with generative AI, trained exclusively on licensed data  | Image-to-Image Transformation | $0.039004149377593327 | 13.6012s | [llms.txt](https://www.segmind.com/models/bria-expand-image/llms.txt) |
| Bria Generate Background | bria-replace-background | Transform images through advanced background editing and generative content creation for diverse applications. | Image-to-Image Transformation | $0.04142857142857145 | 18.72365s | [llms.txt](https://www.segmind.com/models/bria-replace-background/llms.txt) |
| Bria Generative Fill | bria-gen-fill | Bria AI enables precise generative image editing for seamless creative enhancements and transformations. | Image-to-Image Transformation | $0.03787878787878789 | 18.44121s | [llms.txt](https://www.segmind.com/models/bria-gen-fill/llms.txt) |
| Bria Increase Resolution | bria-increase-resolution | Seamlessly upscale and manipulate images while preserving the highest fidelity and safety standards. | Image-to-Image Transformation | $0.036757762991128026 | 12.51979s | [llms.txt](https://www.segmind.com/models/bria-increase-resolution/llms.txt) |
| Bria Lifestyle Product Shot by Text | bria-lifestyle-shot-by-text | Transform isolated product images into dynamic lifestyle scenes with AI-driven contextual realism. | Image-to-Image Transformation | $0.03861635220125788 | 25.26646s | [llms.txt](https://www.segmind.com/models/bria-lifestyle-shot-by-text/llms.txt) |
| Bria Product Cutout | bria-product-cutout | Automates precise product cutouts and background removal for professional eCommerce imagery at scale. | Image-to-Image Transformation | $0.04 | 10.92597s | [llms.txt](https://www.segmind.com/models/bria-product-cutout/llms.txt) |
| Bria Product Packshot | bria-product-packshot | Transform product photos into professional, market-ready images with intelligent enhancements and background removal. | Image-to-Image Transformation | $0.0409375 | 16.5566s | [llms.txt](https://www.segmind.com/models/bria-product-packshot/llms.txt) |
| Bria Product Shadow | bria-product-shadow | Bria Product Shadow enhances product images with realistic shadows for professional eCommerce presentations. | Image-to-Image Transformation | $0.03864197530864198 | 8.40804s | [llms.txt](https://www.segmind.com/models/bria-product-shadow/llms.txt) |
| Bria Reimagine | bria-reimagine | Bria AI Reimagine transforms reference images into detailed, styled visuals with creative flexibility. | Image-to-Image Transformation | $0.04113924050632911 | 13.27925s | [llms.txt](https://www.segmind.com/models/bria-reimagine/llms.txt) |
| Bria RMBG 2.0 | bria-remove-background | Effortlessly extract backgrounds with unmatched precision, powered by models trained exclusively on licensed data for sa | Image-to-Image Transformation | $0.017932757782101176 | 10.9207s | [llms.txt](https://www.segmind.com/models/bria-remove-background/llms.txt) |
| Caricature Style | caricature-style | Transform everyday photos into lively, whimsical caricature illustrations that highlight individual features with playfu | Image-to-Image Transformation | $0.09814871371458372 | 53.12968s | [llms.txt](https://www.segmind.com/models/caricature-style/llms.txt) |
| Clarity Upscaler | clarity-upscaler | High resolution creative image Upscaler and Enhancer. A free Magnific alternative.
  | Image-to-Image Transformation | $0.019290546262842614 | 18.27698s | [llms.txt](https://www.segmind.com/models/clarity-upscaler/llms.txt) |
| ClarityAI Creative Upscaler | clarityai-creative-upscaler | Creative image upscaling with fine detail enhancement. | Image-to-Image Transformation | $0.40403422982885084 | 131.0447s | [llms.txt](https://www.segmind.com/models/clarityai-creative-upscaler/llms.txt) |
| ClarityAI Crystal Upscaler | clarityai-crystal-upscaler | Upscale images up to 200x with enhanced detail and vibrancy. | Image-to-Image Transformation | $0.5203234265734266 | 34.75041s | [llms.txt](https://www.segmind.com/models/clarityai-crystal-upscaler/llms.txt) |
| ClarityAI Flux Upscaler | clarityai-flux-upscaler | Transform low-resolution images into stunning high-quality visuals. | Image-to-Image Transformation | $1.6839788732394365 | 323.32283s | [llms.txt](https://www.segmind.com/models/clarityai-flux-upscaler/llms.txt) |
| Codeformer | codeformer | CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces. | Image-to-Image Transformation | $0.01237333921297771 | 5.92044s | [llms.txt](https://www.segmind.com/models/codeformer/llms.txt) |
| Consistent Character | consistent-character | Create images of a given character in different poses
  | Image-to-Image Transformation | $0.0845939208834292 | 61.36668s | [llms.txt](https://www.segmind.com/models/consistent-character/llms.txt) |
| Consistent Character With Pose | consistent-character-with-pose | Create images of a given character in different poses | Image-to-Image Transformation | $0.02943020679334449 | 30.9992s | [llms.txt](https://www.segmind.com/models/consistent-character-with-pose/llms.txt) |
| ControlNet Canny | sd1.5-controlnet-canny | This model corresponds to the ControlNet conditioned on Canny edges. | Image-to-Image Transformation | $0.003700848409745236 | 4.66892s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-canny/llms.txt) |
| ControlNet Depth | sd1.5-controlnet-depth | This model corresponds to the ControlNet conditioned on Depth estimation. | Image-to-Image Transformation | $0.009929038383475731 | 12.63628s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-depth/llms.txt) |
| ControlNet Openpose | sd1.5-controlnet-openpose | This model corresponds to the ControlNet conditioned on Human Pose Estimation. | Image-to-Image Transformation | $0.00611304704697142 | 10.00272s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-openpose/llms.txt) |
| ControlNet Scribble | sd1.5-controlnet-scribble | This model corresponds to the ControlNet conditioned on Scribble images. | Image-to-Image Transformation | $0.0029691070185857158 | 3.80087s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-scribble/llms.txt) |
| ControlNet Soft Edge | sd1.5-controlnet-softedge | This model corresponds to the ControlNet conditioned on Soft Edge. | Image-to-Image Transformation | $0.002842180637190888 | 3.42821s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-softedge/llms.txt) |
| ESRGAN | esrgan | ERGAN is an Image Super-Resolution (upscaler) model that enhances images with stunning, high-quality upscaling while pre | Image-to-Image Transformation | $0.0046643526179601736 | 4.92678s | [llms.txt](https://www.segmind.com/models/esrgan/llms.txt) |
| Expression Editor | expression-editor | Expression Editor uses reference images to accurately generate new images with desired expressions. Perfect for digital  | Image-to-Image Transformation | $0.0015538020119496036 | 2.31514s | [llms.txt](https://www.segmind.com/models/expression-editor/llms.txt) |
| Face Detailer | face-detailer | Restore characters' faces to their original glory with Face Detailer. Enhance facial details, eliminate distortion, and  | Image-to-Image Transformation | $0.01488621607186064 | 16.32891s | [llms.txt](https://www.segmind.com/models/face-detailer/llms.txt) |
| face-to-many | face-to-many | Turn a face into 3D, emoji, pixel art, video game, claymation or toy | Image-to-Image Transformation | $0.024571864690289144 | 22.46034s | [llms.txt](https://www.segmind.com/models/face-to-many/llms.txt) |
| face-to-sticker | face-to-sticker | Turn a face into a sticker | Image-to-Image Transformation | $0.0865603507661326 | 65.40722s | [llms.txt](https://www.segmind.com/models/face-to-sticker/llms.txt) |
| Faceswap | sd2.1-faceswapper | Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. N | Image-to-Image Transformation | $0.02248596119238562 | 36.76486s | [llms.txt](https://www.segmind.com/models/sd2.1-faceswapper/llms.txt) |
| Faceswap V2 | faceswap-v2 | Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. N | Image-to-Image Transformation | $0.0042687928216279085 | 3.19248s | [llms.txt](https://www.segmind.com/models/faceswap-v2/llms.txt) |
| Faceswap V3 | faceswap-v3 | Face Swap V3 is a cutting-edge tool that empowers you to seamlessly swap faces in images. With customizable features and | Image-to-Image Transformation | $0.005808887588289018 | 4.28326s | [llms.txt](https://www.segmind.com/models/faceswap-v3/llms.txt) |
| Faceswap V3 Multifaceswap | faceswap-v3-multifaceswap | Faceswap V3 Multifaceswap enables realistic face swapping in images, preserving lighting and expressions for professiona | Image-to-Image Transformation | $0.0076014094998761925 | 6.60218s | [llms.txt](https://www.segmind.com/models/faceswap-v3-multifaceswap/llms.txt) |
| Flux 2 Flex | flux-2-flex | Consistent-style photorealistic images using reference inputs. | Image-to-Image Transformation | $0.17055739514348783 | 47.96836s | [llms.txt](https://www.segmind.com/models/flux-2-flex/llms.txt) |
| Flux 2 Max | flux-2-max | Photorealistic images with maximum consistency and fine detail. | Image-to-Image Transformation | $0.21973192019950127 | 53.66553s | [llms.txt](https://www.segmind.com/models/flux-2-max/llms.txt) |
| Flux 2 Pro | flux-2-pro | High-quality photorealistic images with cross-output consistency. | Image-to-Image Transformation | $0.07247651457055213 | 26.09079s | [llms.txt](https://www.segmind.com/models/flux-2-pro/llms.txt) |
| Flux Canny Dev | flux-canny-dev | Open-weight edge-guided image generation. Control structure and composition using Canny edge detection. | Image-to-Image Transformation | $0.03125 | 20.67341s | [llms.txt](https://www.segmind.com/models/flux-canny-dev/llms.txt) |
| Flux Canny Pro | flux-canny-pro | Professional edge-guided image generation. Control structure and composition using Canny edge detection | Image-to-Image Transformation | $0.06248089457243265 | 24.32486s | [llms.txt](https://www.segmind.com/models/flux-canny-pro/llms.txt) |
| Flux Controlnets | flux-controlnet | Flux ControlNets is a collection of models that gives you precise control over image generation. By integrating ControlN | Image-to-Image Transformation | $0.0468823807162098 | 48.66877s | [llms.txt](https://www.segmind.com/models/flux-controlnet/llms.txt) |
| Flux Depth Dev | flux-depth-dev | Open-weight depth-aware image generation. Edit images while preserving spatial relationships. | Image-to-Image Transformation | $0.03125 | 16.03276s | [llms.txt](https://www.segmind.com/models/flux-depth-dev/llms.txt) |
| Flux Depth Pro | flux-depth-pro | Professional depth-aware image generation. Edit images while preserving spatial relationships. | Image-to-Image Transformation | $0.062476545879212544 | 25.41634s | [llms.txt](https://www.segmind.com/models/flux-depth-pro/llms.txt) |
| Flux Fill Dev | flux-fill-dev | Open-weight inpainting model for editing and extending images. Guidance-distilled from FLUX.1 Fill Dev | Image-to-Image Transformation | $0.04999999999999993 | 16.62558s | [llms.txt](https://www.segmind.com/models/flux-fill-dev/llms.txt) |
| Flux Fill Pro | flux-fill-pro | Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, sea | Image-to-Image Transformation | $0.06233509590461379 | 23.56603s | [llms.txt](https://www.segmind.com/models/flux-fill-pro/llms.txt) |
| Flux Inpaint | flux-inpaint | Flux Inpainting is a powerful image editing tool designed to effortlessly edit and enhance your images. It's perfect for | Image-to-Image Transformation | $0.02438738535413639 | 28.43018s | [llms.txt](https://www.segmind.com/models/flux-inpaint/llms.txt) |
| Flux Ipadapter | flux-ipadapter | Flux IP Adapter is a cutting-edge AI model that lets you to create stunning, customized images. With its advanced style  | Image-to-Image Transformation | $0.07432834362068963 | 76.08154s | [llms.txt](https://www.segmind.com/models/flux-ipadapter/llms.txt) |
| Flux Kontext Max | flux-kontext-max | FLUX.1 Kontext [max] transforms textual descriptions into stunning, high-fidelity images with seamless typography integr | Image-to-Image Transformation | $0.10000000000000042 | 24.36245s | [llms.txt](https://www.segmind.com/models/flux-kontext-max/llms.txt) |
| Flux Kontext Pro | flux-kontext-pro | FLUX.1 Kontext Pro transforms text prompts into high-quality, customized images with remarkable efficiency and precision | Image-to-Image Transformation | $0.04999999999999987 | 21.77541s | [llms.txt](https://www.segmind.com/models/flux-kontext-pro/llms.txt) |
| Flux Krea Dev | flux-krea-dev | FLUX.1 Krea generates stunning, photorealistic images with fine-tuned aesthetic control for diverse creative application | Image-to-Image Transformation | $0.031943892432432446 | 24.82919s | [llms.txt](https://www.segmind.com/models/flux-krea-dev/llms.txt) |
| Flux Pulid | flux-pulid | Flux PuLID: Customize AI-generated images with your unique identity. Seamlessly integrate faces into text-to-image model | Image-to-Image Transformation | $0.03616867535285016 | 13.083s | [llms.txt](https://www.segmind.com/models/flux-pulid/llms.txt) |
| Flux Redux Dev | flux-redux-dev | Open-weight image variation model. Create new versions while preserving key elements of your original. | Image-to-Image Transformation | $0.03125 | 15.84067s | [llms.txt](https://www.segmind.com/models/flux-redux-dev/llms.txt) |
| Flux Redux Schnell | flux-redux-schnell | Fast, efficient image variation model for rapid iteration and experimentation. | Image-to-Image Transformation | $0.00375 | 7.67963s | [llms.txt](https://www.segmind.com/models/flux-redux-schnell/llms.txt) |
| Flux-2 Klein-4b | flux-2-klein-4b | Sub-second photorealistic image generation and editing. | Image-to-Image Transformation | $0.031852410947562096 | 10.65235s | [llms.txt](https://www.segmind.com/models/flux-2-klein-4b/llms.txt) |
| Flux-2 Klein-9b | flux-2-klein-9b | Ultra-fast photorealistic image generation on consumer GPUs. | Image-to-Image Transformation | $0.04019102186915888 | 15.39635s | [llms.txt](https://www.segmind.com/models/flux-2-klein-9b/llms.txt) |
| Flux.1 Image To Image  | flux-img2img | Flux Image-To-Image model by Black Forest Labs is an advanced deep learning tool designed for transforming images based  | Image-to-Image Transformation | $0.026470672997882975 | 25.00478s | [llms.txt](https://www.segmind.com/models/flux-img2img/llms.txt) |
| FLUX.1 Kontext [dev] | flux-kontext-dev | FLUX.1 Kontext [dev] creates coherent and editable images by integrating text and visual cues for iterative design. | Image-to-Image Transformation | $0.04000159331124784 | 10.78987s | [llms.txt](https://www.segmind.com/models/flux-kontext-dev/llms.txt) |
| Font Sheet Generator | font-sheet-generator | Transforms images into unique, custom font sets in minutes, revolutionizing typography design. | Image-to-Image Transformation | $0.09303416666666668 | 32.89264s | [llms.txt](https://www.segmind.com/models/font-sheet-generator/llms.txt) |
| Fooocus | fooocus | Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney. | Image-to-Image Transformation | $0.061833064751358115 | 20.21113s | [llms.txt](https://www.segmind.com/models/fooocus/llms.txt) |
| Fooocus Outpainting | focus-outpaint | Fooocus Outpainting transforms ordinary images into extraordinary works of art by seamlessly expanding their boundaries. | Image-to-Image Transformation | $0.024537408137992028 | 15.64931s | [llms.txt](https://www.segmind.com/models/focus-outpaint/llms.txt) |
| GPT Image 1 Edit | gpt-image-1-edit | Edit and compose images using natural language with GPT Image 1 Edit, OpenAI’s powerful inpainting and multi-reference e | Image-to-Image Transformation | $0.14067419260677325 | 57.8164s | [llms.txt](https://www.segmind.com/models/gpt-image-1-edit/llms.txt) |
| GPT Image 1 Edit Mini | gpt-image-1-edit-mini | Affordable text-driven image generation and editing. | Image-to-Image Transformation | $0.027387329232333505 | 42.33248s | [llms.txt](https://www.segmind.com/models/gpt-image-1-edit-mini/llms.txt) |
| GPT Image 1.5 Edit | gpt-image-1.5-edit | Precise image editing via natural language instructions. | Image-to-Image Transformation | $0.20793580058224165 | 51.12257s | [llms.txt](https://www.segmind.com/models/gpt-image-1.5-edit/llms.txt) |
| HiDream-I1 (Fast) | hidream-l1-fast | HiDream-I1 is a next-generation, open-source image generative foundation model designed for text-to-image synthesis, esp | Image-to-Image Transformation | $0.01521270973837209 | 10.42825s | [llms.txt](https://www.segmind.com/models/hidream-l1-fast/llms.txt) |
| Higgsfield Text 2 Image Soul | higgsfield-text2image-soul | SOUL AI transforms text into stunning, customizable visuals with unparalleled style control and precision. | Image-to-Image Transformation | $0.2175372897492859 | 40.00115s | [llms.txt](https://www.segmind.com/models/higgsfield-text2image-soul/llms.txt) |
| HyperSwap Image Faceswap by FaceFusion Labs | hyperswap-image-faceswap-by-facefusion-labs | High-quality face swapping built for real production workflows. | Image-to-Image Transformation | $0.09999999999999999 | 9.1342s | [llms.txt](https://www.segmind.com/models/hyperswap-image-faceswap-by-facefusion-labs/llms.txt) |
| Ideogram 2a Image to Image | ideogram-2a-img-2-img | Ideogram Image to Image: Transform your images with ease! Enhance, modify, or create entirely new visuals using advanced | Image-to-Image Transformation | $0.05000000000000001 | 10.11784s | [llms.txt](https://www.segmind.com/models/ideogram-2a-img-2-img/llms.txt) |
| Ideogram 3 Reframe | ideogram-3-reframe | Ideogram 3.0's Reframe effortlessly adapts images to diverse formats, enhancing visual content creation for any platform | Image-to-Image Transformation | $0.04159392672731744 | 11.31905s | [llms.txt](https://www.segmind.com/models/ideogram-3-reframe/llms.txt) |
| Ideogram 3 Remix | ideogram-3-remix | Ideogram 3 Remix enables versatile image transformation, enhancing creativity through customizable design iterations. | Image-to-Image Transformation | $0.07462871287128711 | 10.54744s | [llms.txt](https://www.segmind.com/models/ideogram-3-remix/llms.txt) |
| Ideogram 3 Replace Background | ideogram-3-replace-background | Effortlessly replace backgrounds in images, enhancing visual storytelling and creativity with precision and speed. | Image-to-Image Transformation | $0.09174859550561797 | 14.27969s | [llms.txt](https://www.segmind.com/models/ideogram-3-replace-background/llms.txt) |
| Ideogram Character | ideogram-character | Achieve perfect character consistency across multiple generations from a single reference image. | Image-to-Image Transformation | $0.2326470318559557 | 19.98518s | [llms.txt](https://www.segmind.com/models/ideogram-character/llms.txt) |
| Ideogram Image To Image | ideogram-img-2-img | Ideogram Image to Image: Transform your images with ease! Enhance, modify, or create entirely new visuals using advanced | Image-to-Image Transformation | $0.10000000000000017 | 12.67675s | [llms.txt](https://www.segmind.com/models/ideogram-img-2-img/llms.txt) |
| Ideogram Reframe | ideogram-reframe | Transform your images with Ideogram Reframe! Easily reframe square images to your chosen resolution. | Image-to-Image Transformation | $0.09999999999999998 | 23.34343s | [llms.txt](https://www.segmind.com/models/ideogram-reframe/llms.txt) |
| Ideogram Turbo Image To Image | ideogram-turbo-img-2-img | Transform images instantly with Ideogram Turbo Image to Image! Fast AI for quick edits & creative remixes. | Image-to-Image Transformation | $0.06300000000000003 | 11.0242s | [llms.txt](https://www.segmind.com/models/ideogram-turbo-img-2-img/llms.txt) |
| IDM VTON | idm-vton | Best-in-class clothing virtual try on in the wild | Image-to-Image Transformation | $0.044285033715220946 | 10.72924s | [llms.txt](https://www.segmind.com/models/idm-vton/llms.txt) |
| illusion-diffusion-hq | illusion-diffusion-hq | Monster Labs QrCode ControlNet on top of SD Realistic Vision v5.1 | Image-to-Image Transformation | $0.04254304087134279 | 60.12161s | [llms.txt](https://www.segmind.com/models/illusion-diffusion-hq/llms.txt) |
| Image Superimpose | superimpose | Superimpose model lets you to create captivating visuals by seamlessly overlaying one image on top of another. It stream | Image-to-Image Transformation | $0.000516212389380531 | 0.64096s | [llms.txt](https://www.segmind.com/models/superimpose/llms.txt) |
| Image Superimpose V2 | superimpose-v2 | Superimpose V2 elevates image editing! Seamlessly layer images with background removal, precise positioning, and flexibl | Image-to-Image Transformation | $0.0018924620997221814 | 2.24128s | [llms.txt](https://www.segmind.com/models/superimpose-v2/llms.txt) |
| Infinite You | infinite-you | InfiniteYou generates high-fidelity portraits preserving identity while aligning with creative text prompts. | Image-to-Image Transformation | $0.18616816406779663 | 164.55799s | [llms.txt](https://www.segmind.com/models/infinite-you/llms.txt) |
| Inpaint Mask Maker | inpaint-mask-maker | Real-Time Open-Vocabulary Object Detection | Image-to-Image Transformation | $0.004548485514777525 | 8.12156s | [llms.txt](https://www.segmind.com/models/inpaint-mask-maker/llms.txt) |
| Insta Depth | insta-depth | InstantID aims to generate customized images with various poses or styles from only a single reference ID image while en | Image-to-Image Transformation | $0.052145482264241795 | 15.17449s | [llms.txt](https://www.segmind.com/models/insta-depth/llms.txt) |
| InstantID | instantid | InstantID aims to generate customized images with various poses or styles from only a single reference ID image while en | Image-to-Image Transformation | $0.02513327221923347 | 7.73523s | [llms.txt](https://www.segmind.com/models/instantid/llms.txt) |
| IP-adapter Canny XL | ip-sdxl-canny | IP Adpater XL Canny is built on the SDXL framework. This model integrates the IP Adapter and Canny edge preprocessor to  | Image-to-Image Transformation | $0.011348650120825329 | 14.22441s | [llms.txt](https://www.segmind.com/models/ip-sdxl-canny/llms.txt) |
| IP-adapter Depth XL | ip-sdxl-depth | IP Adapter Depth XL is built on the SDXL framework. This model integrates the IP Adapter and Depth preprocessor to offer | Image-to-Image Transformation | $0.015832852252854652 | 21.39794s | [llms.txt](https://www.segmind.com/models/ip-sdxl-depth/llms.txt) |
| IP-adapter Openpose XL | ip-sdxl-openpose | IP Adapter XL Openpose is built on the SDXL framework. This model integrates the IP Adapter and Openpose preprocessor to | Image-to-Image Transformation | $0.013783720446760989 | 15.4171s | [llms.txt](https://www.segmind.com/models/ip-sdxl-openpose/llms.txt) |
| IPAdapter Style Transfer | style-transfer | Style & Composition Transfer with Stable Diffusion IP Adapter  | Image-to-Image Transformation | $0.025668324416977615 | 16.64547s | [llms.txt](https://www.segmind.com/models/style-transfer/llms.txt) |
| Kling O1 | kling-o1 | Text-to-video creation with precise AI-driven motion control. | Image-to-Image Transformation | $0.035 | 59.52196s | [llms.txt](https://www.segmind.com/models/kling-o1/llms.txt) |
| Kling V3 Image 2 Image | kling-3-image2image | Transform images into photorealistic, production-ready visuals. | Image-to-Image Transformation | $0.035 | 68.9175s | [llms.txt](https://www.segmind.com/models/kling-3-image2image/llms.txt) |
| Kolors | kolors | Kolors is a cutting-edge text-to-image model that bridges language and visual art. Transform your textual ideas into pho | Image-to-Image Transformation | $0.09548960481352996 | 84.60946s | [llms.txt](https://www.segmind.com/models/kolors/llms.txt) |
| Lifestyle Product Shot by Image | bria-lifestyle-shot-by-image | Transforms ordinary product images into stunning, marketing-ready visuals for eCommerce success. | Image-to-Image Transformation | $0.031020408163265307 | 20.98771s | [llms.txt](https://www.segmind.com/models/bria-lifestyle-shot-by-image/llms.txt) |
| Magic Eraser | magic-eraser | LaMA Object Removal- AI Magic Eraser | Image-to-Image Transformation | $0.000159575151144529 | 0.78151s | [llms.txt](https://www.segmind.com/models/magic-eraser/llms.txt) |
| material-transfer | material-transfer | Transfer a material from an image to a subject | Image-to-Image Transformation | $0.251970550235849 | 164.02658s | [llms.txt](https://www.segmind.com/models/material-transfer/llms.txt) |
| Minimax-image-01 | image-01 | Generate high-fidelity images from text with precise control & stunning quality with Minimax Image-01. | Image-to-Image Transformation | $0.012520856467121588 | 36.23202s | [llms.txt](https://www.segmind.com/models/image-01/llms.txt) |
| Multi Image Kontext Max | multi-image-kontext-max | FLUX.1 Kontext [max] creates stunning, photorealistic images from text prompts and input images seamlessly. | Image-to-Image Transformation | $0.08793129526854218 | 18.3156s | [llms.txt](https://www.segmind.com/models/multi-image-kontext-max/llms.txt) |
| Multi Image Kontext Pro | multi-image-kontext-pro | Transform text into stunning, professional-grade images with precise editing capabilities. | Image-to-Image Transformation | $0.049999999999999975 | 22.90112s | [llms.txt](https://www.segmind.com/models/multi-image-kontext-pro/llms.txt) |
| Nano Banana 2 | nano-banana-2 | Fast photorealistic images — ideal for marketing and ads. | Image-to-Image Transformation | $0.0923011318546232 | 39.30415s | [llms.txt](https://www.segmind.com/models/nano-banana-2/llms.txt) |
| Nano Banana Pro | nano-banana-pro | High-fidelity images with accurate multilingual text rendering. | Image-to-Image Transformation | $0.16494746411651323 | 61.14607s | [llms.txt](https://www.segmind.com/models/nano-banana-pro/llms.txt) |
| Nomos Image Upscaler 4k | nomos-upscaler | This upscaling model is ideal for enhancing amateur to professional photos, excelling with subjects like cats, hair, and | Image-to-Image Transformation | $0.01154986986276634 | 9.08879s | [llms.txt](https://www.segmind.com/models/nomos-upscaler/llms.txt) |
| Omini Control | ominicontrol | OminiControl is an innovative framework that optimizes Diffusion Transformer models for versatile image generation tasks | Image-to-Image Transformation | $0.0041856582476405115 | 4.50113s | [llms.txt](https://www.segmind.com/models/ominicontrol/llms.txt) |
| Omni Zero | omni-zero | Omni-Zero: A diffusion pipeline for zero-shot stylized portrait creation. | Image-to-Image Transformation | $0.17303785389324666 | 149.10339s | [llms.txt](https://www.segmind.com/models/omni-zero/llms.txt) |
| Profile Photo Style Transfer | become-image | Turn any image of a face into artwork using Stable Diffusion Controlnet and IPAdapter | Image-to-Image Transformation | $0.09456781571354969 | 63.7612s | [llms.txt](https://www.segmind.com/models/become-image/llms.txt) |
| Pruna P Image Edit | p-image-edit | Multi-image editing with AI-guided precision and control. | Image-to-Image Transformation | $0.010000000000000002 | 7.5032s | [llms.txt](https://www.segmind.com/models/p-image-edit/llms.txt) |
| PuLID | pulid-base | Novel tuning-free ID customization method for text-to-image generation. | Image-to-Image Transformation | $0.20444191764255643 | 70.51177s | [llms.txt](https://www.segmind.com/models/pulid-base/llms.txt) |
| Qwen Image Edit | qwen-image-edit | Transform images effortlessly through semantic context and pixel-perfect appearance changes. | Image-to-Image Transformation | $0.1993516979032258 | 48.88752s | [llms.txt](https://www.segmind.com/models/qwen-image-edit/llms.txt) |
| Qwen Image Edit Fast | qwen-image-edit-fast | Qwen-Image-Edit enables precise bilingual image editing for seamless localization and professional content creation. | Image-to-Image Transformation | $0.0364297414955695 | 8.73196s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-fast/llms.txt) |
| Qwen Image Edit Plus | qwen-image-edit-plus | Multi-image editing with precise text-guided transformations. | Image-to-Image Transformation | $0.03525215415395049 | 13.64518s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus/llms.txt) |
| Qwen Image Edit Plus Add People Lora | qwen-image-edit-plus-add-people | Generate realistic multi-character scenes with natural interactions. | Image-to-Image Transformation | $0.09862558602150537 | 23.00519s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-add-people/llms.txt) |
| Qwen Image Edit Plus Blend It | qwen-image-edit-plus-blend-it | Product placement into backgrounds with precise lighting match. | Image-to-Image Transformation | $0.08165602641509435 | 18.06678s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-blend-it/llms.txt) |
| Qwen Image Edit Plus Eigen Banana | qwen-image-edit-plus-eigen-banana | Precise text-guided image transformation and creative editing. | Image-to-Image Transformation | $0.08899248170391061 | 21.66085s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-eigen-banana/llms.txt) |
| Qwen Image Edit Plus Eraser | qwen-image-edit-plus-eraser | Remove unwanted objects while preserving realistic backgrounds. | Image-to-Image Transformation | $0.07835796075085324 | 19.94357s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-eraser/llms.txt) |
| Qwen Image Edit Plus Face To Portrait | qwen-image-edit-plus-face-to-portrait | Cropped face into full identity-preserving portrait photo. | Image-to-Image Transformation | $0.07428876108786611 | 17.85171s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-face-to-portrait/llms.txt) |
| Qwen Image Edit Plus Group Photo | qwen-image-edit-plus-group-photo | Merge individual portraits into realistic group photos. | Image-to-Image Transformation | $0.10533954509803922 | 23.72663s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-group-photo/llms.txt) |
| Qwen Image Edit Plus Multi Lora | qwen-image-edit-plus-multi-lora | Multi-image editing with superior identity and style control. | Image-to-Image Transformation | $0.0861332302359882 | 20.40219s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-multi-lora/llms.txt) |
| Qwen Image Edit Plus Multiple Angles | qwen-image-edit-plus-multiple-angle | Transform image perspective with natural language prompts. | Image-to-Image Transformation | $0.0958050966442953 | 22.53001s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-multiple-angle/llms.txt) |
| Qwen Image Edit Plus Next Scene | qwen-image-edit-plus-next-scene | Create cinematic sequences with seamless visual continuity. | Image-to-Image Transformation | $0.09455228241469817 | 22.30941s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-next-scene/llms.txt) |
| Qwen Image Edit Plus Product Photography | qwen-image-edit-plus-product-photography | Transform white-background products into immersive lifestyle scenes. | Image-to-Image Transformation | $0.08908184624145789 | 20.64443s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-product-photography/llms.txt) |
| Qwen Image Edit Plus Relight | qwen-image-edit-plus-relight | Advanced image relighting using natural language prompts. | Image-to-Image Transformation | $0.10313037307692309 | 21.31752s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-relight/llms.txt) |
| Qwen Image Edit Plus Remove Lighting | qwen-image-edit-plus-remove-lighting | Remove artificial lighting effects and restore natural tones. | Image-to-Image Transformation | $0.07965391666666666 | 18.8681s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-remove-lighting/llms.txt) |
| Qwen Image Edit Plus Texture Apply | qwen-image-edit-plus-texture-apply | Apply precise textures to images using natural language. | Image-to-Image Transformation | $0.0997037090909091 | 23.41138s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-texture-apply/llms.txt) |
| Qwen Image Edit Plus Texture Extract | qwen-image-edit-plus-texture-extract | Extract seamless, tileable textures from photographs. | Image-to-Image Transformation | $0.10225743999999999 | 23.38367s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-texture-extract/llms.txt) |
| Relighting | ic-light | Prompts to auto-magically relight your images. | Image-to-Image Transformation | $0.036346569891959085 | 30.6633s | [llms.txt](https://www.segmind.com/models/ic-light/llms.txt) |
| Runway Gen 4 Image | runway-gen4-image | Runway's Gen-4 Image API enables precise, multimodal image generation for innovative creative and technical applications | Image-to-Image Transformation | $0.1 | 30.6423s | [llms.txt](https://www.segmind.com/models/runway-gen4-image/llms.txt) |
| Sam V2 Image | sam-v2-image | SAM v2, the next-gen segmentation model from Meta AI, revolutionizes computer vision. Building on SAM's success, it exce | Image-to-Image Transformation | $0.00174482463091723 | 1.6846s | [llms.txt](https://www.segmind.com/models/sam-v2-image/llms.txt) |
| Sam3 Image | sam3-image | Precise object segmentation and tracking in images. | Image-to-Image Transformation | $0.006994602857424377 | 4.80757s | [llms.txt](https://www.segmind.com/models/sam3-image/llms.txt) |
| SD Outpainting | sd1.5-outpaint | Stable Diffusion Outpainting can extend any image in any direction | Image-to-Image Transformation | $0.01091418242906789 | 4.34742s | [llms.txt](https://www.segmind.com/models/sd1.5-outpaint/llms.txt) |
| SD3 Medium Canny Controlnet | sd3-med-canny | Stable Diffusion 3 (SD3) Medium Canny ControlNet uses Canny edge detection to provide fine-grained control over the gene | Image-to-Image Transformation | $0.006618628908091122 | 8.78311s | [llms.txt](https://www.segmind.com/models/sd3-med-canny/llms.txt) |
| SD3 Medium Pose Controlnet | sd3-med-pose | Stable Diffusion 3 (SD3) Pose ControlNet is a large generative image model tailored for generating images based on text  | Image-to-Image Transformation | $0.01660999798816568 | 18.18351s | [llms.txt](https://www.segmind.com/models/sd3-med-pose/llms.txt) |
| SD3 Medium Tile Controlnet | sd3-med-tile | SD3 Medium Tile ControlNet is a large generative image model designed for generating detailed images based on textual pr | Image-to-Image Transformation | $0.0076904571630204656 | 8.94072s | [llms.txt](https://www.segmind.com/models/sd3-med-tile/llms.txt) |
| SDXL Controlnet | sdxl-controlnet | SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept | Image-to-Image Transformation | $0.012376808212101168 | 10.90488s | [llms.txt](https://www.segmind.com/models/sdxl-controlnet/llms.txt) |
| SDXL Img2Img | sdxl-img2img | SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to ge | Image-to-Image Transformation | $0.019466743047955162 | 8.39108s | [llms.txt](https://www.segmind.com/models/sdxl-img2img/llms.txt) |
| SDXL-Openpose | sdxl-openpose | This model leverages SDXL to generate the images with ControlNet conditioned on Human Pose Estimation. | Image-to-Image Transformation | $0.008039886633784986 | 8.32961s | [llms.txt](https://www.segmind.com/models/sdxl-openpose/llms.txt) |
| SeedEdit 3.0 i2i | seededit-v3 | SeedEdit 3.0 enables seamless, high-quality image edits through advanced AI-driven techniques. | Image-to-Image Transformation | $0.049999999999999975 | 10.85232s | [llms.txt](https://www.segmind.com/models/seededit-v3/llms.txt) |
| Seedream 4.0 (4k) | seedream-4 | Seedream 4.0 generates high-resolution, professional-grade visuals with superior text rendering for impactful design. | Image-to-Image Transformation | $0.03506033812070034 | 20.65216s | [llms.txt](https://www.segmind.com/models/seedream-4/llms.txt) |
| Seedream 4.5 | seedream-4.5 | Photorealistic image generation with precise text understanding. | Image-to-Image Transformation | $0.040010253956318124 | 31.74427s | [llms.txt](https://www.segmind.com/models/seedream-4.5/llms.txt) |
| Seedream 5.0 Lite: Image-to-Image | seedream-v5-lite-image-to-image | Transform images intelligently with detailed text prompts. | Image-to-Image Transformation | $0.03499999999999999 | 47.05796s | [llms.txt](https://www.segmind.com/models/seedream-v5-lite-image-to-image/llms.txt) |
| Segment Anything Model | sam-img2img | The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it c | Image-to-Image Transformation | $0.007594711676599745 | 3.80742s | [llms.txt](https://www.segmind.com/models/sam-img2img/llms.txt) |
| Segmind Beyond: Outpaint with Ease | seg-beyond | Effortlessly expand your visuals with AI Image Extend. Intelligently add pixels to any side of your image. | Image-to-Image Transformation | $0.036383991866913115 | 25.37701s | [llms.txt](https://www.segmind.com/models/seg-beyond/llms.txt) |
| Segmind FaceSwap Comic v1 | faceswap-comic | FaceSwap Comic v1 is an AI-powered face swapping model designed to blend real faces into illustrated or cartoon-style im | Image-to-Image Transformation | $0.07460677699789497 | 21.1779s | [llms.txt](https://www.segmind.com/models/faceswap-comic/llms.txt) |
| Segmind Faceswap v4 | faceswap-v4 | Segmind FaceSwap v4 enables fast and precise face or head swapping between images with customizable options for style, o | Image-to-Image Transformation | $0.1235410645315008 | 32.87619s | [llms.txt](https://www.segmind.com/models/faceswap-v4/llms.txt) |
| Segmind Faceswap v5 | faceswap-v5 | Ultra-fast face and head swapping in images. | Image-to-Image Transformation | $0.049976330591144646 | 9.65348s | [llms.txt](https://www.segmind.com/models/faceswap-v5/llms.txt) |
| Segmind Relighting | segmind-relighting | Prompts to auto-magically relight your images. | Image-to-Image Transformation | $0.059251471825063066 | 10.33768s | [llms.txt](https://www.segmind.com/models/segmind-relighting/llms.txt) |
| Segmind Relighting V2 | segmind-relighting-v2 | Transform images with customizable, photorealistic lighting for unparalleled visual creativity and authenticity. | Image-to-Image Transformation | $0.25820446236559136 | 70.57164s | [llms.txt](https://www.segmind.com/models/segmind-relighting-v2/llms.txt) |
| Segmind SceneCraft v0.1 | segmind-scenecraft-v01 | SceneCraft transforms plain or existing product images into visually rich, photorealistic scenes. Whether starting from  | Image-to-Image Transformation | $0.33926112521739127 | 33.64033s | [llms.txt](https://www.segmind.com/models/segmind-scenecraft-v01/llms.txt) |
| Segmind SegFit v1.1 | segfit-v1.1 | Segmind's Fashion and Immersive Try-on model. SegFIT offers effortless AI virtual try-on from just a product image. No m | Image-to-Image Transformation | $0.4521986935392882 | 68.96038s | [llms.txt](https://www.segmind.com/models/segfit-v1.1/llms.txt) |
| Segmind SegFit v1.2 | segfit-v1.2 | SegFit v1.2 creates hyper-realistic virtual try-on images, transforming fashion retail engagement and conversion rates. | Image-to-Image Transformation | $0.09198869015011549 | 51.81403s | [llms.txt](https://www.segmind.com/models/segfit-v1.2/llms.txt) |
| Segmind SegFit v1.3 | segfit-v1.3 | SegFit v1.3 enables hyper-realistic virtual try-ons, enhancing online fashion retail experiences without physical photos | Image-to-Image Transformation | $0.21953233037952966 | 37.14528s | [llms.txt](https://www.segmind.com/models/segfit-v1.3/llms.txt) |
| Segmind SegSwap v0.1 | seg-swap | Swap Objects Instantly. The Segmind SegSwap v0.1 model enables dynamic and precise image editing by allowing users to re | Image-to-Image Transformation | $0.2872055589597435 | 26.39516s | [llms.txt](https://www.segmind.com/models/seg-swap/llms.txt) |
| Skin Contrast Upscaler | skin-contrast-upscaler | Enhances skin detail in images while preserving background quality for professional photography and art. | Image-to-Image Transformation | $0.013142800174367916 | 3.67532s | [llms.txt](https://www.segmind.com/models/skin-contrast-upscaler/llms.txt) |
| SSD Img2Img | ssd-img2img | This model uses SSD-1B to generate images by passing a text prompt and an initial image to condition the generation
  | Image-to-Image Transformation | $0.0033121782447356482 | 3.99695s | [llms.txt](https://www.segmind.com/models/ssd-img2img/llms.txt) |
| SSD-Canny | ssd-canny | This model leverages SSD-1B to generate the images with ControlNet conditioned on Canny Images
  | Image-to-Image Transformation | $0.006194010348729142 | 5.95972s | [llms.txt](https://www.segmind.com/models/ssd-canny/llms.txt) |
| SSD-Depth | ssd-depth | This model leverages SSD-1B to generate the images with ControlNet conditioned on Depth Estimation | Image-to-Image Transformation | $0.009080595711003319 | 10.7305s | [llms.txt](https://www.segmind.com/models/ssd-depth/llms.txt) |
| Stable Diffusion img2img | sd1.5-img2img | This model uses diffusion-denoising mechanism as first proposed by SDEdit, Stable Diffusion is used for text-guided imag | Image-to-Image Transformation | $0.0037053834433711354 | 7.69591s | [llms.txt](https://www.segmind.com/models/sd1.5-img2img/llms.txt) |
| Story Diffusion | storydiffusion | Story Diffusion turns your written narratives into stunning image sequences. | Image-to-Image Transformation | $0.16542968855203616 | 118.53637s | [llms.txt](https://www.segmind.com/models/storydiffusion/llms.txt) |
| Supir Photo-Realistic Image Restoration | supir | SUPIR restores and enhances images to stunning, photo-realistic quality with advanced AI techniques. | Image-to-Image Transformation | $5 | - | [llms.txt](https://www.segmind.com/models/supir/llms.txt) |
| Text Overlay | text-overlay | Elevate your visuals withText Overlay Model. Easily add customized text to any image, perfect for social media, marketin | Image-to-Image Transformation | $0.0011389110764587526 | 2.11004s | [llms.txt](https://www.segmind.com/models/text-overlay/llms.txt) |
| Topaz Labs Image Upscale | topaz-image-upscale | Topaz Labs image upscale is an industry-leading AI photo upscaler designed to increase the resolution of photos while pr | Image-to-Image Transformation | $0.3745939754385965 | 23.43411s | [llms.txt](https://www.segmind.com/models/topaz-image-upscale/llms.txt) |
| Transparent Background Maker | transparent-background-maker | Transform your images with Transparent Background Maker. Quickly remove backgrounds using AI technology, supporting PNG  | Image-to-Image Transformation | $0.002850137474774654 | 1.36711s | [llms.txt](https://www.segmind.com/models/transparent-background-maker/llms.txt) |
| Word2img | w2imgsd1.5-img2img | Create beautifully designed words using Segmind’s word to image for your marketing purposes | Image-to-Image Transformation | $0.0047461707694165 | 10.00899s | [llms.txt](https://www.segmind.com/models/w2imgsd1.5-img2img/llms.txt) |

## Text-to-Audio Generation

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| 3B Orpheus TTS (0.1) | orpheus-3b-0.1 | Orpheus TTS is an open-source text-to-speech (TTS) system powered by the Llama 3B language model, designed for high-qual | Text-to-Audio Generation | $0.12419483106004443 | 117.62101s | [llms.txt](https://www.segmind.com/models/orpheus-3b-0.1/llms.txt) |
| Ace Step Music | ace-step-music | ACE-Step generates high-quality music rapidly, enhancing the creative process for developers and artists worldwide. | Text-to-Audio Generation | $0.035132896583850944 | 11.79223s | [llms.txt](https://www.segmind.com/models/ace-step-music/llms.txt) |
| Chatterbox TTS | chatterbox-tts | Chatterbox transforms text into rich, natural speech with adjustable emotional expressiveness for diverse applications. | Text-to-Audio Generation | $0.0199414375 | 18.03554s | [llms.txt](https://www.segmind.com/models/chatterbox-tts/llms.txt) |
| Chatterbox Turbo TTS | chatterbox-turbo-tts | Ultra-fast, human-quality TTS with emotional expression. | Text-to-Audio Generation | $0.0208593132664437 | 13.39185s | [llms.txt](https://www.segmind.com/models/chatterbox-turbo-tts/llms.txt) |
| Dia (Text to Speech) | dia | Dia by Nari Labs is an advanced open-weights TTS model that brings scripts to life with natural speech, emotions, and no | Text-to-Audio Generation | $0.06975758289779323 | 89.54892s | [llms.txt](https://www.segmind.com/models/dia/llms.txt) |
| Elevenlabs Dialogue | elevenlabs-dialogue | Immersive, emotionally expressive multi-speaker audio dialogue. | Text-to-Audio Generation | $0.018704464285714283 | 6.76155s | [llms.txt](https://www.segmind.com/models/elevenlabs-dialogue/llms.txt) |
| ElevenLabs Dubbing | dubbing | Instantly dubs audio and video into 29 languages while preserving each speaker's original voice. | Text-to-Audio Generation | $0.24967988200000005 | 92.70439s | [llms.txt](https://www.segmind.com/models/dubbing/llms.txt) |
| Elevenlabs Sound Generation | sound-generation | Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using | Text-to-Audio Generation | $0.026501807981803144 | 7.82464s | [llms.txt](https://www.segmind.com/models/sound-generation/llms.txt) |
| Elevenlabs Text To Speech  | tts-eleven-labs | ElevenLabs TTS transforms text into captivating, human-like speech for diverse applications. | Text-to-Audio Generation | $0.09539751199014937 | 12.2895s | [llms.txt](https://www.segmind.com/models/tts-eleven-labs/llms.txt) |
| Gemini TTS 2.5 Flash | gemini-2.5-flash-tts | Fast, lifelike text-to-speech with expressive emotional tones. | Text-to-Audio Generation | $0.004979537479131886 | 17.60247s | [llms.txt](https://www.segmind.com/models/gemini-2.5-flash-tts/llms.txt) |
| Gemini TTS 2.5 Pro | gemini-2.5-pro-tts | Human-like speech synthesis with rich expressive emotional depth. | Text-to-Audio Generation | $0.020483156066945608 | 32.569s | [llms.txt](https://www.segmind.com/models/gemini-2.5-pro-tts/llms.txt) |
| Lyria 2 | lyria-2 | Lyria 2 by Google DeepMind is an advanced model that generates high-fidelity 48kHz stereo instrumental music from text p | Text-to-Audio Generation | $0.08999999999999997 | 27.23475s | [llms.txt](https://www.segmind.com/models/lyria-2/llms.txt) |
| Meta MusicGen Medium | meta-musicgen-medium | MusicGen: Transform text into music with AI. Create unique, high-quality audio from simple descriptions. Experience the  | Text-to-Audio Generation | $0.04053212388675097 | 22.29395s | [llms.txt](https://www.segmind.com/models/meta-musicgen-medium/llms.txt) |
| Minimax Music-01 | minimax-music-01 | Generate up to 60 seconds of music with both accompaniment and vocals in a single pass, with vocals from lyrics and a re | Text-to-Audio Generation | $0.07049529162790698 | 44.29378s | [llms.txt](https://www.segmind.com/models/minimax-music-01/llms.txt) |
| MyShell Text To Speech | myshell-tts | MyShell's Voice Cloning and Text to Speech - Transform your audio content with realistic, personalized voices. Experienc | Text-to-Audio Generation | $0.006335910745629597 | 7.0019s | [llms.txt](https://www.segmind.com/models/myshell-tts/llms.txt) |
| Openvoice | openvoice | OpenVoice is a versatile voice cloning model that supports multiple languages and offers precise tone replication, flexi | Text-to-Audio Generation | $0.008993625410928472 | 10.20667s | [llms.txt](https://www.segmind.com/models/openvoice/llms.txt) |
| Sam Audio Large | sam-audio-large | Isolate any described sound from mixed audio tracks. | Text-to-Audio Generation | $0.062476201587301584 | 12.91627s | [llms.txt](https://www.segmind.com/models/sam-audio-large/llms.txt) |
| Veena TTS | veena-tts | Veena transforms text into high-fidelity, expressive speech in Hindi and English for real-time applications. | Text-to-Audio Generation | $0.055781026515151516 | 45.2031s | [llms.txt](https://www.segmind.com/models/veena-tts/llms.txt) |
| VeenaMax TTS | veena-max-tts | VeenaMAX transforms text into expressive, real-time speech across multiple Indian languages for seamless communication. | Text-to-Audio Generation | $0.017146847682119208 | 12.95526s | [llms.txt](https://www.segmind.com/models/veena-max-tts/llms.txt) |

## voice

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Kling Create Voice | kling-create-voice | Clone any voice from a single audio sample. | voice | $0.007 | 26.02065s | [llms.txt](https://www.segmind.com/models/kling-create-voice/llms.txt) |

## imageTo3d

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Hunyuan-3d 2mv | hunyuan3d-2mv | Hunyuan3D-2mv is finetuned from Hunyuan3D-2 to support multiview controlled shape generation. | imageTo3d | $0.3088497100628931 | 100.38197s | [llms.txt](https://www.segmind.com/models/hunyuan3d-2mv/llms.txt) |
| Hunyuan3D-2 | hunyuan-3d-2 | Hunyuan3D 2.0 enables the creation of high-quality 3D models with intricate details. Produce assets that are visually ap | imageTo3d | $0.3450042239694657 | 36.90529s | [llms.txt](https://www.segmind.com/models/hunyuan-3d-2/llms.txt) |
| Hunyuan3d-2.1 | hunyuan3d-2.1 | Transform 2D images into photorealistic, high-fidelity 3D assets effortlessly. | imageTo3d | $0.15685504241842613 | 149.83768s | [llms.txt](https://www.segmind.com/models/hunyuan3d-2.1/llms.txt) |
| Sam 3D Body | sam-3d-body | Reconstruct 3D human body meshes from a single photo. | imageTo3d | $0.019317541176470585 | 10.12074s | [llms.txt](https://www.segmind.com/models/sam-3d-body/llms.txt) |
| Sam 3D Object | sam-3d-objects | Single 2D image into detailed 3D object models. | imageTo3d | $0.06384196985583222 | 33.30635s | [llms.txt](https://www.segmind.com/models/sam-3d-objects/llms.txt) |

## Image-to-Text (Vision)

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Bria Fibo | bria-fibo-generate | Photorealistic images from structured prompts with brand control. | Image-to-Text (Vision) | $0.039999999999999994 | 21.60291s | [llms.txt](https://www.segmind.com/models/bria-fibo-generate/llms.txt) |
| Bria Fibo Structured Prompt | bria-fibo-generate-structured-prompt | Convert complex inputs into structured JSON prompts for generation. | Image-to-Text (Vision) | $0.01 | 12.50573s | [llms.txt](https://www.segmind.com/models/bria-fibo-generate-structured-prompt/llms.txt) |
| Bria Mask Generator | bria-mask-generator | Bria AI Get Masks automatically generates accurate object masks for advanced image editing and enhancement. | Image-to-Text (Vision) | $0.001217757009345794 | 6.85549s | [llms.txt](https://www.segmind.com/models/bria-mask-generator/llms.txt) |
| Bria Prompt Enhancer | bria-prompt-enhancer | Bria AI generates high-quality, commercially safe images tailored to diverse creative needs. | Image-to-Text (Vision) | $0.018395061728395064 | 3.63199s | [llms.txt](https://www.segmind.com/models/bria-prompt-enhancer/llms.txt) |
| Google Translate | google-translate | Translate effortlessly with the powerful Google Translation AI model. | Image-to-Text (Vision) | $0.005777320675105487 | 0.65903s | [llms.txt](https://www.segmind.com/models/google-translate/llms.txt) |
| Ideogram Describe | ideogram-describe | Ideogram describe can effortlessly generate detailed prompts from images. Perfect for refining creations or replicating  | Image-to-Text (Vision) | $0.015000000000000003 | 3.93242s | [llms.txt](https://www.segmind.com/models/ideogram-describe/llms.txt) |
| Image Converter | image-converter | Convert images between formats instantly. | Image-to-Text (Vision) | $0.068 | 5.33057s | [llms.txt](https://www.segmind.com/models/image-converter/llms.txt) |
| Image resizer | image-resizer | Resize images to any dimension quickly and precisely. | Image-to-Text (Vision) | $0.03333333333333333 | 3.76637s | [llms.txt](https://www.segmind.com/models/image-resizer/llms.txt) |
| Json Extractor | json-extractor | Json Extractor | Image-to-Text (Vision) | $0.0001 | 0.00203s | [llms.txt](https://www.segmind.com/models/json-extractor/llms.txt) |
| LLAVA 1.6 7B | llava-v1.6 | LLaVa translates images into text descriptions & captions. | Image-to-Text (Vision) | $0.005302737698586939 | 3.58934s | [llms.txt](https://www.segmind.com/models/llava-v1.6/llms.txt) |
| Sam V2.1 Hiera Large | sam-v21-hiera-large | Meta's next-gen segmentation model for images and video. | Image-to-Text (Vision) | $0.036773737623762376 | 25.01516s | [llms.txt](https://www.segmind.com/models/sam-v21-hiera-large/llms.txt) |
| Video Speed Change | video-speed-change | Speed up or slow down any video precisely. | Image-to-Text (Vision) | $0.042207800000000004 | 30.29047s | [llms.txt](https://www.segmind.com/models/video-speed-change/llms.txt) |

## Audio-to-Text (Transcription)

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Elevenlabs Dialogue With Timing | elevenlabs-dialogue-with-timestamps | Multi-speaker dialogue with expressive timestamps included. | Audio-to-Text (Transcription) | $0.01445625 | 2.49052s | [llms.txt](https://www.segmind.com/models/elevenlabs-dialogue-with-timestamps/llms.txt) |
| Elevenlabs Forced Alignment | elevenlabs-forced-alignment | Precise audio-text synchronization with word-level timestamps. | Audio-to-Text (Transcription) | $0.09999999999999999 | 0.70002s | [llms.txt](https://www.segmind.com/models/elevenlabs-forced-alignment/llms.txt) |
| Elevenlabs Transcript | eleven-labs-transcript | Transcribe audio to accurate text in 99 languages with speaker diarization and word-level timestamps. | Audio-to-Text (Transcription) | $0.0034717734729493893 | 7.72543s | [llms.txt](https://www.segmind.com/models/eleven-labs-transcript/llms.txt) |
| Elevenlabs Voice Cloning | elevenlabs-voice-clone | Hyper-realistic voice cloning from short audio samples. | Audio-to-Text (Transcription) | $0.010000000000000002 | 4.67932s | [llms.txt](https://www.segmind.com/models/elevenlabs-voice-clone/llms.txt) |
| Elevenlabs Voice Design | elevenlabs-voice-design | Generate unique synthetic voices without audio samples. | Audio-to-Text (Transcription) | $0.01 | 22.92416s | [llms.txt](https://www.segmind.com/models/elevenlabs-voice-design/llms.txt) |
| TTS Elevenlabs With Timing | tts-elevenlabs-with-timestamps | Emotionally expressive TTS with word-level timestamp output. | Audio-to-Text (Transcription) | $0.05918507462686567 | 5.37361s | [llms.txt](https://www.segmind.com/models/tts-elevenlabs-with-timestamps/llms.txt) |

## audioToAudio

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Elevenlabs Audio Isolation | elevenlabs-audio-isolation | Extract clear speech from noisy audio and video. | audioToAudio | $0.13456178571428573 | 5.28191s | [llms.txt](https://www.segmind.com/models/elevenlabs-audio-isolation/llms.txt) |
| Elevenlabs Speech To Speech | sts-eleven-labs | Eleven Labs Speech-to-Speech offers AI-powered voice conversion for content creators, media professionals, and anyone se | audioToAudio | $0.018750861111111114 | 6.45038s | [llms.txt](https://www.segmind.com/models/sts-eleven-labs/llms.txt) |

## videoToImage

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Frame extractor | frame-extractor | Extract individual frames from any video as images. | videoToImage | $0.0054555857142857146 | 26.50857s | [llms.txt](https://www.segmind.com/models/frame-extractor/llms.txt) |
| Start & End Frame Extractor | start-end-frame-extractor | Extract first and last frames from any video. | videoToImage | $0.004668401041666667 | 4.92903s | [llms.txt](https://www.segmind.com/models/start-end-frame-extractor/llms.txt) |

## textToEmbed

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Text Embedding 3 Large | text-embedding-3-large | Text-embedding-3-large is a robust language model by OpenAI designed for generating high-dimensional text embeddings for | textToEmbed | $0.00001782294162415086 | 1.47092s | [llms.txt](https://www.segmind.com/models/text-embedding-3-large/llms.txt) |
| Text Embedding 3 Small | text-embedding-3-small | Text-embedding-3-small is a compact and efficient model developed for generating high-quality text embeddings. These emb | textToEmbed | $0.000023989964637293316 | 1.25323s | [llms.txt](https://www.segmind.com/models/text-embedding-3-small/llms.txt) |

## imageTOImage

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Stable Diffusion 3 Medium Image to Image | sd3-med-img2img | Stable Diffusion 3 Medium image-to-image is a cutting-edge AI tool that uses advanced image-to-image technology to trans | imageTOImage | $0.0075827782666080075 | 7.43717s | [llms.txt](https://www.segmind.com/models/sd3-med-img2img/llms.txt) |

## Image Inpainting

| Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs |
| --- | --- | --- | --- | --- | --- | --- |
| Fooocus Inpainting | focus-inpaint | Fooocus Inpainting is a powerful image generation model that allows you to selectively edit and enhance images. | Image Inpainting | $0.024867879544198775 | 17.92299s | [llms.txt](https://www.segmind.com/models/focus-inpaint/llms.txt) |
| SDXL Inpaint | sdxl-inpaint | This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting | Image Inpainting | $0.006894747349528049 | 8.54611s | [llms.txt](https://www.segmind.com/models/sdxl-inpaint/llms.txt) |
| Stable Diffusion Inpainting | sd1.5-inpainting | Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given | Image Inpainting | $0.001819761720020576 | 2.87848s | [llms.txt](https://www.segmind.com/models/sd1.5-inpainting/llms.txt) |
| Try-On Diffusion | try-on-diffusion | Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on | Image Inpainting | $0.011084928515494282 | 7.6227s | [llms.txt](https://www.segmind.com/models/try-on-diffusion/llms.txt) |

Document

llms-full.txt

Not stored for this site.