io.github.rsmdt

multimodal

Multi-provider media generation — images, video, audio, and transcription via a unified interface

stdiocommunityservice

Package Details

Transportstdio

Environment Variables

OPENAI_API_KEY(str)
Secret

OpenAI API key for image, video, audio generation and transcription

XAI_API_KEY(str)
Secret

xAI API key for image and video generation

GEMINI_API_KEY(str)
Secret

Google Gemini API key for image, video, and audio generation

ELEVENLABS_API_KEY(str)
Secret

ElevenLabs API key for audio generation and transcription

BFL_API_KEY(str)
Secret

BFL API key for FLUX image generation and editing

MEDIA_OUTPUT_DIR(str)

Directory for saved media files (defaults to cwd)