Mode
Serverless GPU Services
The Serverless GPU service offers two operational modes, Develop and Deploy. Each mode is designed for different use cases to help you efficiently utilize GPU resources for your applications.
Develop Mode
Develop mode is optimized for one-time GPU tasks, immediate results delivery. Suitable for batch processing and analysis tasks. You can execute tasks using this command
Limitations
Maximum execution time: 30 seconds per task
Concurrent tasks: 1 task per user
Deploy Mode
Deploy mode enables continuous GPU application deployment with API endpoint access.
After successful deployment, you will receive:
API endpoints for your application
API key for authentication
Two endpoint types available:
Function Endpoint : Container automatically stop after task completion
Server Endpoint : Container remains active during the time limit (30 seconds), supports multiple requests while container is active and automatic stop after time limit expiration.
Limitations
Applications must be built using FastAPI framework.
Single API key provided per deployment (Regenerate available)
Maximum execution time: 30 seconds per task
Concurrent tasks: 1 task per user
Last updated