Limitation

Supported Model

  • Llama (1, 2, 3, 3.1)

  • Mistral

  • Qwen, Qwen2

  • Gemma, Gemma2

Incoming model

  • RecurrentGemma

  • Mamba

Regions

Currently, we offer services across 5 AWS regions:

  • North Virginia (us-east-1)

  • Oregon (us-west-2)

  • Tokyo (ap-northeast-1)

  • Sydney (ap-southeast-2)

  • Jakarta (ap-southeast-3)

GPU Types

We currently support 3 types of GPU instances:

  • NVIDIA L4

  • NVIDIA L40s

  • NVIDIA A10

Incoming GPU

  • NVIDIA H100

  • NVIDIA H200

Multi-GPU Support

At the moment, We do not support Multi-GPU Deployment.

Please note that our regional coverage and GPU options are subject to expansion in the future. We continuously strive to enhance our service offerings to meet evolving customer needs.

Last updated