Quick Start

One Click Deploy quick start

Check all instances

When you first access the One-Click Deploy service, you'll be presented with a table displaying all your instances, both active and inactive.

Every account is allocated a quota of 4 GPU cards, with no restrictions on the type of GPU. To check your current quota, simply click on the "Quota" button or navigate to the service quota settings.

Learn more about service quota here

Add new instance

to start new instance:

  1. Paste Hugging Face model repository and token (if required), then click "Next"

  1. Review model name and input instance name.

  2. Configure instance, select region and GPU type.

  3. Review pricing and instance summary.

  4. Click "Start Deploy", wait for "Start instance successfully" notification

  5. Redirected to the instance's deployment section.

We currently use basic optimization techniques. Learn more in the technical section. For model support limitations, check here.

Quickly test API

After successful deployment, test your model using:

Instance Chat Playground

After successfully deploying your model, you can easily test its performance using the Instance Chat Playground. This convenient testing tool is readily accessible from your instance overview, allowing you to immediately interact with your deployed model.

  • Default settings: temperature 0.5, max tokens 512

  • Customize parameters, system prompt, and text message via GUI

cURL

Or use following command:

curl -X POST http://api.float16.cloud/dedicate/JxlkeA5y2c/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <float16-api-key>" \
  -d '{
    "model": "<your model>",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "เธชเธงเธฑเธชเธ”เธต"
      }
    ]
   }'

Find copyable API formats (including OpenAI and LangChain) in the API tab.

Last updated