📚
Docs - Float16
homeapp
  • 🚀GETTING STARTED
    • Introduction
    • Account
      • Dashboard
      • Profile
      • Payment
      • Workspace
      • Service Quota
    • LLM as a service
      • Quick Start
        • Set the credentials
      • Supported Model
      • Limitation
      • API Reference
    • One Click Deploy
      • Quick Start
        • Instance Detail
        • Re-generate API Key
        • Terminate Instance
      • Features
        • OpenAI Compatible
        • Long context and Auto scheduler
        • Quantization
        • Context caching
      • Limitation
      • Validated model
      • Endpoint Specification
    • Serverless GPU
      • Quick Start
        • Mode
        • Task Status
        • App Features
          • Project Detail
      • Tutorials
        • Hello World
        • Install new library
        • Prepare model weight
        • S3 Copy output from remote
        • R2 Copy output from remote
        • Direct upload and download
        • Server mode
        • LLM Dynamic Batching
        • Train and Inference MNIST
        • Etc.
      • CLI References
      • ❓FAQ
    • Playground
      • FloatChat
      • FloatPrompt
      • Quantize by Float16
  • 📚Use Case
    • Q&A Bot (RAG)
    • Text-to-SQL
    • OpenAI with Rate Limit
    • OpenAI with Guardrail
    • Multiple Agents
    • Q&A Chatbots (RAG + Agents)
  • ✳️Journey
    • ✨The Beginner's LLM Development Journey
    • 📖Glossary
      • [English Version] LLM Glossary
      • [ภาษาไทย] LLM Glossary
    • 🧠How to install node
  • Prompting
    • 📚Variable
    • ⛓️Condition
    • 🔨Demonstration
    • ⌛Loop
    • 📙Formatting
    • 🐣Chat
    • 🔎Technical term (Retrieve)
  • Privacy Policy
  • Terms & Conditions
Powered by GitBook
On this page
  • Overview
  • Usage & Cost
  • Activity
  • Deployments
  • Setting
  1. GETTING STARTED
  2. One Click Deploy
  3. Quick Start

Instance Detail

Your instance detail

PreviousQuick StartNextRe-generate API Key

Last updated 7 months ago

When you initiate deployment, the instance detail page becomes available. This page is divided into five sections, each providing crucial information about your deployed instance.

Overview

The Overview section contains essential instance information:

  • Model details (batch size, max input length, number of tokens)

  • Instance configuration (cloud provider, region, GPU type)

  • Endpoint and API key (visible after successful deployment)

Usage & Cost

This section displays real-time usage and cost information:

  • Each row represents usage and cost for a specific pricing period

  • New rows are added when pricing changes

Example: If L4 GPU costs 1.00/hour in September and increases to 1.20/hour in October, you'll see separate rows for September and October usage.

Activity

Deployments

This section displays the deployment status with the following possible states:

  1. Initial: Checking model and limitations

  2. Allocate: Allocating resources

  3. Running: Deploy successful, ready to use

  4. Terminated: Instance shut down (will still show 3 checked statuses)

If deployment fails, system will automatically terminate the instance

Setting

You can regenerate the API key for security purposes.

A playground is also provided for quick model testing.

For a comprehensive view of all instance costs, visit the .

The Activity section shows monthly instance activity.

Currently, the Setting section offers the option to terminate the instance.

🚀
Learn more about API key management.
See how to use the playground.
payment settings
Learn more about activity dashboard.
Learn how to terminate an instance.
Overview
usage and cost log
deployment status