📚
Docs - Float16
homeapp
  • 🚀GETTING STARTED
    • Introduction
    • Account
      • Dashboard
      • Profile
      • Payment
      • Workspace
      • Service Quota
    • LLM as a service
      • Quick Start
        • Set the credentials
      • Supported Model
      • Limitation
      • API Reference
    • One Click Deploy
      • Quick Start
        • Instance Detail
        • Re-generate API Key
        • Terminate Instance
      • Features
        • OpenAI Compatible
        • Long context and Auto scheduler
        • Quantization
        • Context caching
      • Limitation
      • Validated model
      • Endpoint Specification
    • Serverless GPU
      • Quick Start
        • Mode
        • Task Status
        • App Features
          • Project Detail
      • Tutorials
        • Hello World
        • Install new library
        • Prepare model weight
        • S3 Copy output from remote
        • R2 Copy output from remote
        • Direct upload and download
        • Server mode
        • LLM Dynamic Batching
        • Train and Inference MNIST
        • Etc.
      • CLI References
      • ❓FAQ
    • Playground
      • FloatChat
      • FloatPrompt
      • Quantize by Float16
  • 📚Use Case
    • Q&A Bot (RAG)
    • Text-to-SQL
    • OpenAI with Rate Limit
    • OpenAI with Guardrail
    • Multiple Agents
    • Q&A Chatbots (RAG + Agents)
  • ✳️Journey
    • ✨The Beginner's LLM Development Journey
    • 📖Glossary
      • [English Version] LLM Glossary
      • [ภาษาไทย] LLM Glossary
    • 🧠How to install node
  • Prompting
    • 📚Variable
    • ⛓️Condition
    • 🔨Demonstration
    • ⌛Loop
    • 📙Formatting
    • 🐣Chat
    • 🔎Technical term (Retrieve)
  • Privacy Policy
  • Terms & Conditions
Powered by GitBook
On this page
  1. GETTING STARTED
  2. LLM as a service

API Reference

API Reference

Chat Completions

POST https://api.float16.cloud/v1/chat/completions

Headers

Name
Type
Description

authorization*

string

Examples: float16-api-123e4567-e89b-12d3-a456-426655440000

Request Body

Name
Type
Description

model*

string

Enum ("SeaLLM-7B-v2.5", "SeaLLM-7B-v3", "SQLCoder-7B-v2")

message*

array of object

role (required) : Enum ("system", "user", "assistant")

content (required) : String

stream

boolean

default : false

max_tokens

integer or null

Max Tokens (integer) or Max Tokens (null)

{
  "id": "string",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "string",
        "role": "assistant",
      },
      "text": "string"
    }
  ],
  "created": 0,
  "model": "string",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 0,
    "prompt_tokens": 0,
    "total_tokens": 0
  }
}
{
  "detail": [
    {
      "loc": [
        "string"
      ],
      "msg": "string",
      "type": "string"
    }
  ]
}

PreviousLimitationNextOne Click Deploy

Last updated 8 months ago

🚀