📚
Docs - Float16
homeapp
  • 🚀GETTING STARTED
    • Introduction
    • Account
      • Dashboard
      • Profile
      • Payment
      • Workspace
      • Service Quota
    • LLM as a service
      • Quick Start
        • Set the credentials
      • Supported Model
      • Limitation
      • API Reference
    • One Click Deploy
      • Quick Start
        • Instance Detail
        • Re-generate API Key
        • Terminate Instance
      • Features
        • OpenAI Compatible
        • Long context and Auto scheduler
        • Quantization
        • Context caching
      • Limitation
      • Validated model
      • Endpoint Specification
    • Serverless GPU
      • Quick Start
        • Mode
        • Task Status
        • App Features
          • Project Detail
      • Tutorials
        • Hello World
        • Install new library
        • Prepare model weight
        • S3 Copy output from remote
        • R2 Copy output from remote
        • Direct upload and download
        • Server mode
        • LLM Dynamic Batching
        • Train and Inference MNIST
        • Etc.
      • CLI References
      • ❓FAQ
    • Playground
      • FloatChat
      • FloatPrompt
      • Quantize by Float16
  • 📚Use Case
    • Q&A Bot (RAG)
    • Text-to-SQL
    • OpenAI with Rate Limit
    • OpenAI with Guardrail
    • Multiple Agents
    • Q&A Chatbots (RAG + Agents)
  • ✳️Journey
    • ✨The Beginner's LLM Development Journey
    • 📖Glossary
      • [English Version] LLM Glossary
      • [ภาษาไทย] LLM Glossary
    • 🧠How to install node
  • Prompting
    • 📚Variable
    • ⛓️Condition
    • 🔨Demonstration
    • ⌛Loop
    • 📙Formatting
    • 🐣Chat
    • 🔎Technical term (Retrieve)
  • Privacy Policy
  • Terms & Conditions
Powered by GitBook
On this page
  1. GETTING STARTED
  2. One Click Deploy

Endpoint Specification

Endpoint

Available endpoint

/{dedicate}/v1/chat/completions

Method

Post

Header

Autherization : "Bearer {your-apikey}"

Request parameter

Parameter
Description
Required

messages

Messages must be in the same format as the OpenAI API format

[

{

"role" : "system", "user", "assistant"

},

{

"content" : "text"

}

]

Yes

model

The model should be from a Huggingface model repository.

i.e. SeaLLMs/SeaLLMs-v3-1.5B-Chat

Yes

stream

Boolean, By default is False

No

max_tokens

Int, By default is 1024

No

temperature

Float, By default is 0.7

No

repetition_penalty

Float, By default is 1.0

No

end_id

Int, By default uses eos_token_id from config.json of the model repository.

No

top_p

Float, By default is 0.7

No

top_k

Int, By default is 40

No

stop

Array, By default uses eos_token from config.json of the model repository.

No

random_sed

Int, By default is 2

No

/{dedicate}/v1/completions

Method

Post

Header

Autherization : "Bearer {your-apikey}"

Request parameter

Parameter
Description
Required

prompt

A prompt is the raw text and is not modified by applying a chat template.

A prompt must be used when you need to use a base model or a coding model. When using a coding model via the Continue.dev extension, the prompt will automatically be passed to the endpoint.

Yes

model

The model should be from a Huggingface model repository. i.e. SeaLLMs/SeaLLMs-v3-1.5B-Chat

Yes

stream

Boolean, By default is False

No

max_tokens

Int, By default is 1024

No

temperature

Float, By default is 0.7

No

repetition_penalty

Float, By default is 1.0

No

end_id

Int, By default uses eos_token_id from config.json of the model repository.

No

top_p

Float, By default is 0.7

No

top_k

Int, By default is 40

No

stop

Array, By default uses eos_token from config.json of the model repository.

No

random_sed

Int, By default is 2

No

PreviousValidated modelNextServerless GPU

Last updated 7 months ago

🚀