📚
Docs - Float16
homeapp
  • 🚀GETTING STARTED
    • Introduction
    • Account
      • Dashboard
      • Profile
      • Payment
      • Workspace
      • Service Quota
    • LLM as a service
      • Quick Start
        • Set the credentials
      • Supported Model
      • Limitation
      • API Reference
    • One Click Deploy
      • Quick Start
        • Instance Detail
        • Re-generate API Key
        • Terminate Instance
      • Features
        • OpenAI Compatible
        • Long context and Auto scheduler
        • Quantization
        • Context caching
      • Limitation
      • Validated model
      • Endpoint Specification
    • Serverless GPU
      • Quick Start
        • Mode
        • Task Status
        • App Features
          • Project Detail
      • Tutorials
        • Hello World
        • Install new library
        • Prepare model weight
        • S3 Copy output from remote
        • R2 Copy output from remote
        • Direct upload and download
        • Server mode
        • LLM Dynamic Batching
        • Train and Inference MNIST
        • Etc.
      • CLI References
      • ❓FAQ
    • Playground
      • FloatChat
      • FloatPrompt
      • Quantize by Float16
  • 📚Use Case
    • Q&A Bot (RAG)
    • Text-to-SQL
    • OpenAI with Rate Limit
    • OpenAI with Guardrail
    • Multiple Agents
    • Q&A Chatbots (RAG + Agents)
  • ✳️Journey
    • ✨The Beginner's LLM Development Journey
    • 📖Glossary
      • [English Version] LLM Glossary
      • [ภาษาไทย] LLM Glossary
    • 🧠How to install node
  • Prompting
    • 📚Variable
    • ⛓️Condition
    • 🔨Demonstration
    • ⌛Loop
    • 📙Formatting
    • 🐣Chat
    • 🔎Technical term (Retrieve)
  • Privacy Policy
  • Terms & Conditions
Powered by GitBook
On this page
  • Setting up API Key
  • Quickly test API
  • Using the Chat API
  • OpenAI
  • LangChain
  1. GETTING STARTED
  2. LLM as a service

Quick Start

LLM as a service quick start

PreviousLLM as a serviceNextSet the credentials

Last updated 8 months ago

Setting up API Key

After accessing LLM as a service, you need to set up your API key. Learn how to set your API key .

Quickly test API

To quickly try the API using cURL, use the following command:

curl -X POST https://api.float16.cloud/v1/chat/completions -d 

  '{
    "model": "seallm-7b-v3",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "สวัสดี"
      }
    ]
   }'

  -H "Content-Type: application/json" 
  -H "Authorization: Bearer <float16-api-key>"

Paste this in your terminal to see the response.

Using the Chat API

Our API is compatible with OpenAI, allowing integration with your chat UI using OpenAI or LangChain libraries.

OpenAI

  1. Install the OpenAI package:

pip install openai
  1. Use this Python code snippet (example using SeaLLM-7B-v2.5 model):

import httpx
import openai

FLOAT16_BASE_URL = "https://api.float16.cloud/v1/"
FLOAT16_API_KEY = "<your API key>"

client = openai.OpenAI(
    api_key=FLOAT16_API_KEY,
    base_url=FLOAT16_BASE_URL,
)
client._base_url = httpx.URL(FLOAT16_BASE_URL)

# Streaming chat:
messages = [{"role": "system", "content": "You are truly awesome."}]

while True:
    content = input(f"User:")
    messages.append({"role": "user", "content": content})
    print(f"Assistant:", sep="", end="", flush=True)
    content = ""

    for chunk in client.chat.completions.create(
        messages=messages,
        model="seallm-7b-v3",
        stream=True,
    ):
        delta_content = chunk.choices[0].delta.content
        if delta_content:
            print(delta_content, sep="", end="", flush=True)
            content += delta_content
    
    messages.append({"role": "assistant", "content": content})
    print("\n")

LangChain

To use Float16.cloud with the LangChain, follow these steps:

  1. Install the LangChain package:

pip install langchain langchain_community

or

conda install langchain langchain_community -c conda-forge
  1. Use this Python code snippet (example using SeaLLM-7B-v2.5 model):

from langchain_community.chat_models import ChatOpenAI
from langchain.schema import HumanMessage

FLOAT16_BASE_URL = "https://api.float16.cloud/v1/"
FLOAT16_API_KEY = "<your API key>"

chat = ChatOpenAI(
    model="seallm-7b-v3",
    api_key=FLOAT16_API_KEY,
    base_url=FLOAT16_BASE_URL,
    streaming=True,
)

# Simple invocation:
print(chat.invoke([HumanMessage(content="Hello")]))

# Streaming invocation:
for chunk in chat.stream("Write me a blog about how to start to raise cats"):
    print(chunk.content, end="", flush=True)

For Further Assistance:

For more information on the OpenAI library, visit the .

For more information on the LangChain library, visit the .

If you need additional help, feel free to contact us at .

🚀
here
OpenAI docs
LangChain docs
support@float16.cloud