📚
Docs - Float16
homeapp
  • 🚀GETTING STARTED
    • Introduction
    • Account
      • Dashboard
      • Profile
      • Payment
      • Workspace
      • Service Quota
    • LLM as a service
      • Quick Start
        • Set the credentials
      • Supported Model
      • Limitation
      • API Reference
    • One Click Deploy
      • Quick Start
        • Instance Detail
        • Re-generate API Key
        • Terminate Instance
      • Features
        • OpenAI Compatible
        • Long context and Auto scheduler
        • Quantization
        • Context caching
      • Limitation
      • Validated model
      • Endpoint Specification
    • Serverless GPU
      • Quick Start
        • Mode
        • Task Status
        • App Features
          • Project Detail
      • Tutorials
        • Hello World
        • Install new library
        • Prepare model weight
        • S3 Copy output from remote
        • R2 Copy output from remote
        • Direct upload and download
        • Server mode
        • LLM Dynamic Batching
        • Train and Inference MNIST
        • Etc.
      • CLI References
      • ❓FAQ
    • Playground
      • FloatChat
      • FloatPrompt
      • Quantize by Float16
  • 📚Use Case
    • Q&A Bot (RAG)
    • Text-to-SQL
    • OpenAI with Rate Limit
    • OpenAI with Guardrail
    • Multiple Agents
    • Q&A Chatbots (RAG + Agents)
  • ✳️Journey
    • ✨The Beginner's LLM Development Journey
    • 📖Glossary
      • [English Version] LLM Glossary
      • [ภาษาไทย] LLM Glossary
    • 🧠How to install node
  • Prompting
    • 📚Variable
    • ⛓️Condition
    • 🔨Demonstration
    • ⌛Loop
    • 📙Formatting
    • 🐣Chat
    • 🔎Technical term (Retrieve)
  • Privacy Policy
  • Terms & Conditions
Powered by GitBook
On this page
  • User Flow
  • Frontend Development
  • Backend Development
  1. Use Case

OpenAI with Rate Limit

PreviousText-to-SQLNextOpenAI with Guardrail

Last updated 1 year ago

Use Case Overview

we will refer to the basics demonstrated in the demo created by Vulture Prime. It is a website to generate API keys for users who want to connect to Vulture Prime's OpenAI. Users can generate API keys through the website automatically without the need for admin approval. Users can also choose the desired rate limit per day.

Note: In cases where there is a user dashboard, users need to log in before using the system.

Image

User Flow

To facilitate management, we have created a UI page that allows users to access the API key easily without contacting the admin. Users can obtain the OpenAI API key by simply clicking a few times on the website. They can select the desired rate limit per day.

Frontend Development

Implement the Chat interface

In the process of creating the UI, we chose libraries and frameworks as follows:

  1. Next.js: Start a new project using the command: npx create-next-app my-project.

npx create-next-app my-project
cd my-project
  1. React Hook Form: Install with: yarn add react-hook-form @hookform/resolvers zod. We use Zod for input form validation.

yarn add react-hook-form @hookform/resolvers zod
  1. TanStack Query: Install TanStack Query (React Query) in the project: yarn add react-query.

yarn add react-query

Connect the frontend with the backend

Now, we will implement the Chat Interface with a Streamed Text approach to receive real-time text data from the API and display it.

codeuseEffect(() => {
  const message = { query: watch('query') };
  const getData = async () => {
    try {
      setValue('query', '');
      const response = await fetch(
        `${API_BOT}/query?uuid=${localStorage?.session}&message=${message.query}`,
        {
          method: 'GET',
          headers: {
            Accept: 'text/event-stream',
            'x-api-key': localStorage?.apiKey, // API key
          },
        }
      );
      const reader = response.body!.getReader();
      let result = '';
      while (true) {
        const { done, value } = await reader?.read();
        if (done) {
          setStreamText('');
          setAnswer((prevState) => [
            ...prevState,
            {
              id: (prevState.length + 1).toString(),
              role: 'ai',
              message: result,
            },
          ]);
          break;
        }
        result += new TextDecoder().decode(value);
        setStreamText(
          (prevData) => prevData + new TextDecoder().decode(value)
        );
      }
    } catch (error: any) {
      console.error(error);
      setError('bot', {
        message: error?.response?.data?.message ?? 'Something went wrong',
      });
    }
  };
  if (isSubmitSuccessful) {
    getData();
  }
}, [submitCount, isSubmitSuccessful, setValue, watch, setError]);

This code snippet fetches data from the API using the Streamed Text approach, allowing real-time updates of chat messages.

Implementing API key generation

After setting up the UI interface, we connect it to the backend for API key generation.

// API Plan
// GET {endpoint}/plan
[
  {
    "name": "20RequestPerDay"
  },
  {
    "name": "30RequestPerDay"
  },
  {
    "name": "10RequestPerDay"
  }
]

// API Create Key
// POST {endpoint}/create_key
// Body
{
  "plan_name": "30RequestPerDay",
  "user": "example@email.com"
}

// Response 200 (OK)
{
  "value": "7wuOmY7Osz721X1bQUkAP2aGUF5oOVw28EJx7MVS"
}

After obtaining the API key, we use it in other projects by adding the header:

{
  "x-api-key": "7wuOmY7Osz721X1bQUkAP2aGUF5oOVw28EJx7MVS"
}

Implementing rate limit notification

We handle input validation errors and errors from API usage in the Stream. When the API usage exceeds the limit, the API will return a status code of 429. In this case, we set an error message to notify the user.

if (response.status === 429) {
  setError('bot', {
    message: 'Limit Exceeded',
  });
  return;
}

This ensures that the UI displays an error message when the user exceeds the API usage limit.

Backend Development

Configuring Usage Plans

Begin by creating the usage plans that you want. Create three plans: 10RequestPerDay, 20RequestPerDay, and 30RequestPerDay. Map these plans to the API Gateway you created, indicating which plans should be used by which API Gateway.

Configuring API Gateway to Use API Key

Configure each method in API Gateway to require an API key. Edit the method request in the desired resource and check the "API Key required" option.

Setting up FastAPI Backend

Install the necessary Python libraries for FastAPI:

pip install fastapi
pip install "uvicorn[standard]"

Create a file named app.py and initialize the FastAPI application:

pythonCopy codefrom fastapi import FastAPI
from fastapi.encoders import jsonable_encoder
from fastapi.responses import JSONResponse
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/helloworld")
async def helloworld():
    return {"message": "Hello World"}

This code initializes a FastAPI project, enables CORS for frontend communication, and sets up a simple endpoint. Run the server using:

uvicorn app:app

Connecting to AWS with boto3

Use the boto3 library to interact with AWS services. Make sure that the EC2 instance has sufficient permissions for the intended actions.

pip install boto3

Create a boto3 client for API Gateway:

import boto3

client = boto3.client(
    'apigateway',
    region_name='ap-southeast-1'
)

API Design and Deployment

Design

Create two API endpoints:

  1. To view available usage plans:

def get_plan():
    res = client.get_usage_plans()
    plan_name_list = []
    for i in res['items']:
        plan_name_list.append({
            "name": i['name']
        })
    return JSONResponse(content=plan_name_list, status_code=200)
  1. To create an API key and associate it with a usage plan:

from fastapi import HTTPException

@app.post("/create_key")
async def create_key(create_request: create_request):
    res = module.check_user_exist_in_plan(create_request.user, create_request.plan_name)
    if res:
        raise HTTPException(status_code=409, detail="Key already exists")
    else:
        response = client.create_api_key(
            name=create_request.user,
            description='for use api',
            enabled=True,
            generateDistinctId=True
        )
        api_key = response['value']
        api_id = response['id']
        try:
            module.add_key_to_plan(api_id, create_request.plan_name)
            result = {
                "value": api_key
            }
            return JSONResponse(content=result, status_code=200)
        except Exception as e:
            client.delete_api_key(
                apiKey=api_id
            )
            result = {
                "message": "create failed"
            }
            return JSONResponse(content=result, status_code=500)

This endpoint creates an API key, associates it with the specified usage plan, and returns the generated API key.

Deploy

Use the uvicorn command to run the FastAPI server:

uvicorn app:app --host 0.0.0.0

This command runs the server on port 8000.

Testing API Calls with API Key

To test API calls with an API key through AWS API Gateway, include the key in the header as x-api-key:

{
  "x-api-key": "7wuOmY7Osz721X1bQUkAP2aGUFxxxxxxxx"
}

This key will be read by API Gateway and sent to the backend, which you configured earlier.

Image
📚