Multiple Agents

Use Case Overview

We have developed a Chatbot specialized in website development. The AI system is designed with the first AI acting as a task divider for different agents, including Frontend, Backend, and Designer. Each agent responds to questions in its respective domain. Additionally, there is another AI responsible for summarizing responses from all agents.

When a user asks a question, such as "create blog," the response includes information from Frontend, Backend, and Designer, presented in clear sections for easy comprehension.

User Flow

AI

In the realm of Large Language Models (LLMs), we continue to use OpenAI as the model for creating the Chatbot. However, we have separated it into multiple agents, each with a distinct role.

Manager Agent

The initial AI is the Manager, with two main responsibilities. Firstly, it manages the segregation of prompts from users, assigning suitable tasks to each agent. This enhances our ability to handle user requests effectively. Secondly, it collects answers, summarizes results, and organizes data from Task Processing Agents before presenting the response to the user.

Task Processing Agent

In the Task Processing segment, or AI responsible for generating responses, we have divided it into three agents:

  • Frontend: Generates responses for questions related to Frontend.

  • Backend: Handles responses in the Backend and Technical domain.

  • Designer: Generates responses related to UI and UX design.

These agents work based on the tasks assigned by Manager Agents. Once their tasks are complete, they send the responses back to the Manager Agent for further consolidation.

API

For the API, we continue to use FastAPI as the framework. The demo API includes the following:

/query

We have created an API that combines the usage of all agents in a single call. Frontend can call this API once when a user asks a question. Subsequently, it generates responses through various agents and displays the results in the chat without the need for multiple API calls.

The APIs called within this API include:

  • /breakdown: Used to break down questions into tasks for forwarding to specific agents.

  • /build: Used to generate responses.

  • /conclude: Used to collect responses and summarize them for presentation to the user.

/queryWithOutChain

For users who do not want to use multiple agents, we provide an API to connect directly to OpenAI.

Frontend Development

Implementing Input Box for User Prompts

To implement the input, we will use the react-hook-form library along with @hookform/resolvers/zod and 'zod'. In the first step, we create a schema comprising a query for input and a bot for displaying error messages related to the bot itself.

export const askScheme = z.object({
  query: z.string().trim().min(1, { message: 'Please enter your message' }),
  bot: z.string({}).optional(),
});

Once the schema is created, we generate a type interface for this form.

export interface IOpenAIForm extends z.infer<typeof askScheme> {}

Next, we create methods:

const methods = useForm<IOpenAIForm>({
    resolver: zodResolver(askScheme),
    shouldFocusError: true,
    defaultValues: {
      query: '',
    },
  });

const { handleSubmit, setError, setValue } = methods;

const onSubmit = async (data: IOpenAIForm) => {
  // ...TO DO Something
};

return (
  <FormProvider methods={methods} onSubmit={handleSubmit(onSubmit)}>
    {/* ... */}
  </FormProvider>
);

The schema validates user input, and if the user submits without typing anything, an error message will be displayed in the user interface: 'Please enter your message.' The input component uses the 'react-hook-form' library along with useFormContext and Controller to manage input.

import { InputHTMLAttributes } from 'react';
import { useFormContext, Controller } from 'react-hook-form';
import { twMerge } from 'tailwind-merge';

interface IProps extends InputHTMLAttributes<HTMLInputElement> {
  name: string;
  helperText?: string;
  label?: string;
}

const RHFTextField = ({
  name,
  helperText,
  label,
  className,
  ...other
}: IProps) => {
  const { control } = useFormContext();

  return (
    <Controller
      name={name}
      control={control}
      render={({ field, fieldState: { error } }) => (
        <div className='w-full flex flex-col gap-y-2 '>
          {label && <label className='text-sm font-semibold'>{label}</label>}
          <input
            {...field}
            {...other}
            className={twMerge(
              'outline-none w-full  border border-gray-200 rounded-lg px-2 py-1',
              className
            )}
          />
          {(!!error || helperText) && (
            <div className={twMerge(error?.message && 'text-rose-500 text-sm')}>
              {error?.message || helperText}
            </div>
          )}
        </div>
      )}
    />
  );
};

export default RHFTextField;

When using the input, set the name attribute to 'query' to match the variable declared in the schema.

const methods = useForm<IOpenAIForm>({
    resolver: zodResolver(askScheme),
    shouldFocusError: true,
    defaultValues: {
      query: '',
    },
  });

const { handleSubmit, setError, setValue } = methods;

const onSubmit = async (data: IOpenAIForm) => {
  console.log(data);
};

return (
  <FormProvider methods={methods} onSubmit={handleSubmit(onSubmit)}>
    {/* ... */}
    <RHFTextField
      type='text'
      placeholder='What do you need? ...'
      className='outline-none w-full border-none'
      name='query'
    />
    <button
      type='submit'
      disabled={isLoading}
      className='text-gray-400 disabled:text-gray-200'
    >
      Submit
    </button>
  </FormProvider>
);

When the submit button is pressed, if the input is validated correctly, we see the logged data from the onSubmit function. This data can then be sent to the backend for further processing.

Handle API Integration for Chained-LLM Agents & Display Output from Each API

To connect with the backend, we manage various states such as loading and error. We set the default base URL for the API.

axios.defaults.baseURL = process.env.NEXT_PUBLIC_API;

After setting the base URL, we connect to the API.

const onSubmit = async (data: IOpenAIForm) => {
  try {
    const { data: result } = await axios.post(
      `/query?question=${data.query}`,
      undefined
    );
    console.log(result); // Value obtained from the API
    // TO DO Something
  } catch (error) {
    const err = error as AxiosError<{ detail: string }>;
    setError('bot', {
      message: err?.response?.data?.detail ?? 'Something went wrong',
    });
  }
};

We use useFormContext from 'react-hook-form' to manage the loading and errors state.

const {
  formState: { isSubmitting, errors },
} = useFormContext();

If an API error occurs, we display an error message from the bot and show the loading state. We use the state from 'useFormContext' to display values such as isSubmitting and errors.

export default function ChatWidget({ answer }: { answer: ChatProps[] }) {
  const chatWindowRef = useRef<HTMLDivElement>(null);
  const {
    formState: { isSubmitting, errors },
  } = useFormContext();

  return (
    <div className='h-full flex flex-col w-full'>
      <Header />
      <ChatWindow
        messages={answer}
        isLoading={isSubmitting}
        error={errors?.bot?.message as string}
        chatWindowRef={chatWindowRef}
      />
      <ChatInput isLoading={isSubmitting} />
    </div>
  );
}

In the ChatWindow component, we manage various states to display in the UI, including the results from the API.

import { useEffect } from 'react'
import Image from 'next/image'
import { CopyClipboard } from './CopyClipboard'

interface Message {
  role: 'user' | 'ai'
  message: string
  id: string
  raw: string
}

interface ChatWindowProps {
  messages: Message[]
  isLoading?: boolean
  error?: string
  chatWindowRef: any | null
}

export const ChatWindow: React.FC<ChatWindowProps> = ({
  messages,
  isLoading,
  error,
  chatWindowRef,
}) => {
  useEffect(() => {
    if (
      chatWindowRef !== null &&
      chatWindowRef?.current &&
      messages.length > 0
    ) {
      chatWindowRef.current.scrollTop = chatWindowRef.current.scrollHeight
    }
  }, [messages.length, chatWindowRef])

  return (
    <divref={chatWindowRef}
      className='flex-1 overflow-y-auto p-4 space-y-8'
      id='chatWindow'
    >
      {messages.map((item, index) => (
        <div key={item.id} className='w-full'>
          {item.role === 'user' ? (
            <div className='flex gap-x-8 '>
              <div className='min-w-[48px] min-h-[48px]'>
                <Imagesrc='/img/chicken.png'
                  width={48}
                  height={48}
                  alt='user'
                />
              </div>
              <div>
                <p className='font-bold'>User</p>
                <p>{item.message}</p>
              </div>
            </div>
          ) : (
            <div className='flex gap-x-8 w-full'>
              <div className='min-w-[48px] min-h-[48px]'>
                <Imagesrc='/img/robot.png'
                  width={48}
                  height={48}
                  alt='robot'
                />
              </div>
              <div className='w-full'>
                <div className='flex justify-between mb-1 w-full '>
                  <p className='font-bold'>Ai</p>
                  <div />
                  <CopyClipboard content={item.raw} />
                </div>

                <divclassName='prose whitespace-pre-line'
                  dangerouslySetInnerHTML={{ __html: item.message }}
                />
              </div>
            </div>
          )}
        </div>
      ))}
      {isLoading && (
        <div className='flex gap-x-8 w-full mx-auto'>
          <div className='min-w-[48px] min-h-[48px]'>
            <Image src='/img/robot.png' width={48} height={48} alt='robot' />
          </div>
          <div>
            <p className='font-bold'>Ai</p>

            <div className='mt-4 flex space-x-2 items-center '>
              <p>Hang on a second </p>
              <span className='sr-only'>Loading...</span>
              <div className='h-2 w-2 bg-blue-600 rounded-full animate-bounce [animation-delay:-0.3s]'></div>
              <div className='h-2 w-2 bg-blue-600 rounded-full animate-bounce [animation-delay:-0.15s]'></div>
              <div className='h-2 w-2 bg-blue-600 rounded-full animate-bounce'></div>
            </div>
          </div>
        </div>
      )}
      {error && (
        <div className='flex gap-x-8 w-full mx-auto'>
          <div className='min-w-[48px] min-h-[48px]'>
            <Image src='/img/error.png' width={48} height={48} alt='error' />
          </div>
          <div>
            <p className='font-bold'>Ai</p>
            <p className='text-rose-500'>{error}</p>
          </div>
        </div>
      )}
    </div>
  )
}

Implement Process Continuation Functionality

After successfully connecting with the backend, we need to store the result message obtained from the API. We create a state to store the user and bot answers.

export default function ChatBotDemo() {
  const [answer, setAnswer] = useState<ChatProps[]>([])

  const methods = useForm<IOpenAIForm>({
    resolver: zodResolver(askScheme),
    shouldFocusError: true,
    defaultValues: {
      query: '',
    },
  })

  const { handleSubmit, setError, setValue } = methods

  const onSubmit = async (data: IOpenAIForm) => {
    try {
      const id = answer.length
      setAnswer((prevState) => [
        ...prevState,
        {
          id: id.toString(),
          role: 'user',
          message: data.query,
          raw: '',
        },
      ])
      setValue('query', '')
      const { data: result } = await axios.post(
        `/query?question=${data.query}`,
        undefined
      )
      setAnswer((prevState) => [
        ...prevState,
        {
          id: (prevState.length + 1).toString(),
          role: 'ai',
          message: result.raw,
          raw: result.json.customer_need,
        },
      ])
    } catch (error) {
      const err = error as AxiosError<{ detail: string }>
      setError('bot', {
        message: err?.response?.data?.detail ?? 'Something went wrong',
      })
    }
  }

  return (
    <FormProvider methods={methods} onSubmit={handleSubmit(onSubmit)}>
      <div className='flex justify-center flex-col items-center bg-white mx-auto max-w-7xl h-screen '>
        <ChatWidget answer={answer} />
      </div>
    </FormProvider>
  )
}

Now, when we ask the AI, for example, to create a blog, the LLMs agent will respond with tasks for each role, such as Design, Frontend, and Backend.

Link to Frontend Code

Backend Development

Setting Up a FastAPI Project

Similar to all the examples we've gone through, we choose to use FastAPI as the framework to create and use our API.

Start by installing the necessary Python libraries for FastAPI:

pip install fastapi
pip install "uvicorn[standard]"

Create a file named app.py and initialize FastAPI:

from fastapi import FastAPI
from fastapi.encoders import jsonable_encoder
from fastapi.responses import JSONResponse
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

@app.get("/helloworld")
async def helloworld():
    return {"message": "Hello World"}

In this example code, we initialize a FastAPI project and enable CORS for connecting with the frontend. To run the server, use the following command:

uvicorn app:app

This command instructs FastAPI, declared in the app.py file, to work. The server runs on the default port 8000.

Now, let's create another file named LocalTemplate.py to store initial templates for asking questions to the Chatbot. These templates include:

  • Manager-template: Divides tasks for incoming questions to be used by each agent.

  • Agent-template: Divides into three agents: Frontend, Backend, and Designer. Each agent answers questions related to their role.

  • Conclusion-template: Summarizes all received answers for a concise overview.

In the API design section, we'll divide the API into five routes for different functionalities, which will be explained in the next section.

Developing the Manager Agent API

The first API we create is POST: /breakdown, which handles the breakdown of a customer's need into tasks for each agent:

@app.post('/breakdown')
def breakdown_question(customer_need : str):
    model_256 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=256,openai_api_key = OPENAI_API_KEY)
    breakdown_chain = ChatPromptTemplate.from_template(LocalTemplate.get_manager()) | model_256
    result = breakdown_chain.invoke({"question": customer_need})
    arr = result.content.split('<question>')[1:]
    task_list = []
    for i in arr : 
        full_task = remove_tag(i,['<question>','<role>','</question>','</role>','</sub-question>']).strip()
        full_task_list = full_task.split('<sub-question>')
        role = full_task_list[0]
        task = full_task_list[1]
        task_list.append({'role':role,'task':task})

    json_compatible_item_data = jsonable_encoder(task_list)
    return JSONResponse(content=json_compatible_item_data)

This API takes a customer's need as input and processes it to create tasks for each agent. The resulting tasks are then returned as a JSON response.

Building Task Processing APIs

Next, we create an API POST: /build that processes the tasks for each agent:

def build_task(task_list : List[task_list]):
    model_256 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=256,openai_api_key = OPENAI_API_KEY)
    frontend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_frontend()) | model_256 
    frontend_result = frontend_chain.invoke({"task": task_list[0].task})
    
    backend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_backend()) | model_256 
    backend_result = backend_chain.invoke({"task": task_list[1].task})

    designer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_designer()) | model_256 
    designer_result = designer_chain.invoke({"task": task_list[2].task})
    
    frontend_text = frontend_result.content.split('<step>')[1:]
    frontend_json = text_to_json(frontend_text,'<description>',['<task>','</task>','<step>','</step>','</description>'])
    backend_text = backend_result.content.split('<step>')[1:]
    backend_json = text_to_json(backend_text,'<description>',['<task>','</task>','<step>','</step>','</description>'])
    designer_text = designer_result.content.split('<step>')[1:]
    designer_json = text_to_json(designer_text,'<description>',['<task>','</task>','<step>','</step>','</description>'])

    result = {
        'raw' : {
            "frontend_task": frontend_result.content, 
            "backend_task": frontend_result.content, 
            "designer_task": frontend_result.content
        },
        'json' : {
            "frontend_task": frontend_json, 
            "backend_task": backend_json, 
            "designer_task": designer_json
        }
    }


    json_compatible_item_data = jsonable_encoder(result)
    return JSONResponse(content=json_compatible_item_data)

This API takes a list of tasks and processes them for each agent (Frontend, Backend, Designer), returning the raw and JSON format of the generated tasks.

Creating a Manager Summary API

Now, we create an API POST: /conclude for summarizing all the received information:

@app.post('/conclude')
def build_conclusion(customer_need : str, frontend_task : str, backend_task : str, designer_task : str):
    model_512 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=512,openai_api_key = OPENAI_API_KEY)
    customer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_conclusion()) | model_512
    customer_result = customer_chain.invoke({
        "customer_need" : customer_need,
        "frontend_task": frontend_task, 
        "backend_task": backend_task, 
        "designer_task": designer_task
    })

    customer_json = remove_tag(customer_result.content,['<conclude>','</conclude>','<text>','</text>'])
    result =  {
        'raw' : customer_result.content,
        'json' : {
            'customer_need' : customer_json
        }
    }
    json_compatible_item_data = jsonable_encoder(result)
    return JSONResponse(content=json_compatible_item_data)

This API takes the initial customer need and the tasks generated by each agent and produces a summarized conclusion in both raw and JSON formats.

Implementing a Chained-LLM Wrapper API

For a more streamlined process, we create an API POST: /query that combines all the steps:

@app.post('/query')
def query_with_chain(question : str):
    customer = question
    model_256 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=256,openai_api_key = OPENAI_API_KEY)
    model_512 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=512,openai_api_key = OPENAI_API_KEY)
    PO_Final_Chain = ChatPromptTemplate.from_template(LocalTemplate.get_manager()) | model_256
    result = PO_Final_Chain.invoke({"question": customer})
    # print(result.content)
    arr = result.content.split('<question>')[1:]
    task_list = []
    for i in arr : 
        full_task = i.replace('\n','').replace('</question>','').replace('<role>','').replace('</role>','').replace('</sub-question>','').strip()
        role = full_task.split('<sub-question>')[0].strip()
        task = full_task.split('<sub-question>')[1].strip()
        task_list.append({'role':role,'task':task})

    frontend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_frontend()) | model_256 
    frontend_result = frontend_chain.invoke({"task": task_list[0]['task']})

    backend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_backend()) | model_256 
    backend_result = backend_chain.invoke({"task": task_list[1]['task']})

    designer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_designer()) | model_256 
    designer_result = designer_chain.invoke({"task": task_list[2]['task']})

    customer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_conclusion()) | model_512
    customer_result = customer_chain.invoke({
        "customer_need" : customer,
        "frontend_task": frontend_result.content, 
        "backend_task": backend_result.content, 
        "designer_task": designer_result.content
    })

    customer_json = remove_tag(customer_result.content,['<conclude>','</conclude>','<text>','</text>'])
    result =  {
        'raw' : customer_result.content,
        'json' : {
            'customer_need' : customer_json
        }
    }

    json_compatible_item_data = jsonable_encoder(result)
    return JSONResponse(content=json_compatible_item_data)

This API takes a question, processes it through the entire chain of tasks and agents, and returns the raw and JSON format of the summarized conclusion.

Direct OpenAI Prompt API Integration

To have a more direct interaction with the Chatbot, we create an API POST: /queryWithoutChain:

def query_without_chain(question : str):
    model_512 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=512,openai_api_key = OPENAI_API_KEY)
    chain = ChatPromptTemplate.from_template('{question}') | model_512
    customer_result = chain.invoke({"question": question})
    
    result =  {
        'raw' : customer_result.content,
        'json' : {
            'customer_need' : customer_result.content
        }
    }

    json_compatible_item_data = jsonable_encoder(result)

    return JSONResponse(content=json_compatible_item_data)

This API takes a question, sends it directly to the Chatbot without going through the task and agent chain, and returns the raw and JSON format of the Chatbot's response.

Deploying and Monitoring on EC2

For deploying FastAPI on an EC2 instance:

  1. Create a session with a name of your choice:

    screen -S name
  2. Navigate to the API folder:

    cd path/to/api
  3. Start FastAPI on port 8000:

    uvicorn app:app --host 0.0.0.0
  4. Detach from the screen session:

    Ctrl+a d

Now, the server runs in the background even if you exit.

Setting Up API Gateway and Implementing CORS

To integrate the API with AWS API Gateway:

Add CORS configuration to FastAPI:

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

This configuration allows CORS for the FastAPI server. Connect this server to AWS API Gateway for enhanced management, authentication, and usage limitation.

Link to Backend Code

Last updated