Multiple Agents
Last updated
Last updated
We have developed a Chatbot specialized in website development. The AI system is designed with the first AI acting as a task divider for different agents, including Frontend, Backend, and Designer. Each agent responds to questions in its respective domain. Additionally, there is another AI responsible for summarizing responses from all agents.
When a user asks a question, such as "create blog," the response includes information from Frontend, Backend, and Designer, presented in clear sections for easy comprehension.
In the realm of Large Language Models (LLMs), we continue to use OpenAI as the model for creating the Chatbot. However, we have separated it into multiple agents, each with a distinct role.
Manager Agent
The initial AI is the Manager, with two main responsibilities. Firstly, it manages the segregation of prompts from users, assigning suitable tasks to each agent. This enhances our ability to handle user requests effectively. Secondly, it collects answers, summarizes results, and organizes data from Task Processing Agents before presenting the response to the user.
Task Processing Agent
In the Task Processing segment, or AI responsible for generating responses, we have divided it into three agents:
Frontend: Generates responses for questions related to Frontend.
Backend: Handles responses in the Backend and Technical domain.
Designer: Generates responses related to UI and UX design.
These agents work based on the tasks assigned by Manager Agents. Once their tasks are complete, they send the responses back to the Manager Agent for further consolidation.
For the API, we continue to use FastAPI as the framework. The demo API includes the following:
/query
We have created an API that combines the usage of all agents in a single call. Frontend can call this API once when a user asks a question. Subsequently, it generates responses through various agents and displays the results in the chat without the need for multiple API calls.
The APIs called within this API include:
/breakdown: Used to break down questions into tasks for forwarding to specific agents.
/build: Used to generate responses.
/conclude: Used to collect responses and summarize them for presentation to the user.
/queryWithOutChain
For users who do not want to use multiple agents, we provide an API to connect directly to OpenAI.
To implement the input, we will use the react-hook-form
library along with @hookform/resolvers/zod
and 'zod'. In the first step, we create a schema comprising a query for input and a bot for displaying error messages related to the bot itself.
export const askScheme = z.object({
query: z.string().trim().min(1, { message: 'Please enter your message' }),
bot: z.string({}).optional(),
});
Once the schema is created, we generate a type interface for this form.
export interface IOpenAIForm extends z.infer<typeof askScheme> {}
Next, we create methods:
const methods = useForm<IOpenAIForm>({
resolver: zodResolver(askScheme),
shouldFocusError: true,
defaultValues: {
query: '',
},
});
const { handleSubmit, setError, setValue } = methods;
const onSubmit = async (data: IOpenAIForm) => {
// ...TO DO Something
};
return (
<FormProvider methods={methods} onSubmit={handleSubmit(onSubmit)}>
{/* ... */}
</FormProvider>
);
The schema validates user input, and if the user submits without typing anything, an error message will be displayed in the user interface: 'Please enter your message.' The input component uses the 'react-hook-form' library along with useFormContext
and Controller
to manage input.
import { InputHTMLAttributes } from 'react';
import { useFormContext, Controller } from 'react-hook-form';
import { twMerge } from 'tailwind-merge';
interface IProps extends InputHTMLAttributes<HTMLInputElement> {
name: string;
helperText?: string;
label?: string;
}
const RHFTextField = ({
name,
helperText,
label,
className,
...other
}: IProps) => {
const { control } = useFormContext();
return (
<Controller
name={name}
control={control}
render={({ field, fieldState: { error } }) => (
<div className='w-full flex flex-col gap-y-2 '>
{label && <label className='text-sm font-semibold'>{label}</label>}
<input
{...field}
{...other}
className={twMerge(
'outline-none w-full border border-gray-200 rounded-lg px-2 py-1',
className
)}
/>
{(!!error || helperText) && (
<div className={twMerge(error?.message && 'text-rose-500 text-sm')}>
{error?.message || helperText}
</div>
)}
</div>
)}
/>
);
};
export default RHFTextField;
When using the input, set the name
attribute to 'query' to match the variable declared in the schema.
const methods = useForm<IOpenAIForm>({
resolver: zodResolver(askScheme),
shouldFocusError: true,
defaultValues: {
query: '',
},
});
const { handleSubmit, setError, setValue } = methods;
const onSubmit = async (data: IOpenAIForm) => {
console.log(data);
};
return (
<FormProvider methods={methods} onSubmit={handleSubmit(onSubmit)}>
{/* ... */}
<RHFTextField
type='text'
placeholder='What do you need? ...'
className='outline-none w-full border-none'
name='query'
/>
<button
type='submit'
disabled={isLoading}
className='text-gray-400 disabled:text-gray-200'
>
Submit
</button>
</FormProvider>
);
When the submit button is pressed, if the input is validated correctly, we see the logged data from the onSubmit
function. This data can then be sent to the backend for further processing.
To connect with the backend, we manage various states such as loading and error. We set the default base URL for the API.
axios.defaults.baseURL = process.env.NEXT_PUBLIC_API;
After setting the base URL, we connect to the API.
const onSubmit = async (data: IOpenAIForm) => {
try {
const { data: result } = await axios.post(
`/query?question=${data.query}`,
undefined
);
console.log(result); // Value obtained from the API
// TO DO Something
} catch (error) {
const err = error as AxiosError<{ detail: string }>;
setError('bot', {
message: err?.response?.data?.detail ?? 'Something went wrong',
});
}
};
We use useFormContext
from 'react-hook-form' to manage the loading and errors state.
const {
formState: { isSubmitting, errors },
} = useFormContext();
If an API error occurs, we display an error message from the bot and show the loading state. We use the state from 'useFormContext' to display values such as isSubmitting
and errors
.
export default function ChatWidget({ answer }: { answer: ChatProps[] }) {
const chatWindowRef = useRef<HTMLDivElement>(null);
const {
formState: { isSubmitting, errors },
} = useFormContext();
return (
<div className='h-full flex flex-col w-full'>
<Header />
<ChatWindow
messages={answer}
isLoading={isSubmitting}
error={errors?.bot?.message as string}
chatWindowRef={chatWindowRef}
/>
<ChatInput isLoading={isSubmitting} />
</div>
);
}
In the ChatWindow
component, we manage various states to display in the UI, including the results from the API.
import { useEffect } from 'react'
import Image from 'next/image'
import { CopyClipboard } from './CopyClipboard'
interface Message {
role: 'user' | 'ai'
message: string
id: string
raw: string
}
interface ChatWindowProps {
messages: Message[]
isLoading?: boolean
error?: string
chatWindowRef: any | null
}
export const ChatWindow: React.FC<ChatWindowProps> = ({
messages,
isLoading,
error,
chatWindowRef,
}) => {
useEffect(() => {
if (
chatWindowRef !== null &&
chatWindowRef?.current &&
messages.length > 0
) {
chatWindowRef.current.scrollTop = chatWindowRef.current.scrollHeight
}
}, [messages.length, chatWindowRef])
return (
<divref={chatWindowRef}
className='flex-1 overflow-y-auto p-4 space-y-8'
id='chatWindow'
>
{messages.map((item, index) => (
<div key={item.id} className='w-full'>
{item.role === 'user' ? (
<div className='flex gap-x-8 '>
<div className='min-w-[48px] min-h-[48px]'>
<Imagesrc='/img/chicken.png'
width={48}
height={48}
alt='user'
/>
</div>
<div>
<p className='font-bold'>User</p>
<p>{item.message}</p>
</div>
</div>
) : (
<div className='flex gap-x-8 w-full'>
<div className='min-w-[48px] min-h-[48px]'>
<Imagesrc='/img/robot.png'
width={48}
height={48}
alt='robot'
/>
</div>
<div className='w-full'>
<div className='flex justify-between mb-1 w-full '>
<p className='font-bold'>Ai</p>
<div />
<CopyClipboard content={item.raw} />
</div>
<divclassName='prose whitespace-pre-line'
dangerouslySetInnerHTML={{ __html: item.message }}
/>
</div>
</div>
)}
</div>
))}
{isLoading && (
<div className='flex gap-x-8 w-full mx-auto'>
<div className='min-w-[48px] min-h-[48px]'>
<Image src='/img/robot.png' width={48} height={48} alt='robot' />
</div>
<div>
<p className='font-bold'>Ai</p>
<div className='mt-4 flex space-x-2 items-center '>
<p>Hang on a second </p>
<span className='sr-only'>Loading...</span>
<div className='h-2 w-2 bg-blue-600 rounded-full animate-bounce [animation-delay:-0.3s]'></div>
<div className='h-2 w-2 bg-blue-600 rounded-full animate-bounce [animation-delay:-0.15s]'></div>
<div className='h-2 w-2 bg-blue-600 rounded-full animate-bounce'></div>
</div>
</div>
</div>
)}
{error && (
<div className='flex gap-x-8 w-full mx-auto'>
<div className='min-w-[48px] min-h-[48px]'>
<Image src='/img/error.png' width={48} height={48} alt='error' />
</div>
<div>
<p className='font-bold'>Ai</p>
<p className='text-rose-500'>{error}</p>
</div>
</div>
)}
</div>
)
}
After successfully connecting with the backend, we need to store the result message obtained from the API. We create a state to store the user and bot answers.
export default function ChatBotDemo() {
const [answer, setAnswer] = useState<ChatProps[]>([])
const methods = useForm<IOpenAIForm>({
resolver: zodResolver(askScheme),
shouldFocusError: true,
defaultValues: {
query: '',
},
})
const { handleSubmit, setError, setValue } = methods
const onSubmit = async (data: IOpenAIForm) => {
try {
const id = answer.length
setAnswer((prevState) => [
...prevState,
{
id: id.toString(),
role: 'user',
message: data.query,
raw: '',
},
])
setValue('query', '')
const { data: result } = await axios.post(
`/query?question=${data.query}`,
undefined
)
setAnswer((prevState) => [
...prevState,
{
id: (prevState.length + 1).toString(),
role: 'ai',
message: result.raw,
raw: result.json.customer_need,
},
])
} catch (error) {
const err = error as AxiosError<{ detail: string }>
setError('bot', {
message: err?.response?.data?.detail ?? 'Something went wrong',
})
}
}
return (
<FormProvider methods={methods} onSubmit={handleSubmit(onSubmit)}>
<div className='flex justify-center flex-col items-center bg-white mx-auto max-w-7xl h-screen '>
<ChatWidget answer={answer} />
</div>
</FormProvider>
)
}
Now, when we ask the AI, for example, to create a blog, the LLMs agent will respond with tasks for each role, such as Design, Frontend, and Backend.
Similar to all the examples we've gone through, we choose to use FastAPI as the framework to create and use our API.
Start by installing the necessary Python libraries for FastAPI:
pip install fastapi
pip install "uvicorn[standard]"
Create a file named app.py
and initialize FastAPI:
from fastapi import FastAPI
from fastapi.encoders import jsonable_encoder
from fastapi.responses import JSONResponse
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/helloworld")
async def helloworld():
return {"message": "Hello World"}
In this example code, we initialize a FastAPI project and enable CORS for connecting with the frontend. To run the server, use the following command:
uvicorn app:app
This command instructs FastAPI, declared in the app.py
file, to work. The server runs on the default port 8000.
Now, let's create another file named LocalTemplate.py
to store initial templates for asking questions to the Chatbot. These templates include:
Manager-template
: Divides tasks for incoming questions to be used by each agent.
Agent-template
: Divides into three agents: Frontend, Backend, and Designer. Each agent answers questions related to their role.
Conclusion-template
: Summarizes all received answers for a concise overview.
In the API design section, we'll divide the API into five routes for different functionalities, which will be explained in the next section.
The first API we create is POST: /breakdown
, which handles the breakdown of a customer's need into tasks for each agent:
@app.post('/breakdown')
def breakdown_question(customer_need : str):
model_256 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=256,openai_api_key = OPENAI_API_KEY)
breakdown_chain = ChatPromptTemplate.from_template(LocalTemplate.get_manager()) | model_256
result = breakdown_chain.invoke({"question": customer_need})
arr = result.content.split('<question>')[1:]
task_list = []
for i in arr :
full_task = remove_tag(i,['<question>','<role>','</question>','</role>','</sub-question>']).strip()
full_task_list = full_task.split('<sub-question>')
role = full_task_list[0]
task = full_task_list[1]
task_list.append({'role':role,'task':task})
json_compatible_item_data = jsonable_encoder(task_list)
return JSONResponse(content=json_compatible_item_data)
This API takes a customer's need as input and processes it to create tasks for each agent. The resulting tasks are then returned as a JSON response.
Next, we create an API POST: /build
that processes the tasks for each agent:
def build_task(task_list : List[task_list]):
model_256 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=256,openai_api_key = OPENAI_API_KEY)
frontend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_frontend()) | model_256
frontend_result = frontend_chain.invoke({"task": task_list[0].task})
backend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_backend()) | model_256
backend_result = backend_chain.invoke({"task": task_list[1].task})
designer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_designer()) | model_256
designer_result = designer_chain.invoke({"task": task_list[2].task})
frontend_text = frontend_result.content.split('<step>')[1:]
frontend_json = text_to_json(frontend_text,'<description>',['<task>','</task>','<step>','</step>','</description>'])
backend_text = backend_result.content.split('<step>')[1:]
backend_json = text_to_json(backend_text,'<description>',['<task>','</task>','<step>','</step>','</description>'])
designer_text = designer_result.content.split('<step>')[1:]
designer_json = text_to_json(designer_text,'<description>',['<task>','</task>','<step>','</step>','</description>'])
result = {
'raw' : {
"frontend_task": frontend_result.content,
"backend_task": frontend_result.content,
"designer_task": frontend_result.content
},
'json' : {
"frontend_task": frontend_json,
"backend_task": backend_json,
"designer_task": designer_json
}
}
json_compatible_item_data = jsonable_encoder(result)
return JSONResponse(content=json_compatible_item_data)
This API takes a list of tasks and processes them for each agent (Frontend, Backend, Designer), returning the raw and JSON format of the generated tasks.
Now, we create an API POST: /conclude
for summarizing all the received information:
@app.post('/conclude')
def build_conclusion(customer_need : str, frontend_task : str, backend_task : str, designer_task : str):
model_512 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=512,openai_api_key = OPENAI_API_KEY)
customer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_conclusion()) | model_512
customer_result = customer_chain.invoke({
"customer_need" : customer_need,
"frontend_task": frontend_task,
"backend_task": backend_task,
"designer_task": designer_task
})
customer_json = remove_tag(customer_result.content,['<conclude>','</conclude>','<text>','</text>'])
result = {
'raw' : customer_result.content,
'json' : {
'customer_need' : customer_json
}
}
json_compatible_item_data = jsonable_encoder(result)
return JSONResponse(content=json_compatible_item_data)
This API takes the initial customer need and the tasks generated by each agent and produces a summarized conclusion in both raw and JSON formats.
For a more streamlined process, we create an API POST: /query
that combines all the steps:
@app.post('/query')
def query_with_chain(question : str):
customer = question
model_256 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=256,openai_api_key = OPENAI_API_KEY)
model_512 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=512,openai_api_key = OPENAI_API_KEY)
PO_Final_Chain = ChatPromptTemplate.from_template(LocalTemplate.get_manager()) | model_256
result = PO_Final_Chain.invoke({"question": customer})
# print(result.content)
arr = result.content.split('<question>')[1:]
task_list = []
for i in arr :
full_task = i.replace('\n','').replace('</question>','').replace('<role>','').replace('</role>','').replace('</sub-question>','').strip()
role = full_task.split('<sub-question>')[0].strip()
task = full_task.split('<sub-question>')[1].strip()
task_list.append({'role':role,'task':task})
frontend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_frontend()) | model_256
frontend_result = frontend_chain.invoke({"task": task_list[0]['task']})
backend_chain = ChatPromptTemplate.from_template(LocalTemplate.get_backend()) | model_256
backend_result = backend_chain.invoke({"task": task_list[1]['task']})
designer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_designer()) | model_256
designer_result = designer_chain.invoke({"task": task_list[2]['task']})
customer_chain = ChatPromptTemplate.from_template(LocalTemplate.get_conclusion()) | model_512
customer_result = customer_chain.invoke({
"customer_need" : customer,
"frontend_task": frontend_result.content,
"backend_task": backend_result.content,
"designer_task": designer_result.content
})
customer_json = remove_tag(customer_result.content,['<conclude>','</conclude>','<text>','</text>'])
result = {
'raw' : customer_result.content,
'json' : {
'customer_need' : customer_json
}
}
json_compatible_item_data = jsonable_encoder(result)
return JSONResponse(content=json_compatible_item_data)
This API takes a question, processes it through the entire chain of tasks and agents, and returns the raw and JSON format of the summarized conclusion.
To have a more direct interaction with the Chatbot, we create an API POST: /queryWithoutChain
:
def query_without_chain(question : str):
model_512 = ChatOpenAI(model_name="gpt-4-1106-preview", temperature=0.3, max_tokens=512,openai_api_key = OPENAI_API_KEY)
chain = ChatPromptTemplate.from_template('{question}') | model_512
customer_result = chain.invoke({"question": question})
result = {
'raw' : customer_result.content,
'json' : {
'customer_need' : customer_result.content
}
}
json_compatible_item_data = jsonable_encoder(result)
return JSONResponse(content=json_compatible_item_data)
This API takes a question, sends it directly to the Chatbot without going through the task and agent chain, and returns the raw and JSON format of the Chatbot's response.
For deploying FastAPI on an EC2 instance:
Create a session with a name of your choice:
screen -S name
Navigate to the API folder:
cd path/to/api
Start FastAPI on port 8000:
uvicorn app:app --host 0.0.0.0
Detach from the screen session:
Ctrl+a d
Now, the server runs in the background even if you exit.
To integrate the API with AWS API Gateway:
Add CORS configuration to FastAPI:
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
This configuration allows CORS for the FastAPI server. Connect this server to AWS API Gateway for enhanced management, authentication, and usage limitation.