Best Deployment Strategy for Vertex AI Agent with Persistent Memory and FastAPI Backend?

I’m building an app using Google ADK with a custom front end, an AI agent, and a FastAPI backend to connect everything. I want my agent to have persistent user memory, so I’m planning to use Vertex Memory Bank, the new feature in Vertex AI.

For deployment, I’m unsure about the best approach:

Should I deploy the AI agent directly in Vertex AI Engine and host FastAPI separately (e.g., on Cloud Run)?
Or should I package and deploy both the AI agent and FastAPI together in a single service (like Cloud Run)?

What would be the best practice or most efficient setup for this kind of use case?

0 comments