Jul 8, 2025

The Architect's Guide to Production Vector Search with Vertex AI and Cloud Run

Building smart AI apps with semantic search or RAG is accessible, but making them fast, scalable, and cost-effective is the real challenge. This Google Cloud guide helps architects and engineers achieve that. It explores Vertex AI's embedding models, such as gemini-embedding-001, demonstrating how model choice affects performance and cost. The guide also covers RAG and semantic search patterns, emphasizing the optimization of embeddings for specific tasks. You'll learn about efficient vector similarity search using tools like Faiss and ScaNN, and gain a blueprint for deploying workloads on Google Cloud Run, along with best practices for minimizing latency. A FastAPI application example provides a complete, production-ready solution.

Technology
Blue technology circles

Nov 9, 2023

Hedgehogs and their technology platforms

How did successful companies apply their Hedgehog Concepts in their enterprise-scale technology implementations? Let’s discuss the one big thing I learned from my experience partnering with business leaders to deploy technology platforms for sustainable business growth.

Technology
human-being

Nov 10, 2023

Hey you! Yes, you, human being!

AI and machine learning are only as smart as the people behind them. Don’t just entrust your career development to your employer. It is a very personal thing, and you should have an active role in yours because you are the one who should benefit from such a plan, not anyone else.

Life