Multimodal AI for Searchable Aerial Imagery at Scale

Gilbert V Lepadatu· June 22, 2026 View original

▶ The 2-minute explainer

Summary

This post details an architecture using Amazon Bedrock and OpenSearch Serverless for scalable multimodal AI to search aerial imagery. It evaluates embedding models and fusion strategies, offering practical guidance for geospatial semantic search.

This article explores the development of a scalable multimodal AI system designed for searching vast amounts of aerial imagery. It outlines the architectural choices, leveraging AWS services like Amazon Bedrock and Amazon OpenSearch Serverless, to create a robust solution for geospatial semantic search. The authors detail their evaluation methodology, which uses OpenStreetMap as ground truth, and present findings from four experiments comparing various embedding models, fusion techniques, captioning methods, and search strategies. The post offers practical advice for professionals looking to build similar systems, highlighting key design decisions that significantly impact performance. Specifically, it identifies Amazon Nova Multimodal Embeddings as a top performer, achieving the highest F1 scores in their benchmark queries. This work has been commercialized into Vexcel Intelligence, a product for searchable imagery.

Why it matters

Professionals can learn how to implement scalable, multimodal AI for geospatial data, improving search capabilities for large aerial imagery datasets. This can lead to more efficient data analysis and new product development in mapping and intelligence.

How to implement this in your domain

  1. 1Design an architecture for multimodal AI using cloud services like Amazon Bedrock and OpenSearch Serverless.
  2. 2Evaluate different embedding models and fusion strategies for optimal geospatial semantic search performance.
  3. 3Implement a robust evaluation methodology using ground truth data like OpenStreetMap.
  4. 4Integrate Amazon Nova Multimodal Embeddings for enhanced F1 scores in geospatial search applications.
  5. 5Adapt the described system to create searchable imagery products or enhance existing geospatial data platforms.

Who benefits

GeospatialDefenseUrban PlanningAgricultureInsurance

Key takeaways

  • Scalable multimodal AI can significantly enhance the searchability of aerial imagery.
  • AWS services like Bedrock and OpenSearch Serverless provide a robust foundation for such systems.
  • Careful evaluation of embedding models and fusion strategies is crucial for performance.
  • Amazon Nova Multimodal Embeddings demonstrated superior performance in geospatial semantic search.

Original post by Gilbert V Lepadatu

"In this post, we walk through the problem space, our architecture on Amazon Bedrock and Amazon OpenSearch Serverless, the evaluation methodology we built on OpenStreetMap ground truth, four experiments that compared embedding models, fusion strategies, captioning, and search meth…"

View on X

Originally posted by Gilbert V Lepadatu on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses