MAVREN: A multilayered adaptive framework for deploying vision-language models on resource-constrained unmanned aerial vehicles for autonomous search and rescue
Loading...
Files
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Engineering Applications of Artificial Intelligence
Abstract
Abstract
Unmanned Aerial Vehicles (UAVs) have become indispensable in autonomous search and rescue (SAR) missions, where the ability to interpret complex visual scenes in real time is critical. When equipped with artificial intelligence (AI)-empowered Vision-Language Models (VLMs), UAVs can provide rich contextual insights, interpret their findings, and even suggest next steps, but their deployment on resource-constrained UAV platforms is vastly limited by high computational demands, energy constraints, and strict latency requirements. This paper introduces MAVREN, a multilayered adaptive scheduler for VLM execution in resource-constrained UAV networks for autonomous SAR operations. Evaluations conducted on NVIDIA Jetson Orin NX using state-of-the-art VLMs such as Large Language and Vision Assistant (LLaVA) 1.6 and Vision-Language Alignment (VILA) 7B demonstrate that MAVREN achieves up to 26.11% higher throughput, 23% lower energy consumption, 13.51% reduced latency, and a 7% gain in detection accuracy compared to baseline schedulers across indoor, outdoor, and multi-UAV SAR scenarios. This is achieved through the integration of a visual encoder for lightweight feature extraction, a block floating-point quantizer for precision-efficient representation, a bit-wise computation engine for fast arithmetic execution, and a branch-and-bound optimizer for dynamic central processing unit (CPU) scheduling. These tightly coupled components allow MAVREN to optimize the energy–latency–accuracy trade-off, making it a deployable solution for vision-language reasoning in real-world SAR missions. Our findings demonstrate MAVREN’s capability to deliver rapid, energy-efficient inference, advancing the deployment of computationally intensive VLMs on resource-constrained UAV platforms.
Description
Citation
Rashid, M. T., Siddique, M. J., & Shaqur, A. (2025). MAVREN: A multilayered adaptive framework for deploying vision-language models on resource-constrained unmanned aerial vehicles for autonomous search and rescue. Engineering Applications of Artificial Intelligence, 162, 112498.
