Skip to main navigation Skip to search Skip to main content

Opto-ViT: Architecting a Near-Sensor Region of Interest-Aware Vision Transformer Accelerator with Silicon Photonics

  • Mehrdad Morsali
  • , Chengwei Zhou
  • , Deniz Najafi
  • , Sreetama Sarkar
  • , Pietro Mercati
  • , Navid Khoshavi
  • , Peter Beerel
  • , Mahdi Nikdast
  • , Gourav Datta
  • , Shaahin Angizi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Vision Transformers (ViTs) have emerged as a powerful architecture for computer vision tasks due to their ability to model long-range dependencies and global contextual relationships. However, their substantial compute and memory demands hinder efficient deployment in scenarios with strict energy and bandwidth limitations. In this work, we propose Opto-ViT, the first near-sensor, region-aware ViT accelerator leveraging silicon photonics (SiPh) for real-time and energy-efficient vision processing. Opto-ViT features a hybrid electronic-photonic architecture, where the optical core handles compute-intensive matrix multiplications using Vertical-Cavity Surface-Emitting Lasers (VCSELs) and Microring Resonators (MRs), while nonlinear functions and normalization are executed electronically. To reduce redundant computation and patch processing, we introduce a lightweight Mask Generation Network (MGNet) that identifies regions of interest in the current frame and prunes irrelevant patches before ViT encoding. We further co-optimize the ViT backbone using quantization-aware training and matrix decomposition tailored for photonic constraints. Experiments across device fabrication, circuit and architecture co-design, to classification, detection, and video tasks demonstrate that Opto-ViT achieves 100.4 KFPS/W with up to 84% energy savings with less than 1.6% accuracy loss, while enabling scalable and efficient ViT deployment at the edge.

Original languageEnglish (US)
Title of host publication2025 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2025 - Conference Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331515607
DOIs
StatePublished - 2025
Event44th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2025 - Munich, Germany
Duration: Oct 26 2025Oct 30 2025

Publication series

NameIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
ISSN (Print)1092-3152

Conference

Conference44th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2025
Country/TerritoryGermany
CityMunich
Period10/26/2510/30/25

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Opto-ViT: Architecting a Near-Sensor Region of Interest-Aware Vision Transformer Accelerator with Silicon Photonics'. Together they form a unique fingerprint.

Cite this