Edge AI / TinyML Hardware-aware ML Embedded Systems On-device Learning ML for EDA Computer Architecture
Independent Researcher, Delhi

Shivansh
Pratap Singh

Software Engineer & Independent Researcher focusing on the Hardware-Software Interface


I build software that bridges the gap between high-level algorithms and physical silicon. My experience ranges from engineering industrial-scale EDA tooling in C++ at Cadence Design Systems to researching performant TinyML inference for ultra-constrained, sub-$15 hardware.

I am driven by the philosophy that software abstractions must be hardware-aware. Currently, I am extending embedded toolchains (TFLite Micro) to enable on-device learning and exploring how machine learning can optimize VLSI design flows. I am actively seeking a Master's at EPFL to deepen my expertise in computer architecture and hardware-software co-design.


ICDECT-2025 — Springer LNNS
Embedded AI Systems Research
AIoT Health: Medicine Reminder For The Elderly
Shivansh Pratap Singh & Manoj Kumar Gupta — Delhi Technological University
Designed and implemented a fully offline, privacy-preserving medicine adherence system on an ESP32-S3. Engineered a cascading TinyML pipeline where a low-power Keyword Spotting (KWS) model gates a high-fidelity Spoken Language Understanding (SLU) classifier to minimize idle-state power consumption. Validated intent recognition through multi-modal physical sensor fusion (Ultrasonic/Hall-effect). Achieved sub-100ms latency on sub-$15 hardware, demonstrating that complex NLP tasks can be decentralized and executed on the extreme edge without cloud dependency.
Performance
KWS 96.4% accuracy
SLU 94.1% accuracy
Latency <100ms combined
Unit Cost (BOM) ~$15 total hardware
Key Contributions
  • Cascading TinyML pipeline — KWS gates SLU, reducing active inference time by ~70%
  • Triple-factor adherence verification: voice intent + Hall effect lid sensor + ultrasonic hand detection
  • Both models run on 330KB combined SRAM with tensor arenas allocated in PSRAM
  • Optimized architecture for resource-constrained inference: Replaced Bi-directional LSTM with GlobalAveragePooling to circumvent TFLite Micro kernel limitations (ReverseV2).
  • MicroMutableOpResolver over AllOpsResolver — saves ~40KB SRAM by excluding backpropagation ops
System Architecture
Audio Input 16kHz continuous
KWS always-on / 30ms
trigger
SLU on-demand / 70ms
Intent 8 classes
Action FSM handler
Technical Retrospective →
2024 – 2025
Cadence Design Systems
Software Engineer I (C++) SPB (Silicon Package Board) / Allegro X

Owned the end-to-end DE-HDL (Design Entry HDL) flow for the Allegro X symbol editor. My work focused on the performance bottlenecks of industrial-scale library management, where I optimized data structures for handling 100k+ components with sub-second retrieval times.

Technical highlights
  • Memory & Data Optimization: Architected a local caching layer for asynchronous component processing, replacing legacy synchronous server-side fetches to eliminate UI blocking during large-scale symbol manipulation.
  • Production C++ at Scale: Leveraged Qt and Tcl within a massive legacy codebase, ensuring features met the rigorous stability requirements for Release 23.1.
  • Hardware-Software Awareness: Worked directly with constraints of EDA flows where software abstractions define the synthesis of physical silicon, bridging the gap between high-level GUI operations and low-level hardware description.
C++ Performance Engineering System Architecture EDA Qt
2019 – 2023 | Dehradun
University of Petroleum and Energy Studies
B.Tech CSE DevOps Hons
Gained a rigorous foundation in core Computer Science, specializing in the lifecycle of complex software systems. While my formal curriculum focused on high-level abstractions and DevOps, I utilized my senior years to independently bridge the gap into hardware-aware computing. This self-directed transition involved applying systems programming principles to resource-constrained environments, eventually leading to my first-author research in TinyML
Academic Highlights
  • Systems & Theory Mastery: Consistently top-graded in core CS theory including Advanced Data Structures (O), Formal Languages and Automata (A+), and Operating Systems (A).
  • Engineering Resilience: Demonstrated a strong upward academic trajectory, moving from foundational courses to advanced systems engineering with a focus on disciplined software delivery.
  • Independent Research Pivot: Despite curriculum limitations, independently pursued a deeper understanding of embedded systems through project-based learning, resulting in a first-author peer-reviewed publication on the ESP32-S3.
Computer Systems Data Structures AI/ML Computer Architecture Embedded Systems CI/CD