About
I am Felix Hirwa Nshuti, an MS student in Electrical and Computer Engineering at Carnegie Mellon University
(CMU). My work resides at the critical intersection of computer architecture, optimizing compilers,
and theoretical machine learning. I focus on building next-generation ML infrastructure by
integrating low-level systems engineering with high-level mathematical abstractions, specifically leveraging
convex optimization to close the gap between algorithmic intent and architectural execution.
My research and development efforts center on compiler infrastructure for AI workloads,
exploring how intermediate representations (IR) can effectively transform computation graphs into
hardware-aware schedules. Additionally, I specialize in ML-driven signal processing, where I
apply tensor-level optimizations to ensure predictable performance and scalability in real-time inference
environments.
As an active open-source contributor and maintainer, I am dedicated to community collaboration and the
development of verifiable, efficient AI systems. Previously, I completed my Bachelor's in Computer Science
and Engineering at Pandit Deendayal Energy University (PDEU), focusing on programming languages, compilers,
and machine learning.
Research Interests
-
Optimization and theory: Deeply interested in the intersection of theoretical machine
learning and convex optimization. Experienced in applying mathematical rigor to close the gap between
algorithmic intent and architectural execution.
-
Signal processing: Specialized in the design of efficient, scalable infrastructure for
ML-driven signal processing, leveraging tensor-level optimizations to ensure predictable performance in
real-time inference environments.
-
Systems for ML: Research and development in compiler infrastructure for AI workloads,
focusing on how intermediate representations (IR) transform computation graphs into hardware-aware
schedules.
-
Objective: Build next-generation ML systems that integrate low-level systems engineering
with high-level mathematical abstractions for efficient and verifiable AI deployment.
In my free time, I play football and share memes with friends.
Selected Projects
Knowledge-Based Visual Question Answering — Python, PyTorch, Transformers, NLP, CV, VQA — Aug 2025
to Dec 2025
- Improved performance of knowledge-based VQA systems by integrating question-aware captioning.
- Built a module that generates contextually relevant captions to improve knowledge retrieval.
Time Series Forecasting with Contextual Features — Python, Pandas, NumPy, XGBoost, CatBoost,
statsmodels — Aug 2025 to Dec 2025
- Incorporated temporal, weather, and event-based features to enhance prediction accuracy.
- Applied advanced time series analysis to capture complex usage patterns.
Compiler Design and Architecture Translation — LLVM, x86, AArch64, C++, IR, CFG — Dec 2024 to
May 2025
- Built a compiler pipeline using the LLVM C++ API to translate x86 to AArch64 via LLVM IR.
- Extracted and analyzed CFGs to support instruction-level translation.
Machine Learning with Time Series — Python, PyTorch, TensorFlow, Darts, CI/CD — May 2024 to Aug 2024
- Added new classification models to sktime using PyTorch (GRU and GRU-FCNN Classifiers).
- Migrated classifier models from legacy sktime-dl to the sktime main repository.
- Implemented the modular interface of darts regression models in sktime.