|
Résumé
Felix Hirwa Nshuti
I am Felix Hirwa Nshuti, an MS student in Electrical and Computer Engineering at
Carnegie Mellon University (CMU). My work resides at the critical intersection of
computer architecture, optimizing compilers, and theoretical machine learning.
I focus on building next-generation ML infrastructure by integrating low-level systems
engineering with high-level mathematical abstractions, specifically leveraging
convex optimization to close the gap between algorithmic intent
and architectural execution.
My research and development efforts center on compiler infrastructure for AI
workloads,
exploring how intermediate representations (IR) can effectively transform computation
graphs into hardware-aware schedules. Additionally, I specialize in
ML-driven signal processing, where I apply tensor-level optimizations
to ensure predictable performance and scalability in real-time inference environments.
As an active open-source contributor and maintainer, I am dedicated to community
collaboration and the development of verifiable, efficient AI systems. Previously,
I completed my Bachelor's in Computer Science and Engineering at Pandit Deendayal
Energy University (PDEU), focusing on programming languages, compilers, and machine learning.
-
Optimization & Theory:
Deeply interested in the intersection of theoretical machine learning and convex optimization.
Experienced in applying mathematical rigor to close the gap between algorithmic intent and
architectural execution.
-
Signal Processing:
Specialized in the design of efficient, scalable infrastructure for ML-driven signal processing,
leveraging tensor-level optimizations to ensure predictable performance in real-time inference
environments.
-
Systems for ML:
Research and development in compiler infrastructure for AI workloads, focusing on how
intermediate representations (IR) transform computation graphs into hardware-aware schedules.
-
Objective:
To build next-generation ML systems that integrate low-level systems engineering with
high-level mathematical abstractions for efficient and verifiable AI deployment.
In my free time, I play football and share memes with friends 😎
|
|
Research
Accepted Conference Talks
-
TransISA: A Static Assembly Transpiler for Automating x86-to-ARM Migration in Scientific
Computing.
Improving Scientific Software Conference, Boulder CO —
2026
|
Selected Projects
-
Enhancing Prophet with Question-Aware Captioning for Knowledge-Based VQA
Knowledge-Based Visual Question Answering — Python, PyTorch, Transformers, NLP, CV,
VQA — Aug 2025 – Dec 2025
- Aimed to improve the performance of knowledge-based VQA systems by integrating
question-aware captioning techniques into the Prophet framework.
- Developed a module that generates contextually relevant captions based on the input
question, enhancing the model's ability to retrieve and utilize external knowledge
effectively.
-
Context-Aware Demand Forecasting in Pittsburgh's Bike Share System
Time Series Forecasting with Contextual Features — Python, Pandas, NumPy, XGBoost,
CatBoost, statsmodels — Aug 2025 – Dec 2025
- Developed a context-aware demand forecasting model incorporating temporal, weather, and
event-based features to enhance prediction accuracy.
- Employed advanced time series analysis techniques and machine learning algorithms to capture
complex patterns in bike usage data.
-
TransISA — Lightweight CISC-to-RISC Transpiler
Compiler Design and Architecture Translation — LLVM, x86, AArch64, C++, IR, CFG
— Dec 2024 – May 2025
- Designed and implemented a compiler pipeline using the LLVM C++ API to translate x86
assembly to AArch64 assembly via LLVM IR.
- Extracted and analyzed Control Flow Graphs (CFGs) of source programs to support
instruction-level translation.
-
Scaling Deep Learning Backends with sktime
Machine Learning with Time Series — Python, PyTorch, TensorFlow, Darts, CI/CD
— May 2024 – Aug 2024
- Added new classification models to sktime using PyTorch (GRU and GRU-FCNN Classifiers).
- Efficiently migrated the classifier models from legacy sktime-dl to the sktime main
repository.
- Implemented the modular interface of darts regression models in sktime.
|
Some Open-Source Contributions
-
Ivy
- A Unified AI Framework (Maintainer: 2022 - 2023)
-
pytorch-forecasting
- Time series forecasting with PyTorch (Maintainer: 2024 - Present)
-
sktime
- A unified framework for machine learning with time series. (Maintainer: 2024 - Present)
-
pytorch-lightning
- Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
(Contributor: 2024 - Present)
|
|