Felix Hirwa Nshuti

About

I am Felix Hirwa Nshuti, an MS student in Electrical and Computer Engineering at Carnegie Mellon University (CMU). My work resides at the critical intersection of computer architecture, optimizing compilers, and theoretical machine learning. I focus on building next-generation ML infrastructure by integrating low-level systems engineering with high-level mathematical abstractions, specifically leveraging convex optimization to close the gap between algorithmic intent and architectural execution.

My research and development efforts center on compiler infrastructure for AI workloads, exploring how intermediate representations (IR) can effectively transform computation graphs into hardware-aware schedules. Additionally, I specialize in ML-driven signal processing, where I apply tensor-level optimizations to ensure predictable performance and scalability in real-time inference environments.

As an active open-source contributor and maintainer, I am dedicated to community collaboration and the development of verifiable, efficient AI systems. Previously, I completed my Bachelor's in Computer Science and Engineering at Pandit Deendayal Energy University (PDEU), focusing on programming languages, compilers, and machine learning.

Research Interests

Optimization and theory: Deeply interested in the intersection of theoretical machine learning and convex optimization. Experienced in applying mathematical rigor to close the gap between algorithmic intent and architectural execution.
Signal processing: Specialized in the design of efficient, scalable infrastructure for ML-driven signal processing, leveraging tensor-level optimizations to ensure predictable performance in real-time inference environments.
Systems for ML: Research and development in compiler infrastructure for AI workloads, focusing on how intermediate representations (IR) transform computation graphs into hardware-aware schedules.
Objective: Build next-generation ML systems that integrate low-level systems engineering with high-level mathematical abstractions for efficient and verifiable AI deployment.

In my free time, I play football and share memes with friends.

Research

Accepted Papers

Comparative Analysis of Structural Code Representation with ML-Driven Embedding and Tool Support.
15th International Conference on Software Engineering and Applications (SEAS 2026), Copenhagen, Denmark — 2026

Accepted Conference Talks

TransISA: A Static Assembly Transpiler for Automating x86-to-ARM Migration in Scientific Computing.
Improving Scientific Software Conference, Boulder CO — 2026

Selected Projects

Enhancing Prophet with Question-Aware Captioning for Knowledge-Based VQA

Knowledge-Based Visual Question Answering — Python, PyTorch, Transformers, NLP, CV, VQA — Aug 2025 to Dec 2025

Improved performance of knowledge-based VQA systems by integrating question-aware captioning.
Built a module that generates contextually relevant captions to improve knowledge retrieval.

Context-Aware Demand Forecasting in Pittsburgh's Bike Share System

Time Series Forecasting with Contextual Features — Python, Pandas, NumPy, XGBoost, CatBoost, statsmodels — Aug 2025 to Dec 2025

Incorporated temporal, weather, and event-based features to enhance prediction accuracy.
Applied advanced time series analysis to capture complex usage patterns.

TransISA — Lightweight CISC-to-RISC Transpiler

Compiler Design and Architecture Translation — LLVM, x86, AArch64, C++, IR, CFG — Dec 2024 to May 2025

Built a compiler pipeline using the LLVM C++ API to translate x86 to AArch64 via LLVM IR.
Extracted and analyzed CFGs to support instruction-level translation.

Scaling Deep Learning Backends with sktime

Machine Learning with Time Series — Python, PyTorch, TensorFlow, Darts, CI/CD — May 2024 to Aug 2024

Added new classification models to sktime using PyTorch (GRU and GRU-FCNN Classifiers).
Migrated classifier models from legacy sktime-dl to the sktime main repository.
Implemented the modular interface of darts regression models in sktime.