Tommy M^cMichen

Abstract

I am a PhD candidate at Northwestern University with a B.S. in Computer Engineering and Computer Science from Rose-Hulman Institute of Technology. I study compilers, specifically looking into new intermediate representations and abstractions. My research aims to broaden the optimization space of compilers through intermediate representations that grant empowering degrees of freedom through strong guarantees. I am broadly interested in static analysis, runtime system co-design, programming languages, and memory models.

Publications

Automatic Data Enumeration for Fast Collections

Click me for more details... The paper is available here.
The source code is available on GitHub.
The artifact is available here.

Tommy McMichen and Simone Campanoni
CGO 2026
Choosing an implementation for each data collection in a program is critical for performance, memory usage and energy consumption. Specialized implementations offer significant benefits over their general-purpose counterparts, but require certain properties of the data they store, such as uniqueness or ordering. To employ them, developers must either possess domain knowledge or transform their data to exhibit the desired property. One such transformation is data enumeration, where data items are assigned unique identifiers to enable fast equality checks and compact memory layout. In this paper, we automate data enumeration in the MemOIR compiler, achieving speedups of 2.16× on average (up to 8.72×) and reducing peak memory consumption by 5.6% on average (up to 50.7%). This work shows that automated techniques can manufacture data properties to unlock specialized collection implementations, pushing the envelope of collection-oriented optimization.

Representing Data Collections in an SSA Form

Click me for more details... The paper is available here.
The source code is available on GitHub.
The artifact is available here.
For more information, see the IEEE Xplore.
Get the BibTeX

Tommy McMichen, Nathan Greiner, Peter Zhong, Federico Sossai, Atmn Patel, and Simone Campanoni
CGO 2024
Collection-oriented programming is a mainstay in many fields of computer science and its application. However modern, general-purpose compilers are entirely blind to their usage within programs. By prematurely lowering collections to their memory layout in libraries and compiler front ends, ambiguous memory behavior is presented to the compiler. Compilers are left to glean conservative information via costly and conservative analysis results. All of this culminates in data collections being a barrier to optimization for the compiler. We introduce MemOIR, an SSA form for data collections, objects and their fields. At its core, MemOIR decouples the memory used to store data from the memory used to logically organize data. We use MemOIR to explore novel static analyses and transformations of collections at the element-level.

Saving Energy with Per-Variable Bitwidth Selection

Click me for more details... The paper is available here.
The artifact is available here.

Tommy McMichen, David Dlott, Panitan Wangse-ammat, Nathan Greiner, Hussain Khajanchi, Russ Joseph, and Simone Campanoni
ASPLOS 2025
Tiny devices have become ubiquitous in people’s daily lives. Their use cases dictate tight energy budgets with reasonable performance to meet user expectations. To this end, the hardware of tiny devices has been highly optimized, making further optimizations difficult. We identify a missed opportunity: the bitwidth selection of program variables. Today, the bitwidth specified in the source code is directly translated to the binary. However, we observe that most variables do not utilize their full bitwidth for the majority of execution. To leverage this, we propose BitSpec: a compiler-architecture codesign that performs fine-grained speculation on the bitwidth of program variables. Through a combination of speculative transformations in the compiler, and dedicated hardware monitoring, BitSpec reduces energy consumption of tiny devices by 14.4% on average, and up to 28.2% while ensuring correctness.

Getting a Handle on Unmanaged Memory

Click me for more details... The paper is available here.
The artifact is available here.
For more information, see the ACM DL.
Get the BibTeX

Nick Wanninger, Tommy McMichen, Simone Campanoni, and Peter Dinda
ASPLOS 2024
The inability to relocate objects in unmanaged languages brings with it a menagerie of problems. Perhaps the most impactful is memory fragmentation, which has long plagued applications such as databases and web servers. These issues either fester or require Herculean programmer effort to address on a per-application basis because, in general, heap objects cannot be moved in unmanaged languages. In this work, we bridge this gap between unmanaged and managed languages through the use of handles, a level of indirection allowing heap object movement. Handles open the door to seamlessly employ runtime features from managed languages in existing, unmodified code written in unmanaged languages. We describe a new compiler and runtime system, Alaska, which automatically transforms pointer-based code to utilize handles, with optimizations to reduce performance impact. A codesigned runtime system manages this level of indirection and exploits heap object movement via an extensible service interface. We evaluate one such service, which eliminates fragmentation on the heap via compaction, reducing memory usage by up to 40% in Redis.

Program State Element Characterization

Click me for more details... The paper is available here.
The source code is available on GitHub.
For more information, see the ACM DL.
Get the BibTeX

Enrico Armenio Deiana, Brian Suchy, Michael Wilkins, Brian Homerding, Tommy McMichen, Katarzyna Dunajewski, Peter Dinda, Nikos Hardavellas, and Simone Campanoni
CGO 2023
Program State Element Characterization (PSEC) enables developers to more easily use modern language abstractions like OpenMP and C++ smart pointers. To accomplish this, PSEC provides a characterization of program state elements within a given code region. Today this process must be performed manually with no dedicated tool support. We implement PSEC in the CARMOT tool. Using CARMOT, developers are able to achieve parallel speedups that match those of hand-tuned OpenMP directives and avoid memory leaks with C++ smart pointers. We hope that PSEC and CARMOT empower developers to fully utilize the rich, expanding ecosystem of modern programming language abstractions.

NOELLE Offers Empowering LLVM Extensions

Click me for more details... The paper is available here.
The source code is available on GitHub.
For more information, see the IEEE Xplore.
Get the BibTeX

Angelo Matni, Enrico Armenio Deiana, Yian Su, Lukas Gross, Souradip Ghosh, Sotiris Apostolakis, Ziyang Xu, Zujun Tan, Ishita Chaturvedi, Brian Homerding, Tommy McMichen, David I. August, and Simone Campanoni
CGO 2022
NOELLE provides abstractions to help build advanced code analyses and transformations on top of the production quality LLVM compiler. NOELLE has been used to accelerate a diverse set of research prototypes, with a powerful automatically parallelizing compiler built upon it. It is available open source on github to help accelerate your compiler research.

Fine-Grained Acceleration using Runtime Integrated Custom Execution (RICE)

Click me for more details... The paper is available here.
For more information, see the ACM DL.

Leela Pakanati, John T. McMichen, and Zachary Estrada
CASES 2019
Runtime Integrated Custom Execution (RICE) relocates traditional peripheral reconfigurable acceleration devices into the pipeline of the processor. This relocation unlocks fine-grained acceleration previously impeded by communication overhead to a peripheral accelerator. Preliminary simulation results on a subset of the PARSEC benchmark suite shows promise for RICE in HPC applications. This was published in a 'work-in-progress' paper track.

Developing Parallel Computation Architectures

Click me for more details... Available on GitHub

Senior design project to improve the performance and efficiency of biologically-accurate neuron simulations using the Hodgkin-Huxley model and a LUT accelerator approach. Designed and developed novel parallel architecture to allow for parallel variable time-step integrators (VITAMIN), which achieved strong scaling on multi-core machines. Minimized hardware area usage of LUT accelerator, with similar execution time performance. Implemented software improvements to improve execution time performance.

Tommy McMichen