Tommy McMichen

I am a Ph.D. candidate at Northwestern University with a B.S. in Computer Engineering and Computer Science from Rose-Hulman Institute of Technology. I study compilers, specifically the design of intermediate representations. My research aims to broaden the optimization space of compilers through intermediate representations that grant empowering degrees of freedom through strong guarantees. Currently, I focus on doing this for data collections (lists, sets, maps) using MemOIR. I am broadly interested in static analysis, runtime system co-design, programming languages, and memory models.

Selected Publications

Automatic Data Enumeration for Fast Data Collections

Tommy McMichen, Simone Campanoni

Choosing an implementation for each data collection in a program is critical for performance, memory usage and energy consumption. Specialized implementations offer significant benefits over their general-purpose counterparts, but require certain properties of the data they store, such as uniqueness or ordering. To employ them, developers must either possess domain knowledge or transform their data to exhibit the desired property. One such transformation is data enumeration, where data items are assigned unique identifiers to enable fast equality checks and compact memory layout. In this paper, we automate data enumeration in the MemOIR compiler, achieving speedups of 2.16× on average (up to 8.72×) and reducing peak memory consumption by 5.6% on average (up to 50.7%). This work shows that automated techniques can manufacture data properties to unlock specialized collection implementations, pushing the envelope of collection-oriented optimization.

Saving Energy with Per-Variable Bitwidth Speculation

Tommy McMichen, David Dlott, Panitan Wongse-ammat, Nathan Greiner, Hussain Khajanchi, Russ Joseph, Simone Campanoni

Tiny devices have become ubiquitous in people's daily lives. Their applications dictate tight energy budgets, but also require reasonable performance to meet user expectations. To this end, the hardware of tiny devices has been highly optimized, making further optimizations difficult. In this work, we identify a missed opportunity: the bitwidth selection of program variables. Today's compilers directly translate the bitwidth specified in the source code to the binary. However, we observe that most variables do not utilize the full bitwidth specified in the source code for the majority of execution. To leverage this opportunity, we propose BitSpec: a system that performs fine-grained speculation on the bitwidth of program variables. BitSpec is implemented as a compiler-architecture co-design, where the compiler transparently reduces the bitwidth of program variables to their expected needs and the hardware monitors speculative variables, reporting misspeculation to the software, which re-executes at the original bitwidth, ensuring correctness. BitSpec reduces energy consumption by 9.9% on average, up to 28.2%.

Representing Data Collections in an SSA Form

Tommy McMichen, Nathan Greiner, Peter Zhong, Federico Sossai, Atmn Patel, Simone Campanoni

Choosing an implementation for each data collection in a program is critical for performance, memory usage and energy consumption. Specialized implementations offer significant benefits over their general-purpose counterparts, but require certain properties of the data they store, such as uniqueness or ordering. To employ them, developers must either possess domain knowledge or transform their data to exhibit the desired property. One such transformation is data enumeration, where data items are assigned unique identifiers to enable fast equality checks and compact memory layout. In this paper, we automate data enumeration in the MemOIR compiler, achieving speedups of 2.16× on average (up to 8.72×) and reducing peak memory consumption by 5.6% on average (up to 50.7%). This work shows that automated techniques can manufacture data properties to unlock specialized collection implementations, pushing the envelope of collection-oriented optimization.

Getting a Handle on Unmanaged Memory

Nick Wanninger, Tommy McMichen, Simone Campanoni, Peter Dinda

The inability to relocate objects in unmanaged languages brings with it a menagerie of problems. Perhaps the most impactful is memory fragmentation, which has long plagued applications such as databases and web servers. These issues either fester or require Herculean programmer effort to address on a per-application basis because, in general, heap objects cannot be moved in unmanaged languages. In this work, we bridge this gap between unmanaged and managed languages through the use of handles, a level of indirection allowing heap object movement. Handles open the door to seamlessly employ runtime features from managed languages in existing, unmodified code written in unmanaged languages. We describe a new compiler and runtime system, Alaska, which automatically transforms pointer-based code to utilize handles, with optimizations to reduce performance impact. A codesigned runtime system manages this level of indirection and exploits heap object movement via an extensible service interface. We evaluate one such service, which eliminates fragmentation on the heap via compaction, reducing memory usage by up to 40% in Redis.

1 Feel free to email me at mcmichen@u.northwestern.edu

2 Set up an appointment on my Google Calendar.

3 Follow my work on GitHub.

4 Connect with me on LinkedIn.

5 See what I'm up to on Mastodon