(Most of) Bradley C. Kuszmaul's Papers and Publications.


These papers are sorted chronologically, newest first.


[BenderDeKu23]
Increment-and-Freeze: Every Cache, Everywhere, All of the Time,
by Michael Bender, Daniel Delayo, Bradley C. Kuszmaul, William Kuszmaul, and Evan West.
In The 35th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2023), Orlando, Florida, June 18–20, 2023.
[BenderKuKu21]
Linear Probing Revisited: Tombstones Mark the Death of Primary Clustering,
by Michael Bender, Bradley C. Kuszmaul, and William Kuszmaul.
An early (arxiv) version of a paper to appear in FOCS 2021, Denver, CO, February 2022.
[LeisersonThEm20]
There's Plenty of Room at the Top: What Will Drive Computer Performance After Moore's Law?,
by Charles E. Leiserson, Neil Thompson, Joel S. Emer, Bradley C. Kuszmaul, Butler W. Lampson, Daniel Sanchez, and Tao B. Schardl.
In Science, Volume 367, Issue 6495, 5 June, 2020.
Available through Neil's website.
[FrigoKuMa20]
Everyone Loves File: Oracle File Storage Service,
by Matteo Frigo, Bradley C. Kuszmaul, Justin Mazzola Paluska, and Alexander (Sasha) Sandler.
In ACM Transactions on Storage (TOS), Volume 16, Issue 1, Article 3, March, 2020.
https://doi.org/10.1145/3377877
An early version appeared as FrigoKuMa19.
[FrigoKuMa19]
Everyone Loves File: File Storage Service (FSS) in Oracle Cloud Infrastructure,
by Matteo Frigo, Bradley C. Kuszmaul, Justin Mazzola Paluska, and Alexander (Sasha) Sandler.
In The 2019 USENIX Annual Technical Conference (USENIX ATC '19), Renton, Washingon, June 10–12, 2019, pp. 15–32.
See also FrigoKuMa20
[SchardlDeDo17]
The CSI Framework for Compiler-Inserted Program Instrumentation,
by Tao B. Schardl, Tyler Denniston, Damon Doucet, Bradley C. Kuszmaul. I-Ting Angelina Lee, and Charles E. Leiserson.
In Proceedings of the ACM on Measurement and Analysis of Computing Systems, Volume 1, Issue 2, December 2017.
Also known as ACM SIGMETRICS, Irvine California, June 18–22, 2018.
[YuanZhJa17]
Writes Wrought Right, and Other Adventures in File System Optimization,
by Jun Yuan, Yang Zhan, William Jannen, Prashant Pandey, Amogh Akshintala, Kanchan Chandnani, Pooja Deo, Zardosht Kasheff, Leif Walsh, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter.
In ACM Transactions on Storage (TOS) (Special Issue on USENIX FAST 2016 and Regular Papers), Volume 13, Issue 1, Article 3, March 2017.
An early version appeared as YuanZhJa16.
[BenderEbHu16]
B-trees and Cache-Oblivious B-trees with Different-Sized Atomic Keys,
by Michael Bender, Roozbeh Ebrahimi, Haodong Hu, and Bradley C. Kuszmaul.
In Transactions on Database Systems, Volume 41, Number 3, Article 19 (33 pages), July, 2016.
[JannenBeFa16]
Lazy Analytics: Let Other Queries Do the Work For You,
by William Jannen, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter.
In The 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage '16), Denver, Colorado, June 20–21, 2016.
[DreherByHi16]
Page Rank Pipeline Benchmark: Proposal for a Holistic Benchmark for Big-Data Platforms,
by Patrick Dreher, Chansup Byun, Chris Hill, Vijay Gadepalley, Bradley C. Kuszmaul, and Jeremy Kepner.
In Graph Algorithms Building Blocks (GABB '16), Chicago, Illinois, May 23, 2016.
[ChowdhuryGaTi16]
Autogen: Automatic Discovery of Cache-Oblivious Parallel Recursive Algorithms for Solving Dynamic Programs,
by Rezaul A. Chowdhury, Pramod Ganapathi, Jesmin Jahan Tithi, Charles Bachmeier, Bradley C. Kuszmaul, Charles E. Leiserson, Armando Solar-Lezama, and Yuan Tang.
In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practies of Parallel Programming (PPoPP '16), Barcelona, Spain, March 12–16, 2016.
[YuanZhJa16]
Optimizing Every Operation in a Write-optimized File System,
by Jun Yuan, Yang Zhan, William Jannen, Prashant Pandey, Amogh Akshintala, Kanchan Chandnani, Pooja Deo, Zardosht Kasheff, Leif Walsh, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter.
In The 14th USENIX Conference on File and Storage Technologies (FAST '16), Santa Clara, California, February 22–25, 2016.
Best paper award. Also invited to USENIX ATC 2016 "best of the rest" section.
See also YuanZhJa17.
[JannenYuZh15b]
BetrFS: Write-Optimization in a Kernel File System,
by William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter.
In ACM Transactions on Storage (TOS), Volume 11, Number 4, Article 18, November, 2015.
An early version appeared as JannenYuZh15.
[BenderFaJa15]
An Introduction to Bε-trees and Write-Optimization,
by Michael Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Bradley C. Kuszmaul, Donald E. Porter. Jun Yuan, and Yang Zhan.
;login: The USENIX Magazine, Volume 40, Number 5, October 2015, pp. 22–28.
[Kuszmaul15]
SuperMalloc: A Super Fast Multithreaded malloc for 64-bit Machines,
by Bradley C. Kuszmaul.
In The 2015 International Symposium on Memory Management (ISMM), pp. 41–55, Portland, Oregon, June 14, 2015.
[SchardlKuLe15]
The Cilkprof Scalability Profiler,
by Tao B. Schardl, Bradley C. Kuszmaul, I-Ting Angelina Lee, William M. Leiserson, and Charles E. Leiserson.
In The 27th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2015), pp. 89–100, Portland, Oregon, June 13–15, 2015.
[JannenYuZh15]
BetrFS: A Right-Optimized Write-Optimized File System,
by William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter.
In The 13th USENIX Conference on File and Storage Technologies, pp. 301–315, Santa Clara, California, February 16–19, 2015.
Runner up, best paper award.
See also JannenYuZh15b.
[KuszmaulKu14]
Avoiding Tree Saturation in the Face of Many Hotspots with Few Buffers,
by Bradley C. Kuszmaul and William H. Kuszmaul.
in The 16th IEEE International Conference on High Performance and Communications (HPCC), pp. 472–481, Paris, France, August 20–22, 2014.
[KuszmaulKu14a]
Brief Announcement: Few Buffers, Many Hot Spots, and No Tree Saturation (with High Probability),
by Bradley C. Kuszmaul and William H. Kuszmaul.
in The 16th IEEE International Conference on High Performance and Communications (HPCC), pp. 67–69, Prague, Czech Republic, June 23–25, 2014.
[AvniKu14]
Improving HTM Scaling with Consistency-Oblivious Programming,
by Hillel Avni, and Bradley C. Kuszmaul.
In The 9th ACM SIGPLAN Workshop on Transactional Computing (TRANSACT 2014),
Salt Lake City, Utah, March 2, 2014.
[BenderFaJo12]
Don't Thrash: How to Cache Your Hash on Flash,
by Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard P. Spillane, and Erez Zadoc.
Proceedings of the VLDB Endowment, Volume 5, Number 11, pp. 1627–1637, July, 2012.
Original article.
[EsmetBeFa12]
The TokuFS Streaming File System.
by John Esmet, Michael A. Bender, Martin Farach-Colton, and Bradley C. Kuszmaul.
In The 4th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage '12), Boston, Massachusetts, June 13–14, 2012.
[TangChKu11]
The Pochoir Stencil Compiler
by Yuan Tang, Rezaul Alam Chowdhury, Bradley C. Kuszmaul, Chi-Keung Luk, and Charles E. Leiserson.
In Proceedings of the 23rd ACM Symposium on Parallelism in Algorithms and Architectures, pp. 117–128, San Jose, California, June 4–6, 2011.
[BenderKuTe11]
Optimal Cache-Oblivious Mesh Layouts
by Michael A. Bender, Bradley C. Kuszmaul, Shang-Hua Teng, and Kebin Wang.
Theory of Computing Systems (TOCS), volume 48, issue 2, pp. 269–296, February 2011.
[BenderHuKu10]
Performance Guarantees for Different-Sized Atomic Keys
by Michael A. Bender, Haodung Hu, and Bradley C. Kuszmaul.
In Proceedings of the 29th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 305–316, Indianapolis, Indiana, June 6–11, 2010.
[KuszmaulSu06]
Concurrent Cache-Oblivious B-Trees Using Transactional Memory,
by Bradley C. Kuszmaul and Jim Sukha.
In the Workshop on Transactional Memory Workloads, Ottawa, Canada, June 10, 2006.
[BenderFaKu06]
Cache-Oblivious String B-Trees,
by Michael A. Bender, Martin Farach-Colton, and Bradley C. Kuszmaul.
In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2006), pp. 233–242, Chicago, Illinois, June 26–29, 2006.
[BenderFaHe05]
Adversarial Contention Resolution for Simple Channels,
by Michael A. Bender, Martin Farach-Colton, Simai He, Bradley C. Kuszmaul, and Charles E. Leiserson.
In Proceedings of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2005), pp. 325–332, July 2005, Las Vegas, Nevada.
[BenderFiGi05]
Concurrent Cache-Oblivious B-Trees,
by Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Bradley C. Kuszmaul.
In Proceedings of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2005), pp. 228–237, July 2005, Las Vegas, Nevada.
[Kuszmaul05]
A Segmented Parallel-Prefix VLSI Circuit with Small Delays for Small Segments,
by Bradley C. Kuszmaul.
In Proceedings of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2005) (brief announcement), page 213, July 2005, Las Vegas, Nevada.
[AnanianAsKu05]
Unbounded Transactional Memory,
by C. Scott Ananian, Krste Asanović, Bradley C. Kuszmaul, Charles E. Leiserson, and Sean Lie.
The IEEE MICRO Special Issue: Top Picks from Computer Architecture Conferences, Volume 26, Number 1, Jan/Feb 2006, pp. 59–69.
Named "one of the most industry-relevant and significant papers of the year in computer architecture".
An earlier version appeared in The 11th International Symposium on High-Performance Computer Architecture (HPCA '05), pp. 316–327, February 2005, San Fransisco, California.
[Kasheff04]
Cache-Oblivious Dynamic Search Trees,
by Zardosht Kasheff.
MIT M.Eng. thesis. June 2004. (Supervised by Bradley C. Kuszmaul)
[HenryKuLo02]
A Comparison of Scalable Superscalar Processors,
by Dana S. Henry, Bradley C. Kuszmaul, and Gabriel H. Loh.
In Theory of Computer Systems (TOCS), Volume 35, Number 2, pp. 123–150, April, 2002.
An early version appeared as KuszmaulHeLo99.
[HenryKu01]
Branch Prediction in a Speculative Dataflow Processor,
by Dana S. Henry and Bradley C. Kuszmaul.
In The Fifth Workshop on Multithreaded Execution, Architecture, and Compilation (MTEAC-5) held in Conjunction with MICRO-34, in Austin Texas, December 1, 2001.
[HenryKuLo00]
Circuits for Wide-Window Superscalar Processors,
by Dana S. Henry, Bradley C. Kuszmaul, Gabriel H. Loh and Rahul Sami.
In The 27th Annual International Symposium on Computer Architecture (ISCA-2000), pp. 236–247, June 12–14, 2000. Vancouver, British Columbia.
(Also published as Ultrascalar Memo 5.)
[KuszmaulHeLo99]
A Comparison of Scalable Superscalar Processors,
by Dana S. Henry, Bradley C. Kuszmaul, and Gabriel H. Loh.
In The Eleventh ACM Symposium on Parallel Algorithms and Architecture (SPAA '99), pp. 126–137, Saint-Malo, France, June 27–30, 1999.
See also HenryKuLo02.
[HenryKuVi99]
The Ultrascalar Processor---An Asymptotically Scalable Superscalar Microarchitecture,
by Dana S. Henry Bradley C. Kuszmaul, and Vinod Viswanath.
In The Twentieth Anniversary Conference on Advanced Research in VLSI (ARVLSI'99), Atlanta, GA, March 21–24, 1999. pp. 256–273.
[HenryKu98a]
Cyclic Segmented Parallel Prefix,
by Dana S. Henry and Bradley C. Kuszmaul.
Ultrascalar Memo 1, November 1998.
[HenryKu98b]
An Efficient, Prioritized Scheduler Using Cyclic Prefix,
by Dana S. Henry and Bradley C. Kuszmaul.
Ultrascalar Memo 2, November, 1998.
[BernsteinKu98]
Communications-Efficient Multithreading on Wide-Area Networks,
by Michael S. Bernstein and Bradley C. Kuszmaul.
In The Tenth Annual Symposium on Parallel Algorithms and Architectures (SPAA'98) Revue, June 1998. (Short abstract.)
[BlumofeJoKu96]
Cilk: An Efficient Multithreaded Runtime System,
by Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou,
In The Journal of Parallel and Distributed Computing, Volume 37, Number 1, August 25, 1996, pp. 55–69.
An early version appeared in The Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '95), Santa Barbara, CA, pp. 207–216, July 1995.
[LeisersonAbDo96]
The Network Architecture of the Connection Machine CM-5,
by Charles E. Leiserson, Bradley C. Kuszmaul, Zahi S. Abuhamdeh, David C. Douglas, Carl R. Feynman, Mahesh N. Ganmukhi, Jeffrey V. Hill, W. Daniel Hillis, Margaret A. St. Pierre, David S. Wells, Monica C. Wong, Shaw-Wen Yang, and Robert Zak.
In The Journal of Parallel and Distributed Computing, Volume 33, Number 2, March 15, 1996, pp. 145–158.
(An early version appeared in the 1992 ACM Symposium on Parallel Algorithms and Architectures, pp. 272-285, June 1992, San Diego, California.)
This paper won the innaugural 2023 SPAA Test-of-Time Award.
[Kuszmaul95a]
The RACE Network Architecture,
by Bradley C. Kuszmaul.
In The 7th International Parallel Processing Symposium (IPPS '95), April, 1995, Santa Barbara, California.
[Kuszmaul95b]
The StarTech Massively Parallel Chess Program,
by Bradley C. Kuszmaul.
In the Journal of the International Computer Chess Association, Volume 18, Number 1, pp. 3–20 March, 1995.
[BlumofeFrHa95]
Cilk 1.2 (Version Beta 1) Reference Manual,
Robert D. Blumofe, Michael Halbherr, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Phil Lisiecki, Keith H. Randall, Andy Shaw, and Yuli Zhou.
February, 1995.
[JoergKu94]
Massively Parallel Chess,
by Bradley C. Kuszmaul and Chris Joerg.
Presented at the Third DIMACS Parallel Implementation Challenge held at Rutgers University, October 16-17 1994.
[Kuszmaul94]
Synchronized MIMD Computing,
by Bradley C. Kuszmaul.
Ph.D. thesis, May 1994.
Technical Report MIT/LCS/TR-645.
Supervised by Charles E. Leiserson.
[BrewerKu94]
How to Get Good Performance from the CM5 Data Network,
by Eric A. Brewer and Bradley C. Kuszmaul.
In The 1994 International Parallel Processing Symposium, Cancun, Mexico, April 1994, pp 858–867.
A version was also given at the First International Connection Machine User Group Conference in Santa Fe, New Mexico on February 18, 1994. This work was also reported in my thesis.
[Kuszmaul90a]
A Glitch in the Theory of Delay-Insensitive Circuits,
by Bradley C. Kuszmaul.
In The 1990 ACM International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems (Tau '90), The University of British Columbia, Vancouver, British Columbia, August 14–17 1990.
[KuszmaulFr90]
NAP (No ALU Processor) The Great Communicator,
by Bradley C. Kuszmaul and Jeff Fried.
In The Journal of Parallel and Distributed Computing, Volume 8, Number 2, February 1990, pp. 169–179.
An early version appeared as FriedKu88.
[Kuszmaul90b]
Fast Deterministic Routing, on Hypercubes, Using Small Bufffers,
by Bradley C. Kuszmaul.
In IEEE Transactions on Computers, Volume 39, Number 11, pp. 1390–1393, November 1990.
[Kuszmaul90-maspar]
Analysis of the MasPar MP-1 Architecture,
by Bradley C. Kuszmaul.
Unpublished notes from a conference talk. March 1990.
[Kuszmaul89iwarp]
Summary of `iWARP Forum`.
by Bradley C. Kuszmaul.
Unpublished trip report from iWARP forum in Washington, DC, 12 September 1989.
[KahleKuHi89]
The Connection Machine Message Router,
by Brewster Kahle, Bradley C. Kuszmaul, and W. Daniel Hillis.
Unpublished manuscript describing the CM-2 router, April 1989.
[FriedKu88]
NAP (No ALU Processor) The Great Communicator,
by Jeff Fried and Bradley C. Kuszmaul.
In The Second Symposium on the Frontiers of Massively Parallel Computation (Frontiers '88), George Mason University, Fairfax, Virginia, October 10–12 1988, pp. 383–389.
See KuszmaulFr90.
[Kuszmaul86b]
A SECD Machine for PCF,
by Bradley C. Kuszmaul.
Term paper for Logic and Semantics of Programs (MIT Course 6.830J/18.427J, Fall 1986).
December 17, 1986.
[Kuszmaul86]
Simulating Applicative Architectures on the Connection Machine,
by Bradley C. Kuszmaul.
S.M. thesis, June 1986
Supervised by Jack B. Dennis.
(Some figures are missing in this version. Illustrate is dead. Long live postscript.)
[Kuszmaul84]
Type Checking in VimVal,
by Bradley C. Kuszmaul.
S.B. thesis, June 1984.
Supervised by Jack B. Dennis.
Runner up best Martin award for best thesis.

Note: The URIs for the papers are constructed according to the last name of the first author, the first two letters of the second author, the first two letters of the third author, and the last two digits of the year. E.g., FriedKu88 is a paper written by Jeff Fried and Bradley C. Kuszmaul in 1988. A lower case letter is added to the end to disambiguate duplicates.


bradley@mit.edu

Valid HTML 4.01!