Computer Architecture (1996-2021)
Chair of Computer Architecture Group
History and Research
The Computer Architecture Group (CAG) has dedicated its research and teaching activities to the understanding and design of complex hardware/software systems, with a special interest in parallel computer architecture and high performance computing. As system architects, we covered not only the operation principles but also the technology and software to build working prototype systems. The group’s expertise covered the design space analysis and hardware design of processors, devices and interconnection networks, as well as the development of the corresponding software ecosystem (e. g. kernel drivers and communication libraries).
The CAG’s research interests cover all levels of the system design, starting at the application programming interface, e.g. MPI, through the efficient design of device drivers finishing at custom build hardware devices based on standard cells and FP- GAs. Goals of the applied research activities were to cover a broad range of methodologies for the design of complete high performance systems with the possibility to optimize every level and educate students on various real world topics. The CAG concluded its research activity in March 2021.
Selection of Past Research Projects
EXTOLL - Extreme Low Latency Interconnect
EXTOLL introduces a new interconnection network architecture for High-Performance Computing, which brings a rich set of features to the HPC application space. The EXTOLL solution integrates host-interface, network interface controller and network functions within a single chip. This single chip solution enables power efficiency and lower total cost of ownership.
GreenICE - Immersion cooled electronics
The GreenICE System is based on the principle of 2-phase immersion cooling. The electronic assemblies, such as server modules, are arranged standing up in a chassis filled with 3M NovecTM 649 Engineered Fluid.
openHMC - An open-source Hybrid Memory Cube controller
openHMC is a vendor-agnostic, AXI-4 compliant Hybrid Memory Cube (HMC) controller that can be parameterized to different data-widths, external lane-width requirements, and clock speeds depending on speed and area requirements. The main objective of developing the HMC controller is to lower the barrier for others to experiment with the HMC, without the risks of using commercial solutions. Project data and documentation can be found here.
DEEP - Extreme Scale Technologies (DEEP - EST)
The Modular Supercomputer Architecture (MSA) developed in the DEEP-EST research project is a blueprint for heterogeneous HPC systems supporting the divergent computation and dataprocessing requirements of high performance compute and data analytics with highest efficiency and scalability. The MSA isbased on the Cluster-Booster architecture developed in the previous DEEP and DEEP-ER projects. The MSA interconnection network is based on EXTOLL, encapsulated in a dense form factor box called FabriCube. In cooperation with EXTOLL, the Computer Architecture Group is responsible for the design and implementation of two key components, which serve two different use cases:
(1) Network Attached Memory (NAM) provides network-speed access up to a maximum of 64 TB non-volatile storage. It can serve as an inter- mediate storage target to, for example, hold machine learning training data. Its functionality is complemented by application specific processing elements to execute small compute kernels.
(2) The Global Collective Engine (GCE) is equipped with two DDR4 memory DIMMs. These function as a buffer for data elements required to process global collective operations “in the network”. Collective operations are among the most commonly used communication patterns in HPC applications. Hence, accelerating these operations will likely improve application runtimes and system efficiency. The DEEP-EST project has officially started in July 2017 and concluded its work in March 2021. Further information is available at here.
The Human Brain Project
The Human Brain Project (HBP) aims to understand how the inconceivably efficient system of the human brain works. For this purpose, it uses the method of synthesis biology. This means that it tries to understand the biological system from the bottom-up direction instead of using the conventional analytic top-down methodology. The BrainScaleS system at the Kirchhof-Institute for Physics (KIP) in Heidelberg is part of the HBP and pursues this goal by developing a neuromorphic analog hardware system in combination with a conventional computing cluster. Up to now, the communication FPGAs have been connected through an Ethernet network using USB 3.0 cables. In collaboration with the Computer Architecture Group, the KIP develops a new network interface for the FPGAs controlling the data communication between the neuromorphic hardware chips and the conventional digital system. This new network interface will utilize the benefits of the EXTOLL network technology, a high- performance interconnection network, which is optimized for low latency and high message rates. Further information is available here.
Cadence Academic Network
The Cadence Academic Network was launched in 2007. The aim is to promote the proliferation of leading-edge technologies and methodologies at universities renowned for their engineering and design excellence. A knowledge network among selected universities, research institutes, industry advisors, and Cadence facilitates the sharing of technology expertise in the areas of verification, design, and implementation of microelectronic systems.The Heidelberg University, specifically the Computer Architecture Group, is a member of this network as the Lead Institution for Advanced Verification Methodology.
- Lars Rzymianowicz, Ulrich Brüning, Jörg Kluge, Patrick Schulz and Mathias Waack.
ATOLL: A Network on a Chip
Cluster Computing Technical Session (CC-TEA) of the PDPTA'99 conference, June 28–July 1 1999, in Las Vegas, NV. - Lars Rzymianowicz, Ulrich Brüning, Jörg Kluge, Patrick Schulz and Mathias Waack.
ATOLL: A Network on a Chip
Cluster Computing Technical Session (CC-TEA) of the PDPTA'99 conference, June 28–July 1 1999, in Las Vegas, NV. - Ulrich Brüning, Jörg Kluge, Lars Rzymianowicz, and Mathias Waack.
FSMDesigner: Combining a Powerful Graphical FSM Editor and Efficient HDL Code Generation with Synthesis in Mind
8th International HDL Conference and Exhibition HDLCON'99, April 6–9, Santa Clara, CA.. - Jörg Kluge, Ulrich Brüning, Markus Fischer, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
The ATOLL approach for a fast and reliable System Area Network
Third Intl. Workshop on Advanced Parallel Processing Technologies (APPT'99) conference, October 19–21 1999, in Changsha, P.R. China. - Markus Fischer, Ulrich Brüning, Jörg Kluge, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
ATOLL, a new switched, high speed Interconnect in Comparison to Myrinet and SCI
IPDPS 2000, PC NOW Workshop, May 1–5 2000, Cancun Mexico. - Jörg Kluge, Ulrich Brüning, Markus Fischer, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
ATOLL - A Next Generation System Area Network
HPC Asia 2000, The Fourth International Conference/Exhibition on High Performance Computing in Asia-Pacific Region, May 14–17 2000, Beijing, P.R. China. - Markus Fischer, Ulrich Brüning, Jörg Kluge, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
Impact of Configurable Network Features in ATOLL
APSCC 2000, May 14–17, 2000, Beijing, P.R. China. - Lars Rzymianowicz, Mathias Waack, Ulrich Brüning, Markus Fischer, Jörg Kluge and Patrick Schulz.
Clustering SMP Nodes with the ATOLL Network: A Look into the Future of System Area Networks
HPCN 2000, April 1–5 2000, Amsterdam, NL. - Ulrich Brüning, Holger Fröning, Patrick R. Schulz, Lars Rzymianowicz
ATOLL: Performance and Cost Optimization of a SAN Interconnect
IASTED Conference: Parallel and Distributed Computing and Systems (PDCS), Nov. 4–6, 2002, Cambridge, USA. - David Slogsnat, Markus Fischer, Andres Bruhn, Joachim Weickert, Ulrich Brüning
Low Level Parallelization of Nonlinear Diffusion Filtering Algorithms for Cluster Computing Environments
International Conference on Parallel and Distributed Computing (Euro-PAR), August 26–29, 2003, Klagenfurt, Austria. - David Slogsnat, Patrick. R. Schulz, Ulrich Brüning
Lessons Learned from Using Superlog, SystemVerilog's Predecessor
Forum on Specification & Design Languages (FDL), September 23–26 2003, Frankfurt, Germany. - Patrick R. Schulz, Ulrich Brüning, Gunter Strube
SEED2002: Support of Educational course for Electronic Design
IEEE International Conference on Microelectronic Systems Education (MSE), June 1–2, 2003, Anaheim CA, USA. - David Slogsnat, Patrick R. Haspel, Holger Fröning and Ulrich Bruening
The ATOLL System Area Network (SAN)
IEEE Task Force Cluster Computing Newletter, September 2003. - Ulrich Bruening, Ulrich Krackhardt
Systeme fuer eine hocheffiziente elektrische und optische Kurzstreckenuebertragung im SAN- Bereich
it - Information Technology 02/2003, Oldenbourg Verlag, pp. 65–71, April 2003. - Ulrich Bruening, Wolfgang Giloi
Future Building Blocks for Parallel Architectures
Proceedings of the 2004 International Conference on Parallel Processing (ICPP.04), Montreal, CA, 2004. - Mondrian Nüssle, Holger Fröning, Ulrich Brüning
SWORDFISH: A Simulator for High-Performance Networks
IASTED Conference: Parallel and Distributed Computing and Systems (PDCS), Nov. 14–16, 2005, Phoenix, AZ, USA. - Holger Fröning, Mondrian Nüssle, David Slogsnat, Patrick R. Haspel, Ulrich Brüning
Performance Evaluation of the ATOLL Interconnect
IASTED Conference: Parallel and Distributed Computing and Networks (PDCN), Feb. 15–17, 2005, Innsbruck, Austria. - Holger Fröning, Mondrian Nüssle, David Slogsnat, Heiner Litz, Ulrich Brüning
The HTX-Board: A Rapid Prototyping Station
3rd annual FPGAworld Conference, Nov. 16, 2006, Stockholm, Sweden. - David Slogsnat, Alexander Giese and Ulrich Bruening
A versatile, low latency HyperTransport core
Fifteenth ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, California, February 2007. - David Slogsnat, Alexander Giese and Ulrich Bruening
Leveraging HyperTransport on Xilinx FPGAs
Xilinx Xcell Journal, Issue 61, July 2007. - Mondrian Nüssle, Holger Fröning, Alexander Giese, Heiner Litz, David Slogsnat, Ulrich Brüning
A Hypertransport based low-latency reconfigurable testbed for message-passing developments
2.Workshop Kommunikation in Clusterrechnern und Clusterverbundsystemen (KiCC'07), TU Chemnitz, February 2007.
- Holger Fröning, Heiner Litz, Ulrich Brüning
A new Ultra-low Latency Message Transfer Mechanism
IASTED Conference: Communication Systems and Networks (CSN 2007), Aug. 29–31, 2007, Palma de Mallorca, Spain. - David Slogsnat, Alexander Giese, Mondrian Nüssle, Ulrich Brüning
An Open-Source HyperTransport Core
ACM Transactions on Reconfigurable Technology and Systems (TRETS), Vol. 1, Issue 3, p.1–21, Sept. 2008. - Ulrich Brüning, Holger Fröning
High Performance Computing und die Technologie der Verbindungen - Heiner Litz, Holger Fröning, Mondrian Nüssle, Ulrich Brüning
VELO: A Novel Communication Engine for Ultra-low Latency Message Transfers
This paper has received the best paper award of this conference. - Heiner Litz, Holger Fröning, Ulrich Brüning
A HyperTransport 3 Physical Layer Interface for FPGAs
5th International Workshop on Applied Reconfigurable Computing (ARC 2009), March 16–18, 2009, Karlsruhe, Germany. - Jose Duato, Federico Silla, Brian Holden, Paul Miranda, Jeff Underhill, Mario Cavalli, Sudha Yalamanchili, Ulrich Brüning and Holger Fröning
Scalable Computing - Why and How
HyperTransport Consortium White Papers, 2009. - Benjamin Kalisch, Alexander Giese, Heiner Litz, Ulrich Brüning
HyperTransport 3 Core: A Next Generation Host Interface with Extremely High Bandwidth
First International Workshop on HyperTransport Research and Applications (WHTRA), February 12th, Mannheim, Germany. - Heiner Litz, Holger Fröning, Maximilian Thürmer, Ulrich Brüning
An FPGA based Verification Platform for HyperTransport 3.x
19th International Conference on Field Programmable Logic and Applications (FPL 2009), August 31– September 2, 2009, Prag, Czech Republic. - Mondrian Nüssle, Benjamin Geib, Holger Fröning, Ulrich Brüning
An FPGA-based custom high performance interconnection network
2009 International Conference on ReConFigurable Computing and FPGAs, December 9–11, Cancun, Mexico. - Holger Fröning, Heiner Litz, Ulrich Brüning
Efficient Virtualization of Network Interfaces
The Eighth International Conference on Networks (ICN 2009), March 1–6, 2009, Guadeloupe/France. - Frank Lemke, David Slogsnat, Niels Burkhardt, Ulrich Bruening
A Unified Interconnection Network with Precise Time Synchronization for the CBM DAQ-System
16th IEEE NPSS Real Time Conference 2009 (RT 09), May 10–15, Beijing, China. - Mondrian Nüssle, Martin Scherer, Ulrich Brüning
A resource optimized remote-memory-access architecture for low-latency communication
The 38th International Conference on Parallel Processing (ICPP-2009), September 22–25, Vienna, Austria.
- Holger Fröning, Mondrian Nüssle, Heiner Litz and Ulrich Brüning
A Case for FPGA based Accelerated Communication
The 9th International Conference on Networks (ICN 2010), April 12-16, 2010,Menuires, France This paper has received the best paper award of this conference. [pdf] - Holger Fröning and Heiner Litz
Efficient Hardware Support for the Partitioned Global Address Space
10th Workshop on Communication Architecture for Clusters (CAC2010), co-located with 24th International Parallel and Distributed Processing Symposium (IPDPS 2010), April 19, 2010, Atlanta, Georgia. - Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
Getting Rid of Coherency Overhead for Memory-Hungry Applications
IEEE International Conference on Cluster Computing 2010, September 20–24, 2010, Heraklion, Crete, Greece. - Frank Lemke, David Slogsnat, Niels Burkhardt, Ulrich Bruening
A Unified DAQ Interconnection Network with Precise Time Synchronization
IEEE Transactions on Nuclear Science (TNS), Journal Paper, VOL. 57, No. 2, APRIL 2010. - Heiner Litz, Holger Fröning, Ulrich Bruening
HTAX : A Novel Framework for Flexible and High Performance Networks-on-Chip
Fourth Workshop on Interconnection Network Architectures: On-Chip, Multi-Chip (INA-OCMC) in conjunction with Hipeac", January 25–27, 2010, Pisa, Italy. - Heiner Litz, Maximilian Thürmer, Ulrich Brüning
A Cluster Architecture Utilizing the Processor Host Interface as a Network Interconnect
CLUSTER 2010, September 20–24, 2010, Heraklion, Greece. - Holger Fröning
Network Interfaces
David Padua (Ed.): Encyclopedia of Parallel Computing, Springer, New York, ISBN 978-0-387-09765-7, to appear, 2011. - Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
MEMSCALE: a Scalable Environment for Databases
13th IEEE International Conference on High Performance Computing and Communications (HPCC- 2011), Sept. 2–4, 2011, Banff, Canada. - Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
Unleash your Memory-Constrained Applications: a 32-node Non-coherent Distributed-memory Prototype Cluster
13th IEEE International Conference on High Performance Computing and Communications (HPCC- 2011), Sept. 2–4, 2011, Banff, Canada. - Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
A New Degree of Freedom for Memory Allocation in Clusters
Cluster Computing: Special Issue (HPDC 2010), Springer, accepted for publication in 2011. - Holger Fröning, Hector Montaner, Federico Silla, Jose Duato
On Memory Relaxations for Extreme Manycore System Scalability
2nd Workshop on New Directions in Computer Architecture (NDCA-2), held in conjunction with the 38th
International Symposium on Computer Architecture (ISCA-38), San Jose, California, June 5th, 2011 - Holger Fröning, Alexander Giese, Hector Montaner, Federico Silla, Jose Duato
Highly Scalable Barriers for Future High-Performance Computing Clusters
18th annual IEEE International Conference on High Performance Computing (HiPC 2011), Dec. 18-21, 2011, Bangalore, India. - Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
MEMSCALE: In-Cluster-Memory Databases
20th ACM Conference on Information and Knowledge Management (CIKM2011), demo session, Oct. 24–28, 2011, Glasgow, UK. [pdf] - Denis Wohlfeld, Frank Lemke, Holger Fröning, Sven Schenk, Ulrich Brüning
High Density Active Optical Cable: from a Concept to a Prototype
SPIE Photonics West, Optoelectronic Interconnects and Component Integration XI, January 22–27, 2011, San Francisco, California. - Frank Lemke, Sven Kapferer, Alexander Giese, Holger Fröning, Ulrich Brüning
A HT3 Platform for Papid Prototyping and High Performance Reconfigurable Computing
Second International Workshop on HyperTransport Research amd Applications, Feb. 9th, 2011, Mannheim, Germany. - Bastian Mohr, Niklas Zimmermann, Yifan Wang, Björn Thorsten Thiel, Renato Negra, Stefan Heinen; Frank Lemke, Sven Schenk, Richard Leys, Ulrich Brüning
Implementation of an RF-DAC based Multistandard Transmitter System
CDNLIVE! 2011, Academic Track, May 3–5th, 2011, Munich, Germany. - Heiner Litz, Christian Leber, Benjamin Geib
DSL Programmable Engine for High Frequency Trading Acceleration
4th Workshop on High Performance Computational Finance at SC11 (WHPCF 2011), - Christian Leber, Benjamin Geib, Heiner Litz
High Frequency Trading Acceleration using FPGAs
21st International Conference on Field Programmable Logic and Applications (FPL 2011), September 5–7, 2011, Chania, Greece. - Frank Lemke, Ulrich Brüning
Design Concepts for a Hierarchical Synchronized Data Acquisition Network for CBM
IEEE 18th Real-Time Conference 2012 (RT12), June 11–15, 2012, Berkeley, CA, USA. - B.Mohr, N.Zimmermann, B.T.Thiel, J.H.Mueller, Y.Wang, Y.Zang, F. Lemke, R.Leys, S.Schenk, U. Brüning, R.Negra, S.Heinen
An RFDAC Based Reconfigurable Multistandard Transmitter in 65nm CMOS
IEEE 2012 RFIC Symposium, June 17–19, 2012, Montreal, Canada. - Javier Prades, Federico Silla, Jose Duato, Holger Fröning, Mondrian Nüssle
A New End-to-End Flow-Control Mechanism for High Performance Computing Clusters
IEEE International Conference on Cluster Computing, September 24-28, 2012, Beijing, China. - Mondrian Nüssle, Holger Fröning, Sven Kapferer, Ulrich Brüning
Accelerate Communication, not Computation!
High Performance Computing Using FPGAs, p. 507-542, Vanderbauwhede, Wim; Benkrid, Khaled (Eds.), Springer, 2013. - Holger Fröning, Mondrian Nüssle, Heiner Litz, Christian Leber and Ulrich Brüning
On Achieving High Message Rates
13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May 13-16, 2013, Delft, The Netherlands. - Sven Schatral, Frank Lemke, Ulrich Brüning
Design of a deterministic link initialization mechanism for serial LVDS interconnects
TWEPP 2013 - Topical Workshop on Electronics for Particle Physics, Sept. 23-27, 2013, Perugia, Italy. - Frank Lemke, Ulrich Brüning
A Hierarchical Synchronized Data Acquisition Network for CBM
IEEE Transactions on Nuclear Science (TNS), Journal Paper, VOL. 60, No. 5, Part II, Oct. 2013. - Sven Schatral, Frank Lemke and Ulrich Brüning
Design of a deterministic link initialization mechanism for serial LVDS interconnects
Journal of Instrumentation, doi:10.1088/1748-0221/9/03/C03022, VOL. 9, No. 03, pages C03022, March 2014. - Sven Kapferer, Markus Müller, Ulrich Brüning
Implementation of a Complex Network ASIC in an Academic Environment
CDNLive EMEA 2014, Academic Track, May 19-21, 2014, Munich, Germany. - Sarah Neuwirth, Dirk Frey, Mondrian Nüssle, Ulrich Brüning
Scalable Communication Architecture for Network-Attached Accelerators
21st IEEE International Symposium on High Performance Computer Architecture (HPCA 2015), Feb. 7-11, 2015, Bay Area, California, USA. (acceptance rate: 21%, 51/231, ranking: core A*) - Ulrich Bruening, Mondrian Nuessle, Dirk Frey
An Immersive Cooled Implementation of a DEEP Booster
Intel European Exascale Labs Annual Report 2014, July 2015. - Juri Schmidt, Ulrich Brüning
openHMC - A Configurable Open-Source Hybrid Memory Cube Controller
10th IEEE International Conference on ReConFigurable Computing and FPGAs, Dec. 7-9, 2015, Mayan Riviera, Mexico. - Sarah Neuwirth, Dirk Frey, Ulrich Bruening
Communication Models for Distributed Intel Xeon Phi Coprocessors
21st IEEE International Conference on Parallel and Distributed Systems (ICPADS 2015), Dec. 14-17, 2015, Melbourne, Australia. - Juri Schmidt, Holger Fröning, Ulrich Brüning
Exploring Time and Energy for Complex Accesses to a Hybrid Memory Cube
The international Symposium on Memory Systems (MEMSYS 2016) Oct. 3-6, 2016, Washington D.C, - Sarah Neuwirth, Feiyi Wang, Sarp Oral, Sudharshan Vazhkudai, James Rogers, Ulrich Brüning
Using Balanced Data Placement to Address I/O Contention in Production Environments
28th International Symposium on Computer Architecture and High Performance Computing (SBAC- PAD 2016), Oct. 26-28, 2016, Los Angeles, California, USA.
- Best Paper Award - - Sarah Neuwirth, Feiyi Wang, Sarp Oral, Ulrich Bruening
Automatic and Transparent Resource Contention Mitigation for Improving Large-scale Parallel File System Performance
23rd IEEE International Conference on Parallel and Distributed Systems (ICPADS 2017), Dec. 15-17, 2017, Shenzhen, China. - Stefan Kosnac and Ulrich Bruening
Design Flow Automation for On-Chip Inductors
Cadence User Conference 2018 (CDNLive EMEA 2018), Academic Track, May 7-9, 2018, Munich, Germany. - Felix Kaiser, Stefan Kosnac and Ulrich Bruening
Implementation of a RISC-V-Conform Fused Multiply-Add Floating Point Unit
Supercomputing Frontiers Europe 2019, March 11-14, 2019, Warsaw, Poland. - Tobias Markus, Markus Mueller, Ulrich Bruening
Schematic Generation Framework in a Mixed Signal Top Down Design Flow
22nd IEEE International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS 2019), April 24-26, 2019, Cluj-Napoca, Romania. - Bharti Wadhwa, Arnab Paul, Sarah Neuwirth, Feiyi Wang, Sarp Oral, Ali Butt, Jon Bernard, Kirk Cameron
iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems
33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), May 20-24, 2019, Rio de Janeiro, Brazil.
Further Information
The Chairholder Prof. Dr. Ulrich Brüning can be reached by e-mail via