Rechenarchitektur (1996-2021)

Lehrstuhl für Rechenarchitektur

History and Research

The Computer Architecture Group (CAG) has dedicated its research and teaching activities to the understanding and design of complex hardware/software systems, with a special interest in parallel computer architecture and high performance computing. As system architects, we covered not only the operation principles but also the technology and software to build working prototype systems. The group’s expertise covered the design space analysis and hardware design of processors, devices and interconnection networks, as well as the development of the corresponding software ecosystem (e. g. kernel drivers and communication libraries).

The CAG’s research interests cover all levels of the system design, starting at the application programming interface, e.g. MPI, through the efficient design of device drivers finishing at custom build hardware devices based on standard cells and FP- GAs. Goals of the applied research activities were to cover a broad range of methodologies for the design of complete high performance systems with the possibility to optimize every level and educate students on various real world topics. The CAG concluded its research activity in March 2021.

Selection of Past Research Projects

EXTOLL - Extreme Low Latency Interconnect

EXTOLL introduces a new interconnection network architecture for High-Performance Computing, which brings a rich set of features to the HPC application space. The EXTOLL solution integrates host-interface, network interface controller and network functions within a single chip. This single chip solution enables power efficiency and lower total cost of ownership.

Extoll logo

GreenICE - Immersion cooled electronics

The GreenICE System is based on the principle of 2-phase immersion cooling. The electronic assemblies, such as server modules, are arranged standing up in a chassis filled with 3M NovecTM 649 Engineered Fluid.

Image of a GreenICE system

openHMC - An open-source Hybrid Memory Cube controller

openHMC is a vendor-agnostic, AXI-4 compliant Hybrid Memory Cube (HMC) controller that can be parameterized to different data-widths, external lane-width requirements, and clock speeds depending on speed and area requirements. The main objective of developing the HMC controller is to lower the barrier for others to experiment with the HMC, without the risks of using commercial solutions. Project data and documentation can be found here.

openHMC logo

DEEP - Extreme Scale Technologies (DEEP - EST)

The Modular Supercomputer Architecture (MSA) developed in the DEEP-EST research project is a blueprint for heterogeneous HPC systems supporting the divergent computation and dataprocessing requirements of high performance compute and data analytics with highest efficiency and scalability. The MSA isbased on the Cluster-Booster architecture developed in the previous DEEP and DEEP-ER projects. The MSA interconnection network is based on EXTOLL, encapsulated in a dense form factor box called FabriCube. In cooperation with EXTOLL, the Computer Architecture Group is responsible for the design and implementation of two key components, which serve two different use cases:

(1) Network Attached Memory (NAM) provides network-speed access up to a maximum of 64 TB non-volatile storage. It can serve as an inter- mediate storage target to, for example, hold machine learning training data. Its functionality is complemented by application specific processing elements to execute small compute kernels.

(2) The Global Collective Engine (GCE) is equipped with two DDR4 memory DIMMs. These function as a buffer for data elements required to process global collective operations “in the network”. Collective operations are among the most commonly used communication patterns in HPC applications. Hence, accelerating these operations will likely improve application runtimes and system efficiency. The DEEP-EST project has officially started in July 2017 and concluded its work in March 2021. Further information is available at here.

DEEP-EST logo

The Human Brain Project

The Human Brain Project (HBP) aims to understand how the inconceivably efficient system of the human brain works. For this purpose, it uses the method of synthesis biology. This means that it tries to understand the biological system from the bottom-up direction instead of using the conventional analytic top-down methodology. The BrainScaleS system at the Kirchhof-Institute for Physics (KIP) in Heidelberg is part of the HBP and pursues this goal by developing a neuromorphic analog hardware system in combination with a conventional computing cluster. Up to now, the communication FPGAs have been connected through an Ethernet network using USB 3.0 cables. In collaboration with the Computer Architecture Group, the KIP develops a new network interface for the FPGAs controlling the data communication between the neuromorphic hardware chips and the conventional digital system. This new network interface will utilize the benefits of the EXTOLL network technology, a high- performance interconnection network, which is optimized for low latency and high message rates. Further information is available here.

Human Brain Project logo

Cadence Academic Network

The Cadence Academic Network was launched in 2007. The aim is to promote the proliferation of leading-edge technologies and methodologies at universities renowned for their engineering and design excellence. A knowledge network among selected universities, research institutes, industry advisors, and Cadence facilitates the sharing of technology expertise in the areas of verification, design, and implementation of microelectronic systems.The Heidelberg University, specifically the Computer Architecture Group, is a member of this network as the Lead Institution for Advanced Verification Methodology.

Cadence Academic Network logo

Publications

1999-2009:

Details
  1. Lars Rzymianowicz, Ulrich Brüning, Jörg Kluge, Patrick Schulz and Mathias Waack.
    ATOLL: A Network on a Chip
    Cluster Computing Technical Session (CC-TEA) of the PDPTA'99 conference, June 28–July 1 1999, in Las Vegas, NV.
  2. Lars Rzymianowicz, Ulrich Brüning, Jörg Kluge, Patrick Schulz and Mathias Waack.
    ATOLL: A Network on a Chip
    Cluster Computing Technical Session (CC-TEA) of the PDPTA'99 conference, June 28–July 1 1999, in Las Vegas, NV.
  3. Ulrich Brüning, Jörg Kluge, Lars Rzymianowicz, and Mathias Waack.
    FSMDesigner: Combining a Powerful Graphical FSM Editor and Efficient HDL Code Generation with Synthesis in Mind
    8th International HDL Conference and Exhibition HDLCON'99, April 6–9, Santa Clara, CA..
  4. Jörg Kluge, Ulrich Brüning, Markus Fischer, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
    The ATOLL approach for a fast and reliable System Area Network
    Third Intl. Workshop on Advanced Parallel Processing Technologies (APPT'99) conference, October 19–21 1999, in Changsha, P.R. China.
  5. Markus Fischer, Ulrich Brüning, Jörg Kluge, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
    ATOLL, a new switched, high speed Interconnect in Comparison to Myrinet and SCI
    IPDPS 2000, PC NOW Workshop, May 1–5 2000, Cancun Mexico.
  6. Jörg Kluge, Ulrich Brüning, Markus Fischer, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
    ATOLL - A Next Generation System Area Network
    HPC Asia 2000, The Fourth International Conference/Exhibition on High Performance Computing in Asia-Pacific Region, May 14–17 2000, Beijing, P.R. China.
  7. Markus Fischer, Ulrich Brüning, Jörg Kluge, Lars Rzymianowicz, Patrick Schulz and Mathias Waack.
    Impact of Configurable Network Features in ATOLL
    APSCC 2000, May 14–17, 2000, Beijing, P.R. China.
  8. Lars Rzymianowicz, Mathias Waack, Ulrich Brüning, Markus Fischer, Jörg Kluge and Patrick Schulz.
    Clustering SMP Nodes with the ATOLL Network: A Look into the Future of System Area Networks
    HPCN 2000, April 1–5 2000, Amsterdam, NL.
  9. Ulrich Brüning, Holger Fröning, Patrick R. Schulz, Lars Rzymianowicz
    ATOLL: Performance and Cost Optimization of a SAN Interconnect
    IASTED Conference: Parallel and Distributed Computing and Systems (PDCS), Nov. 4–6, 2002, Cambridge, USA.
  10. David Slogsnat, Markus Fischer, Andres Bruhn, Joachim Weickert, Ulrich Brüning
    Low Level Parallelization of Nonlinear Diffusion Filtering Algorithms for Cluster Computing Environments
    International Conference on Parallel and Distributed Computing (Euro-PAR), August 26–29, 2003, Klagenfurt, Austria.
  11. David Slogsnat, Patrick. R. Schulz, Ulrich Brüning
    Lessons Learned from Using Superlog, SystemVerilog's Predecessor
    Forum on Specification & Design Languages (FDL), September 23–26 2003, Frankfurt, Germany.
  12. Patrick R. Schulz, Ulrich Brüning, Gunter Strube
    SEED2002: Support of Educational course for Electronic Design
    IEEE International Conference on Microelectronic Systems Education (MSE), June 1–2, 2003, Anaheim CA, USA.
  13. David Slogsnat, Patrick R. Haspel, Holger Fröning and Ulrich Bruening
    The ATOLL System Area Network (SAN)
    IEEE Task Force Cluster Computing Newletter, September 2003.
  14. Ulrich Bruening, Ulrich Krackhardt
    Systeme fuer eine hocheffiziente elektrische und optische Kurzstreckenuebertragung im SAN- Bereich
    it - Information Technology 02/2003, Oldenbourg Verlag, pp. 65–71, April 2003.
  15. Ulrich Bruening, Wolfgang Giloi
    Future Building Blocks for Parallel Architectures
    Proceedings of the 2004 International Conference on Parallel Processing (ICPP.04), Montreal, CA, 2004.
  16. Mondrian Nüssle, Holger Fröning, Ulrich Brüning
    SWORDFISH: A Simulator for High-Performance Networks
    IASTED Conference: Parallel and Distributed Computing and Systems (PDCS), Nov. 14–16, 2005, Phoenix, AZ, USA.
  17. Holger Fröning, Mondrian Nüssle, David Slogsnat, Patrick R. Haspel, Ulrich Brüning
    Performance Evaluation of the ATOLL Interconnect
    IASTED Conference: Parallel and Distributed Computing and Networks (PDCN), Feb. 15–17, 2005, Innsbruck, Austria.
  18. Holger Fröning, Mondrian Nüssle, David Slogsnat, Heiner Litz, Ulrich Brüning
    The HTX-Board: A Rapid Prototyping Station
    3rd annual FPGAworld Conference, Nov. 16, 2006, Stockholm, Sweden.
  19. David Slogsnat, Alexander Giese and Ulrich Bruening
    A versatile, low latency HyperTransport core
    Fifteenth ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, California, February 2007.
  20. David Slogsnat, Alexander Giese and Ulrich Bruening
    Leveraging HyperTransport on Xilinx FPGAs
    Xilinx Xcell Journal, Issue 61, July 2007.
  21. Mondrian Nüssle, Holger Fröning, Alexander Giese, Heiner Litz, David Slogsnat, Ulrich Brüning
    A Hypertransport based low-latency reconfigurable testbed for message-passing developments
    2.Workshop Kommunikation in Clusterrechnern und Clusterverbundsystemen (KiCC'07), TU Chemnitz, February 2007.

  22. Holger Fröning, Heiner Litz, Ulrich Brüning
    A new Ultra-low Latency Message Transfer Mechanism
    IASTED Conference: Communication Systems and Networks (CSN 2007), Aug. 29–31, 2007, Palma de Mallorca, Spain.
  23. David Slogsnat, Alexander Giese, Mondrian Nüssle, Ulrich Brüning
    An Open-Source HyperTransport Core
    ACM Transactions on Reconfigurable Technology and Systems (TRETS), Vol. 1, Issue 3, p.1–21, Sept. 2008.
  24. Ulrich Brüning, Holger Fröning
    High Performance Computing und die Technologie der Verbindungen
  25. Heiner Litz, Holger Fröning, Mondrian Nüssle, Ulrich Brüning
    VELO: A Novel Communication Engine for Ultra-low Latency Message Transfers
    37th INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP-08), Sept. 08–12, 2008, Portland, Oregon, USA.
    This paper has received the best paper award of this conference.
  26. Heiner Litz, Holger Fröning, Ulrich Brüning
    A HyperTransport 3 Physical Layer Interface for FPGAs
    5th International Workshop on Applied Reconfigurable Computing (ARC 2009), March 16–18, 2009, Karlsruhe, Germany.
  27. Jose Duato, Federico Silla, Brian Holden, Paul Miranda, Jeff Underhill, Mario Cavalli, Sudha Yalamanchili, Ulrich Brüning and Holger Fröning
    Scalable Computing - Why and How
    HyperTransport Consortium White Papers, 2009.
  28. Benjamin Kalisch, Alexander Giese, Heiner Litz, Ulrich Brüning
    HyperTransport 3 Core: A Next Generation Host Interface with Extremely High Bandwidth
    First International Workshop on HyperTransport Research and Applications (WHTRA), February 12th, Mannheim, Germany.
  29. Heiner Litz, Holger Fröning, Maximilian Thürmer, Ulrich Brüning
    An FPGA based Verification Platform for HyperTransport 3.x
    19th International Conference on Field Programmable Logic and Applications (FPL 2009), August 31– September 2, 2009, Prag, Czech Republic.
  30. Mondrian Nüssle, Benjamin Geib, Holger Fröning, Ulrich Brüning
    An FPGA-based custom high performance interconnection network
    2009 International Conference on ReConFigurable Computing and FPGAs, December 9–11, Cancun, Mexico.
  31. Holger Fröning, Heiner Litz, Ulrich Brüning
    Efficient Virtualization of Network Interfaces
    The Eighth International Conference on Networks (ICN 2009), March 1–6, 2009, Guadeloupe/France.
  32. Frank Lemke, David Slogsnat, Niels Burkhardt, Ulrich Bruening
    A Unified Interconnection Network with Precise Time Synchronization for the CBM DAQ-System
    16th IEEE NPSS Real Time Conference 2009 (RT 09), May 10–15, Beijing, China.
  33. Mondrian Nüssle, Martin Scherer, Ulrich Brüning
    A resource optimized remote-memory-access architecture for low-latency communication
    The 38th International Conference on Parallel Processing (ICPP-2009), September 22–25, Vienna, Austria.

2010-2019

Details
  1. Holger Fröning, Mondrian Nüssle, Heiner Litz and Ulrich Brüning
    A Case for FPGA based Accelerated Communication
    The 9th International Conference on Networks (ICN 2010), April 12-16, 2010,Menuires, France This paper has received the best paper award of this conference. [pdf]
  2. Holger Fröning and Heiner Litz
    Efficient Hardware Support for the Partitioned Global Address Space
    10th Workshop on Communication Architecture for Clusters (CAC2010), co-located with 24th International Parallel and Distributed Processing Symposium (IPDPS 2010), April 19, 2010, Atlanta, Georgia.
  3. Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
    Getting Rid of Coherency Overhead for Memory-Hungry Applications
    IEEE International Conference on Cluster Computing 2010, September 20–24, 2010, Heraklion, Crete, Greece.
  4. Frank Lemke, David Slogsnat, Niels Burkhardt, Ulrich Bruening
    A Unified DAQ Interconnection Network with Precise Time Synchronization
    IEEE Transactions on Nuclear Science (TNS), Journal Paper, VOL. 57, No. 2, APRIL 2010.
  5. Heiner Litz, Holger Fröning, Ulrich Bruening
    HTAX : A Novel Framework for Flexible and High Performance Networks-on-Chip
    Fourth Workshop on Interconnection Network Architectures: On-Chip, Multi-Chip (INA-OCMC) in conjunction with Hipeac", January 25–27, 2010, Pisa, Italy.
  6. Heiner Litz, Maximilian Thürmer, Ulrich Brüning
    A Cluster Architecture Utilizing the Processor Host Interface as a Network Interconnect
    CLUSTER 2010, September 20–24, 2010, Heraklion, Greece.
  7. Holger Fröning
    Network Interfaces
    David Padua (Ed.): Encyclopedia of Parallel Computing, Springer, New York, ISBN 978-0-387-09765-7, to appear, 2011.
  8. Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
    MEMSCALE: a Scalable Environment for Databases
    13th IEEE International Conference on High Performance Computing and Communications (HPCC- 2011), Sept. 2–4, 2011, Banff, Canada.
  9. Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
    Unleash your Memory-Constrained Applications: a 32-node Non-coherent Distributed-memory Prototype Cluster
    13th IEEE International Conference on High Performance Computing and Communications (HPCC- 2011), Sept. 2–4, 2011, Banff, Canada.
  10. Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
    A New Degree of Freedom for Memory Allocation in Clusters
    Cluster Computing: Special Issue (HPDC 2010), Springer, accepted for publication in 2011.
  11. Holger Fröning, Hector Montaner, Federico Silla, Jose Duato
    On Memory Relaxations for Extreme Manycore System Scalability
    2nd Workshop on New Directions in Computer Architecture (NDCA-2), held in conjunction with the 38th
    International Symposium on Computer Architecture (ISCA-38), San Jose, California, June 5th, 2011
  12. Holger Fröning, Alexander Giese, Hector Montaner, Federico Silla, Jose Duato
    Highly Scalable Barriers for Future High-Performance Computing Clusters
    18th annual IEEE International Conference on High Performance Computing (HiPC 2011), Dec. 18-21, 2011, Bangalore, India.
  13. Hector Montaner, Federico Silla, Holger Fröning, Jose Duato
    MEMSCALE: In-Cluster-Memory Databases
    20th ACM Conference on Information and Knowledge Management (CIKM2011), demo session, Oct. 24–28, 2011, Glasgow, UK. [pdf]
  14. Denis Wohlfeld, Frank Lemke, Holger Fröning, Sven Schenk, Ulrich Brüning
    High Density Active Optical Cable: from a Concept to a Prototype
    SPIE Photonics West, Optoelectronic Interconnects and Component Integration XI, January 22–27, 2011, San Francisco, California.
  15. Frank Lemke, Sven Kapferer, Alexander Giese, Holger Fröning, Ulrich Brüning
    A HT3 Platform for Papid Prototyping and High Performance Reconfigurable Computing
    Second International Workshop on HyperTransport Research amd Applications, Feb. 9th, 2011, Mannheim, Germany.
  16. Bastian Mohr, Niklas Zimmermann, Yifan Wang, Björn Thorsten Thiel, Renato Negra, Stefan Heinen; Frank Lemke, Sven Schenk, Richard Leys, Ulrich Brüning
    Implementation of an RF-DAC based Multistandard Transmitter System
    CDNLIVE! 2011, Academic Track, May 3–5th, 2011, Munich, Germany.
  17. Heiner Litz, Christian Leber, Benjamin Geib
    DSL Programmable Engine for High Frequency Trading Acceleration
    4th Workshop on High Performance Computational Finance at SC11 (WHPCF 2011),
  18. Christian Leber, Benjamin Geib, Heiner Litz
    High Frequency Trading Acceleration using FPGAs
    21st International Conference on Field Programmable Logic and Applications (FPL 2011), September 5–7, 2011, Chania, Greece.
  19. Frank Lemke, Ulrich Brüning
    Design Concepts for a Hierarchical Synchronized Data Acquisition Network for CBM
    IEEE 18th Real-Time Conference 2012 (RT12), June 11–15, 2012, Berkeley, CA, USA.
  20. B.Mohr, N.Zimmermann, B.T.Thiel, J.H.Mueller, Y.Wang, Y.Zang, F. Lemke, R.Leys, S.Schenk, U. Brüning, R.Negra, S.Heinen
    An RFDAC Based Reconfigurable Multistandard Transmitter in 65nm CMOS
    IEEE 2012 RFIC Symposium, June 17–19, 2012, Montreal, Canada.
  21. Javier Prades, Federico Silla, Jose Duato, Holger Fröning, Mondrian Nüssle
    A New End-to-End Flow-Control Mechanism for High Performance Computing Clusters
    IEEE International Conference on Cluster Computing, September 24-28, 2012, Beijing, China.
  22. Mondrian Nüssle, Holger Fröning, Sven Kapferer, Ulrich Brüning
    Accelerate Communication, not Computation!
    High Performance Computing Using FPGAs, p. 507-542, Vanderbauwhede, Wim; Benkrid, Khaled (Eds.), Springer, 2013.
  23. Holger Fröning, Mondrian Nüssle, Heiner Litz, Christian Leber and Ulrich Brüning
    On Achieving High Message Rates
    13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May 13-16, 2013, Delft, The Netherlands.
  24. Sven Schatral, Frank Lemke, Ulrich Brüning
    Design of a deterministic link initialization mechanism for serial LVDS interconnects
    TWEPP 2013 - Topical Workshop on Electronics for Particle Physics, Sept. 23-27, 2013, Perugia, Italy.
  25. Frank Lemke, Ulrich Brüning
    A Hierarchical Synchronized Data Acquisition Network for CBM
    IEEE Transactions on Nuclear Science (TNS), Journal Paper, VOL. 60, No. 5, Part II, Oct. 2013.
  26. Sven Schatral, Frank Lemke and Ulrich Brüning
    Design of a deterministic link initialization mechanism for serial LVDS interconnects
    Journal of Instrumentation, doi:10.1088/1748-0221/9/03/C03022, VOL. 9, No. 03, pages C03022, March 2014.
  27. Sven Kapferer, Markus Müller, Ulrich Brüning
    Implementation of a Complex Network ASIC in an Academic Environment
    CDNLive EMEA 2014, Academic Track, May 19-21, 2014, Munich, Germany.
  28. Sarah Neuwirth, Dirk Frey, Mondrian Nüssle, Ulrich Brüning
    Scalable Communication Architecture for Network-Attached Accelerators
    21st IEEE International Symposium on High Performance Computer Architecture (HPCA 2015), Feb. 7-11, 2015, Bay Area, California, USA. (acceptance rate: 21%, 51/231, ranking: core A*)
  29. Ulrich Bruening, Mondrian Nuessle, Dirk Frey
    An Immersive Cooled Implementation of a DEEP Booster
    Intel European Exascale Labs Annual Report 2014, July 2015.
  30. Juri Schmidt, Ulrich Brüning
    openHMC - A Configurable Open-Source Hybrid Memory Cube Controller
    10th IEEE International Conference on ReConFigurable Computing and FPGAs, Dec. 7-9, 2015, Mayan Riviera, Mexico.
  31. Sarah Neuwirth, Dirk Frey, Ulrich Bruening
    Communication Models for Distributed Intel Xeon Phi Coprocessors
    21st IEEE International Conference on Parallel and Distributed Systems (ICPADS 2015), Dec. 14-17, 2015, Melbourne, Australia.
  32. Juri Schmidt, Holger Fröning, Ulrich Brüning
    Exploring Time and Energy for Complex Accesses to a Hybrid Memory Cube
    The international Symposium on Memory Systems (MEMSYS 2016) Oct. 3-6, 2016, Washington D.C,
  33. Sarah Neuwirth, Feiyi Wang, Sarp Oral, Sudharshan Vazhkudai, James Rogers, Ulrich Brüning
    Using Balanced Data Placement to Address I/O Contention in Production Environments
    28th International Symposium on Computer Architecture and High Performance Computing (SBAC- PAD 2016), Oct. 26-28, 2016, Los Angeles, California, USA.
    - Best Paper Award -
  34. Sarah Neuwirth, Feiyi Wang, Sarp Oral, Ulrich Bruening
    Automatic and Transparent Resource Contention Mitigation for Improving Large-scale Parallel File System Performance
    23rd IEEE International Conference on Parallel and Distributed Systems (ICPADS 2017), Dec. 15-17, 2017, Shenzhen, China.
  35. Stefan Kosnac and Ulrich Bruening
    Design Flow Automation for On-Chip Inductors
    Cadence User Conference 2018 (CDNLive EMEA 2018), Academic Track, May 7-9, 2018, Munich, Germany.
  36. Felix Kaiser, Stefan Kosnac and Ulrich Bruening
    Implementation of a RISC-V-Conform Fused Multiply-Add Floating Point Unit
    Supercomputing Frontiers Europe 2019, March 11-14, 2019, Warsaw, Poland.
  37. Tobias Markus, Markus Mueller, Ulrich Bruening
    Schematic Generation Framework in a Mixed Signal Top Down Design Flow
    22nd IEEE International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS 2019), April 24-26, 2019, Cluj-Napoca, Romania.
  38. Bharti Wadhwa, Arnab Paul, Sarah Neuwirth, Feiyi Wang, Sarp Oral, Ali Butt, Jon Bernard, Kirk Cameron
    iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems
    33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2019), May 20-24, 2019, Rio de Janeiro, Brazil.

Further Information

The Chairholder Prof. Dr. Ulrich Brüning can be reached by e-mail via ulrich.bruening@ziti.uni-heidelberg.de.