MEEP will provide a foundation for building European-based chips and infrastructure to enable rapid prototyping using a library of IPs and a standard set of interfaces to the Host CPU and other FPGAs in the system using the FPGA shell. In addition to RISC-V architecture and hardware ecosystem improvements, MEEP will also improve the RISC-V software ecosystem with an improved and extended software tool chain and software stack, including a suite of HPC and HPDA (High Performance Data Analytics) applications.
From a high-level perspective, MEEP is defined as a layered structure, such as it is depicted below. There exist direct dependencies among consecutive vertical layers; whereas the vertical layer is transversal to all the rest.
The above structure contextualizes the scope of each layer and the close relationship among contiguous layers, at the same time. HPC applications (first layer on top) need software tools (second layer) to transform programs into something readable and executable on the emulation platform (third layer), according to the characterization of the accelerator architecture (fourth layer).
Each layer has its own responsibility, and all of them might be analyzed using profiling and performance monitoring tools. In particular:
- HPC Applications layer studies and analyzes different kinds of applications, considering their workloads, data distribution and dependencies, level of parallelism, programming languages, etc.
- The Software toolchain layer provides the required software ecosystem to exploit hardware capabilities. It includes the software stack, from the operating system (at the bottom) to several HPC runtimes (at the top).
- The Emulation Platform layer enables the possibility of executing HPC applications on top of the Accelerator Architecture, using the information received from the Software toolchain layer.
- The Accelerator Architecture layer defines the accelerator components, their connectivity and functionality.
- The Profiling and performance monitoring layer gives some feedback in terms of applications memory access pattern.
Read more details on the MEEP platform layers here.
MEEP development phases
In order to deal with MEEP complexity and dimension, its development will be afforded in two main phases:
Phase 1: It will work with a portion of the architecture on a single FPGA, a single instance of the emulation platform. The development of this phase will use the Xilinx Alveo U280 data center accelerator cards. The idea behind this experimental work is being able to demonstrate the viability of MEEP as an FPGA-emulation platform.
MEEP teams have been making progress on the different layers involved in the project, from software to hardware.
Regarding the Software Stack, most of the efforts have been focused on the Operating System; since this is the foundation for all the software layers to be built on top. The same happens from the hardware perspective, where the Operating System opens the door to the execution of HPC applications on the hardware architecture. Thus, the final goal of the emulation platform, and the emulated accelerator, is to support the requirements imposed by the OS.
The efforts made from the hardware stack perspective have been focused on three different areas:
Emulation Platform: The goal here has been to create an FPGA Shell that enables communication between the host and emulated accelerator. More information can be found on Emulation Platform.
Modelling ACME accelerator architecture: The goal here has been to work on the refinement of the ACME accelerator specifications by following a data-driven decision procedure. Coyote is playing the main role as a performance modelling simulator, modeling different components and the corresponding simulation results are used to check initial assumptions, refine and validate features and behaviors. More information can be found on Accelerator Architecture.
ACME as an emulated accelerator: Considering that the ACME accelerator SoC will be further developed after the life cycle of MEEP, the hardware stack has to work on doing one step after another, starting from a first approach to the architecture and then refining the design to be as faithful as possible to the proposed ACME architecture specifications. In the end, the emulated accelerator will be a proof of concept for the MEEP project, which will allow executing HPC applications on top of it. More information can be found on Accelerator Architecture/corresponding subsection.
Phase 2: A bigger and more complex system will be developed, in which up to 8 cards will be placed in a system or node. Multiple systems will be connected via a high speed switch. The FPGAs will be able to communicate within the node or between nodes, facilitating FPGA-to-FPGA communication development in two ways: local and remote. The idea is to scale the system using a denser form factor to beyond what can be done today with normal off the shelf platforms. Therefore, the prototype platform will enable mappings within an FPGA, across multiple FPGAs in a node, and across multiple FPGAs in multiple nodes.