Π£ Π½Π°Ρ Π²Ρ ΠΌΠΎΠΆΠ΅ΡΠ΅ ΠΏΠΎΡΠΌΠΎΡΡΠ΅ΡΡ Π±Π΅ΡΠΏΠ»Π°ΡΠ½ΠΎ Peer Provider Composability in libfabric ΠΈΠ»ΠΈ ΡΠΊΠ°ΡΠ°ΡΡ Π² ΠΌΠ°ΠΊΡΠΈΠΌΠ°Π»ΡΠ½ΠΎΠΌ Π΄ΠΎΡΡΡΠΏΠ½ΠΎΠΌ ΠΊΠ°ΡΠ΅ΡΡΠ²Π΅, Π²ΠΈΠ΄Π΅ΠΎ ΠΊΠΎΡΠΎΡΠΎΠ΅ Π±ΡΠ»ΠΎ Π·Π°Π³ΡΡΠΆΠ΅Π½ΠΎ Π½Π° ΡΡΡΠ±. ΠΠ»Ρ Π·Π°Π³ΡΡΠ·ΠΊΠΈ Π²ΡΠ±Π΅ΡΠΈΡΠ΅ Π²Π°ΡΠΈΠ°Π½Ρ ΠΈΠ· ΡΠΎΡΠΌΡ Π½ΠΈΠΆΠ΅:
ΠΡΠ»ΠΈ ΠΊΠ½ΠΎΠΏΠΊΠΈ ΡΠΊΠ°ΡΠΈΠ²Π°Π½ΠΈΡ Π½Π΅
Π·Π°Π³ΡΡΠ·ΠΈΠ»ΠΈΡΡ
ΠΠΠΠΠΠ’Π ΠΠΠΠ‘Π¬ ΠΈΠ»ΠΈ ΠΎΠ±Π½ΠΎΠ²ΠΈΡΠ΅ ΡΡΡΠ°Π½ΠΈΡΡ
ΠΡΠ»ΠΈ Π²ΠΎΠ·Π½ΠΈΠΊΠ°ΡΡ ΠΏΡΠΎΠ±Π»Π΅ΠΌΡ ΡΠΎ ΡΠΊΠ°ΡΠΈΠ²Π°Π½ΠΈΠ΅ΠΌ Π²ΠΈΠ΄Π΅ΠΎ, ΠΏΠΎΠΆΠ°Π»ΡΠΉΡΡΠ° Π½Π°ΠΏΠΈΡΠΈΡΠ΅ Π² ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΊΡ ΠΏΠΎ Π°Π΄ΡΠ΅ΡΡ Π²Π½ΠΈΠ·Ρ
ΡΡΡΠ°Π½ΠΈΡΡ.
Π‘ΠΏΠ°ΡΠΈΠ±ΠΎ Π·Π° ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ ΡΠ΅ΡΠ²ΠΈΡΠ° ClipSaver.ru
Sean Hefty, Intel Corp.; Alexia Ingerson, Intel Corp.; Jianxin Xiong, Intel Corp. libfabric defines low-level communication APIs. Implementations of those APIs over a specific network technology is known as a provider. There are providers for RDMA NICs, shared memory, standard Ethernet NICs, as well as customized high-performance NICs. In general, an application running over libfabric typically uses a single provider for all communication. This model works well when there's a single fabric connecting all communicating peers. An application written to libfabric can migrate between different network technologies with minimal effort. However, such a simple model is unable to achieve maximum performance out of current and future system configurations. For example, Intel's DSA (data-streaming accelerator) offers significant benefits for communication between processes within a single operating system domain. Similarly, GPUs have their own back-end fabrics for communicating between devices, which offers significant bandwidth improvements over standard networks. It is expected that GPU fabrics will attach to devices under the control of different operating systems, and that use case will become more common. Additionally, more traditional HPC networks (e.g. Infiniband) offer in-network accelerations for communications, such as switch-based collectives. It is conceivable that such accelerations could exist in GPU fabrics, or even within the local node using custom devices (e.g. FPGAs or PCIe/CXL plug-in devices). To achieve the best performance, an application must be able to leverage all of these components well -- local node accelerations, GPU fabrics, GPU fabric switches, HPC NICs, HPC switches, and other attached devices. A significant difficulty in doing so is that these different components may come from different vendors and the application must be able to work across a variety of evolving hardware and network transport configurations. To support this anticipated complexity, libfabric has introduced a new concept known as peer providers and peer APIs. The peer APIs target independent development and maintenance of highly focused providers, which can then be assembled to present themselves to an application as a single entity. This allows mixing and matching providers from different vendors for separate purposes, as long as they support the peer APIs. This talk will discuss the peer provider architecture, the current status, and peer API design. It is comprised of 3 related presentations, listed as one submission for ease of review. Together, the 3 presentations will require 60-75 minutes total. Sample presentations for each section are attached to the submission, but may show up as different presentation versions. However, each presentation is separate. The presentations are: 1. Introduction to the peer provider architecture and API design. 2. Pairing the shared memory provider with scale-out (i.e. HPC NIC) providers. Two separate, complimentary methods are discussed. 3. Using the peer architecture to support highly focused providers with a scale-out provider. In this example, we'll look at integrating support for a provider focused on offloading collective operations onto switches and how it can be paired with core providers.