Topologies in Decentralized Federated Learning.

A detail description about topologies in Decentralized Federated Learning.
type: tutoriallevel: advanceguides: decai_fl

Data silos are nearly always available in the Cross-Silo configuration, have high-speed connectivity similar to the orchestrator, and exchange information with other silos more quickly than with the orchestrator. An inter-silo communication architecture focused on orchestrators would be ineffective since it overlooks rapid communication possibilities and makes the orchestrator a congestion candidate. A recent tendency is to replace communication between separate silos and an orchestrator with peer-to-peer communications. This setup conducts part of the local model update aggregations. In the section, we examine the mentioned scenario and how to develop the topology of communications.

Currently, their are three main types of topologies are leveraged in Decentralized Federated learning, including: Underlay, Connectivity Graph and Overlay.

Underlay

Fig-1

Figure 1: Underlay Gu=(VV,Eu)\mathcal{G}_u = (\mathcal{V} \cup \mathcal{V}', \mathcal{E}_u).

FL silos are connected by a so-called underlay, i.e., a communication infrastructure such as the Internet or some private network. The underlay can be represented as a directed graph (digraph). Gu=(VV,Eu)\mathcal{G}_u = (\mathcal{V} \cup \mathcal{V}', \mathcal{E}_u), where V\mathcal{V} denotes the set of silos, V\mathcal{V}' is the set of other nodes (e.g., routers) in the network, and Eu\mathcal{E}_u the set of communication links. For simplicity, we consider that each silo iVi \in \mathcal{V} is connected to the rest of the network through a single link (i,i)(i,i'), where iVi' \in \mathcal{V}', with uplink capacity CUP(i)C_{UP}(i) and downlink capacity CDN(i)C_{DN}(i) (See the example in Figure 1 which illustrates the underlay).

Connectivity Graph

Fig-2

Figure 2: Connectivity Graph Gc=(V,Ec)\mathcal{G}_c = (\mathcal{V}, \mathcal{E}_c).

Connectivity Graph denotes by Gc=(V,Ec)\mathcal{G}_c = (\mathcal{V}, \mathcal{E}_c) captures the possible direct communications among silos. Often the connectivity graph is fully connected, but specific NAT or firewall configurations may prevent some pairs of silos to communicate. If transmission is allowed, the messageexperiences a delay that is the sum of two contributions: 1) an end-to-end delay accounting for link latencies, and queueing delays long the path, and 2) a term depending on the model size and the available bandwidth. We assume that in the stable cross-silo setting these quantities do not vary or vary slowly, so that the topology is recomputed only occasionally, if at all.

Overlay

Fig-3

Figure 3: Overlay Go=(V,Eo)\mathcal{G}_o = (\mathcal{V}, \mathcal{E}_o).

Thank to the development of decentralized training algorithm, we do not need to use all potential connections. Hence, the orchestrator can select a connected subgraph of Gc\mathcal{G}_c, the so-called Overlay Go=(V,Eo)\mathcal{G}_o = (\mathcal{V}, \mathcal{E}_o). Only nodes directly connected in Eo\mathcal{E}_o will exchange messages.