An Introduction to Federated Learning: Decentralized Data, Centralized Intelligence

An Introduction to Federated Learning: Decentralized Data, Centralized Intelligence

Photo by Chris Ried on Unsplash

In many real-world applications, training machine learning models on client data is challenging due to data exchange issues and user privacy concerns. To address these problems, McMahan et al. introduced federated learning in 2016.

Definition of federated learning

Federated learning is a machine learning approach where a model is trained across multiple clients under the orchestration of a central server. Instead of sharing raw data, clients share only model weight updates with the server. The server then aggregates these updates to improve the global model.

Diagram illustrating the process of federated learning: clients' mobile devices send data to the server for federated learning, followed by model testing by engineers and analysts, and culminating in model deployment and distribution to users globally.

The diagram is taken from the [3] paper.

How Federated Learning Works

The server orchestrates the training process by repeatedly following these steps:

  1. Select Clients: The server selects a sample of clients that meet the eligibility criteria.

  2. Distribute Model Weights: Clients download the current model weights from the server.

  3. Local Training: Clients train the model locally on their own data.

  4. Collect Updates: The server collects model updates from the clients.

  5. Aggregate Updates: The server aggregates the updates to refine the global model.

  6. Update Global Model: The server updates the shared global model based on the aggregated updates.

Real-Life Usage

Google uses federated learning extensively in its Gboard mobile keyboard, Pixel phone features, and Android Messages. Apple has also adopted this technology in iOS 13 for the QuickType keyboard and the "Hey Siri" vocal classifier.

For more information about federated learning, please use the references below:

References

  1. Communication-Efficient Learning of Deep Networks from Decentralized Data

  2. What is federated learning? (IBM Research)

  3. Advances and Open Problems in Federated Learning