Tian Li et al. Federated Learning Challenges, Methods, and Future Directions
Introโ
- Edge Computing is getting big
- Fog Computing
- Recent trend: Train centrally, Inference locally
- FL challenge this
Challengesโ
- Expensive communication
- Reduce comms round
- Reduce comms size
- System Heterogeneity
- Everyone uses different phones
- Statistical Heterogeneity
- Privacy Concerns
Communication Efficiency in Federated Networksโ
Communication is a primary bottleneck in federated learning.
Local Updatingโ
- Mini-batch optimization methods are used for distributed learning.
- Newer methods focus on flexible local updating, leveraging distributed data processing.
- In federated settings, FedAvg is a commonly used method for local updating.
- Federated Averaging (FedAvg) has been effective but may only sometimes converge.
Compression Schemesโ
- Model compression schemes like sparsification, subsampling, and quantization reduce message size.
- These methods face unique challenges in federated settings due to device participation variability.
Decentralized Trainingโ
- Federated learning often uses a star network topology.
- Decentralized topologies may reduce central server communication costs.
- Hierarchical communication patterns leverage edge servers for data aggregation.
Systems Heterogeneity in Federated Settingsโ
Federated settings exhibit significant systems variability: hardware, network, battery power.
Asynchronous Communicationโ
- Asynchronous schemes mitigate stragglers but face challenges in federated settings with unpredictable delays.
Active Samplingโ
- Only a subset of devices participate in each training round.
- Active selection of devices can influence outcomes.
- Methods can sample devices based on system resources or data quality.
Fault Toleranceโ
- Essential in federated settings due to potential device dropouts.
- Coded computation introduces redundancy to tolerate failures.
Statistical Heterogeneity in Federated Modelsโ
Statistical heterogeneity creates challenges in federated models when data is not consistently distributed across devices. This heterogeneity impacts both how data is modeled and the analysis of the convergence behavior of training procedures.
Modeling Heterogeneous Dataโ
Literature Backgroundsโ
Extensive research models statistical heterogeneity through meta-learning and multi-task learning. These concepts have recently been extended to federated learning.
- MOCHA โ Optimizes federated settings, allowing personalization by learning separate models for each device but also has a shared representation via multi-task learning.
- Bayesian Networks โ Models the star topology and performs variational inference.
- Meta-learning โ Adopts a within-task learning rate using multi-task information.
- Transfer Learning โ Explores personalization by training a global model.
Challengesโ
Despite advancements, there are still obstacles in ensuring the methods are robust, scalable, and automated.
Fairness in Federated Dataโ
Fairness needs to be considered apart from accuracy. Some devices might get more advantages based on data volume or frequency.
Modified Modelingโ
Some approaches aim to reduce model performance variance across devices.
Convergence Guarantees for Non-IID Dataโ
Challengeโ
Non-IID (independent and identically distributed) data causes difficulties in analyzing convergence behavior in federated settings.
- FedAvg โ A method that's been observed to diverge in practice.
- FedProx โ A proposal to modify FedAvg to ensure convergence.
Other Approachesโ
Some approaches tackle statistical heterogeneity by sharing local device data or server-side proxy data. However, this could lead to issues with privacy.
Privacy in Federated Learningโ
The need for privacy is paramount in federated settings, mainly because raw data remains local on each device. However, sharing information like model updates during training can inadvertently expose sensitive information.
Privacy in General Machine Learningโ
Strategiesโ
- Differential Privacy โ Popular for its strong guarantees, simplicity, and small systems overhead. It ensures changes in input don't drastically alter the output distribution.
- Homomorphic Encryption โ Allows computation on encrypted data.
- Secure Multiparty Computation (SMC) โ Enables multiple parties to compute functions collaboratively without leaking input data.
Trade-offsโ
A balance between privacy and model accuracy must be struck. The introduction of noise enhances privacy but can compromise accuracy.
Privacy in Federated Learningโ
Challengesโ
The federated context requires cost-effective methods, efficient communication, and resilience to device dropouts without sacrificing accuracy.
- Privacy Definitions โ Global privacy (privacy from external third parties) and local privacy (privacy from the central server).
Approachesโ
- Secure Multi-Party Computation (SMC) โ Protects individual model updates.
- Differential Privacy โ Applied to federated learning, offers global differential privacy. Some methods even propose adaptive gradient clipping to optimize performance.
- Local Privacy โ Proposes a more relaxed version that performs better than strict local privacy.
- Combining Methods โ Differential privacy can be merged with model compression for privacy and communication benefits.
Future Directions and Themesโ
Communication in Federated Learningโ
Extreme Communication Schemesโ
- Exploration of optimization in federated learning regarding communication requirements.
- Analyzing optimization methods' tolerance for precision and the resultant benefits for generalization.
- Evaluation of one-shot/few-shot heuristics in federated settings.
Communication Reduction and Pareto Frontierโ
- Techniques for reducing communication in federated training, e.g., local updating and model compression.
- Analysis of the trade-off between accuracy and communication.
- Assess techniques for their efficiency in achieving the best accuracy under a communication budget.
Novel Models of Asynchronyโ
- Comparison between synchronous and asynchronous communication in distributed optimization.
- Consideration of device-centric communication schemes in federated networks.
Heterogeneity in Federated Learningโ
Heterogeneity Diagnosticsโ
- Methods to quantify statistical heterogeneity.
- Development of diagnostics to gauge the levels of heterogeneity before training.
- Exploration of how heterogeneity can be leveraged for better-federated optimization convergence.
Privacy Concernsโ
Granular Privacy Constraintsโ
- Examination of local and global privacy definitions in federated networks.
- Proposing methods that respect device-specific or sample-specific privacy constraints.
Extending the Scope of Federated Learningโ
Beyond Supervised Learningโ
- Addressing scenarios where data in federated networks is unlabeled or weakly labeled.
- Addressing tasks other than model fitting, such as exploratory data analysis, aggregate statistics, or reinforcement learning.
Operational Challengesโ
Productionizing Federated Learningโ
- Handling practical concerns like concept drift, diurnal variations, and cold start problems.
Standards and Benchmarksโ
Benchmarksโ
- Emphasis on grounding federated learning research in real-world settings and datasets.
- Enhancing existing benchmarking tools to encourage reproducibility and the dissemination of solutions.