Skip to main content

Qiang Yang et al. Federated Machine Learning Concept and Applications

Introโ€‹

  • Data exists in isolated islands
  • More security required

First FML framework by Google, 2016โ€‹

  • Horizontal FL
  • Vertical FL
  • Federated Transfer Learnings

Traditional Data Processing Modelsโ€‹

  • aka Simple Data Transaction Model
  • 3 parties
    • Data Collector
    • Data Sanitizer
    • ML Trainer
  • Privacy concerns

Overviewโ€‹

  • aim to extend to all privacy-preserving decentralized collaborative machine learning techniques.
  • simple definition
    • NN parties federate their data without exposing them to each other to attain performance closely comparable to the model trained as if all the information were gathered.
  • Secure Multi-Party Computation. jointly compute over their inputs while keeping those inputs private.
  • Differential Privacy. Add random noise to the data, making identifying any individual's data in the aggregate results difficult.
  • Homomorphic Encryption. allows computations on encrypted data without requiring decryption
  • Indirect Information Leakage
    • Considered Blockchained FL architectures

Categorization of Federated Learningโ€‹

Architectureโ€‹

Horizontal Federated Learningโ€‹

  • Participants compute locally, send to server
  • Server aggregates global model, distributes to participants
  • Participants update their local model

Vertical Federated Learningโ€‹

  • We need an intermediary collaborator.
  • Step 1: Collaborator C creates encryption pairs and sends the public key to A and B.
  • Step 2: A and B encrypt and exchange the intermediate results for gradient and loss calculations.
  • Step 3: A and B compute encrypted gradients and adds additional mask, respectively, and B also calculates encrypted loss; A and B send encrypted values to C.
  • Step 4: C decrypts and sends the decrypted gradients and loss back to A and B; A and B unmask the gradients and update the model parameters accordingly.

Privacy-preserving Machine Learningโ€‹

Privacy-preserving machine learning is designed to perform learning while keeping data private.

Techniquesโ€‹

Federated Learning vs. Other Conceptsโ€‹

Distributed Machine Learningโ€‹

  • Distributed ML focuses on distributed storage and operation.
  • Uses tools like "Parameter Server" to efficiently store and compute.
  • Differences with Federated Learning:

Edge Computingโ€‹

  • Federated learning serves as a protocol for edge computing.
  • To optimize learning, focus on determining the best trade-off for local updates and global aggregation.

Federated Database Systemsโ€‹

  • These systems integrate multiple databases.
  • Differences with Federated Learning:
    • No privacy mechanisms in federated database interactions.
    • Federated learning aims to create a unified model across different data owners with privacy.

Applications of Federated Learningโ€‹

Smart Retailโ€‹

  • Personalize services such as product recommendations.
  • Challenges: Data privacy, security, and heterogeneity across different entities (e.g., banks, social networks, e-shops).
  • Solution: Federated learning can train models without sharing raw data, thus overcoming privacy barriers.

Financeโ€‹

  • Detect multi-party borrowing which is a risk to the industry.
  • Federated learning can help find malicious borrowers without exposing user lists.

Smart Healthcareโ€‹

  • Challenges: Sensitive medical data scattered across isolated centers.
  • Solution: Federated learning combined with transfer learning can share model insights without sharing patient data.

Federated Learning as a Business Modelโ€‹

Traditional Approachโ€‹

Aggregate data, use cloud computing to compute models, then use results.

With Federated Learningโ€‹

  • Data stays where it is; only model insights are shared.
  • Privacy and data security are prioritized.
  • Offers a new paradigm for big data applications.
  • Can use blockchain for profit allocation in a data alliance.
  • Calls for establishing standards for federated learning for faster adoption.