Collective Learning — Update with Code release
Feb 19, 2021
The collective learning protocol allows learners to collaborate on training a model without requiring trust between the participants. Learners vote on updates to the model, and only updates which pass the quality threshold are accepted. This makes the system robust to attempts to interfere with the model by providing bad updates.
Colearn is a library that enables privacy-preserving decentralized machine learning tasks on the FET network.
This blockchain-mediated collective learning system enables multiple stakeholders to build a shared machine learning model without needing to rely on a central authority, and without revealing sensitive information about their dataset to the other stakeholders.
The aim of Colearn is to enable the collaboration between parties that cannot collaborate effectively, be this due to privacy concerns, lack of trust, or lack of communication. Colearn aims to be a framework that bridges the gap between disparate stakeholders in a simple, but powerful way.
A Colearn experiment begins when a group of entities, referred to as participants, decide on a model architecture and begin learning. Together they will train a single global model. The goal is to train a model that performs better than any of the learners can produce by training on their private data set.
The core components are:
- Learner: Each participant is a learner in the experiment. A learner represents a unique private dataset and machine learning system.
- Global Model: The result of a collective learning experiment: a machine learning model that is trained collectively by the learners. Currently we support neural network architectures.
- Fetch AI Blockchain: The underlying blockchain and smart contracts that permits the coordination and governance in a secure and auditable way.
- Data Layer: A decentralized data layer based on IPFS that enables the sharing of machine learning weights between the learners.
How Training Works
Training occurs in rounds; during each round the learners attempt to improve the performance of the global shared model. To do so each round an update of the global model (for example a new set of weights in a neural network) is proposed. The learners then evaluate the update and decide if the new model is better than the current global model.
If enough learners approve the update then the global model is updated. After an update is approved or rejected a new round begins.
The detailed steps of a round updating a global model M are as follows:
- One of the learners is selected and proposes a new updated model M’
- The rest of the learners validate M’
- If M’ has better performance than M against their private data set then the learner votes to approve
- If not the learner votes to reject
- The total votes are tallied
- If more than some threshold (typically 50%) of learners approve then M’ becomes the new global model. If not, M continues to be the global model
- A new round begins.
By using a decentralized ledger (a blockchain) this learning process can be run in a completely decentralized, secure and auditable way. Further security can be provided by using differential privacy to avoid exposing your private data set when generating an update.
We have uploaded the code on Collective Learning to our Github repository using which you can simulate Collective Learning yourself. For further detailed information, you can also watch the video on Collective Learning below.