Skip to main content

DeML

DeML

Blockchain marketplace where people get paid for helping train ML models and don’t need to share their data with anyone.
Hackathon ProjectAI/ML

Team

Kun Qiu
Bill Bai
g05x5#0
Ulysses Kee
ukcw#0
Jason Sun
sunapi386#0

Languages, Libraries & Stacks

Share project

About

Data is the new oil. Powerful machine learning models are trained using people’s data. In the current world, people usually share their private data with a centralized entity in exchange for some freemium service. Users are not paid for sharing private data that powers the applications, and they rely on the central authority to protect their data privacy.

Imagine a world where people don’t need to share their local data with a central entity but still contribute to training a global machine learning model using their data locally and even better, get paid for doing so. In this project, we use blockchain to achieve that.

How it's Made

In this project, we created a blockchain model marketplace where model sponsors, who want to train some ML models using people’s data (for example, health data to predict health conditions), can post their jobs. The posting basically says I need this type of data to train a specific model and in exchange, I will pay X tokens. The model sponsor would provide in the listing the initial model to be trained, a model trainer executable (for clients to run on their local node to update the model using their local data) and the reward for the job.

Clients (nodes) can see the listings in the marketplace and decide if they want to participate or not. If a client decides to participate, they accept the job from the marketplace. After that, the client loads the initial model and the trainer executable from IPFS (Filecoin), and runs the trainer executable on their local node with their local data. Once the job is completed, a new model will be written to IPFS. We use Cartesi to run a Python service for model training and validation, given {model_cid, data_cid} where cid is a unique IPFS file reference string. The Cartesi Machine will listen to the network, and upon the message (in the form of creating a blockchain transaction), the corresponding reward is issued to the client’s wallet. We also tried Filecoin’s Lilypad for the distributed execution of the ML compute but were stuck at creating a smart contract part using Lilypad. But we think Lilypad would work great as well. Note that the model training happens on the client side and the client data never leaves the client node. Only the new model weights are delivered to the model sponsor in exchange for the reward. Thus, this protects the user data privacy and simultaneously the client gets rewarded for their effort in training the model using their data.

For wallet connection and authentication, we used WalletConnect (super easy to use and integrate). After completing the task, the client can see their reward in their wallet. We also integrated web3inbox so that the model sponsors can easily communicate with potential clients for any questions or discussions.

Gallery

Last updated: Oct 20, 2023
Anyone is free to submit information about their project. Do your own research and use your best judgment when using or interacting with any of the projects listed in this directory. Being listed in this directory is not an endorsement from the Cartesi Foundation or any other related entity.

Explore similar projects

Biometrics classifier
Proof of concept

Biometrics classifier

This DApp uses machine learning, computer vision, and feature extraction to perform a decentralized biometrics spoof detection on-chain. Beyond verifying who’s fingerprints were used, this program checks for spoofing.

Last updated: Oct 20, 2023
ChainGPT
Hackathon Project

ChainGPT

Decentralised & verifiable chat AI, backed by the blockchain: a port of Alpaca LLM model leveraging the Cartesi app-specific rollups

Last updated: Oct 20, 2023
Teach AI
Hackathon Project

Teach AI

The application helps to curate high-quality datasets by providing a framework to incentivize RLHF, with the LLM being fully verifiable and hosted on-chain.

Last updated: Oct 20, 2023