DVC

Data Version Control. Git for data and ML models. Track experiments and build pipelines.

About DVC

DVC (Data Version Control) is an open source tool for data science and machine learning projects. It is designed to handle large files, datasets, and machine learning models with Git-like commands while storing them efficiently.

Key Features

Git-like versioning for data & models
ML pipeline management
Experiment tracking & comparison
Remote storage (S3, GCS, Azure, etc.)
Metrics & plots
Language & framework agnostic

Why choose DVC?

DVC is an open source alternative to Git LFS, Pachyderm. Licensed under Apache-2.0, it gives you full access to the source code and the freedom to modify, self-host, and contribute. It is available as a desktop or web application.