Engineered to make data science

Explainable. Repeatable. Scalable.

Our free and source-available version of Pachyderm is built and backed by a community of experts. With Pachyderm Community Edition, you can quickly and easily build, train, and deploy your data science workloads on whatever Kubernetes deployment you call home.

What is Pachyderm

Pachyderm is a data science platform that combines Data Lineage with End-to-End Pipelines on Kubernetes, engineered for the enterprise.

What’s your Pachyderm use case?

Pachyderm brings together version control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to develop their code in any language, framework, or tool of their choice. Pachyderm has been chosen time and time again to be the ideal foundation for teams looking to solve real-world AI and ML problems reliably.

Learn more


All over the world data scientists and ML engineers are discovering how much better applied data science can be when Pachyderm is involved. Here's just a few examples of they're saying.

"A Pachyderm Hub cluster equals Data Scientist autonomy!"

Raanan Hadar Data Scientist

"Setting up and provisioning a Kubernetes cluster can be a huge pain, so seeing a cluster spin up immediately on Hub was immensely satisfying."

Matt Usifer Software Engineer

"We use Pachyderm as our data pipeline orchestrator. For us, the fact that you can deploy it so easily to a k8s cluster, and use language-agnostic, container-based workloads are absolute killer features."

Guilherme Caminha Senior Software Engineer - Precis Digital

"Pachyderm makes it easy to organise and run complicated pipelines. It gets you up and running in a matter of seconds."

Samanvay Karambhe Data Scientist - Nearmap

Companies who use Pachyderm

LogMeIn Agbiome logo. Digital Reasoning logo. General Fusion logo.