Pachyderm has been acquired by Hewlett Packard Enterprise - Learn more

Building Models on your
Data Warehouse

Leverage your Data Warehouse for innovative machine learning (ML) projects such as churn analysis or customer lifetime value projections.

data warehouse integration

Using Pachyderm and the Data Warehouse

Leverage all the structured data in the data warehouse and combine it with unstructured data to provide a comprehensive view for the Data Scientist.

Data Centric

Data-Centric Processing

Pachyderm’s pipelines leverage automated versioning that drives incremental processing and data deduplication that shorten processing times and reduce storage costs

Data Workflows

Complex Data Workflows

With Pachyderm you can build complex workflows that can support the most advanced ML applications, which can be visually managed and monitored with Pachyderm console UI

Scale

Scales to the Job

Pachyderm scales to petabytes of data with autoscaling and data-driven parallel processing. Our approach to version control and file processing automates scale while controlling compute costs

Reproduceable

Fully Reproducable

Pachyderm automatically versions all data and code changes across your data workflow, including intermediate transformations, so you always have full reproducibility and lineage for your ML models

Lauguagea Agnostic

Language and data agnostic

Use any language or library in your Pachyderm pipelines such as Python, R, Scala, or Bash. If you can get it into a container, then Pachyderm can run it as a pipeline. Easily process both structured and unstructured data

Native Integration

Native Integration

Getting data into and out of your data warehouse is as simple as writing a SQL query

Recommended Reading

Read the Docs

Ingesting Data with SQL

Learn more about this feature from our documentation website on how to leverage SQL data sources in your ML Pipelines.

GitHub Example

Churn Prediction with Snowflake

Create a churn analysis model for a music streaming service with Pachyderm and Snowflake using the Data Warehouse integration.

Read the Blog

Speed Up Your Pipeline Development

Going beyond the limitations of SQL and using Python to speed development and insight with Snowflake.

Want to see Pachyderm Data Pipelines in action? Book a demo with one of our solution engineers!