Pachyderm has been acquired by Hewlett Packard Enterprise - Learn more

Pachyderm 2.2 Release

We’re excited to announce the 2.2 release of Pachyderm with several improvements around pipeline debugging. We’ve also made it easier to backup and restore Pachyderm clusters with a new Pachctl command. Lastly, we’ve released several performance fixes and enhancements.

Pipeline Debugging 

We’ve included a host of enhancements to make pipeline debugging faster and easier including:

  • More intuitive and informative job details Console view:
    • New side panel provides a more compact and easier to read job summary.
    • Additional transform details provide better insights into user defined code which in turn allows more effective debugging and remediation.
    • An interactive (expandable/collapsable) view of the YAML to allow easier navigation within a pipeline spec.
    • Streamlined debugging within Console through an intuitive Global ID filter which visually highlights the relationship of your DAG’s commits and pipeline jobs.
  • A new view in Console that makes it easy to see which files are new or changed between commits 
  • The ability to view the Docker image SHA on datums for better lineage tracking 
  • We’ve exposed join_on and group_by as environment variables on pipeline jobs

Backup and Restore 

To make backing up and restoring Pachyderm easier we now provide a Pachctl command for pausing and unpausing Pachyderm clusters. Pause/unpause via Pactctl is an Enterprise Edition Feature. For more in information please see our documentation

Performance Fixes & Enhancements

A variety of performance fixes and enhancements are included in 2.2. Changes include:

  • Content-defined chunking is now decoupled from file batching. This generally enables more files to be copied by reference during copy file and compaction.
  • The concat step during compaction is now purely a metadata operation. It used to require rewriting of data proportional to the total volume of data.
  • All compaction tasks now run through the task service, which offloads work from the PFS master. 
  • Added a compaction task cache which allows recovery of compaction work when transient errors occur during compaction.

Pachyderm 1.13 EOL & Docker Hub Updates

To give customers more time to migrate from Pachyderm 1.x to 2.x we’re going to continue to support 1.13 through the 2.2 release. If you’ve not already started planning your 2.x migration please contact Pachyderm.

This is notice that we’ll be removing unsupported images from Docker Hub starting on May 26th. This includes any Pachyderm versions earlier than 1.13.