Natural Language Processing (NLP)
NLP use cases are characterized by large unstructured data sets that can create performance bottlenecks as they scale.
In the example below, we show a market sentiment NLP implementation in Pachyderm. In it, we use transfer learning to fine-tune a BERT language model to classify text for financial sentiment.
It shows how to combine inputs from separate sources, incorporates data labeling, model training, and data visualization.
FinBERT is a pre-trained NLP model that is adapted to analyze the sentiment of financial text. The original BERT-based language model was trained with a large corpus of Reuters and training code
Key Features of Pachyderm
Pachyderm is cost-effective at scale and enables data engineering teams to automate complex pipelines with sophisticated data transformations
Deliver reliable results faster maximizes dev efficiency.
Automated diff-based data-driven pipelines.
Deduplication of data saves infrastructure costs.
Immutable data lineage ensures compliance.
Data versioning of all data types and metadata.
Familiar git-like structure of commits, branches, & repos.
Leverage existing infrastructure investment.
Language agnostic - use any language to process data
Data agnostic - unstructured, structured, batch, & streaming
See Pachyderm In Action
Watch a short 5-minute demo which outlines the product in action