The Filter Pipeline in Syncraft is essentially a gatekeeper in your data processing journey. It evaluates data against specified conditions, ensuring only the qualified data continues through the pipeline. Unlike other pipelines that are involved in data retrieval or transformation, the Filter Pipeline exclusively focuses on controlling the data flow based on predefined criteria.
Let's break down how the Filter Pipeline operates using a simplistic example.
The condition
structure is the core of the filter pipeline. It simulates a model query, enabling you to test data against certain conditions.
{
"condition": {
"id": {
"lt": 3
},
"name": {
"includes": "ob"
}
}
}
In the above snippet:
id
field must be less than 3.name
field must include the substring "ob".Now, let's evaluate a piece of data against this condition.
{
"id": 4,
"name": "Bob"
}
Here, the data:
id
is 4, which does not meet the condition of being less than 3.name
is "Bob", which meets the condition of including the substring "ob".Since one of the conditions is not met, this piece of data would not pass through the filter.
[
{
"query": {
"id": {
"eq": 2
}
},
"selection": {
"id": true,
"name": true
},
"root": "Users"
},
{
"condition": {
"id": {
"lt": 3
},
"name": {
"includes": "ob"
}
}
}
]
The syntax for defining conditions in the Filter Pipeline draws from the Syncraft Sync Definition. However, the primary distinction is that this evaluation occurs without interacting with a data source. It's a self-contained check within the pipeline, ensuring that only data meeting the defined criteria proceeds to the subsequent stages of the pipeline.
The Filter Pipeline is a crucial component for managing data flow meticulously, ensuring that only pertinent data is processed further, which in turn, enhances efficiency and accuracy in data processing within Syncraft.