How to use Dataweave to filter data

Dataweave is a powerful tool for filtering, transforming, and manipulating data in MuleSoft’s Anypoint Platform. It is a functional programming language that uses a simple syntax and can handle complex data structures. In this blog, we will focus on how to use Dataweave to filter data.

Dataweave provides different functions and operators to filter data. The most commonly used functions are filter, map, and reduce. These functions can be used to filter data based on different conditions.

Why Dataweave filter is needed?

Dataweave filter is a powerful feature that allows you to selectively extract or remove data from a complex data structure based on a specific condition. There are several reasons why you might need to use Dataweave filter in your integration projects:

  • Selective extraction: Often, you need to extract only a subset of the data from a complex data structure. Dataweave filter allows you to extract only the data that matches a certain condition, while discarding the rest. This helps reduce the amount of data that needs to be processed, and makes the integration more efficient.
  • Data validation: Dataweave filter can be used to validate the data before processing it further. For example, you can use filter to remove any invalid or incomplete data from the input, and only process the data that meets certain criteria. This helps ensure that the data being processed is accurate and reliable.
  • Data transformation: Dataweave filter can also be used to transform the data into a different format or structure. You can filter out unwanted fields or objects, and rearrange the remaining data to meet the requirements of the downstream systems.
  • Performance optimization: Dataweave filter is a high-performance feature that can be used to optimize the performance of your integration. By selectively filtering out the data that is not required, you can reduce the processing time and improve the overall performance of your integration.

Let’s take a look at some examples of how to use Dataweave to filter data.

Example 1: Filtering a List of Numbers

Suppose we have a list of numbers and we want to filter out the even numbers. We can use the filter function to achieve this.

%dw 2.0
output application/json
---
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] filter ((item) -> item mod 2 != 0)

In the above example, we have used the filter function to filter out the even numbers. The filter function takes a lambda expression as an argument, which defines the condition for filtering. In this case, we have used the mod operator to check if the number is odd or even.

The output of the above code will be:

[1, 3, 5, 7, 9]

Example 2: Filtering a List of Objects

Suppose we have a list of objects, and we want to filter out the objects that meet a certain condition. We can use the filter function along with the select function to achieve this.

%dw 2.0
output application/json
---
[
  { name: "John", age: 25 },
  { name: "Emily", age: 30 },
  { name: "David", age: 40 },
  { name: "Sarah", age: 20 },
  { name: "Peter", age: 35 }
] filter ($.age > 30) map ($.name)

In the above example, we have used the filter function to filter out the objects where the age is greater than 30. We have also used the map function to extract the names of the filtered objects. The select function can also be used instead of the map function.

The output of the above code will be:

["David", "Peter"]

Example 3: Filtering a JSON Object

Suppose we have a JSON object and we want to filter out certain elements based on a condition. We can use the filter function along with the pluck function to achieve this.

%dw 2.0
output application/json
---
{
  name: "John",
  age: 25,
  address: {
    city: "New York",
    state: "NY"
  },
  phone: "123-456-7890"
} filterObject (($) -> $ != "phone") pluck $

In the above example, we have used the filterObject function to filter out the phone element from the JSON object. We have also used the pluck function to extract the filtered elements.

The output of the above code will be:

[  "John",  25,  {    "city": "New York",    "state": "NY"  }]

Example 4: Filter objects based on a specific value in a nested object

Suppose you have a list of objects that contain a nested object with a “status” field, and you want to filter out the objects where the status is “inactive”. Here’s how you can do it using Dataweave:

%dw 2.0
output application/json
---
payload filter ($.details.status == "active")

In the above example, we have used the filter function to check the “status” field of the “details” object inside each object in the payload. If the status is “active”, the object will be included in the output.

Example 5: Filter a list of strings based on a regular expression

Suppose you have a list of strings, and you want to filter out the strings that do not match a certain regular expression. Here’s how you can do it using Dataweave:

%dw 2.0
output application/json
---
payload filter (($ =~ /pattern/))

In the above example, we have used the filter function to check if each element in the payload matches the regular expression specified in the condition. If the element matches the pattern, it will be included in the output.

In conclusion, Dataweave provides a simple yet powerful way to filter data. The filter, map, and reduce functions, along with the select, pluck, and filterObject functions, can be used to filter data based on different conditions. With the examples provided above, you should now have a good understanding of how to use Dataweave to filter data in MuleSoft’s Anypoint Platform. However, there are some additional features of Dataweave that are worth mentioning.

For more check the official MuleSoft documentation for the filter and filterObject function.

Partition Function in Dataweave

Dataweave also provides the partition function, which can be used to split a list into two lists based on a condition. The partition function takes a lambda expression as an argument, which defines the condition for splitting.

Let’s take a look at an example of how to use the partition function:

%dw 2.0
output application/json
---
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] partition ((item) -> item mod 2 != 0)

In the above example, we have used the partition function to split a list of numbers into two lists, one for odd numbers and another for even numbers. The output of the above code will be:

[
  [1, 3, 5, 7, 9],
  [2, 4, 6, 8, 10]
]

Another feature of Dataweave is the use of the conditional operator (?:) to filter data. The conditional operator can be used to create a ternary expression that evaluates to true or false.

Let’s take a look at an example of how to use the conditional operator:

%dw 2.0
output application/json
---
[
  { name: "John", age: 25 },
  { name: "Emily", age: 30 },
  { name: "David", age: 40 },
  { name: "Sarah", age: 20 },
  { name: "Peter", age: 35 }
] filter ($.age > 30 ? true : false) map ($.name)

In the above example, we have used the conditional operator to filter out the objects where the age is greater than 30. We have also used the map function to extract the names of the filtered objects.

The output of the above code will be:

["David", "Peter"]

In summary, Dataweave provides several ways to filter data based on different conditions. The filter, map, and reduce functions, along with the select, pluck, filterObject, partition, and conditional operator, can be used to filter data in various scenarios. With these tools, you can efficiently filter, transform, and manipulate data in MuleSoft’s Anypoint Platform.

More from the blog

Handling Dates and Times in Dataweave

Dataweave is a powerful data transformation language used in MuleSoft to transform data from one format to another. When working with data, one of...

Using MuleSoft to Implement Content-Based Routing (Choice Router)

Content-based routing is a widely used architectural pattern that is particularly useful for handling incoming messages or requests that need to be distributed based...

Hash Indexing in RDBMS

In relational database management systems (RDBMS), indexing is an essential feature that allows for faster retrieval of data. A hash index is a type...

Caching in RDBMS

Caching is a technique that stores frequently used data in memory for faster access. The goal of caching is to reduce the time it...