Published on

Aggregation in MongoDB

Authors
  • avatar
    Name
    Hieu Cao
    Twitter

Introduction

MongoDB's aggregation framework is a powerful tool for processing and transforming data. It allows you to perform complex data analysis tasks such as filtering, grouping, sorting, and calculating aggregated results directly within the database.

In this blog, we’ll explore the basics of MongoDB’s aggregation framework, including its core components and practical examples.


Prerequisites

Before diving into aggregation, ensure you have:

  • MongoDB installed and running.
  • Access to the MongoDB shell or a database management tool.

What Is Aggregation?

Aggregation in MongoDB involves processing data records and returning computed results. It is used to analyze data patterns, extract insights, and prepare summarized reports. The aggregation framework consists of a pipeline that processes data in stages.

Key Components of the Aggregation Framework:

  • Pipeline: A sequence of stages that process documents.
  • Stages: Operations like $match, $group, $project, etc., applied sequentially to the input documents.
  • Expressions: Specify calculations or transformations within stages.

Basic Aggregation Stages

1. $match: Filtering Documents

Filters the documents based on specified conditions.

Example:

Retrieve users aged above 30:

> db.users.aggregate([
  { $match: { age: { $gt: 30 } } }
]);

2. $group: Grouping Data

Groups documents by a specified field and performs aggregation operations like sum, average, etc.

Example:

Group users by isActive status and count them:

> db.users.aggregate([
  { $group: { _id: "$isActive", count: { $sum: 1 } } }
]);

3. $project: Shaping the Output

Selects specific fields and transforms the data structure.

Example:

Show only user names and ages:

> db.users.aggregate([
  { $project: { name: 1, age: 1, _id: 0 } }
]);

4. $sort: Sorting Documents

Sorts documents by specified fields in ascending or descending order.

Example:

Sort users by age in descending order:

> db.users.aggregate([
  { $sort: { age: -1 } }
]);

5. $limit and $skip: Paginating Results

  • $limit: Restricts the number of documents.
  • $skip: Skips a specified number of documents.

Example:

Retrieve the top 5 oldest users:

> db.users.aggregate([
  { $sort: { age: -1 } },
  { $limit: 5 }
]);

Combining Stages: A Full Example

Suppose we have the following sales collection:

{
  product: "Laptop",
  category: "Electronics",
  price: 1200,
  quantity: 10
},
{
  product: "Phone",
  category: "Electronics",
  price: 800,
  quantity: 20
},
{
  product: "Chair",
  category: "Furniture",
  price: 150,
  quantity: 15
}

Task:

Calculate total sales revenue for each category.

Solution:

> db.sales.aggregate([
  { $group: { _id: "$category", totalRevenue: { $sum: { $multiply: ["$price", "$quantity"] } } } },
  { $sort: { totalRevenue: -1 } }
]);

Output:

{
  _id: "Electronics",
  totalRevenue: 28000
},
{
  _id: "Furniture",
  totalRevenue: 2250
}

Tips for Using Aggregation Effectively

  1. Indexing: Use indexes to optimize performance for $match stages.
  2. Pipeline Order: Place $match and $limit early in the pipeline to reduce the dataset size.
  3. Test Incrementally: Build your pipeline step by step to debug issues more easily.
  4. Aggregation Expressions: Leverage expressions like $sum, $avg, $min, $max, and $multiply for advanced calculations.

Conclusion

The MongoDB aggregation framework is a versatile and powerful feature for data processing and analysis. By mastering its stages and expressions, you can perform a wide range of data operations efficiently. Experiment with the examples provided and explore more advanced use cases to unlock its full potential.

Happy Coding!