- Published on
Aggregation in MongoDB
- Authors
- Name
- Hieu Cao
Introduction
MongoDB's aggregation framework is a powerful tool for processing and transforming data. It allows you to perform complex data analysis tasks such as filtering, grouping, sorting, and calculating aggregated results directly within the database.
In this blog, we’ll explore the basics of MongoDB’s aggregation framework, including its core components and practical examples.
Prerequisites
Before diving into aggregation, ensure you have:
- MongoDB installed and running.
- Access to the MongoDB shell or a database management tool.
What Is Aggregation?
Aggregation in MongoDB involves processing data records and returning computed results. It is used to analyze data patterns, extract insights, and prepare summarized reports. The aggregation framework consists of a pipeline that processes data in stages.
Key Components of the Aggregation Framework:
- Pipeline: A sequence of stages that process documents.
- Stages: Operations like
$match
,$group
,$project
, etc., applied sequentially to the input documents. - Expressions: Specify calculations or transformations within stages.
Basic Aggregation Stages
$match
: Filtering Documents
1. Filters the documents based on specified conditions.
Example:
Retrieve users aged above 30:
> db.users.aggregate([
{ $match: { age: { $gt: 30 } } }
]);
$group
: Grouping Data
2. Groups documents by a specified field and performs aggregation operations like sum, average, etc.
Example:
Group users by isActive
status and count them:
> db.users.aggregate([
{ $group: { _id: "$isActive", count: { $sum: 1 } } }
]);
$project
: Shaping the Output
3. Selects specific fields and transforms the data structure.
Example:
Show only user names and ages:
> db.users.aggregate([
{ $project: { name: 1, age: 1, _id: 0 } }
]);
$sort
: Sorting Documents
4. Sorts documents by specified fields in ascending or descending order.
Example:
Sort users by age in descending order:
> db.users.aggregate([
{ $sort: { age: -1 } }
]);
$limit
and $skip
: Paginating Results
5. $limit
: Restricts the number of documents.$skip
: Skips a specified number of documents.
Example:
Retrieve the top 5 oldest users:
> db.users.aggregate([
{ $sort: { age: -1 } },
{ $limit: 5 }
]);
Combining Stages: A Full Example
Suppose we have the following sales
collection:
{
product: "Laptop",
category: "Electronics",
price: 1200,
quantity: 10
},
{
product: "Phone",
category: "Electronics",
price: 800,
quantity: 20
},
{
product: "Chair",
category: "Furniture",
price: 150,
quantity: 15
}
Task:
Calculate total sales revenue for each category.
Solution:
> db.sales.aggregate([
{ $group: { _id: "$category", totalRevenue: { $sum: { $multiply: ["$price", "$quantity"] } } } },
{ $sort: { totalRevenue: -1 } }
]);
Output:
{
_id: "Electronics",
totalRevenue: 28000
},
{
_id: "Furniture",
totalRevenue: 2250
}
Tips for Using Aggregation Effectively
- Indexing: Use indexes to optimize performance for
$match
stages. - Pipeline Order: Place
$match
and$limit
early in the pipeline to reduce the dataset size. - Test Incrementally: Build your pipeline step by step to debug issues more easily.
- Aggregation Expressions: Leverage expressions like
$sum
,$avg
,$min
,$max
, and$multiply
for advanced calculations.
Conclusion
The MongoDB aggregation framework is a versatile and powerful feature for data processing and analysis. By mastering its stages and expressions, you can perform a wide range of data operations efficiently. Experiment with the examples provided and explore more advanced use cases to unlock its full potential.
Happy Coding!