MongoDB : Let Your Database Do the Heavy Lifting 🚀
Why Are We Talking About This?
Imagine you run a food recommendation system, and you need to fetch all food recommendations a user has made, grouped by foodID.
Now, you could:
- Fetch all recommendations and group them in Java (slow 🐌).
- Let MongoDB do the grouping before sending the data (fast ⚡).
If you chose option 1, we need to talk. Let’s fix that with MongoDB Aggregation Framework and get things moving at the speed of light. 🚀
Problem: How Not to Handle Grouping
Your first instinct might be to fetch all recommendations and then use Java Streams to group them:
User currentUser = userService.getCurrentUser();
List<Recommendation> recommendations = recommendationRepository.findByFromUserId(currentUser.getId(), pageable)
.getContent().stream().toList();
Map<String, List<Recommendation>> groupedRecommendations = recommendations.stream()
.collect(Collectors.groupingBy(Recommendation::getFoodId));Looks clean, right? But under the hood, it’s slow and inefficient. Why?
- Pulling Too Much Data → You fetch raw data and then process it in Java instead of in MongoDB.
- Double Iteration → First
.getContent().stream().toList()and then another.stream(), meaning unnecessary looping. - Not Leveraging MongoDB’s Strengths → MongoDB is a document database, not just a storage box. Let it do the work for you!
Solution: MongoDB Aggregation to the Rescue 🚑
Instead of fetching everything and processing in Java, let’s tell MongoDB Aggregation Framework do the work for you.
MongoDB’s Aggregation Framework is a powerful tool for processing large datasets by passing them through a series of stages, known as a pipeline.
This framework allows you to filter, sort, group, and modify documents in a flexible and efficient manner.
Key Concepts
- Data Input: The pipeline starts with data from a collection.
- Stages Execution: Each stage processes the input data and passes the result to the next stage.
- Output: The final stage produces the desired output, which can be aggregated data, reshaped documents, or even new collections.
Common Aggregation Stages:
- $match: Filters documents based on conditions.
- $group: Groups documents and performs aggregation operations like sum or average.
- $sort: Orders the documents based on specified fields.
- $project: Reshapes documents by adding, removing, or modifying fields.
Here’s how we can do it using MongoTemplate Aggregation:
@Autowired
private MongoTemplate mongoTemplate;
public Page<Map<String, Object>> getGroupedRecommendations(String userId, Pageable pageable) {
// filter stage
MatchOperation matchStage = Aggregation.match(Criteria.where("fromUserId").is(userId));
// grouping stage
GroupOperation groupStage = Aggregation.group("foodId")
.push("$$ROOT").as("recommendations");
// Alternatively we can use count instead of ROOT
// GroupOperation groupStage = Aggregation.group("foodId")
// .count().as("recommendations");
// sorting stage
SortOperation sortStage = Aggregation.sort(pageable.getSort().isSorted() ? pageable.getSort() : Sort.by("_id"));
// Facet stage
FacetOperation facetStage = Aggregation.facet(
Aggregation.skip((long) pageable.getPageNumber() * pageable.getPageSize()),
Aggregation.limit(pageable.getPageSize())
).as("pagedResults")
.and(Aggregation.count().as("count")).as("totalCount");
// Create aggregation pipeline
Aggregation aggregation = Aggregation.newAggregation(matchStage, groupStage, sortStage, facetStage);
AggregationResults<Map> results = mongoTemplate.aggregate(aggregation, "recommendations", Map.class);
List<Map<String, Object>> pagedResults = (List<Map<String, Object>>) results.getMappedResults().get(0).get("pagedResults");
List<Map<String, Object>> totalCountList = (List<Map<String, Object>>) results.getMappedResults().get(0).get("totalCount");
long total = totalCountList.isEmpty() ? 0 : (long) totalCountList.get(0).get("count");
return new PageImpl<>(pagedResults, pageable, total);
}How It Works (Without Giving You a Headache) 🤕
1️ . $match → Filter Recommendations by User
Instead of fetching everything, we filter right away:
{ "$match": { "fromUserId": "12345" } }💡 Why? This ensures MongoDB only considers relevant recommendations.
2. $group → Group Recommendations by foodId
{
"$group": {
"_id": "$foodId",
"recommendations": { "$push": "$$ROOT" }
}
}💡 Why? Instead of getting 100s of rows, we get one entry per foodId with all its recommendations inside an array.
- Grouping: The
_idfield in the$groupstage specifies that documents should be grouped by thefoodIdfield. - Collecting Documents: The
$pushoperator is used to collect all the documents ($$ROOT) that belong to each group into an array calledrecommendations.
Other notable accumulator $sum , $avg , $max and $min , $count , $push , $addToSet
3️. $sort → Keep It Organized
{ "$sort": { "_id": 1 } }💡 Why? If MongoDB gives us grouped data but in a random order, things could break on the frontend. Sorting fixes that.
4️. $facet → Apply Pagination
{
"$facet": {
"pagedResults": [
{ "$skip": 10 },
{ "$limit": 10 }
],
"totalCount": [
{ "$count": "count" }
]
}
}💡 Why?
pagedResults→ Extracts only the page we need.totalCount→ Gets the total number of grouped results, so pagination works properly.
Why Is This Approach Faster? 🏎️
✅ Less Data Transfer → Java doesn’t have to loop over thousands of records.
✅ Less Memory Usage → MongoDB does the heavy lifting, freeing up your JVM.
✅ Faster Response Times → Your API doesn’t waste time filtering and grouping in Java.
Be Smart, Let MongoDB Work for You
When working with MongoDB, avoid thinking like a SQL developer. Instead of fetching raw data and processing in Java, use MongoDB Aggregation to do it in one efficient query.
🚀 Rule of Thumb:
“If MongoDB can do it, let MongoDB do it.”
Got questions or different use cases? Let’s discuss in the comments! 😊
Happy coding.
