mongodb - mongo: aggregate - $match before $project -
mongodb - mongo: aggregate - $match before $project -
having mongodb around 100gb of info , per field in $match expression, have index (single field index).
now tried aggregate() , wrote $project first part in pipeline, $match behind this.
the aggregation runs , returns right results, takes hours! process filtered ($match) info or mongo aggregate on total range of info , filter afterwards?
in test case, $match filters around 150mb (instead of total info size of 100gb).
by accident, changed order , wrote $match before $project in pipeline definition. way, done within few seconds.
when mongodb cut down input info , deal index fields in $match?
as have noticed, order of pipeline operators crucial when dealing big collection. if done incorrectly can run out of memory allow lone process taking long time. noted in docs:
the next pipeline operators take advantage of index when occur @ origin of pipeline:
$match $sort $limit $skip.
so long $match
comes front end index can used. noted in docs
the mongodb aggregation pipeline streams mongodb documents 1 pipeline operator next process documents. pipeline operators can repeated in pipe.
that means $project
sees fraction of entire collection if preceded $match
.
mongodb match aggregate
Comments
Post a Comment