mongodb - mongo: aggregate - $match before $project -



mongodb - mongo: aggregate - $match before $project -

having mongodb around 100gb of info , per field in $match expression, have index (single field index).

now tried aggregate() , wrote $project first part in pipeline, $match behind this.

the aggregation runs , returns right results, takes hours! process filtered ($match) info or mongo aggregate on total range of info , filter afterwards?

in test case, $match filters around 150mb (instead of total info size of 100gb).

by accident, changed order , wrote $match before $project in pipeline definition. way, done within few seconds.

when mongodb cut down input info , deal index fields in $match?

as have noticed, order of pipeline operators crucial when dealing big collection. if done incorrectly can run out of memory allow lone process taking long time. noted in docs:

the next pipeline operators take advantage of index when occur @ origin of pipeline:

$match $sort $limit $skip.

so long $match comes front end index can used. noted in docs

the mongodb aggregation pipeline streams mongodb documents 1 pipeline operator next process documents. pipeline operators can repeated in pipe.

that means $project sees fraction of entire collection if preceded $match.

mongodb match aggregate

Comments

Popular posts from this blog

web services - java.lang.NoClassDefFoundError: Could not initialize class net.sf.cglib.proxy.Enhancer -

Accessing MATLAB's unicode strings from C -

javascript - mongodb won't find my schema method in nested container -