hadoop - How to count the number of unique users with PIG -



hadoop - How to count the number of unique users with PIG -

the next piece of code doesn't homecoming trying compute; number of unique users. idea?

data = load 'input_initial' (user_id,item_id,rating,timestamp); info = foreach info generate user_id,item_id; store info 'input_final'; data_users = foreach info generate user_id; group_users = grouping data_users user_id; count_users = foreach group_users generate count(data_users); store count_users 'count_users';

you need amend final grouping operation deed on 'all' rather individual field:

group_users = grouping data_users user_id; grp_all = grouping group_users all; count_users = foreach grp_all generate count(group_users);

hadoop apache-pig

Comments

Popular posts from this blog

web services - java.lang.NoClassDefFoundError: Could not initialize class net.sf.cglib.proxy.Enhancer -

Accessing MATLAB's unicode strings from C -

javascript - mongodb won't find my schema method in nested container -