hadoop - How to count the number of unique users with PIG -
hadoop - How to count the number of unique users with PIG -
the next piece of code doesn't homecoming trying compute; number of unique users. idea?
data = load 'input_initial' (user_id,item_id,rating,timestamp); info = foreach info generate user_id,item_id; store info 'input_final'; data_users = foreach info generate user_id; group_users = grouping data_users user_id; count_users = foreach group_users generate count(data_users); store count_users 'count_users';
you need amend final grouping operation deed on 'all' rather individual field:
group_users = grouping data_users user_id; grp_all = grouping group_users all; count_users = foreach grp_all generate count(group_users);
hadoop apache-pig
Comments
Post a Comment