Hello,
I am trying to follow the instructions here: https://github.com/facebookresearch/Horizon/blob/master/docs/usage.md
When I run this script:
/usr/local/spark/bin/spark-submit
--class com.facebook.spark.rl.Preprocessor preprocessing/target/rl-preprocessing-1.1.jar
"cat ml/rl/workflow/sample_configs/discrete_action/timeline.json
"
I am getting2019-02-27 00:57:03 INFO HiveMetaStore:746 - 0: get_database: global_temp
2019-02-27 00:57:03 INFO audit:371 - ugi=root ip=unknown-ip-addr cmd=get_database: global_temp
2019-02-27 00:57:03 WARN ObjectStore:568 - Failed to get database global_temp, returning NoSuchObjectException
Exception in thread "main" org.apache.spark.sql.AnalysisException: grouping expressions sequence is empty, and 'source_table.mdp_id
' is not an aggregate function. Wrap '()' in windowing function(s) or wrap 'source_table.mdp_id
' in first() (or first_value) if you don't care which value you get.;;
'Sort ['HASH('mdp_id, 'sequence_number) ASC NULLS FIRST], false
+- 'RepartitionByExpression ['HASH('mdp_id, 'sequence_number)], 200
+- 'Project [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, next_state_features#24, next_action#25, sequence_number#2, sequence_number_ordinal#26, time_diff#27, possible_actions#7, possible_next_actions#28, metrics#8]
+- 'Project [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, sequence_number#2, possible_actions#7, metrics#8, next_state_features#24, next_action#25, sequence_number_ordinal#26, _we3#30, possible_next_actions#28, next_state_features#24, next_action#25, sequence_number_ordinal#26, (coalesce(_we3#30, sequence_number#2) - sequence_number#2) AS time_diff#27, possible_next_actions#28]
+- 'Window [lead(state_features#4, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS next_state_features#24, lead(action#5, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS next_action#25, row_number() windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS sequence_number_ordinal#26, lead(sequence_number#2, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS _we3#30, lead(possible_actions#7, 1, null) windowspecdefinition(mdp_id#1, mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST, specifiedwindowframe(RowFrame, 1, 1)) AS possible_next_actions#28], [mdp_id#1], [mdp_id#1 ASC NULLS FIRST, sequence_number#2 ASC NULLS FIRST]
+- 'Filter isnotnull('next_state_features)
+- Aggregate [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, sequence_number#2, possible_actions#7, metrics#8]
+- SubqueryAlias source_table
+- Project [mdp_id#1, state_features#4, action#5, action_probability#3, reward#6, sequence_number#2, possible_actions#7, metrics#8]
+- Filter ((ds#0 >= 2019-01-01) && (ds#0 <= 2019-01-01))
+- SubqueryAlias cartpole_discrete
+- Relation[ds#0,mdp_id#1,sequence_number#2,action_probability#3,state_features#4,action#5,reward#6,possible_actions#7,metrics#8] json
I tried the steps, after manually installing Hbase (This step is missing in the documentation. Please let me know, if you want me to add it)
I am using docker on Mac instructions (https://github.com/facebookresearch/Horizon/blob/master/docs/installation.md) to get going. Can anyone please help me on how to move forward?