Oracle PGX on Yarn - 404 on WebService










4















I'm running Yarn on Oracle BDA X7-2, specs:



  • Cloudera Enterprise 5.14.3

  • Java 1.8.0_171

  • PGX 2.7.1

I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html



Managed to run the installation script, completed the config file provided by it with the following:




"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1



Yarn has a pgx-service application in RUNNING state, no errors in stderr, the log shows me the service is running in the address:



http://bda1node06:7007



And the linux Java process is running with the following command:



/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680



And after the execution of the PGX client for testing purposes:



$PGX_HOME/bin/pgx --base_url http://bda1node06:7007



I get:



java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


I have no idea of how to debug and check if there's any extra path needed in the connection URL.



How may I proceed to debug?



Thanks in advance!










share|improve this question






















  • is there any useful output when running yarn logs -applicationId <appId> ?

    – Korbi
    Nov 16 '18 at 20:33






  • 1





    another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity

    – Korbi
    Nov 20 '18 at 19:52











  • @Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!

    – Samamba
    Dec 6 '18 at 20:13











  • Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?

    – Albert Godfrind
    Dec 7 '18 at 18:13











  • @AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA

    – Samamba
    Dec 10 '18 at 19:45















4















I'm running Yarn on Oracle BDA X7-2, specs:



  • Cloudera Enterprise 5.14.3

  • Java 1.8.0_171

  • PGX 2.7.1

I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html



Managed to run the installation script, completed the config file provided by it with the following:




"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1



Yarn has a pgx-service application in RUNNING state, no errors in stderr, the log shows me the service is running in the address:



http://bda1node06:7007



And the linux Java process is running with the following command:



/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680



And after the execution of the PGX client for testing purposes:



$PGX_HOME/bin/pgx --base_url http://bda1node06:7007



I get:



java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


I have no idea of how to debug and check if there's any extra path needed in the connection URL.



How may I proceed to debug?



Thanks in advance!










share|improve this question






















  • is there any useful output when running yarn logs -applicationId <appId> ?

    – Korbi
    Nov 16 '18 at 20:33






  • 1





    another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity

    – Korbi
    Nov 20 '18 at 19:52











  • @Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!

    – Samamba
    Dec 6 '18 at 20:13











  • Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?

    – Albert Godfrind
    Dec 7 '18 at 18:13











  • @AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA

    – Samamba
    Dec 10 '18 at 19:45













4












4








4








I'm running Yarn on Oracle BDA X7-2, specs:



  • Cloudera Enterprise 5.14.3

  • Java 1.8.0_171

  • PGX 2.7.1

I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html



Managed to run the installation script, completed the config file provided by it with the following:




"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1



Yarn has a pgx-service application in RUNNING state, no errors in stderr, the log shows me the service is running in the address:



http://bda1node06:7007



And the linux Java process is running with the following command:



/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680



And after the execution of the PGX client for testing purposes:



$PGX_HOME/bin/pgx --base_url http://bda1node06:7007



I get:



java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


I have no idea of how to debug and check if there's any extra path needed in the connection URL.



How may I proceed to debug?



Thanks in advance!










share|improve this question














I'm running Yarn on Oracle BDA X7-2, specs:



  • Cloudera Enterprise 5.14.3

  • Java 1.8.0_171

  • PGX 2.7.1

I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html



Managed to run the installation script, completed the config file provided by it with the following:




"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1



Yarn has a pgx-service application in RUNNING state, no errors in stderr, the log shows me the service is running in the address:



http://bda1node06:7007



And the linux Java process is running with the following command:



/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680



And after the execution of the PGX client for testing purposes:



$PGX_HOME/bin/pgx --base_url http://bda1node06:7007



I get:



java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


I have no idea of how to debug and check if there's any extra path needed in the connection URL.



How may I proceed to debug?



Thanks in advance!







bigdata yarn cloudera cloudera-manager oracle-spatial






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 13 '18 at 15:23









SamambaSamamba

387




387












  • is there any useful output when running yarn logs -applicationId <appId> ?

    – Korbi
    Nov 16 '18 at 20:33






  • 1





    another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity

    – Korbi
    Nov 20 '18 at 19:52











  • @Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!

    – Samamba
    Dec 6 '18 at 20:13











  • Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?

    – Albert Godfrind
    Dec 7 '18 at 18:13











  • @AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA

    – Samamba
    Dec 10 '18 at 19:45

















  • is there any useful output when running yarn logs -applicationId <appId> ?

    – Korbi
    Nov 16 '18 at 20:33






  • 1





    another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity

    – Korbi
    Nov 20 '18 at 19:52











  • @Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!

    – Samamba
    Dec 6 '18 at 20:13











  • Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?

    – Albert Godfrind
    Dec 7 '18 at 18:13











  • @AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA

    – Samamba
    Dec 10 '18 at 19:45
















is there any useful output when running yarn logs -applicationId <appId> ?

– Korbi
Nov 16 '18 at 20:33





is there any useful output when running yarn logs -applicationId <appId> ?

– Korbi
Nov 16 '18 at 20:33




1




1





another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity

– Korbi
Nov 20 '18 at 19:52





another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity

– Korbi
Nov 20 '18 at 19:52













@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!

– Samamba
Dec 6 '18 at 20:13





@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!

– Samamba
Dec 6 '18 at 20:13













Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?

– Albert Godfrind
Dec 7 '18 at 18:13





Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?

– Albert Godfrind
Dec 7 '18 at 18:13













@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA

– Samamba
Dec 10 '18 at 19:45





@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA

– Samamba
Dec 10 '18 at 19:45












2 Answers
2






active

oldest

votes


















2














By default, PGX has a base path of /pgx, which means you should connect as follows:



$PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx





share|improve this answer























  • I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""

    – Samamba
    Nov 13 '18 at 17:07


















0














I'll do a little follow up here.



We've managed to start a pgx server and manipulate hbase graph! :D



PGX "Hello World"



We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:



cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
opg = OraclePropertyGraph.getInstance(cfg)

a = opg.addVertex()
a.setProperty('nome', 'Felipe')

b = opg.addVertex()
b.setProperty('nome', 'Rhenan')

c = opg.addVertex()
c.setProperty('nome', 'Hugo')

opg.addEdge(a, b, 'Pai de')
opg.addEdge(b, c, 'Pai de')
opg.addEdge(a, c, 'Avo de')

opg.commit()

session = Pgx.createSession('sinapsepgx')
analyst = session.createAnalyst()
pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
analyst.countTriangles(pgxGraph, true)


And that worked just fine!



Client - Server architecture



The next step, we moved to a client/server mode, starting the start-server script.
We managed to do that just fine too!
This is our config files:



server.conf




"port": 7007,
"enable_tls": false,
"enable_client_authentication": false



pgx.conf




"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config":
"analysis_task_config":
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
,
"fast_analysis_task_config":
"priority": "HIGH",
"weight": 1,
"max_threads": 12
,
"num_io_threads_per_task": 12
,
"preload_graphs": [
"path": "graphs/sinapse_conf.json",
"name": "sinapse"
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"



sinapse_conf.json




"edge_props": [

"name": "relacao",
"type": "string"

],
"db_engine": "HBASE",
"vertex_props": [

"name": "nome",
"type": "string"
,

"name": "cpf",
"type": "string"

],
"format": "pg",
"name": "sinapse",
"error_handling": ,
"vertex_id_type": "long",
"attributes": ,
"loading": ,
"zk_quorum": "bda1node05,bda1node06,bda1node07"




start-script ran just fine with that, preloaded our hbase graph, works like a charm.



Connected to the server using the pgx client:



./bin/pgx -b http://localhost:7007


And managed to do the same we did in the groovy shell.
That's awesome.



PGX on Yarn



Well, now we are back in our challenge: run and manage PGX on Yarn.



We've copied our pgx.conf file to the hdfs, like this:



hdfs://user/pgx/pgx.conf




"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config":
"analysis_task_config":
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
,
"fast_analysis_task_config":
"priority": "HIGH",
"weight": 1,
"max_threads": 12
,
"num_io_threads_per_task": 12
,
"preload_graphs": [
"path": "graphs/sinapse_conf.json",
"name": "sinapse"
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"



/opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf




"pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1



Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.



So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:



hdfs://user/pgx/log4j2.xml



<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%dHH:mm:ss,SSS %p %C1 - %m%n"/>
</Console>
<File name="LogFile" fileName="file:/tmp/pg_trace.log">
<PatternLayout pattern="%dHH:mm:ss.SSS [%t] %-5level %logger36 - %msg%n"/>
</File>
</Appenders>
<Loggers>
<Root level="debug">
<AppenderRef ref="LogFile"/>
</Root>
<Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
<Logger name="pgx.dist.cluster_host" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
</Loggers>
</Configuration>


And finally ran the yarn start server command, just like this:



yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf


And we get the bottom of the logfile that seems realy nice!:



18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down


But connecting to it still returns 404 ;(



The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:



SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
18/12/11 16:25:06 INFO yarn.AppMaster: register app
18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
18/12/11 16:25:07 INFO yarn.AppMaster: server env = CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR
18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
.
.
.


This is the farthest we've managed to go.



We can start our work now! That's realy exciting.
Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.



Would be lovely to have this running on Yarn at the production level.



Thank you all for the extreme dedication and attention.






share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53284215%2foracle-pgx-on-yarn-404-on-webservice%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    By default, PGX has a base path of /pgx, which means you should connect as follows:



    $PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx





    share|improve this answer























    • I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""

      – Samamba
      Nov 13 '18 at 17:07















    2














    By default, PGX has a base path of /pgx, which means you should connect as follows:



    $PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx





    share|improve this answer























    • I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""

      – Samamba
      Nov 13 '18 at 17:07













    2












    2








    2







    By default, PGX has a base path of /pgx, which means you should connect as follows:



    $PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx





    share|improve this answer













    By default, PGX has a base path of /pgx, which means you should connect as follows:



    $PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 13 '18 at 16:39









    MartijnMartijn

    3,67342235




    3,67342235












    • I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""

      – Samamba
      Nov 13 '18 at 17:07

















    • I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""

      – Samamba
      Nov 13 '18 at 17:07
















    I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""

    – Samamba
    Nov 13 '18 at 17:07





    I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""

    – Samamba
    Nov 13 '18 at 17:07













    0














    I'll do a little follow up here.



    We've managed to start a pgx server and manipulate hbase graph! :D



    PGX "Hello World"



    We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:



    cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
    opg = OraclePropertyGraph.getInstance(cfg)

    a = opg.addVertex()
    a.setProperty('nome', 'Felipe')

    b = opg.addVertex()
    b.setProperty('nome', 'Rhenan')

    c = opg.addVertex()
    c.setProperty('nome', 'Hugo')

    opg.addEdge(a, b, 'Pai de')
    opg.addEdge(b, c, 'Pai de')
    opg.addEdge(a, c, 'Avo de')

    opg.commit()

    session = Pgx.createSession('sinapsepgx')
    analyst = session.createAnalyst()
    pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
    analyst.countTriangles(pgxGraph, true)


    And that worked just fine!



    Client - Server architecture



    The next step, we moved to a client/server mode, starting the start-server script.
    We managed to do that just fine too!
    This is our config files:



    server.conf




    "port": 7007,
    "enable_tls": false,
    "enable_client_authentication": false



    pgx.conf




    "allow_idle_timeout_overwrite": true,
    "allow_local_filesystem": false,
    "allow_task_timeout_overwrite": true,
    "enable_gm_compiler": true,
    "enterprise_scheduler_config":
    "analysis_task_config":
    "priority": "MEDIUM",
    "weight": 12,
    "max_threads": 12
    ,
    "fast_analysis_task_config":
    "priority": "HIGH",
    "weight": 1,
    "max_threads": 12
    ,
    "num_io_threads_per_task": 12
    ,
    "preload_graphs": [
    "path": "graphs/sinapse_conf.json",
    "name": "sinapse"
    ],
    "max_active_sessions": 1024,
    "max_queue_size_per_session": -1,
    "max_snapshot_count": 0,
    "memory_cleanup_interval": 600,
    "path_to_gm_compiler": null,
    "release_memory_threshold": 0.85,
    "session_idle_timeout_secs": 0,
    "session_task_timeout_secs": 0,
    "strict_mode": true,
    "tmp_dir": "/tmp"



    sinapse_conf.json




    "edge_props": [

    "name": "relacao",
    "type": "string"

    ],
    "db_engine": "HBASE",
    "vertex_props": [

    "name": "nome",
    "type": "string"
    ,

    "name": "cpf",
    "type": "string"

    ],
    "format": "pg",
    "name": "sinapse",
    "error_handling": ,
    "vertex_id_type": "long",
    "attributes": ,
    "loading": ,
    "zk_quorum": "bda1node05,bda1node06,bda1node07"




    start-script ran just fine with that, preloaded our hbase graph, works like a charm.



    Connected to the server using the pgx client:



    ./bin/pgx -b http://localhost:7007


    And managed to do the same we did in the groovy shell.
    That's awesome.



    PGX on Yarn



    Well, now we are back in our challenge: run and manage PGX on Yarn.



    We've copied our pgx.conf file to the hdfs, like this:



    hdfs://user/pgx/pgx.conf




    "allow_idle_timeout_overwrite": true,
    "allow_local_filesystem": false,
    "allow_task_timeout_overwrite": true,
    "enable_gm_compiler": true,
    "enterprise_scheduler_config":
    "analysis_task_config":
    "priority": "MEDIUM",
    "weight": 12,
    "max_threads": 12
    ,
    "fast_analysis_task_config":
    "priority": "HIGH",
    "weight": 1,
    "max_threads": 12
    ,
    "num_io_threads_per_task": 12
    ,
    "preload_graphs": [
    "path": "graphs/sinapse_conf.json",
    "name": "sinapse"
    ],
    "max_active_sessions": 1024,
    "max_queue_size_per_session": -1,
    "max_snapshot_count": 0,
    "memory_cleanup_interval": 600,
    "path_to_gm_compiler": null,
    "release_memory_threshold": 0.85,
    "session_idle_timeout_secs": 0,
    "session_task_timeout_secs": 0,
    "strict_mode": true,
    "tmp_dir": "/tmp"



    /opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf




    "pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
    "pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
    "pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
    "pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
    "pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
    "pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
    "zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
    "standard_library_path": "/usr/lib64/gcc/4.8.2",
    "min_heap_size": "512m",
    "max_heap_size": "12g",
    "container_cores": 9,
    "container_memory": 0,
    "container_priority": 0,
    "num_machines": 1



    Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.



    So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:



    hdfs://user/pgx/log4j2.xml



    <?xml version="1.0" encoding="UTF-8"?>
    <Configuration status="WARN">
    <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
    <PatternLayout pattern="%dHH:mm:ss,SSS %p %C1 - %m%n"/>
    </Console>
    <File name="LogFile" fileName="file:/tmp/pg_trace.log">
    <PatternLayout pattern="%dHH:mm:ss.SSS [%t] %-5level %logger36 - %msg%n"/>
    </File>
    </Appenders>
    <Loggers>
    <Root level="debug">
    <AppenderRef ref="LogFile"/>
    </Root>
    <Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
    <AppenderRef ref="LogFile"/>
    </Logger>
    <Logger name="pgx.dist.cluster_host" level="debug">
    <AppenderRef ref="LogFile"/>
    </Logger>
    </Loggers>
    </Configuration>


    And finally ran the yarn start server command, just like this:



    yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf


    And we get the bottom of the logfile that seems realy nice!:



    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
    18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
    18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
    18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
    18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
    18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
    18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
    18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
    18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
    18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
    18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down


    But connecting to it still returns 404 ;(



    The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:



    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
    18/12/11 16:25:06 INFO yarn.AppMaster: register app
    18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
    18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
    18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
    18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
    18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
    18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
    18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
    18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
    18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
    18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
    18/12/11 16:25:07 INFO yarn.AppMaster: server env = CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR
    18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
    18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
    18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
    .
    .
    .


    This is the farthest we've managed to go.



    We can start our work now! That's realy exciting.
    Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.



    Would be lovely to have this running on Yarn at the production level.



    Thank you all for the extreme dedication and attention.






    share|improve this answer



























      0














      I'll do a little follow up here.



      We've managed to start a pgx server and manipulate hbase graph! :D



      PGX "Hello World"



      We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:



      cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
      opg = OraclePropertyGraph.getInstance(cfg)

      a = opg.addVertex()
      a.setProperty('nome', 'Felipe')

      b = opg.addVertex()
      b.setProperty('nome', 'Rhenan')

      c = opg.addVertex()
      c.setProperty('nome', 'Hugo')

      opg.addEdge(a, b, 'Pai de')
      opg.addEdge(b, c, 'Pai de')
      opg.addEdge(a, c, 'Avo de')

      opg.commit()

      session = Pgx.createSession('sinapsepgx')
      analyst = session.createAnalyst()
      pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
      analyst.countTriangles(pgxGraph, true)


      And that worked just fine!



      Client - Server architecture



      The next step, we moved to a client/server mode, starting the start-server script.
      We managed to do that just fine too!
      This is our config files:



      server.conf




      "port": 7007,
      "enable_tls": false,
      "enable_client_authentication": false



      pgx.conf




      "allow_idle_timeout_overwrite": true,
      "allow_local_filesystem": false,
      "allow_task_timeout_overwrite": true,
      "enable_gm_compiler": true,
      "enterprise_scheduler_config":
      "analysis_task_config":
      "priority": "MEDIUM",
      "weight": 12,
      "max_threads": 12
      ,
      "fast_analysis_task_config":
      "priority": "HIGH",
      "weight": 1,
      "max_threads": 12
      ,
      "num_io_threads_per_task": 12
      ,
      "preload_graphs": [
      "path": "graphs/sinapse_conf.json",
      "name": "sinapse"
      ],
      "max_active_sessions": 1024,
      "max_queue_size_per_session": -1,
      "max_snapshot_count": 0,
      "memory_cleanup_interval": 600,
      "path_to_gm_compiler": null,
      "release_memory_threshold": 0.85,
      "session_idle_timeout_secs": 0,
      "session_task_timeout_secs": 0,
      "strict_mode": true,
      "tmp_dir": "/tmp"



      sinapse_conf.json




      "edge_props": [

      "name": "relacao",
      "type": "string"

      ],
      "db_engine": "HBASE",
      "vertex_props": [

      "name": "nome",
      "type": "string"
      ,

      "name": "cpf",
      "type": "string"

      ],
      "format": "pg",
      "name": "sinapse",
      "error_handling": ,
      "vertex_id_type": "long",
      "attributes": ,
      "loading": ,
      "zk_quorum": "bda1node05,bda1node06,bda1node07"




      start-script ran just fine with that, preloaded our hbase graph, works like a charm.



      Connected to the server using the pgx client:



      ./bin/pgx -b http://localhost:7007


      And managed to do the same we did in the groovy shell.
      That's awesome.



      PGX on Yarn



      Well, now we are back in our challenge: run and manage PGX on Yarn.



      We've copied our pgx.conf file to the hdfs, like this:



      hdfs://user/pgx/pgx.conf




      "allow_idle_timeout_overwrite": true,
      "allow_local_filesystem": false,
      "allow_task_timeout_overwrite": true,
      "enable_gm_compiler": true,
      "enterprise_scheduler_config":
      "analysis_task_config":
      "priority": "MEDIUM",
      "weight": 12,
      "max_threads": 12
      ,
      "fast_analysis_task_config":
      "priority": "HIGH",
      "weight": 1,
      "max_threads": 12
      ,
      "num_io_threads_per_task": 12
      ,
      "preload_graphs": [
      "path": "graphs/sinapse_conf.json",
      "name": "sinapse"
      ],
      "max_active_sessions": 1024,
      "max_queue_size_per_session": -1,
      "max_snapshot_count": 0,
      "memory_cleanup_interval": 600,
      "path_to_gm_compiler": null,
      "release_memory_threshold": 0.85,
      "session_idle_timeout_secs": 0,
      "session_task_timeout_secs": 0,
      "strict_mode": true,
      "tmp_dir": "/tmp"



      /opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf




      "pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
      "pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
      "pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
      "pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
      "pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
      "pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
      "zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
      "standard_library_path": "/usr/lib64/gcc/4.8.2",
      "min_heap_size": "512m",
      "max_heap_size": "12g",
      "container_cores": 9,
      "container_memory": 0,
      "container_priority": 0,
      "num_machines": 1



      Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.



      So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:



      hdfs://user/pgx/log4j2.xml



      <?xml version="1.0" encoding="UTF-8"?>
      <Configuration status="WARN">
      <Appenders>
      <Console name="Console" target="SYSTEM_OUT">
      <PatternLayout pattern="%dHH:mm:ss,SSS %p %C1 - %m%n"/>
      </Console>
      <File name="LogFile" fileName="file:/tmp/pg_trace.log">
      <PatternLayout pattern="%dHH:mm:ss.SSS [%t] %-5level %logger36 - %msg%n"/>
      </File>
      </Appenders>
      <Loggers>
      <Root level="debug">
      <AppenderRef ref="LogFile"/>
      </Root>
      <Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
      <AppenderRef ref="LogFile"/>
      </Logger>
      <Logger name="pgx.dist.cluster_host" level="debug">
      <AppenderRef ref="LogFile"/>
      </Logger>
      </Loggers>
      </Configuration>


      And finally ran the yarn start server command, just like this:



      yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf


      And we get the bottom of the logfile that seems realy nice!:



      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
      18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
      18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
      18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
      18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
      18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
      18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
      18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
      18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
      18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
      18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down


      But connecting to it still returns 404 ;(



      The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:



      SLF4J: Class path contains multiple SLF4J bindings.
      SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
      SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
      ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
      18/12/11 16:25:06 INFO yarn.AppMaster: register app
      18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
      18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
      18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
      18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
      18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
      18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
      18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
      18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
      18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
      18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
      18/12/11 16:25:07 INFO yarn.AppMaster: server env = CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR
      18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
      18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
      18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
      .
      .
      .


      This is the farthest we've managed to go.



      We can start our work now! That's realy exciting.
      Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.



      Would be lovely to have this running on Yarn at the production level.



      Thank you all for the extreme dedication and attention.






      share|improve this answer

























        0












        0








        0







        I'll do a little follow up here.



        We've managed to start a pgx server and manipulate hbase graph! :D



        PGX "Hello World"



        We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:



        cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
        opg = OraclePropertyGraph.getInstance(cfg)

        a = opg.addVertex()
        a.setProperty('nome', 'Felipe')

        b = opg.addVertex()
        b.setProperty('nome', 'Rhenan')

        c = opg.addVertex()
        c.setProperty('nome', 'Hugo')

        opg.addEdge(a, b, 'Pai de')
        opg.addEdge(b, c, 'Pai de')
        opg.addEdge(a, c, 'Avo de')

        opg.commit()

        session = Pgx.createSession('sinapsepgx')
        analyst = session.createAnalyst()
        pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
        analyst.countTriangles(pgxGraph, true)


        And that worked just fine!



        Client - Server architecture



        The next step, we moved to a client/server mode, starting the start-server script.
        We managed to do that just fine too!
        This is our config files:



        server.conf




        "port": 7007,
        "enable_tls": false,
        "enable_client_authentication": false



        pgx.conf




        "allow_idle_timeout_overwrite": true,
        "allow_local_filesystem": false,
        "allow_task_timeout_overwrite": true,
        "enable_gm_compiler": true,
        "enterprise_scheduler_config":
        "analysis_task_config":
        "priority": "MEDIUM",
        "weight": 12,
        "max_threads": 12
        ,
        "fast_analysis_task_config":
        "priority": "HIGH",
        "weight": 1,
        "max_threads": 12
        ,
        "num_io_threads_per_task": 12
        ,
        "preload_graphs": [
        "path": "graphs/sinapse_conf.json",
        "name": "sinapse"
        ],
        "max_active_sessions": 1024,
        "max_queue_size_per_session": -1,
        "max_snapshot_count": 0,
        "memory_cleanup_interval": 600,
        "path_to_gm_compiler": null,
        "release_memory_threshold": 0.85,
        "session_idle_timeout_secs": 0,
        "session_task_timeout_secs": 0,
        "strict_mode": true,
        "tmp_dir": "/tmp"



        sinapse_conf.json




        "edge_props": [

        "name": "relacao",
        "type": "string"

        ],
        "db_engine": "HBASE",
        "vertex_props": [

        "name": "nome",
        "type": "string"
        ,

        "name": "cpf",
        "type": "string"

        ],
        "format": "pg",
        "name": "sinapse",
        "error_handling": ,
        "vertex_id_type": "long",
        "attributes": ,
        "loading": ,
        "zk_quorum": "bda1node05,bda1node06,bda1node07"




        start-script ran just fine with that, preloaded our hbase graph, works like a charm.



        Connected to the server using the pgx client:



        ./bin/pgx -b http://localhost:7007


        And managed to do the same we did in the groovy shell.
        That's awesome.



        PGX on Yarn



        Well, now we are back in our challenge: run and manage PGX on Yarn.



        We've copied our pgx.conf file to the hdfs, like this:



        hdfs://user/pgx/pgx.conf




        "allow_idle_timeout_overwrite": true,
        "allow_local_filesystem": false,
        "allow_task_timeout_overwrite": true,
        "enable_gm_compiler": true,
        "enterprise_scheduler_config":
        "analysis_task_config":
        "priority": "MEDIUM",
        "weight": 12,
        "max_threads": 12
        ,
        "fast_analysis_task_config":
        "priority": "HIGH",
        "weight": 1,
        "max_threads": 12
        ,
        "num_io_threads_per_task": 12
        ,
        "preload_graphs": [
        "path": "graphs/sinapse_conf.json",
        "name": "sinapse"
        ],
        "max_active_sessions": 1024,
        "max_queue_size_per_session": -1,
        "max_snapshot_count": 0,
        "memory_cleanup_interval": 600,
        "path_to_gm_compiler": null,
        "release_memory_threshold": 0.85,
        "session_idle_timeout_secs": 0,
        "session_task_timeout_secs": 0,
        "strict_mode": true,
        "tmp_dir": "/tmp"



        /opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf




        "pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
        "pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
        "pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
        "pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
        "pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
        "pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
        "zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
        "standard_library_path": "/usr/lib64/gcc/4.8.2",
        "min_heap_size": "512m",
        "max_heap_size": "12g",
        "container_cores": 9,
        "container_memory": 0,
        "container_priority": 0,
        "num_machines": 1



        Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.



        So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:



        hdfs://user/pgx/log4j2.xml



        <?xml version="1.0" encoding="UTF-8"?>
        <Configuration status="WARN">
        <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
        <PatternLayout pattern="%dHH:mm:ss,SSS %p %C1 - %m%n"/>
        </Console>
        <File name="LogFile" fileName="file:/tmp/pg_trace.log">
        <PatternLayout pattern="%dHH:mm:ss.SSS [%t] %-5level %logger36 - %msg%n"/>
        </File>
        </Appenders>
        <Loggers>
        <Root level="debug">
        <AppenderRef ref="LogFile"/>
        </Root>
        <Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
        <AppenderRef ref="LogFile"/>
        </Logger>
        <Logger name="pgx.dist.cluster_host" level="debug">
        <AppenderRef ref="LogFile"/>
        </Logger>
        </Loggers>
        </Configuration>


        And finally ran the yarn start server command, just like this:



        yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf


        And we get the bottom of the logfile that seems realy nice!:



        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
        18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
        18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
        18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
        18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
        18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
        18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
        18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
        18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
        18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down


        But connecting to it still returns 404 ;(



        The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:



        SLF4J: Class path contains multiple SLF4J bindings.
        SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
        SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
        SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
        SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
        ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
        18/12/11 16:25:06 INFO yarn.AppMaster: register app
        18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
        18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
        18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
        18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
        18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
        18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
        18/12/11 16:25:07 INFO yarn.AppMaster: server env = CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR
        18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
        18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
        .
        .
        .


        This is the farthest we've managed to go.



        We can start our work now! That's realy exciting.
        Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.



        Would be lovely to have this running on Yarn at the production level.



        Thank you all for the extreme dedication and attention.






        share|improve this answer













        I'll do a little follow up here.



        We've managed to start a pgx server and manipulate hbase graph! :D



        PGX "Hello World"



        We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:



        cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
        opg = OraclePropertyGraph.getInstance(cfg)

        a = opg.addVertex()
        a.setProperty('nome', 'Felipe')

        b = opg.addVertex()
        b.setProperty('nome', 'Rhenan')

        c = opg.addVertex()
        c.setProperty('nome', 'Hugo')

        opg.addEdge(a, b, 'Pai de')
        opg.addEdge(b, c, 'Pai de')
        opg.addEdge(a, c, 'Avo de')

        opg.commit()

        session = Pgx.createSession('sinapsepgx')
        analyst = session.createAnalyst()
        pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
        analyst.countTriangles(pgxGraph, true)


        And that worked just fine!



        Client - Server architecture



        The next step, we moved to a client/server mode, starting the start-server script.
        We managed to do that just fine too!
        This is our config files:



        server.conf




        "port": 7007,
        "enable_tls": false,
        "enable_client_authentication": false



        pgx.conf




        "allow_idle_timeout_overwrite": true,
        "allow_local_filesystem": false,
        "allow_task_timeout_overwrite": true,
        "enable_gm_compiler": true,
        "enterprise_scheduler_config":
        "analysis_task_config":
        "priority": "MEDIUM",
        "weight": 12,
        "max_threads": 12
        ,
        "fast_analysis_task_config":
        "priority": "HIGH",
        "weight": 1,
        "max_threads": 12
        ,
        "num_io_threads_per_task": 12
        ,
        "preload_graphs": [
        "path": "graphs/sinapse_conf.json",
        "name": "sinapse"
        ],
        "max_active_sessions": 1024,
        "max_queue_size_per_session": -1,
        "max_snapshot_count": 0,
        "memory_cleanup_interval": 600,
        "path_to_gm_compiler": null,
        "release_memory_threshold": 0.85,
        "session_idle_timeout_secs": 0,
        "session_task_timeout_secs": 0,
        "strict_mode": true,
        "tmp_dir": "/tmp"



        sinapse_conf.json




        "edge_props": [

        "name": "relacao",
        "type": "string"

        ],
        "db_engine": "HBASE",
        "vertex_props": [

        "name": "nome",
        "type": "string"
        ,

        "name": "cpf",
        "type": "string"

        ],
        "format": "pg",
        "name": "sinapse",
        "error_handling": ,
        "vertex_id_type": "long",
        "attributes": ,
        "loading": ,
        "zk_quorum": "bda1node05,bda1node06,bda1node07"




        start-script ran just fine with that, preloaded our hbase graph, works like a charm.



        Connected to the server using the pgx client:



        ./bin/pgx -b http://localhost:7007


        And managed to do the same we did in the groovy shell.
        That's awesome.



        PGX on Yarn



        Well, now we are back in our challenge: run and manage PGX on Yarn.



        We've copied our pgx.conf file to the hdfs, like this:



        hdfs://user/pgx/pgx.conf




        "allow_idle_timeout_overwrite": true,
        "allow_local_filesystem": false,
        "allow_task_timeout_overwrite": true,
        "enable_gm_compiler": true,
        "enterprise_scheduler_config":
        "analysis_task_config":
        "priority": "MEDIUM",
        "weight": 12,
        "max_threads": 12
        ,
        "fast_analysis_task_config":
        "priority": "HIGH",
        "weight": 1,
        "max_threads": 12
        ,
        "num_io_threads_per_task": 12
        ,
        "preload_graphs": [
        "path": "graphs/sinapse_conf.json",
        "name": "sinapse"
        ],
        "max_active_sessions": 1024,
        "max_queue_size_per_session": -1,
        "max_snapshot_count": 0,
        "memory_cleanup_interval": 600,
        "path_to_gm_compiler": null,
        "release_memory_threshold": 0.85,
        "session_idle_timeout_secs": 0,
        "session_task_timeout_secs": 0,
        "strict_mode": true,
        "tmp_dir": "/tmp"



        /opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf




        "pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
        "pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
        "pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
        "pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
        "pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
        "pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
        "zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
        "standard_library_path": "/usr/lib64/gcc/4.8.2",
        "min_heap_size": "512m",
        "max_heap_size": "12g",
        "container_cores": 9,
        "container_memory": 0,
        "container_priority": 0,
        "num_machines": 1



        Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.



        So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:



        hdfs://user/pgx/log4j2.xml



        <?xml version="1.0" encoding="UTF-8"?>
        <Configuration status="WARN">
        <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
        <PatternLayout pattern="%dHH:mm:ss,SSS %p %C1 - %m%n"/>
        </Console>
        <File name="LogFile" fileName="file:/tmp/pg_trace.log">
        <PatternLayout pattern="%dHH:mm:ss.SSS [%t] %-5level %logger36 - %msg%n"/>
        </File>
        </Appenders>
        <Loggers>
        <Root level="debug">
        <AppenderRef ref="LogFile"/>
        </Root>
        <Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
        <AppenderRef ref="LogFile"/>
        </Logger>
        <Logger name="pgx.dist.cluster_host" level="debug">
        <AppenderRef ref="LogFile"/>
        </Logger>
        </Loggers>
        </Configuration>


        And finally ran the yarn start server command, just like this:



        yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf


        And we get the bottom of the logfile that seems realy nice!:



        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
        18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
        18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
        18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
        18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
        18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
        18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
        18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
        18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
        18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
        18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down


        But connecting to it still returns 404 ;(



        The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:



        SLF4J: Class path contains multiple SLF4J bindings.
        SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
        SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
        SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
        SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
        ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
        18/12/11 16:25:06 INFO yarn.AppMaster: register app
        18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
        18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
        18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
        18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
        18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
        18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
        18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
        18/12/11 16:25:07 INFO yarn.AppMaster: server env = CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR
        18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
        18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
        18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
        .
        .
        .


        This is the farthest we've managed to go.



        We can start our work now! That's realy exciting.
        Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.



        Would be lovely to have this running on Yarn at the production level.



        Thank you all for the extreme dedication and attention.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 11 '18 at 18:32









        SamambaSamamba

        387




        387



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53284215%2foracle-pgx-on-yarn-404-on-webservice%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Use pre created SQLite database for Android project in kotlin

            Darth Vader #20

            Ondo