Tableau Graceful Restarter Application: PET Restart
Last week I had the opportunity to show some of our solutions at ISATUG (International Server Administration Tableau User Group). One of my topics was graceful restart, because, yes, I still find it so useful that I want to share this goodie as much as I can. And to share something that actually works in practice I wrote a small plugin for our very-best Palette Enterprise Tabadmin (we call it pet) distribution that works standalone and can do the graceful restart magic what I explained in Graceful Restart Part1 and Part2.
The Video with the Star (uhm, yes, with me)
This video shows the basic usage / use case: we have a nice simple server. We’d just like to switch the vizql processes to debug mode to find out some interesting behaviour. Previously it wasn’t possible without a full tableau server restart, but using this free open source tool you can do it without interrupting your users.
Now, let’s see how you can get it and what it can do for you.
Where can I get it?
The project’s github page is https://github.com/palette-software/pet-restart but if you are not ready to build it from sources just grab the binary jar file from https://github.com/palette-software/pet-restart/releases. You need Java 8 to run it but hey, this is 2016, you should have Java 8 already anyway.
How can I use it?
Enable Balancer Manager
First of all, you need to fix / enable the balancer-manager portal in httpd.conf.templ located in (installdir)\Tableau\Tableau Server\(version number)\templates\ . Even it seems it’s enabled by default, don’t be confused, you need to change this part:
<Location /balancer-manager> SetHandler balancer-manager Require host 127.0.0.1 </Location>
<Location /balancer-manager> SetHandler balancer-manager <RequireAny> Require ip ::1 Require ip 127.0.0.1 </RequireAny> </Location>
As I wrote in this article you always need to issue a tabadmin configure and restart after you change a template, but in fact, after tabadmin restart it’s enough to kill the apache process to use this new configuration.
Now, if you navigate to your tableau server’s /balancer-manager url you should see the balancer manager portal where you can control your worker routes.
You need JMX since the tool reads out the number of active connections from the JMX ports. That can be enabled by
tabadmin set service.jmx_enabled true
But the whole process and its impact are described here: https://onlinehelp.tableau.com/current/server/en-us/ports_jmx.htm
The process is the same as described in Graceful Restart part2. In case of web apps we put the worker in drain mode and wait until JMX reports no active sessions. When that happens we just restart the worker process using the new configuration values. For other processes like Redis (cache), Apache (gateway) or Postgres (repository) we use other graceful reload methods as described in part1.
It’s also advised to test the tool in simulation mode. Simulation doesn’t restart any of the processes but rather goes through all the steps before an actual restart. It is recommended to run a simulation before issuing any restart commands to avoid possible failures. An example where JMX is disabled on vizqlserver:
C:\Users\palette\Java\pet-restart>java -jar target/pet-restart-1.1-SNAPSHOT.jar -s -r Running simulation. Restarting Repository Restarting Cache Server(s) There are 2 ports Restarting Cache server at port 6379 Restarting Cache server at port 6380 Locating local-vizportal workers from balancer-manager vizqlserver null http://localhost:8600 vizqlserver null http://localhost:8601 Restarting worker Switching worker to Draining mode Sending stop signal to process 164764 Switch worker to Non-disabled mode Restart complete Restarting worker Switching worker to Draining mode Sending stop signal to process 164764 Switch worker to Non-disabled mode Restart complete Locating vizqlserver-cluster workers from balancer-manager JMX connection error. Retrying after 60 seconds... JMX connection error. Retrying after 60 seconds... JMX connection error. Retrying after 60 seconds... Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 192.168.224.137; nested exception is: java.net.ConnectException: Connection refused: connect]
If all good, you can start restarting your services. All command line options are available if you just start the tool without arguments:
java -jar pet-restart-1.0.jar
Gracefully restart VizQL Workers:
java -jar pet-restart-1.0.jar -rv
Non-gracefully restart VizQL Workers as fast as possible:
java -jar pet-restart-1.0.jar -rv -f –wait 1
Reload the Repository’s configuration file:
java -jar pet-restart-1.0.jar -pg
For more options check the docs here: https://github.com/palette-software/pet-restart#switches