Tableau Graceful Restarter Application: PET Restart

By
 In FOR BI PROS, just blog, LEARN, Tableau

PET RestartLast week I had the opportunity to show some of our solutions at ISATUG (International Server Administration Tableau User Group). One of my topics was graceful restart, because, yes, I still find it so useful that I want to share this goodie as much as I can. And to share something that actually works in practice I wrote a small plugin for our very-best Palette Enterprise Tabadmin (we call it pet) distribution that works standalone and can do the graceful restart magic what I explained in Graceful Restart Part1 and Part2.

The Video with the Star (uhm, yes, with me)

This video shows the basic usage / use case: we have a nice simple server. We’d just like to switch the vizql processes to debug mode to find out some interesting behaviour. Previously it wasn’t possible without a full tableau server restart, but using this free open source tool you can do it without interrupting your users.

Now, let’s see how you can get it and what it can do for you.

Where can I get it?

The project’s github page is https://github.com/palette-software/pet-restart but if you are not ready to build it from sources just grab the binary jar file from https://github.com/palette-software/pet-restart/releases. You need Java 8 to run it but hey, this is 2016, you should have Java 8 already anyway.

How can I use it?

Enable Balancer Manager

First of all, you need to fix / enable the balancer-manager portal in httpd.conf.templ  located in (installdir)\Tableau\Tableau Server\(version number)\templates\ . Even it seems it’s enabled by default, don’t be confused, you need to change this part:

<Location /balancer-manager>
SetHandler balancer-manager
Require host 127.0.0.1
</Location>

to this:

<Location /balancer-manager>
SetHandler balancer-manager
<RequireAny>
Require ip ::1
Require ip 127.0.0.1
</RequireAny>
</Location>

As I wrote in this article you always need to issue a tabadmin configure  and restart  after you change a template, but in fact, after tabadmin restart  it’s enough to kill the apache process to use this new configuration.

Now, if you navigate to your tableau server’s /balancer-manager  url you should see the balancer manager portal where you can control your worker routes.

Prerequisites

You need JMX since the tool reads out the number of active connections from the JMX ports. That can be enabled by

tabadmin set service.jmx_enabled true

But the whole process and its impact are described here: https://onlinehelp.tableau.com/current/server/en-us/ports_jmx.htm

Restart processes

The process is the same as described in Graceful Restart part2. In case of web apps we put the worker in drain mode and wait until JMX reports no active sessions. When that happens we just restart the worker process using the new configuration values. For other processes like Redis (cache), Apache (gateway) or Postgres (repository) we use other graceful reload methods as described in part1.

It’s also advised to test the tool in simulation mode. Simulation doesn’t restart any of the processes but rather goes through all the steps before an actual restart. It is recommended to run a simulation before issuing any restart commands to avoid possible failures. An example where JMX is disabled on vizqlserver:

C:\Users\palette\Java\pet-restart>java -jar target/pet-restart-1.1-SNAPSHOT.jar -s -r
Running simulation.
Restarting Repository
Restarting Cache Server(s)
There are 2 ports
Restarting Cache server at port 6379
Restarting Cache server at port 6380
Locating local-vizportal workers from balancer-manager
vizqlserver null http://localhost:8600
vizqlserver null http://localhost:8601
Restarting worker
Switching worker to Draining mode
Sending stop signal to process 164764
Switch worker to Non-disabled mode
Restart complete
Restarting worker
Switching worker to Draining mode
Sending stop signal to process 164764
Switch worker to Non-disabled mode
Restart complete
Locating vizqlserver-cluster workers from balancer-manager
JMX connection error.
Retrying after 60 seconds...
JMX connection error.
Retrying after 60 seconds...
JMX connection error.
Retrying after 60 seconds...
Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 192.168.224.137; nested exception is:
java.net.ConnectException: Connection refused: connect]

If all good, you can start restarting your services. All command line options are available if you just start the tool without arguments:

java -jar pet-restart-1.0.jar  

Gracefully restart VizQL Workers:

java -jar pet-restart-1.0.jar -rv

Non-gracefully restart VizQL Workers as fast as possible:

java -jar pet-restart-1.0.jar -rv -f –wait 1

Reload the Repository’s configuration file:

java -jar pet-restart-1.0.jar -pg

For more options check the docs here: https://github.com/palette-software/pet-restart#switches 

Limitations / TODOs

In current format, pet-restart only works on a single node configuration. Don’t panic, we’re working to make it multi node compatible using tomcat shutdown ports instead of local process kills.