Debug SlapOS Node

Debug SlapOS Node

This document is a collection of tips and tricks for trying to understand issues when working with SlapOS and software deployed on it. Make sure you have read the SlapOS Basics and SlapOS Architecture design documents to get a basic understanding of how SlapOS is built and what the different components are doing.

It is expected you have command line access to the node or server you want to debug. The SlapOS core command line documentation (partially outdated) may also prove helpful.

Table of Content

  • General Debugging
  • Specific Software Issues
  • Webrunner Debugging
 

General Debugging

This chapter contains information and basic commands to run when analysing a SlapOS node.

Installation Issues

$ sudo su
# tar czf slapos-log.tar.gz /opt/slapos/log/slapos-node-*

# (different terminal)
$ scp debian@xx.xx.xx.xxx:/home/debian/slapos-log.tar.gz

It is not a given that installation of a software finishes correctly. Should you run into problems during installation and the error is not evident from the output you receive, you can create a log like so:

$ sudo su
# tar czf slapos-log.tar.gz /opt/slapos/log/slapos-node-*

and from another terminal, call:

scp debian@xx.xx.xx.xxx:/home/debian/slapos-log.tar.gz ./

to download the log file to your local machine. The log will contain more information on what may have caused errors. If you cannot resolve the issue, get in touch on the Vifib Forum and ask for help.

Where are Logs?

$ sudo su
# ls /opt/slapos/log/
slapos-node-collect.log   slapos-node-software.log   slapos-node-report.log   slapos-node-instance.log
slapos-node-format.log

There are different logs which you can find in the log directory.

$ sudo su
# ls /opt/slapos/log/
slapos-node-collect.log   slapos-node-software.log   slapos-node-report.log   slapos-node-instance.log
slapos-node-format.log

The slapos-node-instance.log is the most important giving information on all partitions and processes run on an instance, while the slapos-node-software.log containes info on the status of the software(s) being compiled and running on the instance.

To see what is written into all of the logs you can check:

# cat /etc/cron.d/slapos-node

To see the log itself

# tail opt/slapos/log/slapos-node-software.log

SlapOS Node Status

$ sudo su
# slapos node

To see the status of running processes call slapos node status or just slapos node (full list of commands).

# slapos node status
slappart8:bootstrap-monitor                EXITED    Mar 20 02:28 PM
slappart8:certificate_authority-on-watch   RUNNING   pid 12137, uptime 0:33:54
slappart8:crond-on-watch                   RUNNING   pid 12125, uptime 0:33:54
slappart8:frontend-apache-safe-graceful    EXITED    Mar 20 02:28 PM
slappart8:frontend-nginx-safe-graceful     EXITED    Mar 20 02:28 PM
slappart8:frontend_apache-on-watch         EXITED    Mar 20 02:28 PM
slappart8:frontend_nginx-on-watch          RUNNING   pid 12136, uptime 0:33:54
slappart8:monitor-httpd-graceful           EXITED    Mar 20 02:28 PM
slappart8:monitor-httpd-on-watch           RUNNING   pid 12128, uptime 0:33:54
slappart8:trafficserver-on-watch           RUNNING   pid 12134, uptime 0:33:54
slappart8:trafficserver-reload             EXITED    Mar 20 02:28 PM
slappart9:bootstrap-monitor                EXITED    Mar 20 02:29 PM
slappart9:certificate_authority-on-watch   RUNNING   pid 11866, uptime 0:36:16
slappart9:crond                            RUNNING   pid 11867, uptime 0:36:16
slappart9:monitor-httpd-graceful           EXITED    Mar 20 02:29 PM
slappart9:monitor-httpd-on-watch           RUNNING   pid 11865, uptime 0:36:16
watchdog                                   RUNNING   pid 24426, uptime 7 days, 21:46:29

In the above example you can see the frontend_apache-on-watch process has exited. on-watch processes are the running processes while safe-graceful are only triggered when updating the configuration of a service (for example via the XML parameter form of the service in the SlapOS Dashboard). This process is used to minimize downtime of the actual service.

To investigate the status of this or any other process on the list, you can use # slapos node tail slappart8:frontend_apache-on-watch.

To restart a service you can use slapos node restart slappart8:frontend_apache-on-watch.

To grep processes you can use

# slapos node | grep apache
slappart8:frontend-apache-safe-graceful    EXITED    Mar 20 02:38 PM
slappart8:frontend_apache-on-watch         RUNNING   pid 19230, uptime 0:07:10

SlapOS Instance/Software

$ sudo su
# slapos node [instance/software]

Using slapos node instance or slapos node software will give you information on their respective current status:

$sudo su
# slapos node instance
2018-06-26 09:11:33 slapos[17326] INFO New slapos process started, but another 
  slapos process is aleady running with pid 15852, exiting.
# slapos node software
2018-06-26 09:11:50 slapos[17493] INFO Processing software releases...
2018-06-26 09:11:50 slapos[17493] INFO Finished software releases.

When the software is idle as in the example above, there isn't much to see but during installation, the output is equivalent to the progress log entries seen inside a webrunner when installing/instantiating.

Specific Software Issues

This section will cover common issues encountered when working with software available through SlapOS.

SlapOS Master

The following tips and tricks might be useful:

  • If SlapProxy does not start automatically, slapos proxy start >& /dev/null & can manually start it.
  • Use # erp5-show -s to see if compilation of SlapOS ERP5 has finished.

Software: Apache Frontend

When working with a Apache Frontend instance, any of the following are good points to start debugging:

  • What is the DNS status? See for example pingdom for info.
  • SSL certificate validity? See ssl-checker.
  • Socat Binding still working? Use # ps aux | grep "socat", should give something like:
    root      3234  0.0  0.0  12728  2180 pts/0    S+   09:02   0:00 grep socat
  • Frontend still running? Check # slapos node status and restart using slapos node restart slappart[x]:frontend_apache-on-watch

Webrunner Debugging

The Webrunner is both development IDE for software (with editor, ssh access, service dashboard and logs) as well mini SlapOS Master hosting a single software instance. It is useful for deploying production systems because it provides resiliency, allows to run isolated patched instances of a software and gives ssh access to the underlying file system as is often required.

When working with Webrunner building software using Buildout you can run into a number of issues for which this section will provide debugging tips and tricks.

Software and Instance Logs

Debug SlapOS Node - Webrunner Interface - Webrunner Instance Logs

When building software the left hand window displays the software and instances logs. Each is accessible separately via the Logs menu or through the Editor and clicking on "This project" in the subheader to switch to "Working Dir".

Note, the software log reports on progress of Buildout running the software.cfg profile to build the software release. Errors during the build should therefore be found in the software profile. After the software has been built, the Webrunner tries to instantiate using instance.cfg.in. Errors during instantiation are thus normally found in the instance profile.

Note, that every section in both the software.cfg and instance.cfg.in profiles should show up on the Webrunner's log. In case a section is missing, you might have forgotten to add it to the parts.

Templates

Debug SlapOS Node - Webrunner Interface - Software Profile Templates

Templates in SlapOS can be rendered by Buildout or Jinja2. Jinja templates use double curly braces eg {{ dict['parameter'] }} (details), while Buildout uses "$" and single curly braces, eg ${buildout:directory} (details). You will encounter $${:parameter} only when instances.cfg.in is rendered by software.cfg using Buildout, and only for variables of the instance itself.

Hidden .instance.cfg blocking Rebuilds

Debug SlapOS Node - Webrunner Interface - Hidden .instance.cfg

Once a build and instantiation finished, you should find a completed .instance.cfg in the instance/slappart0/ directory. You can use the Terminal to access the file (or use shellinbox by going to https://[your_instance_url]/shellinabox/). Open the file and verify no variables are left unreplaced (like when using $$ in the wrong place). This can for example return $ instead of the variable value).

If you fixed your issues and the build keeps failing with the same message, remove the hidden .instance.cfg.in by hand and rebuild. Be careful, you should know what you are doing!

Thank You

Image Nexedi Office
  • Nexedi SA
  • 147 Rue du Ballon
  • 59110 La Madeleine
  • France