Debugging Arvados deployed with Salt

As discussed on Gitter I’m having some trouble getting workflows to run from the Workbench. I have a

crunch-dispatch-local.service as follows:

########################################################################
# File managed by Salt at <salt://arvados/dispatcher/service/files/default/crunch-dispatch-local-service.tmpl>.
# Your changes will be overwritten.
########################################################################
[Unit]
Description=Arvados Crunch Dispatcher for LOCAL service
Documentation=https://doc.arvados.org/
After=network.target

# systemd==229 (ubuntu:xenial) obeys StartLimitInterval in the [Unit] section
StartLimitInterval=0

# systemd>=230 (debian:9) obeys StartLimitIntervalSec in the [Unit] section
StartLimitIntervalSec=0

[Service]
Type=simple
EnvironmentFile=-/etc/arvados/environment
ExecStart=/usr/bin/crunch-dispatch-local -poll-interval=1 -crunch-run-command=/usr/bin/crunch-run
# Set a reasonable default for the open file limit
LimitNOFILE=65536
Restart=always
RestartSec=1
LimitNOFILE=1000000

# systemd<=219 (centos:7, debian:8, ubuntu:trusty) obeys StartLimitInterval in the [Service] section
StartLimitInterval=0

[Install]
WantedBy=multi-user.target

and /etc/arvados/environment as follows:

ARVADOS_API_HOST=sanbi.arvados.sanbi.ac.za
ARVADOS_API_HOST_INSECURE=1
ARVADOS_API_TOKEN=changeme_system_root_token 

and the error that keeps cycling in my syslog is:

Nov 18 18:24:06 arvados systemd[1]: Started Arvados Crunch Dispatcher for LOCAL service.
Nov 18 18:24:06 arvados crunch-dispatch-local[9796]: {"level":"info","msg":"crunch-dispatch-local 2.1.0 started","time":"2020-11-18T18:24:06.668847709Z"}
Nov 18 18:24:06 arvados arvados-controller[3028]: {"PID":3028,"RequestID":"req-4khgiez4l7ozjrnwoeef","level":"info","msg":"request","remoteAddr":"127.0.0.1:45358","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"sanbi.arvados.sanbi.ac.za","reqMethod":"GET","reqPath":"arvados/v1/api_client_authorizations/current","reqQuery":"","time":"2020-11-18T18:24:06.678752178Z"}
Nov 18 18:24:06 arvados arvados-controller[3028]: {"PID":3028,"RequestID":"req-4khgiez4l7ozjrnwoeef","level":"info","msg":"response","remoteAddr":"127.0.0.1:45358","reqBytes":0,"reqForwardedFor":"127.0.0.1","reqHost":"sanbi.arvados.sanbi.ac.za","reqMethod":"GET","reqPath":"arvados/v1/api_client_authorizations/current","reqQuery":"","respBody":"{\"errors\":[\"Not logged in (req-4khgiez4l7ozjrnwoeef)\"],\"error_token\":\"1605723846+4bd3eab8\"}","respBytes":91,"respStatus":"Unauthorized","respStatusCode":401,"time":"2020-11-18T18:24:06.693454796Z","timeToStatus":0.014530,"timeTotal":0.014659,"timeWriteBody":0.000129}
Nov 18 18:24:06 arvados crunch-dispatch-local[9796]: {"level":"fatal","msg":"\"error getting my token UUID: arvados API server error: Not logged in (req-4khgiez4l7ozjrnwoeef) (401: 401 Unauthorized) returned by sanbi.arvados.sanbi.ac.za\"","time":"2020-11-18T18:24:06.694966035Z"}
Nov 18 18:24:06 arvados systemd[1]: crunch-dispatch-local.service: Main process exited, code=exited, status=1/FAILURE
Nov 18 18:24:06 arvados systemd[1]: crunch-dispatch-local.service: Failed with result 'exit-code'.
Nov 18 18:24:07 arvados systemd[1]: crunch-dispatch-local.service: Service hold-off time over, scheduling restart.
Nov 18 18:24:07 arvados systemd[1]: crunch-dispatch-local.service: Scheduled restart job, restart counter is at 25.
Nov 18 18:24:07 arvados systemd[1]: Stopped Arvados Crunch Dispatcher for LOCAL service.

btw the workflow that I am testing is https://github.com/pvanheus/lukasa/blob/main/protein_evidence_mapping.cwl with appropriate inputs…

ARVADOS_API_TOKEN=changeme_system_root_token 

@pvanheus, after some research you might have discover a bug. Please try a 41 character long, alphanumeric only system root token (i.e: CvlebKKXnykZwE3f3n7NdMRQI2iSjlobf87EUebpW)

pvanheus: I uploaded and merged a few changes to the provision.sh script and salt formula:

I hope you can give it a try and let me know if you have further issues

Hi! Salt deploy is now working.

The SSL certificates in use by the services are signed by a Issue Identified as ArvadosFormula - is the cert for this CA available on the server somewhere so that I can import it into my browser?

Thanks,
Peter

Hi @pvanheus. The cert is a classic self-signed certificate generated with a salt script.

If I’m not wrong, in most browsers, when you get a warning about the certificate being self-signed, you go to ‘advanced / accept certificate’ and that would be it. The browser will remember your choice and won’t ask you again for the site.

Hi! Unfortunately that is not quite enough. The workbench talks to keep, and even after accepting the certificate in Firefox, uploads will fail because the certificate for the keep server is not accepted. My workaround for that has been to examine the Firefox console at upload time, find the URL which Firefox refuses to connect to (the keep one) and open that in the browser so that an exception can be made. Then everything works.

Peter