Challenges getting lukasa protein evidence mapping flow to work in Arvados

Hi there

I am trying to get the lukasa workflow to work in Arvados. I have hit a few problems:

  1. The workflow uses format annotation extensively. There is some sample input here in FASTA format. When I upload that to Arvados I cannot see a way to associate format info with the files.

  2. The workflow fails at the metaeuk step, as far as I can tell because it is running with the default memory limit of 1GB RAM. I have tried to increase the memory limit through a ResourceRequirement (see below) but it does not lead to more RAM being allocated. Here is the start of the adapted workflow:

#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: Workflow

inputs:
  contigs_fasta:
    type: File
  proteins_fasta:
    type: File
  species_table:
    type: string?
outputs:
  spaln_out:
    type: File
    outputSource:
      process_spaln_output/combined_spaln_output
requirements:
  ResourceRequirement:
    ramMin: 4096
    ramMax: 4096
    coresMin: 1
    coresMax: 4

The environment I am running this on is a VM with 16 GB RAM, 4 vCPUs running Ubuntu 18.04. Arvados was installed using the single-host Salt method.

Peter

BTW out-of-memory behaviour is the same on Arvados Playground, see this invocation pirca-xvhdp-jmsi42qh548vc2q

1 Like

I was just about to suggest running that workflow on the Arvados Playground :slight_smile:

You might need to adjust the sharing preferences for your Arvados Playground invocation, I’m not able to view it.

I have tried to share with all users - here is another invocation after I set sharing to All and Anonymous users: https://workbench.pirca.arvadosapi.com/container_requests/pirca-xvhdp-xhdn9hvodxhskhd

1 Like

@pvanheus oh perfect, now we can all debug it together.

I see “$(runtime.tmpdir)” on the Arvados command line. That’s a problem because it should have been expanded to a real path already.

Here’s the problem

              {
                "default": "$(runtime.tmpdir)",
                "id": "#metaeuk_easy_predict.cwl/temp_dir",
                "inputBinding": {
                  "position": 40
                },
                "type": "string"
              }

As discussed in the CWL issue tracker, it has been proposed but is not currently valid.

You want something like this:

              {
                "default": false,
                "valueFrom": "$(self ? self : runtime.tmpdir)",
                "id": "#metaeuk_easy_predict.cwl/temp_dir",
                "inputBinding": {
                  "position": 40
                },
                "type": "string"
              }
1 Like

@pvanheus Did you say you got this workflow to run correctly on a non-Arvados CWL runner? Which one?

Yes, it works on cwltool.

Thanks, I changed the tool section in question to:

  temp_dir:
    type: string
    default: ""
    inputBinding:
      valueFrom: "$(self ? self : runtime.tmpdir)"
      position: 40

(default to “false” didn’t work for me)

Then I had to add InlineJavascriptRequirement to the tool and the tool and the workflow work now.

2 Likes