Data deletion issue

Jerome_Jean_Verleyen · 14 June 2022 00:00

Dear all
I have an Arvados system installed with Salt, version 2.4.0. I’m in a testing phase at this moment.
I test the keepstore capability, uploading two collections with the workbench serveur. All run perfectly: the 2 collections sums a total of 520 Mb. (that’s a test). I decide to trash one of them. After one day, the trashed collection disapears from the trash system (as i am configured it, so all good!). And when i mount my home on my linux client, i can see just one collection, with a amount of 287 Mb occupied.
But one week later, my 2 keepstores stil have 520 Mb of data: the deletion of the trashed collection no occurs… I don’t know where is the problem…

My definition of trash and blob are the following:

arvados-server config-dump | grep -i blob

  MaxKeepBlobBuffers: 128
  BlobDeleteConcurrency: 4
  BlobMissingReport: ""
  BlobReplicateConcurrency: 2
  BlobSigning: true
  BlobSigningKey: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  BlobSigningTTL: 24h
  BlobTrash: true
  BlobTrashCheckInterval: 24h
  BlobTrashConcurrency: 4
  BlobTrashLifetime: 24h
      BlobContainer: ""
  LocalKeepBlobBuffersPerVCPU: 1

arvados-server config-dump | grep -i trash

  BlobTrash: true
  BlobTrashCheckInterval: 24h
  BlobTrashConcurrency: 4
  BlobTrashLifetime: 24h
  DefaultTrashLifetime: 48h
  TrashSweepInterval: 1m

Normaly, after 2 days, the deleted data on each keepstopre should be definitively deleted, correct?

Thank’s !

tetron · 14 June 2022 17:45

Is the keep-balance service running? That is the service responsible for identifying blocks that are no longer referenced and actually telling the keepstore servers to get rid of them. If it is running, check the logs and post them here (it includes a report on the number of referenced blocks).

Also the keepstore servers themselves have a separate waiting period where blocks are moved into a “trash” location where they are kept a little while longer before finally being deleted.

(Arvados is very defensive with your data, so it gives you multiple opportunities to recover if someone makes a mistake or something goes wrong).

The total time between when you hit the trash button to when data is deleted can be up to:

DefaultTrashLifetime + BalancePeriod + BlobTrashLifetime + BlobTrashCheckInterval

From your configuration it seems like a week should be long enough, so you will need to see what the logs say. It is also possible the data is still being referenced, and you don’t realize it.

Jerome_Jean_Verleyen · 14 June 2022 21:40

Dear peter,
Thank’s for your answer.
I install the keep-balance service on the API server, as the Salt’s documentation indicate. The service is running:

ps aux|grep balance

root 435 0.0 0.6 1161568 27300 ? SNsl Jun06 0:26 /usr/bin/keep-balance -commit-pulls -commit-trash

The Trash location inside the keepstore server are in the same disk deffined for the data, right?
I try to check the log files, but i should admit i’m bit confuse where to search…

Thank’s

Jerome_Jean_Verleyen · 16 June 2022 23:00

I do the following to check if the “trashed” files were declared as this.
On each keepstore server, i stop the service, delete all sub directories of blocks, and restart the keepstore service. After some minutes, the “good” blocks were copied in place. And i foloqing with the other keepstore server.
I know that’s not the good one way to do. The problem seems at keepstore nivel, that i can’t decide to delete files… I don’t know why…
Hope that someone could help me.
Regards.