Error when applying readRDS function to Arvados Collection connection in R

I get an error when I try to read a R object from a RDS file which is located on an Arvados platform. First, I use the following code to connect to Arvados and get a file listing:



# Connect to Arvados
arv = Arvados$new()

# Get collection object
coll = Collection$new(arv, "<collection_uuid>")

# List collection content

When I read a tab delimited file xyz.txt with

df.a = read.delim(coll$get("xyz.txt")$connection("r"))

it works perfectly fine. However, when I try to read an RDS file abc.RDS with

df.b = readRDS(coll$get("abc.RDS")$connection("r"))

I get the following error:

Error in readRDS(coll$get("abc.RDS")$connection("r")) : 
   unknown input format

Reading abc.RDS from a local drive works fine. According to the help page, readRDS accepts “a connection or the name of the file where the R object is saved to or read from”.

What version of ArvadosR are you using?

Perhaps it needs to be opened in binary mode, try $connection("rb") ?

I am using ArvadosR_0.0.6

$connection("rb") returns exactly the same error

There’s a 0.0.7 release of the ArvadosR package with some bug fixes, it might be worth a try. Details here:


Also doesn’t work with the latest release. The error for “r” and “rb” remains.

I could make it work with readRDS(gzcon(coll$get("abc.RDS")$connection("rb"))).

In help(readRDS) I noticed this paragraph:

“Compression is handled by the connection opened when file is a file name, so is only possible when file is a connection if handled by the connection. So e.g. url connections will need to be wrapped in a call to gzcon.”

Maybe you could add an example for reading an RDS file from Arvados to the help package of ArvadosR? This would have saved me a lot of time.

I’m glad you were able to figure it out. I will see about adding an example to the help package.

1 Like

Thanks a lot! Also for your help.