Using the CLI¶

Before you can use the CLI to download Datasets and Compendia, you need to create and activate a Token.

See Setting up Tokens for information on setting up Tokens using the CLI.

Or you can go to Setting up Tokens for a tutorial on setting up a Token using python.

Setting up Tokens¶

pyrefinebio provides the CLI command create-token which can automatically create, activate, and save a Token for you.

By default, create-token will prompt you before activating and saving the Token it creates.

Alternatively, it has the flag --silent or -s which you can use to bypass the prompts.

If you use the silent flag, the created Token will be automatically activated and saved to the config file located by default at ~/.refinebio.yaml. You must save the Token in order for it to be used in future CLI commands.

For more information about pyrefinebio’s Config see Config.

Here’s an example of creating, activating, and saving a Token with prompts:

$ refinebio create-token
Please review the refine.bio Terms of Use: https://www.refine.bio/terms and Privacy Policy: https://www.refine.bio/privacy
Do you understand and accept both documents? (y/N)y
Would you like to save your Token to the Config file for future use? (y/N)y

Here’s an example of creating, activating, and saving a Token without prompts:

$ refinebio create-token -s

Downloading Datasets¶

After you set up and activate a Token you can use the CLI to start creating and downloading Datasets.

See Setting up Tokens for information on setting up Tokens using the CLI.

Or you can go to Setting up Tokens for a tutorial on setting up a Token using python.

pyrefinebio provides the CLI command download-dataset for creating and downloading Datasets. It will automatically handle every part of the creation and download process for you. You will receive the Dataset as a zip file.

download-dataset requires that you pass in the options email-address, and either experiments or dataset-dict.

email_address - The email address that will be notified when the Dataset is finished processing.

The options experiments and dataset-dict both control which Experiments and Samples will be a part of the Dataset.

experiments can be used when you want to add specific Experiments to your Dataset. All the downloadable samples associated with the Experiments that you pass in will be added to the Dataset.

The experiments option is just a space separated list of Experiment accession codes. Here’s an example:

$ refinebio download-datset --experiments "<Experiment 1 Accession Code> <Experiment 2 Accession Code>"

dataset-dict should be used when you want to specify specific Samples to be included in the Dataset. However, you can pass in “ALL” instead of specific Sample accession codes to add all downloadable Samples associated with that Experiment to the Dataset.

The dataset-dict option is a JSON object in the following format:

$ dataset-json='{"<Experiment 1 Accession Code>": ["<Sample 1 Accession Code>", "<Sample 2 Accession Code>"], "<Experiment 2 Accession Code>": ["ALL"]}'
$ refinebio download-dataset --dataset-dict

You can also pass in other optional command options to alter the Dataset itself and to alter how the download process works.

path - The path that the Dataset will be downloaded to. You specify a path to a zip file or a directory. If you pass in a path to a directory, the name of the zip file will be automatically generated in the format dataset-<dataset_id>.zip. By default, path will be set to the current directory.
aggregation - Can be used to change how the Dataset is aggregated. The default is “EXPERIMENT”, and the other available choices are “SPECIES” and “ALL”. For more information about Dataset aggregation check out Aggregations.
transformation - Can be used to change the transformation of the Dataset. The default is “NONE”, and the other available choices are “MINMAX” and “STANDARD”. For more information on Dataset transformation check out Gene transformations.
skip-quantile-normalization - Can be used to choose whether or not quantile normalization is skipped for RNA-seq Samples. For more information check out Quantile normalization.
extract - Can be used to choose whether the downloaded zip file should be automatically extracted. It will automatically extract to the same location that you passed in as path. So if path is a zip file: ./path/to/dataset.zip it will be extracted to the dir ./path/to/dataset/, if path is a dir: ./path/to/dir/ it will be extracted to ./path/to/dir/[generated-file-name]/. By default, extract is False.
prompt - Can be used to choose whether or not you should be prompted before downloading if the Dataset zip file is larger than 1 gigabyte. By default, prompt is True.

Below is a simple example of downloading a Dataset using experiments:

$ refinebio download-dataset --path "~/path/to/dataset/dir/" --email-address "foo@bar.com" --experiments "GSE74410 GSM604796 GSM604797"

Below is a simple example of downloading a Dataset using dataset_dict:

$ dataset-json='{"GSE74410": ["ALL"], "GSE24528": ["GSM604796", "GSM604797"]}'
$ refinebio download-dataset --path "./path/to/dataset.zip" --email-address "foo@bar.com" --dataset-dict $dataset-json

Downloading Compendia¶

You can start using the CLI to download Compendia after you set up and activate a Token.

See Setting up Tokens for information on setting up Tokens using the CLI.

Or you can go to Setting up Tokens for a tutorial on setting up a Token using python.

pyrefinebio provides the CLI command download-compendium for downloading Compendium results. It will automatically search for Compendia based on organisms and download the results. You will receive the Compendium as a zip file.

download-compendium requires that you pass in the parameter organism.

organism - The scientific name of the Organism for the Compendium that you want to download.

You can also pass in other optional parameters to alter the type of Compendium you download.

path - The path that the Dataset will be downloaded to. You specify a path to a zip file or a directory. If you pass in a path to a directory, the name of the zip file will be automatically generated in the format compendium-<compendium_id>.zip. By default, path will be set to the current directory.
version - The Compendium version. The default is None which will get the latest version.
quant-sf-only - Can be used to choose if the Compendium is quantile normalized. Pass in True for RNA-seq Sample Compendium results or False for quantile normalized. By default, quant_sf_only is False. For more information on normalized vs RNA-seq compendia check out refine.bio Compendia.
extract - Can be used to choose whether the downloaded zip file should be automatically extracted. It will automatically extract to the same location that you passed in as path. So if path is a zip file: ./path/to/dataset.zip it will be extracted to the dir ./path/to/dataset/, if path is a dir: ./path/to/dir/ it will be extracted to ./path/to/dir/[generated-file-name]/. By default, extract is False.
prompt - Can be used to choose whether or not you should be prompted before downloading if the Dataset zip file is larger than 1 gigabyte. By default, prompt is True.

Below is a simple example of Downloading a Compendium result:

$ refinebio download--compendium --path "~/path/to/dir/for/compendium/" --organism "HOMO_SAPIENS"

pyrefinebio also provides the CLI command download-quantfile-compendium which is equivalent to using the command download-compendium with the option quant-sf-only set to True.

You can use this function when you want to be explicit to future users of your script that you are downloading quantfile Compendium results.

Below is a simple example of Downloading a Compendium result using download-quantfile-compendium:

$ refinebio download-quantfile-compendium --path "~/path/to/dir/for/compendium/" --organism "HOMO_SAPIENS"

Getting Information About pyrefinebio Classes and Functions¶

If you are re-reading a script that you wrote and forget what a pyrefinebio function or class does - or if you just want more information about a pyrefinebio class or function, pyrefinebio exposes its help() function as the command describe which can print out information about all pyrefinebio classes/functions.

To get information about a function or class, just pass its name as the first argument to the command.

Here’s an example:

$ refinebio describe download_dataset

This will print out information about the pyrefinebio download_dataset() function.

To get information about a class method, just pass in <Class>.<method> as the first argument to the command.

Here’s an example:

$ refinebio describe Sample.search

This will print out information about the pyrefinebio class Sample’s search method.