Tool Usage

TDF includes the following submodes:

  • promotertest: Promoter test evaluates the association between the given lncRNA to the target promoters.

  • regiontest: Genomic region test evaluates the association between the given lncRNA to the target regions by randomization.

  • get_dbss: Get TTSs in BED format from the single BED file

  • integrate: Integrate the project’s links and generate project-level statistics.

Here we introduce the common parameters for two main tests (promoter test and genomic region test) first and then describe their test-specific parameters. The last three scripts as the tools are introduced afterward.

Common Inputs for both tests

TDF can be executed with the following command:

rgt-TDF {promotertest,regiontest} [required inputs] [options]

Where:

  • {promotertest,regiontest}: Define the applying test, either promoter test, or genomic region test.

  • required inputs: Required inputs files and paths.

  • options: Additional input parameters or output options. There are some inputs common for both tests shown below:

Required Input for both tests

Option Name

Type

Description

-h, –help

Show the help message and exit

-r

PATH

Input file name for RNA sequence (in fasta format)

-rn

String

Define the RNA name

-o

PATH

Output directory name for all the results and temporary files

-organism

String

Define the organism (hg19, hg38, mm9, mm10… etc)

Options

Option Name

Type

Default

Description

-t

String

RNA name

Define the title name for the results under the Output name.

-a

Float

0.05

Define significance level for rejection null hypothesis

-ccf

Integer

20

Define the cut off value for promoter counts

-rt

Boolean

False

Remove temporary files (fa, txp…etc)

-log

Boolean

False

Set the plots in log scale

-ac

PATH

None

Input file for RNA accecibility

-accf

Integer

500

Define the cut off value for RNA accecibility

-obed

Boolean

False

Output the BED files for DNA binding sites.

-showpa

Boolean

False

Show parallel and antiparallel bindings in the plot separately

-filter_havana

Boolean

False

Apply filtering to remove HAVANA entries.

-protein_coding

Boolean

False

Apply filtering to get only protein coding genes.

-known_only

Boolean

False

Apply filtering to get only known genes.

-nofile

Boolean

False

Don’t save any files in the output folder, except the statistics.

Options for TRIPLEXES

The arguments of the TRIPLEXES can be adjusted by the options below.

Option Name

Type

Default

Description

-l

Integer

20

Define the minimum length of triplex

-e

Integer

20

Set the maximal error-rate in % tolerated

-c

Integer

2

Sets the tolerated number of consecutive errors with respect to the canonical triplex rules as were found to greatly destabilize triplexes in vitro.

-fr

String

off

Activates the filtering of low complexity regions and repeats in the sequence data

-fm

Integer

0

Method to quickly discard non-hits (Default 0).’0′ = greedy approach; ‘1’ = q-gram filtering.

-of

Integer

1

Define output formats of Triplexator

-mf

Boolean

False

Merge overlapping features into a cluster and report the spanning region.

-rm

Boolean

False

Set the multiprocessing

-par

String

False

Define other parameters for TRIPLEXES. Please ignore the first “-” and replace space with underline. For example, when you want to add “-G 80 -g 20”, please do “-par G_80_-g_20”.

Particular Inputs for promoter test

Required Input for promoter test

The target promoters can be defined in two ways:

  • A gene list, which contains gene symbols or Ensembl IDs, one gene per line in plain text format. The argument, -de, should be used;

  • Two BED files containing the regions of target promoters and non-target promoters (background). Two arguments, -bed and -bg, should be used together.

Option Name

Type

Description

-de

PATH

Input file for gene list (gene symbols or Ensembl ID)

-bed

PATH

Input BED file of the promoter regions of genes

-bg

PATH

Input BED file of the promoter regions of background genes

Options for promoter test

Option Name

Type

Default

Description

-pl

Integer

1000

Define the promotor length

-score

Boolean

False

Load score column from input gene list of BED file for analysis.

-scoreh

Boolean

False

Use the header of scores from the given gene list or BED file.

Particular Inputs for region set test

Required Input for region set test

Option Name

Type

Description

-bed

PATH

Input BED file for interesting regions on DNA

Options for region set test

-mp Integer 0Define the number of threads for multiprocessing.

Option Name

Type

Default

Description

-n

Integer

10000

Number iterations (randomization)

-f

PATH

None

Input BED file as mask for randomization

-score

Boolean

False

Load score column from input BED file

get_ttss

Get TTSs of the given RNA sequence with the single BED file.

rgt-TDF get_ttss [options]

Option Name

Type

Default

Description

-h,

–help

show this help message and exit

-i

PATH

Input BED file of the target regions

-tts

PATH

Output BED file of the TTSs

-tfo

PATH

Output BED file of the TFOs

-tfo

PATH

Output BED file of the TFOs

-r

PATH

Input FASTA file of the RNA

-organism

PATH

Define the organism

-l

Integer

20

Triplexes Define the minimum length of triplex

-e

Integer

20

Triplexes Set the maximal error-rate in % tolerated

-c

Integer

2

Triplexes Sets the tolerated number of consecutive errors with respect to the canonical triplex rules as such were found to greatly destabilize triplexes in vitro

-fr

on/off

off

Triplexes Activates the filtering of low complexity regions and repeats in the sequence data

-fm

Integer

0

Triplexes Method to quickly discard non-hits (default: 0).’0′ = greedy approach; ‘1’ = q-gram filtering.

-of

Integer

1

Triplexes Define output formats of Triplexes

-mf

Boolean

False

Triplexes Merge overlapping features into a cluster and report the spanning region.

-rm

Integer

1

Triplexes Set the multiprocessing

integrate

Integrate the project’s links and generate project-level statistics.

rgt-TDF integrate [options]

Option Name

Type

Default

Description

-h, –help

show this help message and exit

-path

PATH

Define the path of the project.

-exp

PATH

Include expression score for ranking.