Tool Usage
TDF includes the following submodes:
promotertest: Promoter test evaluates the association between the given lncRNA to the target promoters.
regiontest: Genomic region test evaluates the association between the given lncRNA to the target regions by randomization.
get_dbss: Get TTSs in BED format from the single BED file
integrate: Integrate the project’s links and generate project-level statistics.
Here we introduce the common parameters for two main tests (promoter test and genomic region test) first and then describe their test-specific parameters. The last three scripts as the tools are introduced afterward.
Common Inputs for both tests
TDF can be executed with the following command:
rgt-TDF {promotertest,regiontest} [required inputs] [options]
Where:
{promotertest,regiontest}: Define the applying test, either promoter test, or genomic region test.
required inputs: Required inputs files and paths.
options: Additional input parameters or output options. There are some inputs common for both tests shown below:
Required Input for both tests
Option Name |
Type |
Description |
---|---|---|
-h, –help |
Show the help message and exit |
|
-r |
PATH |
Input file name for RNA sequence (in fasta format) |
-rn |
String |
Define the RNA name |
-o |
PATH |
Output directory name for all the results and temporary files |
-organism |
String |
Define the organism (hg19, hg38, mm9, mm10… etc) |
Options
Option Name |
Type |
Default |
Description |
---|---|---|---|
-t |
String |
RNA name |
Define the title name for the results under the Output name. |
-a |
Float |
0.05 |
Define significance level for rejection null hypothesis |
-ccf |
Integer |
20 |
Define the cut off value for promoter counts |
-rt |
Boolean |
False |
Remove temporary files (fa, txp…etc) |
-log |
Boolean |
False |
Set the plots in log scale |
-ac |
PATH |
None |
Input file for RNA accecibility |
-accf |
Integer |
500 |
Define the cut off value for RNA accecibility |
-obed |
Boolean |
False |
Output the BED files for DNA binding sites. |
-showpa |
Boolean |
False |
Show parallel and antiparallel bindings in the plot separately |
-filter_havana |
Boolean |
False |
Apply filtering to remove HAVANA entries. |
-protein_coding |
Boolean |
False |
Apply filtering to get only protein coding genes. |
-known_only |
Boolean |
False |
Apply filtering to get only known genes. |
-nofile |
Boolean |
False |
Don’t save any files in the output folder, except the statistics. |
Options for TRIPLEXES
The arguments of the TRIPLEXES can be adjusted by the options below.
Option Name |
Type |
Default |
Description |
---|---|---|---|
-l |
Integer |
20 |
Define the minimum length of triplex |
-e |
Integer |
20 |
Set the maximal error-rate in % tolerated |
-c |
Integer |
2 |
Sets the tolerated number of consecutive errors with respect to the canonical triplex rules as were found to greatly destabilize triplexes in vitro. |
-fr |
String |
off |
Activates the filtering of low complexity regions and repeats in the sequence data |
-fm |
Integer |
0 |
Method to quickly discard non-hits (Default 0).’0′ = greedy approach; ‘1’ = q-gram filtering. |
-of |
Integer |
1 |
Define output formats of Triplexator |
-mf |
Boolean |
False |
Merge overlapping features into a cluster and report the spanning region. |
-rm |
Boolean |
False |
Set the multiprocessing |
-par |
String |
False |
Define other parameters for TRIPLEXES. Please ignore the first “-” and replace space with underline. For example, when you want to add “-G 80 -g 20”, please do “-par G_80_-g_20”. |
Particular Inputs for promoter test
Required Input for promoter test
The target promoters can be defined in two ways:
A gene list, which contains gene symbols or Ensembl IDs, one gene per line in plain text format. The argument, -de, should be used;
Two BED files containing the regions of target promoters and non-target promoters (background). Two arguments, -bed and -bg, should be used together.
Option Name |
Type |
Description |
---|---|---|
-de |
PATH |
Input file for gene list (gene symbols or Ensembl ID) |
-bed |
PATH |
Input BED file of the promoter regions of genes |
-bg |
PATH |
Input BED file of the promoter regions of background genes |
Options for promoter test
Option Name |
Type |
Default |
Description |
---|---|---|---|
-pl |
Integer |
1000 |
Define the promotor length |
-score |
Boolean |
False |
Load score column from input gene list of BED file for analysis. |
-scoreh |
Boolean |
False |
Use the header of scores from the given gene list or BED file. |
Particular Inputs for region set test
Required Input for region set test
Option Name |
Type |
Description |
---|---|---|
-bed |
PATH |
Input BED file for interesting regions on DNA |
Options for region set test
-mp Integer 0Define the number of threads for multiprocessing.
Option Name |
Type |
Default |
Description |
---|---|---|---|
-n |
Integer |
10000 |
Number iterations (randomization) |
-f |
PATH |
None |
Input BED file as mask for randomization |
-score |
Boolean |
False |
Load score column from input BED file |
get_ttss
Get TTSs of the given RNA sequence with the single BED file.
rgt-TDF get_ttss [options]
Option Name |
Type |
Default |
Description |
---|---|---|---|
-h, |
–help |
show this help message and exit |
|
-i |
PATH |
Input BED file of the target regions |
|
-tts |
PATH |
Output BED file of the TTSs |
|
-tfo |
PATH |
Output BED file of the TFOs |
|
-tfo |
PATH |
Output BED file of the TFOs |
|
-r |
PATH |
Input FASTA file of the RNA |
|
-organism |
PATH |
Define the organism |
|
-l |
Integer |
20 |
Triplexes Define the minimum length of triplex |
-e |
Integer |
20 |
Triplexes Set the maximal error-rate in % tolerated |
-c |
Integer |
2 |
Triplexes Sets the tolerated number of consecutive errors with respect to the canonical triplex rules as such were found to greatly destabilize triplexes in vitro |
-fr |
on/off |
off |
Triplexes Activates the filtering of low complexity regions and repeats in the sequence data |
-fm |
Integer |
0 |
Triplexes Method to quickly discard non-hits (default: 0).’0′ = greedy approach; ‘1’ = q-gram filtering. |
-of |
Integer |
1 |
Triplexes Define output formats of Triplexes |
-mf |
Boolean |
False |
Triplexes Merge overlapping features into a cluster and report the spanning region. |
-rm |
Integer |
1 |
Triplexes Set the multiprocessing |
integrate
Integrate the project’s links and generate project-level statistics.
rgt-TDF integrate [options]
Option Name |
Type |
Default |
Description |
---|---|---|---|
-h, –help |
show this help message and exit |
||
-path |
PATH |
Define the path of the project. |
|
-exp |
PATH |
Include expression score for ranking. |