Usage:
A usage example and full list of options is provided for each tool below:
AME Use Case:
By default, AME performs unconstrained partition maximisation. With verbose set to 3, however, it will output a constrained partition maximisation score including up to each sequence (from the head) in the positive set. To generate the fluorescence based constrained partition maximisation plots in the paper, a Perl script was used to convert sequence numbers into fluorescence thresholds from the verbose output.
Additionally, the linreg mode in AME reports raw values only. For the full approach used in the paper, please use the tool RAMEN, which was also included in this download. RAMEN also supports simulation of p-values.
We wish to run an unconstrained partition maximisation MHG analysis utilising essentially the default options. By invoking AME with the following command:
./ame --method mhg --scoring totalhits --bgformat 0 macissac_yeast.v1.meme ARO80_YPD.fsa
The following output is produced:
ame (Analysis of Motif Enrichment): Compiled on Nov 30 2009 ------------------------------ Copyright (C) Robert McLeay <r.mcleay@imb.uq.edu.au> & Timothy Bailey <t.bailey@imb.uq.edu.au>, 2009. 1. MultiHG p-value of motif RGT1 top 13 seqs: 0.0002104 (Corrected p-value: 0.006084) 2. MultiHG p-value of motif ARO80 top 3 seqs: 0.000821 (Corrected p-value: 0.02354) 3. MultiHG p-value of motif UME6 top 3 seqs: 0.001642 (Corrected p-value: 0.04654) 4. MultiHG p-value of motif AZF1 top 14 seqs: 0.003042 (Corrected p-value: 0.08457) 5. MultiHG p-value of motif GLN3 top 3 seqs: 0.005747 (Corrected p-value: 0.1539) ...Truncated additional lines...
RAMEN Use Case:
We wish to run a linear regression analysis utilising the default options. By invoking RAMEN with the following command:
./ramen --bgformat 0 macissac_yeast.v1.meme ARO80_YPD.fsa
The following output is produced:
ramen (Regression Analysis of Motif ENrichment): Compiled on Nov 30 2009 ------------------------------ Copyright (C) Robert McLeay <r.mcleay@imb.uq.edu.au> & Timothy Bailey <t.bailey@imb.uq.edu.au>, 2009. Options Invoked: ---------------- Background Format: Uniform Motif Format: MEME y-axis: Log_e of Fluorescence Scores x-axis: PWM Scores Motif Scoring Function: RMA (normalised motif scores) Sampling Repetitions for p-values: 10000 Pseudocount: 0.25 Motif File: macissac_yeast.v1.meme Sequence File: ARO80_YPD.fsa Results: ======== Showing all motifs with p-value <= 0.05 Fitting motifs to y: = mx + b Over-represented Motifs: ------------------------ Rank Motif MSE p-value (adj) p-value (raw) m b ---- ----- --- ------------- ------------- - - 1 ARO80 1.14741 0.0244974 0.0002 -9.95713e+06 -13.6832 Under-represented Motifs: ------------------------- Rank Motif MSE p-value (adj) p-value (raw) m b ---- ----- --- ------------- ------------- - - --- Elapsed wall clock time: 3 seconds Elapsed CPU time: 2.340000 seconds
AME Options:
ame: Compiled on Nov 30 2009
Error: Must specify a motif file and sequence file.
USAGE: ame [options] <motif file> <sequence file>
Key Options:
--method [fisher|mhg|4dmhg|ranksum|linreg|spearman] Select the association function for motif significance
--scoring [avg|max|totalhits] Motif-to-sequence affinity function:
Hints: Use avg (recommended) or max for ranksum, linreg, spearman methods.
Use totalhits for fisher, mhg, 4dmhg (and possibly other) methods.
File format options:
--bgformat [0|1|2] Source used to determine background frequencies
0: uniform background
1: MEME motif file
2: Background file
--bgfile <background> File containing background frequencies
--motif-format [meme|tamo|regexp] Format of input motif file (default meme)
Ranksum-specific options:
--rsmethod [better|quick] Whether to use a slower and more accurate ranksum method or a quicker one
--poslist [fl (default)|pwm] For partition max., threshold on either X (pwm) or Y (fluorescence)
LR- and Spearman- specific options:
--log-fscores Regress on the log_e of the fluorescence scores
--log-pwmscores Regress on the log_e of the PWM scores
--normalise-linreg Normalise the motif scores so that the motifs are comparable
--linreg-switchxy Make the x-points fluorescence scores and the y-points PWM scores
Fisher, MHG, 4D-MHG, Ranksum in TOTALHITS affinity mode options:
--length-correction Correct for length bias by subtracting expected hits
--pvalue-threshold <float, default=2e-4> Threshold to consider a single motif hit significant
Fisher Test with either AVG or MAX affinity (undefined results in TOTALHITS mode) options:
--fl-threshold <float, default=1e-3> (Requires --poslist fl) Max fluorescence p-value to consider a 'positive'
--pwm-threshold <float, default=1> (Requires --poslist pwm) Min PWM score to call a sequence a 'positive'
--poslist [fl (default)|pwm] For partition max., threshold on either X (pwm) or Y (fluorescence)
Hints: Be careful when switching the poslist. In the case of the Fisher test, it switches between
using X and Y for determining true positives in the contingency matrix, in addition to switching
which of X and Y is used for partition maximisation.
Miscellaneous Options:
--pseudocount <float, default = 0.25> Pseudocount for motif affinity scan
--verbose <1...5> Integer describing verbosity. Best used as first argument in list.
--help Show this message again
Note:
By default, this tool performs unconstrained partition maximisation. With verbose set to 3, however, it will
output a constrained partition maximisation score for each sequence in the input set. To generate the fluorescence
based constrained partition maximisation plots in the paper, a Perl script was used to convert sequence numbers
into fluorescence thresholds from the verbose output.
WARNING:
This tool will not resort input sequences. It assumes that input FastA files are sorted from most-likely to be bound
to least likely to be bound in descending order.
Citing ame:
If ame is of use to you in your research, please cite:
Robert C. McLeay, Timothy L. Bailey.
"Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data."
BMC Bioinformatics 2010, 11:165, doi:10.1186/1471-2105-11-165.
Contact the authors:
You can contact the authors via email:
Robert McLeay <r.mcleay@imb.uq.edu.au>, and
Timothy Bailey <t.bailey@imb.uq.edu.au>.
Bug reports should be directed to Robert McLeay.
RAMEN Options:
USAGE: ramen [options]Linear Regression Options: --log-fscores [on|off] Regression on the log_e of the fluorescence scores on: (Default) Use the log_e(fluorescence) in the regression. off: Use the score directly provided in the sequence file. --log-pwmscores [on|off] Regression on the log_e of the PWM scores on: Use the log_e(RMA or AMA Score) in the regression. off: (Default) Use the RMA/AMA score directly. --normalise-motifs [on|off] Normalise the motif scores so that the motifs are comparable on: (Default) Normalise motifs for comparison (Use RMA score). off: Use raw AMA score (Not recommended). --linreg-switchxy [on|on] Switch the x and y axis for the linear regression on: y-points are PWM scores, x-values are fluorescence scores. off: (Default) y-points are fluorescence scores, x-points are PWM scores. --linreg-dumpdir Dump (R-format) TSV files of each regression. P-Value Simulation Options: --repeats (default=10,000) Number of times to sample for p-value determination. --pvalue-cutoff (default=0.05) Only show results with p-value <= this cutoff File format options: --bgformat [0|2|3] source used to determine background frequencies 0: uniform background 1: MEME motif file 2: Background file --bgfile file containing background frequencies --motif-format [meme|tamo|regexp] format of input motif file (default meme) Miscellaneous Options: --pseudocount Pseudocount for motif affinity scan --verbose <1...5> Integer describing verbosity. Best used as first argument in list. --help Show this message again Citing ramen (Regression Analysis of Motif ENrichment): If ramen is of use to you in your research, please cite: Robert C. McLeay, Timothy L. Bailey. "Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data." BMC Bioinformatics 2010, 11:165, doi:10.1186/1471-2105-11-165. Contact the authors: You can contact the authors via email: Robert McLeay , and Timothy Bailey . Bug reports should be directed to Robert McLeay.