wavClusteR-package |
A comprehensive pipeline for the analysis of PAR-CLIP data. PAR-CLIP-induced transitions are first discriminated from sequencing errors, SNPs and additional non- experimental sources by a non-parametric mixture model. The protein binding sites (clusters) are then resolved at high resolution and cluster statistics are estimated using a rigorous Bayesian framework. Post-processing of the results, data export for UCSC genome browser visualization and motif search analysis are provided. In addition, the package allows to integrate RNA-Seq data to estimate the False Discovery Rate of cluster detection. Key functions support parallel multicore computing. Note: while wavClusteR was designed for PAR-CLIP data analysis, it can be applied to the analysis of other NGS data obtained from experimental procedures that induce nucleotide substitutions (e.g. BisSeq). |
annotateClusters |
Annotate clusters with respect to transcript features |
estimateFDR |
Estimate False Discovery Rate within the relative substitution frequency support by integrating PAR-CLIP data and RNA-Seq data |
exportClusters |
Export clusters as BED track |
exportCoverage |
Export coverage as BigWig track |
exportHighConfSub |
Export high-confidence substitutions as BED track |
exportSequences |
Export cluster sequences for motif search analysis |
filterClusters |
Merge clusters and compute all relevant cluster statistics |
fitMixtureModel |
Fit a non-parametric mixture model from all identified substitutions |
getAllSub |
Identify all substitutions observed across genomic positions exhibiting a specified minimum coverage |
getClusters |
Identify clusters containing high-confidence substitutions and resolve boundaries at high resolution |
getExpInterval |
Identify the interval of relative substitution frequencies dominated by experimental induction. |
getHighConfSub |
Classify substitutions based on identified RSF interval and return high confidence transitions |
getMetaCoverage |
Compute and plot distribution of average coverage or relative log-odds as metagene profile using identified clusters |
getMetaGene |
Compute and plot metagene profile using identified clusters |
getMetaTSS |
Compute and plot read densities in genomic regions around transcription start sites |
model |
Components of the non-parametric mixture moodel fitted on Ago2 PAR-CLIP data |
plotSizeDistribution |
Plot the distribution of cluster sizes |
plotStatistics |
Pairs plot visualization of clusters statistics |
plotSubstitutions |
Barplot visualization of the number of genomic positions exhibiting a given substitution and, if model provided, additional diagnostic plots. |
readSortedBam |
Load a sorted BAM file |
wavClusteR |
A comprehensive pipeline for the analysis of PAR-CLIP data. PAR-CLIP-induced transitions are first discriminated from sequencing errors, SNPs and additional non- experimental sources by a non-parametric mixture model. The protein binding sites (clusters) are then resolved at high resolution and cluster statistics are estimated using a rigorous Bayesian framework. Post-processing of the results, data export for UCSC genome browser visualization and motif search analysis are provided. In addition, the package allows to integrate RNA-Seq data to estimate the False Discovery Rate of cluster detection. Key functions support parallel multicore computing. Note: while wavClusteR was designed for PAR-CLIP data analysis, it can be applied to the analysis of other NGS data obtained from experimental procedures that induce nucleotide substitutions (e.g. BisSeq). |