Metadata-Version: 2.4
Name: cadd-threshold-app
Version: 0.0.4
Summary: Shiny-for-Python app for exploring ClinVar distributions across CADD score thresholds
Author-email: Cora Leifheit <cora.leifheit@bih-charite.de>, Max Schubach <max.schubach@bih-charite.de>
License-Expression: MIT
Project-URL: Homepage, https://github.com/kircherlab/CADD_threshold_app
Project-URL: Repository, https://github.com/kircherlab/CADD_threshold_app
Keywords: shiny,genomics,cadd,clinvar,visualization
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anywidget
Requires-Dist: click
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: plotly
Requires-Dist: requests
Requires-Dist: scikit-learn
Requires-Dist: shiny
Requires-Dist: shinywidgets
Dynamic: license-file

# CADD Threshold APP

[![DOI](https://zenodo.org/badge/1008289329.svg)](https://doi.org/10.5281/zenodo.18863535)
[![GitHub License](https://img.shields.io/github/license/kircherlab/CADD_threshold_app)](https://github.com/kircherlab/CADD_threshold_app/blob/master/LICENSE)
[![GitHub Release](https://img.shields.io/github/v/release/kircherlab/CADD_threshold_app)](https://github.com/kircherlab/CADD_threshold_app/releases/latest)
[![PyPI version](https://badge.fury.io/py/cadd-threshold-app.svg)](https://badge.fury.io/py/cadd-threshold-app)
[![Bioconda Version](https://img.shields.io/conda/vn/bioconda/cadd-threshold-app?label=bioconda)](https://bioconda.github.io/recipes/cadd-threshold-app/README.html)
[![Tests](https://github.com/kircherlab/CADD_threshold_app/actions/workflows/tests.yml/badge.svg?branch=master)](https://github.com/kircherlab/CADD_threshold_app/actions/workflows/tests.yml)
[![GitHub Issues](https://img.shields.io/github/issues/kircherlab/CADD_threshold_app)](https://github.com/kircherlab/CADD_threshold_app/issues)
[![GitHub Pull Requests](https://img.shields.io/github/issues-pr/kircherlab/CADD_threshold_app)](https://github.com/kircherlab/CADD_threshold_app/pulls)


A Shiny-for-Python web application to explore and compare distributions of ClinVar
variants across different CADD PHRED-score thresholds, filter by gene lists or panels, and
export per-gene/per-panel or filtered annotation summaries. The app is primarily intended for investigating the score distribution of known pathogenic and benign variants for different CADD PHRED-score thresholds.

This README explains the repository layout, how to run the app locally (pip/conda).

**Highlights**
- Interactive visualizations of CADD PHRED-score distributions
- Compare distributions across CADD/ClinVar versions and genome builds
- Per-gene filtering (paste a list or upload a file) and exportable summaries
- Per-panel filtering using panels from PanelApp and exportable summaries

## Requirements
- Python 3.10+ (3.12 recommended)
- See `requirements.txt` or `environment.yml` for full dependencies
- Docker (optional) — a `Dockerfile` is included for containerized runs

## Installation

### Data preperation
The underlying data for the CADD-ThresholdApp needs to be downloaded, if the source code is downloaded as a package from bioconda or pip. The data can be downloaded [here](https://zenodo.org/records/19204078?token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6IjU4NjI1Njg2LTczM2MtNGY5Ni1hNzJkLTQ0Y2I3NzU5ZmZlYyIsImRhdGEiOnt9LCJyYW5kb20iOiJmNmM0N2YzZGJkMjk3ZDI1OWRjOTA4NjYwOTU4MDRmMCJ9.-WD2-pTxlVoJItfjOUYqAY4163l1jUHYHftcvSaSYTasGJ6-7AZSPXfZRFmPUohAOkrtHkCuAmRBUxbma6ioUw). The data is also versionized seperately from the packages. You can also preprocess your own data for the website using this Snakemake workflow: https://github.com/kircherlab/CADD_threshold_analysis.

### Data overview
- `data/` - contains preprocessed tables, panel summaries and metrics used by the app.
  - `paneldata/` - CSVs summarizing panels and versions used by the UI
  - `panel_metrics/` - generated metrics stored by date/version

Notes:
- Large raw annotation files are typically not tracked in the repository. The app
  expects prepared/normalized CSV inputs - use https://github.com/kircherlab/CADD_threshold_analysis to regenerate CSV inputs or use the `modules/panelapp/` utilities if you need to regenerate panel CSVs from PanelApp.

### Pre-compiled packages

Using conda

```bash
conda create -n cadd_threshold_app -c bioconda -c conda-forge cadd-threshold-app
conda activate cadd_threshold_app
cadd-threshold-app --data </path/to/data>
```

Using pip

```bash
pip install cadd-threshold-app
cadd-threshold-app --data </path/to/data>
```

### From source

```bash
git clone https://github.com/kircherlab/CADD_threshold_app.git
cd CADD_threshold_app
pip install .
cadd-threshold-app --data data
```

Install as package (editable, recommended for development)

```bash
pip install -e .
```

## Run the app


Option A: run via the package entry point

This requires installing the project as a package (e.g. pip install -e .).

```bash
cadd-threshold-app --data </path/to/data>
```

Alternatively to the cli option `--data`, you can set the `CADD_THRESHOLD_APP_DATA_DIR` environment variable.

```bash
export CADD_THRESHOLD_APP_DATA_DIR=data
cadd-threshold-app
```

Further CLI options are available to configute host and port - run `cadd-threshold-app --help` for details.

Option B: run from the repository root. Please set the `CADD_THRESHOLD_APP_DATA_DIR` environment variable to point to your data directory (e.g. `data/` in the repository) before running.

```bash
export CADD_THRESHOLD_APP_DATA_DIR=data
python -m shiny run cadd_threshold_app.app:app
```

Then open http://localhost:8080 in your browser.


## Key files and modules
- `app.py` - Shiny app entrypoint and UI wiring
- `server_logic.py` - main server-side reactive logic and handlers
- `data_loader.py` - helpers to load and preprocess annotation tables
-  `ui_components.py` - UI
- `modules/` - plotting helpers, utilities and gene-list/panel parsing helpers
  - `basic_plot.py`, `basic_bar_plot.py`, `compare_basic_plot.py` - plotting factories
  - `functions_server_helpers.py`, `read_genes_from_list_or_file_functions.py` - utilities
  - `panelapp/` - scripts to interact with PanelApp (CSV generation, comparison)

## Development notes
- To extend plots: add a factory under `modules/` and register it in server logic
- To add data sources: update `data_loader.py` and ensure column names match the
  plotting/metric code paths
- Linting/tests: None included by default. Add unit tests for critical data parsing
  when making larger refactors.

## Docker
- The included `Dockerfile` builds a minimal image running the app on port 8080.

## License & contact
- See `LICENSE` for licensing terms.
- For questions about data sources, interpretation, or contributions, contact the
  repository maintainers or open an issue.
