Add training guide

2026-04-28 18:54:49 +00:00 · 2023-06-22 11:15:02 -05:00
parent fda64e7a51
commit 0937969729
1 changed files with 1 additions and 72 deletions
--- a/README.md
+++ b/README.md
@@ -88,81 +88,10 @@ Piper has been used in the following projects/papers:

 ## Training

-See [src/python](src/python)
+See the [training guide](TRAINING.md) and the [source code](src/python).

 Pretrained checkpoints are available on [Hugging Face](https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main)

-Start by installing system dependencies:
-
-``` sh
-sudo apt-get install python3-dev
-```
-
-Then create a virtual environment:
-
-``` sh
-cd piper/src/python
-python3 -m venv .venv
-source .venv/bin/activate
-pip3 install --upgrade pip
-pip3 install --upgrade wheel setuptools
-pip3 install -r requirements.txt
-```
-
-Run the `build_monotonic_align.sh` script in the `src/python` directory to build the extension.
-
-Ensure you have [espeak-ng](https://github.com/espeak-ng/espeak-ng/) installed (`sudo apt-get install espeak-ng`).
-
-Next, preprocess your dataset:
-
-``` sh
-python3 -m piper_train.preprocess \
-  --language en-us \
-  --input-dir /path/to/ljspeech/ \
-  --output-dir /path/to/training_dir/ \
-  --dataset-format ljspeech \
-  --sample-rate 22050
-```
-
-Datasets must either be in the [LJSpeech](https://keithito.com/LJ-Speech-Dataset/) format (with only id/text columns or id/speaker/text) or from [Mimic Recording Studio](https://github.com/MycroftAI/mimic-recording-studio) (`--dataset-format mycroft`).
-
-Finally, you can train:
-
-``` sh
-python3 -m piper_train \
-    --dataset-dir /path/to/training_dir/ \
-    --accelerator 'gpu' \
-    --devices 1 \
-    --batch-size 32 \
-    --validation-split 0.05 \
-    --num-test-examples 5 \
-    --max_epochs 10000 \
-    --precision 32
-```
-
-Training uses [PyTorch Lightning](https://www.pytorchlightning.ai/). Run `tensorboard --logdir /path/to/training_dir/lightning_logs` to monitor. See `python3 -m piper_train --help` for many additional options.
-
-It is highly recommended to train with the following `Dockerfile`:
-
-``` dockerfile
-FROM nvcr.io/nvidia/pytorch:22.03-py3
-
-RUN pip3 install \
-    'pytorch-lightning'
-
-ENV NUMBA_CACHE_DIR=.numba_cache
-```
-
-See the various `infer_*` and `export_*` scripts in [src/python/piper_train](src/python/piper_train) to test and export your voice from the checkpoint in `lightning_logs`. The `dataset.jsonl` file in your training directory can be used with `python3 -m piper_train.infer` for quick testing:
-
-``` sh
-head -n5 /path/to/training_dir/dataset.jsonl | \
-  python3 -m piper_train.infer \
-    --checkpoint lightning_logs/path/to/checkpoint.ckpt \
-    --sample-rate 22050 \
-    --output-dir wavs
-```
-

 ## Running in Python