Update README.md
This commit is contained in:
parent
814a5edb42
commit
a50b425863
@ -2,9 +2,9 @@
|
||||
|
||||
Supports fine-tuning training with custom CSV data using configuration files
|
||||
|
||||
## Quick Start
|
||||
## 1. Quick Start
|
||||
|
||||
### 1. Configuration Setup
|
||||
### Configuration Setup
|
||||
|
||||
First edit the `config.yaml` file to set the correct paths and parameters:
|
||||
|
||||
@ -24,7 +24,7 @@ model_paths:
|
||||
# ... other paths
|
||||
```
|
||||
|
||||
### 2. Run Training
|
||||
### Run Training
|
||||
|
||||
Using train_sequential
|
||||
|
||||
@ -58,44 +58,16 @@ DDP Training
|
||||
DIST_BACKEND=nccl \
|
||||
torchrun --standalone --nproc_per_node=8 train_sequential.py --config configs/config_ali09988_candle-5min.yaml
|
||||
```
|
||||
## 2. Training Results
|
||||
|
||||
## Configuration Description
|
||||

|
||||
|
||||
### Main Configuration Items
|
||||

|
||||
|
||||
- **data**: Data-related configuration
|
||||
- `data_path`: CSV data file path
|
||||
- `lookback_window`: Lookback window size
|
||||
- `predict_window`: Prediction window size
|
||||
- `train_ratio/val_ratio/test_ratio`: Dataset split ratios
|
||||

|
||||
|
||||
- **training**: Training-related configuration
|
||||
- `epochs`: Number of training epochs
|
||||
- `batch_size`: Batch size
|
||||
- `tokenizer_learning_rate`: Tokenizer learning rate
|
||||
- `predictor_learning_rate`: Predictor learning rate
|
||||

|
||||
|
||||
- **model_paths**: Model path configuration
|
||||
- `pretrained_tokenizer`: Pre-trained tokenizer path
|
||||
- `pretrained_predictor`: Pre-trained predictor path
|
||||
- `base_save_path`: Model save root directory
|
||||
- `finetuned_tokenizer`: Fine-tuned tokenizer path (for basemodel training)
|
||||
|
||||
- **experiment**: Experiment control
|
||||
- `train_tokenizer`: Whether to train tokenizer
|
||||
- `train_basemodel`: Whether to train basemodel
|
||||
- `skip_existing`: Whether to skip existing models
|
||||
|
||||
## Training Process
|
||||
|
||||
1. **Tokenizer Fine-tuning Stage**
|
||||
- Load pre-trained tokenizer
|
||||
- Fine-tune on custom data
|
||||
- Save fine-tuned tokenizer to `{base_save_path}/tokenizer/best_model/`
|
||||
|
||||
2. **Basemodel Fine-tuning Stage**
|
||||
- Load fine-tuned tokenizer and pre-trained predictor
|
||||
- Fine-tune on custom data
|
||||
- Save fine-tuned basemodel to `{base_save_path}/basemodel/best_model/`
|
||||

|
||||
|
||||
**Data Format**: Ensure CSV file contains the following columns: `timestamps`, `open`, `high`, `low`, `close`, `volume`, `amount`
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user