Hyperion: Speaker Recognition Toolkit
- Hyperion is a Speaker Recognition Toolkit based on PyTorch and numpy. It provides:
x-Vector architectures: ResNet, Res2Net, Spine2Net, ECAPA-TDNN, EfficientNet, Transformers and others.
Embedding preprocessing tools: PCA, LDA, NAP, Centering/Whitening, Length Normalization, CORAL
Several flavours of PLDA back-ends: Full-rank PLDA, Simplified PLDA, PLDA
Calibration and Fusion tools
Recipes for popular datasets: VoxCeleb, NIST-SRE, VOiCES
- Getting Started with Hyperion
- Numpy Models and Tools
- PyTorch Models and Tools
- Layers
- Activation Function Layers
- Normalization Layers
- Dropout Layers
- Attention Layers
- Pooling Layers
- Acoustic Feature Extraction Layers
- Feature Normalization Layers
- Feature Augmentation Layers
- Large Margin Losses Layers
- Prob Densitiy Function Layers
- Vector Quantization Layers
- Upsampling Layers
- Positional Encoders
- Calibration
- Layer Blocks
- Torch Models and Model Loader
- Neural Architectures
- Models
- Losses
- Adversarial Attacks
- Trainers
- Datasets, Data Loaders and Samplers
- Optimizers
- Learning Rate Schedulers
- Metrics
- Loggers
- Utils
- Layers
- Input/Output Utilities
- Utils