Numpy Models and Tools
Hyperion provides several models and feature extractors based on numpy.
Feature Extraction and Voice Activity Detection
Feature Extraction Classes
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- class hyperion.feats.mfcc.MFCCSteps(value)[source]
Steps in the MFCC pipeline
- WAVE = 0
- FFT = 1
- SPEC = 2
- LOG_SPEC = 3
- LOGFB = 4
- MFCC = 5
- class hyperion.feats.mfcc.MFCC(sample_frequency=16000, frame_length=25, frame_shift=10, fft_length=512, remove_dc_offset=True, preemphasis_coeff=0.97, window_type='povey', use_fft2=True, dither=1, fb_type='mel_kaldi', low_freq=20, high_freq=0, num_filters=23, norm_filters=False, num_ceps=13, snip_edges=True, energy_floor=0, raw_energy=True, use_energy=True, cepstral_lifter=22, input_step='wave', output_step='mfcc')[source]
Compute MFCC features.
- sample_frequency
Waveform data sample frequency (must match the waveform file, if specified there) (default = 16000)
- frame_length
Frame length in milliseconds (default = 25)
- frame_shift
Frame shift in milliseconds (default = 10)
- fft_length
Length of FFT (default = 512)
- remove_dc_offset
Subtract mean from waveform on each frame (default = True)
- preemphasis_coeff
Coefficient for use in signal preemphasis (default = 0.97)
- window_type
Type of window (“hamming”|”hanning”|”povey”|”rectangular”|”blackmann”) (default = ‘povey’)
- use_fft2
If true, it uses |X(f)|^2, if false, it uses |X(f)|, (default = True)
- dither
Dithering constant (0.0 means no dither) (default = 1)
- fb_type
Filter-bank type: mel_kaldi, mel_etsi, mel_librosa, mel_librosa_htk, linear (default = ‘mel_kaldi’)
- low_freq
Low cutoff frequency for mel bins (default = 20)
- high_freq
High cutoff frequency for mel bins (if < 0, offset from Nyquist) (default = 0)
- num_filters
Number of triangular mel-frequency bins (default = 23)
- norm_filters
Normalize filters coeff to sum up to 1, if librosa it uses stanley norm (default = False)
- num_ceps
Number of cepstra in MFCC computation (including C0) (default = 13)
- snip_edges
If true, end effects will be handled by outputting only frames that completely fit in the file, and the number of frames depends on the frame-length. If false, the number of frames depends only on the frame-shift, and we reflect the data at the ends. (default = True)
- energy_floor
Floor on energy (absolute, not relative) in MFCC computation (default = 0)
- raw_energy
If true, compute energy before preemphasis and windowing (default = True)
- use_energy
Use energy (not C0) in MFCC computation (default = True)
- cepstral_lifter
Constant that controls scaling of MFCCs (default = 22)
- input_step
It can continue computation from any step: wav, fft, spec, logfb (default = ‘wav’)
- output_step
It can return intermediate result: fft, spec, logfb, mfcc (default = ‘mfcc’)
- __init__(sample_frequency=16000, frame_length=25, frame_shift=10, fft_length=512, remove_dc_offset=True, preemphasis_coeff=0.97, window_type='povey', use_fft2=True, dither=1, fb_type='mel_kaldi', low_freq=20, high_freq=0, num_filters=23, norm_filters=False, num_ceps=13, snip_edges=True, energy_floor=0, raw_energy=True, use_energy=True, cepstral_lifter=22, input_step='wave', output_step='mfcc')[source]
- static make_lifter(N, Q)[source]
Makes the liftering function
- Parameters
N – Number of cepstral coefficients.
Q – Liftering parameter
- Returns
Liftering vector.
- compute_raw_logE(x)[source]
Computes log-energy before preemphasis filter
- Parameters
x – wave signal
- Returns
Log-energy
- compute(x, return_fft=False, return_spec=False, return_logfb=False)[source]
Evaluates the MFCC pipeline.
- Parameters
x – Wave, stft, spectrogram or log-filter-bank depending on input_step.
return_fft – If true, it also returns short-time fft.
return_spec – If true, it also returns short-time magnitude spectrogram.
return_logfb – If true, it also returns log-filter-bank.
- Returns
Stfft, spectrogram, log-filter-bank or MFCC depending on output_step.
- static filter_args(**kwargs)[source]
Filters MFCC args from arguments dictionary.
- Parameters
kwargs – Arguments dictionary.
- Returns
Dictionary with MFCC options.
- static add_class_args(parser, prefix=None)[source]
Adds MFCC options to parser.
- Parameters
parser – Arguments parser
prefix – Options prefix.
- static add_argparse_args(parser, prefix=None)
Adds MFCC options to parser.
- Parameters
parser – Arguments parser
prefix – Options prefix.
- class hyperion.feats.filter_banks.FilterBankFactory[source]
- static create(filter_bank_type, num_filters, fft_length, fs, low_freq, high_freq, norm_filters)[source]
- static make_mel_librosa(num_filters, fft_length, fs, low_freq, high_freq, htk=False, norm_filters=False)[source]
- static add_argparse_args(parser, prefix=None)
Feature Normalization Classes
Copyright 2019 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- class hyperion.feats.feature_normalization.MeanVarianceNorm(norm_mean=True, norm_var=False, left_context=None, right_context=None)[source]
Class to perform mean and variance normalization
- norm_mean
normalize mean
- norm_var
normalize variance
- left_context
past context of the sliding window, if None all past frames.
- right_context
future context of the sliding window, if None all future frames.
If left_context==right_context==None, it will apply global mean/variance normalization.
- normalize_conv(x)[source]
- Normalize featurex in x
Uses convolution operator
- Parameters
x – Input feature matrix.
- Returns
Normalized feature matrix.
- normalize_cumsum(x)[source]
- Normalize featurex in x
Uses cumsum
- Parameters
x – Input feature matrix.
- Returns
Normalized feature matrix.
- static filter_args(**kwargs)[source]
Filters ST-CMVN args from arguments dictionary.
- Parameters
prefix – Options prefix.
kwargs – Arguments dictionary.
- Returns
Dictionary with ST-CMVN options.
- static add_class_args(parser, prefix=None)[source]
Adds ST-CMVN options to parser.
- Parameters
parser – Arguments parser
prefix – Options prefix.
- static add_argparse_args(parser, prefix=None)
Adds ST-CMVN options to parser.
- Parameters
parser – Arguments parser
prefix – Options prefix.
Voice Activity Detection Classes
- class hyperion.feats.energy_vad.EnergyVAD(sample_frequency=16000, frame_length=25, frame_shift=10, dither=1, snip_edges=True, vad_energy_mean_scale=0.5, vad_energy_threshold=5, vad_frames_context=0, vad_proportion_threshold=0.6)[source]
Compute VAD based on Kaldi Energy VAD method.
- sample_frequency
Waveform data sample frequency (must match the waveform file, if specified there) (default = 16000)
- frame_length
Frame length in milliseconds (default = 25)
- frame_shift
Frame shift in milliseconds (default = 10)
- dither
Dithering constant (0.0 means no dither) (default = 1)
- snip_edges
If true, end effects will be handled by outputting only frames that completely fit in the file, and the number of frames depends on the frame-length. If false, the number of frames depends only on the frame-shift, and we reflect the data at the ends. (default = True)
- vad_energy_mean_scale
If this is set to s, to get the actual threshold we let m be the mean log-energy of the file, and use s*m + vad-energy-threshold (float, default = 0.5)
- vad_energy_threshold
Constant term in energy threshold for MFCC0 for VAD (also see –vad-energy-mean-scale) (float, default = 5)
- vad_frames_context
Number of frames of context on each side of central frame, in window for which energy is monitored (int, default = 0)
- vad_proportion_threshold
Parameter controlling the proportion of frames within the window that need to have more energy than the threshold (float, default = 0.6)
- __init__(sample_frequency=16000, frame_length=25, frame_shift=10, dither=1, snip_edges=True, vad_energy_mean_scale=0.5, vad_energy_threshold=5, vad_frames_context=0, vad_proportion_threshold=0.6)[source]
- compute(x, return_loge=False)[source]
Evaluates the VAD.
- Parameters
x – Wave
return_loge – If true, it also returns the log-energy.
- Returns
Binary VAD
- static filter_args(**kwargs)[source]
Filters VAD args from arguments dictionary.
- Parameters
kwargs – Arguments dictionary.
- Returns
Dictionary with VAD options.
- static add_class_args(parser, prefix=None)[source]
Adds VAD options to parser.
- Parameters
parser – Arguments parser
prefix – Options prefix.
- static add_argparse_args(parser, prefix=None)
Adds VAD options to parser.
- Parameters
parser – Arguments parser
prefix – Options prefix.
Feature Extraction Functions
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
Speech Augmentation
Combined Speech Augmentation Class
- class hyperion.augment.speech_augment.SpeechAugment(speed_aug=None, reverb_aug=None, noise_aug=None)[source]
Class to add noise and reverberation on-the-fly when trianing nnets.
- speed_aug
SpeedAugment object
- reverb_aug
ReverbAugment object
- noise_aug
NoiseAugment object
- property max_reverb_context
Noise Augmentation Classes
- class hyperion.augment.noise_augment.NoiseAugment(noise_prob, noise_types, random_seed=112358, rng=None)[source]
- Class to augment speech with additive noise from multiple types,
e.g., music, babble, … It will randomly choose which noise type to add.
- noise_prob
probability of adding noise
- noise_types
dictionary of options with one entry per noise-type, Each entry is also a dictiory with the following entries: weight, max_snr, min_snr, noise_path. The weight parameter is proportional to how often we want to sample a given noise type.
- rng
Random number generator returned by np.random.RandomState (optional)
- class hyperion.augment.noise_augment.SingleNoiseAugment(noise_type, noise_path, min_snr, max_snr, random_seed=112358, rng=None)[source]
- Class to augment speech with additive noise of a single type,
e.g., music, babble, …
- noise_type
string label indicating the noise type.
- noise_path
path to Kaldi style wav.scp file indicating the path to the noise wav files.
- min_snr
mininimum SNR(dB) to sample from.
- max_snr
maximum SNR(dB) to sample from.
- rng
Random number generator returned by np.random.RandomState (optional)
Reverberation Augmentation Classes
- class hyperion.augment.reverb_augment.ReverbAugment(reverb_prob, rir_types, max_reverb_context=0, random_seed=112358, rng=None)[source]
- Class to augment speech with reverberation with RIRS from multiple types,
e.g., small room, medium room, large room. It will randomly choose which RIR type to add.
- reverb_prob
probability of adding reverberation
- rir_types
dictionary of options with one entry per RIR-type, Each entry is also a dictiory with the following entries: weight, rir_norm, comp_delay, rir_path. The weight parameter is proportional to how often we want to sample a given RIR type.
- max_reverb_context
number of samples required as left context for the convolution operation.
- rng
Random number generator returned by np.random.RandomState (optional)
- classmethod create(cfg, random_seed=112358, rng=None)[source]
Creates a ReverbAugment object from options dictionary or YAML file.
- Parameters
cfg – YAML file path or dictionary with reverb options.
rng – Random number generator returned by np.random.RandomState (optional)
- Returns
ReverbAugment object
- class hyperion.augment.reverb_augment.SingleReverbAugment(rir_type, rir_path, rir_norm=None, comp_delay=True, preload_rirs=True, random_seed=112358, rng=None)[source]
- Class to augment speech with reverberation using RIR from a
single type, e.g., small room, medium room, large room
- rir_type
string label indicating the RIR type.
- rir_path
Kaldi style rspecifier to Ark or H5 file containing RIRs
- rir_norm
RIR normalization method between None, ‘max’ or ‘energy’
- comp_delay
compensate the delay introduced by the RIR if any, this delay will happen if the maximum of the RIR is not in its first sample.
- preload_rirs
if True all RIRS are loaded into RAM
- rng
Random number generator returned by np.random.RandomState (optional)
Speed Augmentation Classes
- class hyperion.augment.speed_augment.SpeedAugment(speed_prob, speed_ratios=[0.9, 1.1], keep_length=False, random_seed=112358, rng=None)[source]
Class to augment speech with speed perturbation
- speed_prob
probability of applying speed perturbation
- speed_ratios
list of speed pertubation ratios
- keep_length
applies padding or cropping to keep the lenght of the signal
- random_seed
random seed for random number generator
- rng
Random number generator returned by np.random.RandomState (optional)
- __init__(speed_prob, speed_ratios=[0.9, 1.1], keep_length=False, random_seed=112358, rng=None)[source]
Hyperion Numpy Models
All numpy ML models in Hyperion derive from the same base class
Probability Density Functions
These are classes that define different probability density functions like GMMs and PLDA
Core PDF Classes
- class hyperion.pdfs.core.pdf.PDF(x_dim=1, **kwargs)[source]
-
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.pdfs.core.exp_family.ExpFamily(eta=None, **kwargs)[source]
-
- property is_init
- copy()
- eval_llk(x)
- abstract fit_generator(x, x_val=None)
- generate(num_samples, **kwargs)
- get_config()
- init_to_false()
- abstract initialize()
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract sample(num_samples)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.pdfs.core.normal_diag_cov.NormalDiagCov(mu=None, Lambda=None, var_floor=1e-05, update_mu=True, update_Lambda=True, **kwargs)[source]
- __init__(mu=None, Lambda=None, var_floor=1e-05, update_mu=True, update_Lambda=True, **kwargs)[source]
- property logLambda
- property cholLambda
- property Sigma
- Estep(x, u_x=None, sample_weight=None, batch_size=None)
- accum_log_h(x, sample_weight=None)
- accum_suff_stats(x, u_x=None, sample_weight=None, batch_size=None)
- add_suff_stats(N, u_x)
- copy()
- elbo(x, u_x=None, N=1, log_h=None, sample_weight=None, batch_size=None)
- eval_llk(x)
- fit(x, sample_weight=None, x_val=None, sample_weight_val=None, batch_size=None)
- abstract fit_generator(x, x_val=None)
- generate(num_samples, **kwargs)
- init_to_false()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- log_h(x)
- log_prob(x, u_x=None, method='nat')
- log_prob_nat(x, u_x=None)
- abstract save(file_path)
- to_json(**kwargs)
PLDA Classes
- class hyperion.pdfs.plda.plda_base.PLDABase(y_dim=None, mu=None, update_mu=True, **kwargs)[source]
-
- fit(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)[source]
- fit_adapt_weighted_avg_model(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None, plda0=None, w_mu=1, w_B=0.5, w_W=0.5)[source]
- fit_adapt(x, class_ids=None, ptheta=None, sample_weight=None, x0=None, class_ids0=None, ptheta0=None, sample_weight0=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)[source]
- copy()
- eval_llk(x)
- abstract fit_generator(x, x_val=None)
- generate(num_samples, **kwargs)
- init_to_false()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract log_prob(x)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.pdfs.plda.frplda.FRPLDA(mu=None, B=None, W=None, fullcov_W=True, update_mu=True, update_B=True, update_W=True, **kwargs)[source]
- __init__(mu=None, B=None, W=None, fullcov_W=True, update_mu=True, update_B=True, update_W=True, **kwargs)[source]
- property is_init
- static center_stats(D, mu)
- static compute_stats_hard(x, class_ids, sample_weight=None, scale_factor=None)
- static compute_stats_hard_v0(x, class_ids, sample_weight=None, scal_factor=None)
- static compute_stats_soft(x, p_theta, sample_weight=None, scal_factor=None)
- copy()
- eval_llk(x)
- fit(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)
- fit_adapt(x, class_ids=None, ptheta=None, sample_weight=None, x0=None, class_ids0=None, ptheta0=None, sample_weight0=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)
- fit_adapt_weighted_avg_model(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None, plda0=None, w_mu=1, w_B=0.5, w_W=0.5)
- abstract fit_generator(x, x_val=None)
- generate(num_samples, **kwargs)
- init_to_false()
- llr_Nvs1(x1, x2, ids1=None, method='vavg-lnorm')
- llr_Nvs1_savg(x1, ids1, x2)
- llr_Nvs1_vavg(D1, x2, do_lnorm=True)
- llr_NvsM(x1, x2, ids1=None, ids2=None, method='vavg-lnorm')
- llr_NvsM_savg(x1, ids1, x2, ids2)
- llr_NvsM_vavg(D1, D2, do_lnorm=True)
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract log_prob(x)
- abstract save(file_path)
- to_json(**kwargs)
- abstract weigthed_avg_model(plda)
- weigthed_avg_params(mu, w_mu)
- class hyperion.pdfs.plda.splda.SPLDA(y_dim=None, mu=None, V=None, W=None, fullcov_W=True, update_mu=True, update_V=True, update_W=True, **kwargs)[source]
- __init__(y_dim=None, mu=None, V=None, W=None, fullcov_W=True, update_mu=True, update_V=True, update_W=True, **kwargs)[source]
- property is_init
- static center_stats(D, mu)
- static compute_stats_hard(x, class_ids, sample_weight=None, scale_factor=None)
- static compute_stats_hard_v0(x, class_ids, sample_weight=None, scal_factor=None)
- static compute_stats_soft(x, p_theta, sample_weight=None, scal_factor=None)
- copy()
- eval_llk(x)
- fit(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)
- fit_adapt(x, class_ids=None, ptheta=None, sample_weight=None, x0=None, class_ids0=None, ptheta0=None, sample_weight0=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)
- fit_adapt_weighted_avg_model(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None, plda0=None, w_mu=1, w_B=0.5, w_W=0.5)
- abstract fit_generator(x, x_val=None)
- generate(num_samples, **kwargs)
- init_to_false()
- llr_Nvs1(x1, x2, ids1=None, method='vavg-lnorm')
- llr_Nvs1_savg(x1, ids1, x2)
- llr_Nvs1_vavg(D1, x2, do_lnorm=True)
- llr_NvsM(x1, x2, ids1=None, ids2=None, method='vavg-lnorm')
- llr_NvsM_savg(x1, ids1, x2, ids2)
- llr_NvsM_vavg(D1, D2, do_lnorm=True)
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract log_prob(x)
- abstract save(file_path)
- to_json(**kwargs)
- abstract weigthed_avg_model(plda)
- weigthed_avg_params(mu, w_mu)
- class hyperion.pdfs.plda.plda.PLDA(y_dim=None, z_dim=None, mu=None, V=None, U=None, D=None, floor_iD=1e-05, update_mu=True, update_V=True, update_U=True, update_D=True, **kwargs)[source]
- __init__(y_dim=None, z_dim=None, mu=None, V=None, U=None, D=None, floor_iD=1e-05, update_mu=True, update_V=True, update_U=True, update_D=True, **kwargs)[source]
- property is_init
- static center_stats(D, mu)
- static compute_stats_hard(x, class_ids, sample_weight=None, scale_factor=None)
- static compute_stats_hard_v0(x, class_ids, sample_weight=None, scal_factor=None)
- static compute_stats_soft(x, p_theta, sample_weight=None, scal_factor=None)
- copy()
- eval_llk(x)
- fit(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)
- fit_adapt(x, class_ids=None, ptheta=None, sample_weight=None, x0=None, class_ids0=None, ptheta0=None, sample_weight0=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None)
- fit_adapt_weighted_avg_model(x, class_ids=None, ptheta=None, sample_weight=None, x_val=None, class_ids_val=None, ptheta_val=None, sample_weight_val=None, epochs=20, ml_md='ml+md', md_epochs=None, plda0=None, w_mu=1, w_B=0.5, w_W=0.5)
- abstract fit_generator(x, x_val=None)
- generate(num_samples, **kwargs)
- init_to_false()
- llr_Nvs1(x1, x2, ids1=None, method='vavg-lnorm')
- llr_Nvs1_savg(x1, ids1, x2)
- llr_Nvs1_vavg(D1, x2, do_lnorm=True)
- llr_NvsM(x1, x2, ids1=None, ids2=None, method='vavg-lnorm')
- llr_NvsM_savg(x1, ids1, x2, ids2)
- llr_NvsM_vavg(D1, D2, do_lnorm=True)
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract log_prob(x)
- abstract save(file_path)
- to_json(**kwargs)
- abstract weigthed_avg_model(plda)
- weigthed_avg_params(mu, w_mu)
Mixture Models
- class hyperion.pdfs.mixtures.exp_family_mixture.ExpFamilyMixture(num_comp=1, pi=None, eta=None, min_N=0, update_pi=True, **kwargs)[source]
-
- property is_init
- property log_pi
- fit_generator(generator, train_steps, epochs=10, val_data=None, val_steps=0, max_queue_size=10, workers=1, use_multiprocessing=False)[source]
- accum_suff_stats_sorttime(x, frame_length, frame_shift, u_x=None, sample_weight=None, batch_size=None)[source]
- Estep_generator(generator, num_steps, return_log_h, max_queue_size=10, workers=1, use_multiprocessin=False)[source]
- copy()
- eval_llk(x)
- generate(num_samples, **kwargs)
- init_to_false()
- abstract initialize()
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract sample(num_samples)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.pdfs.mixtures.gmm.GMM(mu=None, Lambda=None, var_floor=0.001, update_mu=True, update_Lambda=True, **kwargs)[source]
- __init__(mu=None, Lambda=None, var_floor=0.001, update_mu=True, update_Lambda=True, **kwargs)[source]
- property logLambda
- property cholLambda
- property Sigma
- Estep(x, u_x=None, sample_weight=None, batch_size=None)
- Estep_generator(generator, num_steps, return_log_h, max_queue_size=10, workers=1, use_multiprocessin=False)
- accum_log_h(x, sample_weight=None)
- accum_suff_stats(x, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_segments(x, segments, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_segments_prob(x, prob, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_sorttime(x, frame_length, frame_shift, u_x=None, sample_weight=None, batch_size=None)
- compute_log_pz(x, u_x=None, mode='nat')
- compute_pz(x, u_x=None, mode='nat')
- compute_pz_nat(x, u_x=None)
- compute_pz_std(x)
- copy()
- elbo(x, u_x=None, N=1, log_h=None, sample_weight=None, batch_size=None)
- eval_llk(x)
- fit(x, sample_weight=None, x_val=None, sample_weight_val=None, epochs=10, batch_size=None)
- fit_generator(generator, train_steps, epochs=10, val_data=None, val_steps=0, max_queue_size=10, workers=1, use_multiprocessing=False)
- generate(num_samples, **kwargs)
- init_to_false()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- log_h(x)
- property log_pi
- log_prob(x, u_x=None, mode='nat')
- log_prob_nat(x, u_x=None)
- log_prob_nbest(x, u_x=None, mode='nat', nbest_mode='master', nbest=1)
- log_prob_nbest_nat(x, u_x=None, nbest_mode='master', nbest=1)
- abstract log_prob_nbest_std(x, nbest_mode='master', nbest=1)
- abstract save(file_path)
- sum_suff_stats(N, u_x)
- to_json(**kwargs)
- static tuple2data(data)
- class hyperion.pdfs.mixtures.gmm_diag_cov.GMMDiagCov(mu=None, Lambda=None, var_floor=0.001, update_mu=True, update_Lambda=True, **kwargs)[source]
- __init__(mu=None, Lambda=None, var_floor=0.001, update_mu=True, update_Lambda=True, **kwargs)[source]
- property logLambda
- property cholLambda
- property Sigma
- Estep(x, u_x=None, sample_weight=None, batch_size=None)
- Estep_generator(generator, num_steps, return_log_h, max_queue_size=10, workers=1, use_multiprocessin=False)
- accum_log_h(x, sample_weight=None)
- accum_suff_stats(x, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_segments(x, segments, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_segments_prob(x, prob, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_sorttime(x, frame_length, frame_shift, u_x=None, sample_weight=None, batch_size=None)
- compute_log_pz(x, u_x=None, mode='nat')
- compute_pz(x, u_x=None, mode='nat')
- compute_pz_nat(x, u_x=None)
- compute_pz_std(x)
- copy()
- elbo(x, u_x=None, N=1, log_h=None, sample_weight=None, batch_size=None)
- eval_llk(x)
- fit(x, sample_weight=None, x_val=None, sample_weight_val=None, epochs=10, batch_size=None)
- fit_generator(generator, train_steps, epochs=10, val_data=None, val_steps=0, max_queue_size=10, workers=1, use_multiprocessing=False)
- generate(num_samples, **kwargs)
- init_to_false()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- log_h(x)
- property log_pi
- log_prob(x, u_x=None, mode='nat')
- log_prob_nat(x, u_x=None)
- log_prob_nbest(x, u_x=None, mode='nat', nbest_mode='master', nbest=1)
- log_prob_nbest_nat(x, u_x=None, nbest_mode='master', nbest=1)
- abstract log_prob_nbest_std(x, nbest_mode='master', nbest=1)
- abstract save(file_path)
- sum_suff_stats(N, u_x)
- to_json(**kwargs)
- static tuple2data(data)
- class hyperion.pdfs.mixtures.gmm_tied_diag_cov.GMMTiedDiagCov(mu=None, Lambda=None, var_floor=0.001, update_mu=True, update_Lambda=True, **kwargs)[source]
- __init__(mu=None, Lambda=None, var_floor=0.001, update_mu=True, update_Lambda=True, **kwargs)[source]
- Estep(x, u_x=None, sample_weight=None, batch_size=None)
- Estep_generator(generator, num_steps, return_log_h, max_queue_size=10, workers=1, use_multiprocessin=False)
- property Sigma
- accum_log_h(x, sample_weight=None)
- accum_suff_stats(x, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_segments(x, segments, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_segments_prob(x, prob, u_x=None, sample_weight=None, batch_size=None)
- accum_suff_stats_sorttime(x, frame_length, frame_shift, u_x=None, sample_weight=None, batch_size=None)
- property cholLambda
- static compute_A_nat(eta)
- static compute_A_std(mu, Lambda)
- compute_log_pz(x, u_x=None, mode='nat')
- compute_pz(x, u_x=None, mode='nat')
- compute_pz_nat(x, u_x=None)
- compute_pz_std(x)
- static compute_suff_stats(x)
- copy()
- elbo(x, u_x=None, N=1, log_h=None, sample_weight=None, batch_size=None)
- eval_llk(x)
- fit(x, sample_weight=None, x_val=None, sample_weight_val=None, epochs=10, batch_size=None)
- fit_generator(generator, train_steps, epochs=10, val_data=None, val_steps=0, max_queue_size=10, workers=1, use_multiprocessing=False)
- generate(num_samples, **kwargs)
- get_config()
- init_to_false()
- initialize(x=None)
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_from_kaldi(file_path)
- classmethod load_params(f, config)
- property logLambda
- log_h(x)
- property log_pi
- log_prob(x, u_x=None, mode='nat')
- log_prob_nat(x, u_x=None)
- log_prob_nbest(x, u_x=None, mode='nat', nbest_mode='master', nbest=1)
- log_prob_nbest_nat(x, u_x=None, nbest_mode='master', nbest=1)
- abstract log_prob_nbest_std(x, nbest_mode='master', nbest=1)
- norm_suff_stats(N, u_x, return_order2=False)
- abstract save(file_path)
- save_params(f)
- stack_suff_stats(F, S=None)
- sum_suff_stats(N, u_x)
- to_json(**kwargs)
- static tuple2data(data)
- unstack_suff_stats(stats)
- validate()
Classifiers and Calibrators
Gaussian Classifiers
- class hyperion.classifiers.linear_gbe.LinearGBE(mu=None, W=None, update_mu=True, update_W=True, x_dim=1, num_classes=None, balance_class_weight=True, beta=None, nu=None, prior=None, prior_beta=None, prior_nu=None, post_beta=None, post_nu=None, **kwargs)[source]
- __init__(mu=None, W=None, update_mu=True, update_W=True, x_dim=1, num_classes=None, balance_class_weight=True, beta=None, nu=None, prior=None, prior_beta=None, prior_nu=None, post_beta=None, post_nu=None, **kwargs)[source]
- static filter_train_args(**kwargs)
- static add_argparse_args(parser, prefix=None)
- static add_argparse_train_args(parser, prefix=None)
- static add_argparse_eval_args(parser, prefix=None)
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
SVM Classifiers
- class hyperion.classifiers.linear_svmc.LinearSVMC(A=None, b=None, penalty='l2', C=1.0, loss='squared_hinge', use_bias=True, bias_scaling=1, class_weight=None, random_state=None, max_iter=100, dual=True, tol=0.0001, multi_class='ovr', verbose=0, balance_class_weight=True, lr_seed=1024, **kwargs)[source]
- __init__(A=None, b=None, penalty='l2', C=1.0, loss='squared_hinge', use_bias=True, bias_scaling=1, class_weight=None, random_state=None, max_iter=100, dual=True, tol=0.0001, multi_class='ovr', verbose=0, balance_class_weight=True, lr_seed=1024, **kwargs)[source]
- property A
- property b
- static add_argparse_train_args(parser, prefix=None)
- static add_argparse_eval_args(parser, prefix=None)
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
Logistic Regression Classifiers and Calibrators
- class hyperion.classifiers.logistic_regression.LogisticRegression(A=None, b=None, penalty='l2', lambda_reg=1e-05, use_bias=True, bias_scaling=1, priors=None, random_state=None, solver='lbfgs', max_iter=100, dual=False, tol=0.0001, multi_class='multinomial', verbose=0, warm_start=True, num_jobs=1, lr_seed=1024, **kwargs)[source]
- __init__(A=None, b=None, penalty='l2', lambda_reg=1e-05, use_bias=True, bias_scaling=1, priors=None, random_state=None, solver='lbfgs', max_iter=100, dual=False, tol=0.0001, multi_class='multinomial', verbose=0, warm_start=True, num_jobs=1, lr_seed=1024, **kwargs)[source]
Wrapper for sktlearn logistic regression. penalty : str, ‘l1’ or ‘l2’, default: ‘l2’ ,
- Used to specify the norm used in the penalization. The ‘newton-cg’, ‘sag’ and ‘lbfgs’ solvers support only l2 penalties.
New in version 0.19: l1 penalty with SAGA solver (allowing ‘multinomial’ + L1)
- dualbool, default: False
Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.
- tolfloat, default: 1e-4
Tolerance for stopping criteria.
- lambda_regfloat, default: 1e-5
Regularization strength; must be a positive float.
- use_biasbool, default: True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
- bias_scalingfloat, default 1.
Useful only when the solver ‘liblinear’ is used and use_bias is set to True. In this case, x becomes [x, bias_scaling], i.e. a “synthetic” feature with constant value equal to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic_feature_weight. Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) bias_scaling has to be increased.
- priorsdict or ‘balanced’ default: None
Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)). Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.
- random_stateint, RandomState instance or None, optional, default: None
The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; . Used when solver == ‘sag’ or ‘liblinear’.
- solver{‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’},
default: ‘liblinear’ Algorithm to use in the optimization problem. For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ and ‘saga’ are faster for large ones. For multiclass problems, only ‘newton-cg’, ‘sag’, ‘saga’ and ‘lbfgs’ handle multinomial loss; ‘liblinear’ is limited to one-versus-rest schemes. ‘newton-cg’, ‘lbfgs’ and ‘sag’ only handle L2 penalty, whereas ‘liblinear’ and ‘saga’ handle L1 penalty. Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. New in version 0.17: Stochastic Average Gradient descent solver. New in version 0.19: SAGA solver.
- max_iterint, default: 100
Useful only for the newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge.
- multi_classstr, {‘ovr’, ‘multinomial’}, default: ‘ovr’
Multiclass option can be either ‘ovr’ or ‘multinomial’. If the option chosen is ‘ovr’, then a binary problem is fit for each label. Else the loss minimised is the multinomial loss fit across the entire probability distribution. Does not work for liblinear solver. New in version 0.18: Stochastic Average Gradient descent solver for ‘multinomial’ case.
- verboseint, default: 0
For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.
- warm_startbool, default: False
When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. New in version 0.17: warm_start to support lbfgs, newton-cg, sag, saga solvers.
- n_jobsint, default: 1
Number of CPU cores used when parallelizing over classes if multi_class=’ovr’”. This parameter is ignored when the ``solver``is set to ‘liblinear’ regardless of whether ‘multi_class’ is specified or not. If given a value of -1, all cores are used.
- property A
- property b
- static filter_train_args(prefix=None, **kwargs)
- static add_argparse_args(parser, prefix=None)
- static add_argparse_train_args(parser, prefix=None)
- static add_argparse_eval_args(parser, prefix=None)
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.classifiers.binary_logistic_regression.BinaryLogisticRegression(A=None, b=None, penalty='l2', lambda_reg=1e-06, use_bias=True, bias_scaling=1, prior=0.5, random_state=None, solver='liblinear', max_iter=100, dual=False, tol=0.0001, verbose=0, warm_start=True, lr_seed=1024, **kwargs)[source]
- __init__(A=None, b=None, penalty='l2', lambda_reg=1e-06, use_bias=True, bias_scaling=1, prior=0.5, random_state=None, solver='liblinear', max_iter=100, dual=False, tol=0.0001, verbose=0, warm_start=True, lr_seed=1024, **kwargs)[source]
Wrapper for sktlearn logistic regression. penalty : str, ‘l1’ or ‘l2’, default: ‘l2’ ,
- Used to specify the norm used in the penalization. The ‘newton-cg’, ‘sag’ and ‘lbfgs’ solvers support only l2 penalties.
New in version 0.19: l1 penalty with SAGA solver (allowing ‘multinomial’ + L1)
- dualbool, default: False
Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.
- tolfloat, default: 1e-4
Tolerance for stopping criteria.
- lambda_regfloat, default: 1e-5
Regularization strength; must be a positive float.
- use_biasbool, default: True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.
- bias_scalingfloat, default 1.
Useful only when the solver ‘liblinear’ is used and use_bias is set to True. In this case, x becomes [x, bias_scaling], i.e. a “synthetic” feature with constant value equal to intercept_scaling is appended to the instance vector. The intercept becomes intercept_scaling * synthetic_feature_weight. Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) bias_scaling has to be increased.
- priorsdict or ‘balanced’ default: None
Weights associated with classes in the form {class_label: weight}. If not given, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)). Note that these weights will be multiplied with sample_weight (passed through the fit method) if sample_weight is specified.
- random_stateint, RandomState instance or None, optional, default: None
The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; . Used when solver == ‘sag’ or ‘liblinear’.
- solver{‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’},
default: ‘liblinear’ Algorithm to use in the optimization problem. For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ and ‘saga’ are faster for large ones. For multiclass problems, only ‘newton-cg’, ‘sag’, ‘saga’ and ‘lbfgs’ handle multinomial loss; ‘liblinear’ is limited to one-versus-rest schemes. ‘newton-cg’, ‘lbfgs’ and ‘sag’ only handle L2 penalty, whereas ‘liblinear’ and ‘saga’ handle L1 penalty. Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. New in version 0.17: Stochastic Average Gradient descent solver. New in version 0.19: SAGA solver.
- max_iterint, default: 100
Useful only for the newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge.
- multi_classstr, {‘ovr’, ‘multinomial’}, default: ‘ovr’
Multiclass option can be either ‘ovr’ or ‘multinomial’. If the option chosen is ‘ovr’, then a binary problem is fit for each label. Else the loss minimised is the multinomial loss fit across the entire probability distribution. Does not work for liblinear solver. New in version 0.18: Stochastic Average Gradient descent solver for ‘multinomial’ case.
- verboseint, default: 0
For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.
- warm_startbool, default: False
When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. New in version 0.17: warm_start to support lbfgs, newton-cg, sag, saga solvers.
- n_jobsint, default: 1
Number of CPU cores used when parallelizing over classes if multi_class=’ovr’”. This parameter is ignored when the ``solver``is set to ‘liblinear’ regardless of whether ‘multi_class’ is specified or not. If given a value of -1, all cores are used.
- property prior
- static add_argparse_args(parser, prefix=None)
- property A
- static add_argparse_eval_args(parser, prefix=None)
- static add_argparse_train_args(parser, prefix=None)
- static add_eval_args(parser, prefix=None)
- property b
- copy()
- static filter_args(prefix=None, **kwargs)
- static filter_eval_args(prefix, **kwargs)
- fit(x, class_ids, sample_weight=None)
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- save_params(f)
- to_json(**kwargs)
Clustering Tools
- class hyperion.clustering.kmeans.KMeans(num_clusters, mu=None, rtol=0.001, **kwargs)[source]
-
- copy()
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.clustering.ahc.AHC(method='average', metric='llr', **kwargs)[source]
-
- copy()
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
Score Normalization
- class hyperion.score_norm.score_norm.ScoreNorm(std_floor=1e-05, **kwargs)[source]
Base class for score normalization
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.score_norm.t_norm.TNorm(std_floor=1e-05, **kwargs)[source]
Class for T-Norm score normalization.
- __init__(std_floor=1e-05, **kwargs)
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.score_norm.z_norm.ZNorm(std_floor=1e-05, **kwargs)[source]
Class for Z-Norm score normalization.
- __init__(std_floor=1e-05, **kwargs)
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.score_norm.zt_norm.ZTNorm(**kwargs)[source]
Class ZT-Norm score-normalization.
- predict(scores, scores_coh_test, scores_enr_coh, scores_coh_coh, mask_coh_test=None, mask_enr_coh=None, mask_coh_coh=None)[source]
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.score_norm.tz_norm.TZNorm(**kwargs)[source]
Class for TZ-Norm score normalization.
- predict(scores, scores_coh_test, scores_enr_coh, scores_coh_coh, mask_coh_test=None, mask_enr_coh=None, mask_coh_coh=None)[source]
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.score_norm.s_norm.SNorm(**kwargs)[source]
Class for S-Norm, symmetric score normalization.
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
- class hyperion.score_norm.adapt_s_norm.AdaptSNorm(nbest=100, nbest_discard=0, **kwargs)[source]
Class for adaptive S-Norm
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- abstract save_params(f)
- to_json(**kwargs)
Feature Transformations
These are classes to apply feature transformations/projections like PCA, LDA, etc.
Transform Classes
- class hyperion.transforms.pca.PCA(mu=None, T=None, update_mu=True, update_T=True, pca_dim=None, pca_var_r=None, pca_min_dim=2, whiten=False, **kwargs)[source]
Class to do principal component analysis
- __init__(mu=None, T=None, update_mu=True, update_T=True, pca_dim=None, pca_var_r=None, pca_min_dim=2, whiten=False, **kwargs)[source]
- static add_argparse_args(parser, prefix=None)
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.lda.LDA(mu=None, T=None, lda_dim=None, update_mu=True, update_T=True, **kwargs)[source]
Class to do linear discriminant analysis.
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.cent_whiten.CentWhiten(mu=None, T=None, update_mu=True, update_T=True, **kwargs)[source]
Class to do centering and whitening of i-vectors.
- static add_argparse_args(parser, prefix=None)
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.lnorm.LNorm(mu=None, T=None, update_mu=True, update_T=True, **kwargs)[source]
Class to do length normalization.
- __init__(mu=None, T=None, update_mu=True, update_T=True, **kwargs)
- static add_argparse_args(parser, prefix=None)
- static add_class_args(parser, prefix=None)
- copy()
- static filter_args(**kwargs)
- fit(x=None, sample_weight=None, mu=None, S=None)
- abstract fit_generator(x, x_val=None)
- get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_mat(file_path)
- classmethod load_params(f, config)
- abstract save(file_path)
- save_mat(file_path)
- save_params(f)
- to_json(**kwargs)
- class hyperion.transforms.coral.CORAL(mu=None, T_col=None, T_white=None, update_mu=True, update_T=True, alpha_mu=1, alpha_T=1, **kwargs)[source]
Class to do CORAL
- __init__(mu=None, T_col=None, T_white=None, update_mu=True, update_T=True, alpha_mu=1, alpha_T=1, **kwargs)[source]
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.gaussianizer.Gaussianizer(max_vectors=None, r=None, **kwargs)[source]
Class to make i-vector distribution standard Normal.
- static add_arparse_args(parser, prefix=None)
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.nap.NAP(U=None, **kwargs)[source]
Class to do nussance attribute projection.
- copy()
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.nda.NDA(mu=None, T=None, **kwargs)[source]
Class to do nearest-neighbors discriminant analysis
- copy()
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.mvn.MVN(mu=None, s=None, **kwargs)[source]
Class to do global mean and variance normalization.
- copy()
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
- class hyperion.transforms.skl_tsne.SklTSNE(tsne_dim=2, perplexity=30.0, early_exaggeration=12.0, lr=200.0, num_iter=1000, num_iter_without_progress=300, min_grad_norm=1e-07, metric='euclidean', init='random', verbose=0, rng=None, rng_seed=1234, method='barnes_hut', angle=0.5, num_jobs=None, **kwargs)[source]
Wrapper class for sklearn TSNE manifold learner
- tsne_dim
dimension of the embedded space.
- perplexity
the perplexity is related to the number of nearest neighbors that is used in other manifold learning algorithms. Larger datasets usually require a larger perplexity. Consider selecting a value between 5 and 50.
- early_exaggeration
controls how tight natural clusters in the original space are in the embedded space and how much space will be between them.
- lr
the learning rate for t-SNE is usually in the range [10.0, 1000.0].
- num_iter
maximum number of iterations for the optimization.
- num_iter_without_progress
maximum number of iterations without progress before we abort the optimization
- min_grad_norm
if the gradient norm is below this threshold, the optimization will be stopped.
- metric
the metric to use when calculating distance between instances in [‘cosine’, ‘euclidean’, ‘l1’, ‘l2’, ‘precomputed’] or callable function.
- init
initialization method in [‘random’, ‘pca’] or embedding matrix of shape (num_samples, num_comp)
- verbose
verbosity level.
- rng
RandomState instance
- rng_seed
seed for random number generator
- method
gradient calculation method in [‘barnes_hut’, ‘exact’]
- angle
angle thetha in Barnes-Hut TSNE
- num_jobs
number of parallel jobs to run for neighbors search.
- __init__(tsne_dim=2, perplexity=30.0, early_exaggeration=12.0, lr=200.0, num_iter=1000, num_iter_without_progress=300, min_grad_norm=1e-07, metric='euclidean', init='random', verbose=0, rng=None, rng_seed=1234, method='barnes_hut', angle=0.5, num_jobs=None, **kwargs)[source]
- property tsne_dim
- property perplexity
- property early_exaggeration
- property lr
- property num_iter
- property num_iter_without_progress
- property min_grad_norm
- property metric
- property init
- property method
- property angle
- property num_jobs
- static add_argparse_args(parser, prefix=None)
- copy()
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
Sequence of Transformations Classes
- class hyperion.transforms.transform_list.TransformList(transforms, **kwargs)[source]
Class to perform a list of transformations
- copy()
- abstract fit(x, sample_weights=None, x_val=None, sample_weights_val=None)
- abstract fit_generator(x, x_val=None)
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load(file_path)
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- abstract save(file_path)
- to_json(**kwargs)
Auxiliary Classes
- class hyperion.transforms.sb_sw.SbSw(Sb=None, Sw=None, mu=None, num_classes=0, **kwargs)[source]
Class to compute between and within class matrices
- copy()
- abstract fit_generator(x, x_val=None)
- abstract get_config()
- init_to_false()
- abstract initialize()
- property is_init
- classmethod load_config(file_path)
- static load_config_from_json(json_str)
- classmethod load_params(f, config)
- abstract save(file_path)
- to_json(**kwargs)
Metrics
Metric Functions
These are some functions to compute performance metrics used in speaker identification and verification
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- hyperion.metrics.eer.compute_eer(tar, non)[source]
Computes equal error rate.
- Parameters
tar – Scores of target trials.
non – Scores of non-target trials.
- Returns
EER
- hyperion.metrics.eer.compute_prbep(tar, non)[source]
- Computes precission-recall break-even point
where #FA == #Miss
- Parameters
tar – Scores of target trials.
non – Scores of non-target trials.
- Returns
PREBP value
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- hyperion.metrics.dcf.compute_dcf(p_miss, p_fa, prior, normalize=True)[source]
- Computes detection cost function
DCF = prior*p_miss + (1-prior)*p_fa
- Parameters
p_miss – Vector of miss probabilities.
p_fa – Vector of false alarm probabilities.
prior – Target prior or vector of target priors.
normalize – if true, return normalized DCF, else unnormalized.
- Returns
Matrix of DCF for each pair of (p_miss, p_fa) and each value of prior. [len(prior) x len(p_miss)]
- hyperion.metrics.dcf.compute_min_dcf(tar, non, prior, normalize=True)[source]
- Computes minimum DCF
min_DCF = min_t prior*p_miss(t) + (1-prior)*p_fa(t)
where t is the decission threshold.
- Parameters
tar – Target scores.
non – Non-target scores.
prior – Target prior or vector of target priors.
normalize – if true, return normalized DCF, else unnormalized.
- Returns
Vector Minimum DCF for each prior. Vector of P_miss corresponding to each min DCF. Vector of P_fa corresponding to each min DCF.
- hyperion.metrics.dcf.compute_act_dcf(tar, non, prior, normalize=True)[source]
- Computes actual DCF by making decisions assuming that scores
are calibrated to act as log-likelihood ratios.
- Parameters
tar – Target scores.
non – Non-target scores.
prior – Target prior or vector of target priors.
normalize – if true, return normalized DCF, else unnormalized.
- Returns
Vector actual DCF for each prior. Vector of P_miss corresponding to each act DCF. Vector of P_fa corresponding to each act DCF.
- hyperion.metrics.dcf.fast_eval_dcf_eer(tar, non, prior, normalize_dcf=True, return_probs=False)[source]
Computes actual DCF, minimum DCF, EER and PRBE all togther
- Parameters
tar – Target scores.
non – Non-target scores.
prior – Target prior or vector of target priors.
normalize_cdf – if true, return normalized DCF, else unnormalized.
- Returns
Vector Minimum DCF for each prior. Vector Actual DCF for each prior. EER value PREBP value
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- hyperion.metrics.cllr.compute_cllr(tar, non)[source]
- CLLR: Measure of goodness of log-likelihood-ratio detection output. This measure ps both:
The quality of the score (over the whole DET curve), and
The quality of the calibration
- Parameters
tar – Scores of target trials.
non – Scores of non-target trials.
- Returns
CLLR
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- hyperion.metrics.roc.compute_roc(true_scores, false_scores)[source]
- Computes the (observed) miss/false_alarm probabilities
for a set of detection output scores.
- Parameters
true_scores (false_scores) –
trials (detection) – (By convention, the more positive the score, the more likely is the target hypothesis.)
true (given that the target hypothesis is) – (By convention, the more positive the score, the more likely is the target hypothesis.)
- Returns
The miss/false_alarm error probabilities
- hyperion.metrics.roc.compute_rocch(tar_scores, non_scores)[source]
Computes ROCCH: ROC Convex Hull.
- Parameters
tar_scores – scores for target trials
nontar_scores – scores for non-target trials
- Returns
pmiss and pfa contain the coordinates of the vertices of the ROC Convex Hull.
- hyperion.metrics.roc.rocch2eer(p_miss, p_fa)[source]
Calculates the equal error rate (eer) from pmiss and pfa vectors. Note: pmiss and pfa contain the coordinates of the vertices of the ROC Convex Hull. Use compute_rocch to convert target and non-target scores to pmiss and pfa values.
- hyperion.metrics.roc.filter_roc(p_miss, p_fa)[source]
- Removes redundant points from the sequence of points (p_fa,p_miss) so
that plotting an ROC or DET curve will be faster. The output ROC curve will be identical to the one plotted from the input vectors. All points internal to straight (horizontal or vertical) sections on the ROC curve are removed i.e. only the points at the start and end of line segments in the curve are retained. Since the plotting code draws straight lines between points, the resulting plot will be the same as the original.
- Parameters
p_miss – The coordinates of the vertices of the ROC Convex Hull. m for misses and fa for false alarms.
p_fa – The coordinates of the vertices of the ROC Convex Hull. m for misses and fa for false alarms.
- Returns
- Vectors containing selected values from the
input vectors.
- Return type
new_p_miss, new_p_fa
- hyperion.metrics.roc.compute_area_under_rocch(p_miss, p_fa)[source]
Calculates area under the ROC convex hull given p_miss, p_fa.
- Parameters
p_miss – Miss probabilities vector obtained from compute_rocch
p_fa – False alarm probabilities vector
- Returns
AUC
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- hyperion.metrics.acc.compute_accuracy(y_true, y_pred, normalize=True, sample_weight=None)[source]
Computes accuracy
- Parameters
y_true – 1d array-like, or label indicator array / sparse matrix. Ground truth (correct) labels.
y_pred – 1d array-like, or label indicator array / sparse matrix. Predicted labels, as returned by a classifier.
normalize – If False, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples.
sample_weight – Sample weights.
- Returns
Accuracy or number of correctly classified samples.
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- hyperion.metrics.confusion_matrix.compute_confusion_matrix(y_true, y_pred, labels=None, normalize=True, sample_weight=None)[source]
Computes confusion matrix.
- Parameters
y_true – Ground truth.
y_pred – Estimated labels.
labels – List of labels to index the matrix. This may be used to reorder or select a subset of labels. If none is given, those that appear at least once in y_true or y_pred are used in sorted order.
sample_weight – Sample weights.
- Returns
Confusion matrix (num_classes x num_classes)
- hyperion.metrics.confusion_matrix.compute_xlabel_confusion_matrix(y_true, y_pred, labels_train=None, labels_test=None, normalize=True, sample_weight=None)[source]
- Computes confusion matrix when the labels used to train the classifier are
different than those of the test set.
- Parameters
y_true – Ground truth.
y_pred – Estimated labels.
labels_train – List of labels used to train the classifier. This may be used to reorder or select a subset of labels. If none is given, those that appear at least once in y_pred are used in sorted order.
labels_test – List of labels of the test set. This may be used to reorder or select a subset of labels. If none is given, those that appear at least once in y_true are used in sorted order.
sample_weight – Sample weights.
- Returns
Confusion matrix (num_classes_test x num_classes_train)
- hyperion.metrics.confusion_matrix.plot_confusion_matrix(C, labels_true, labels_pred=None, title='Confusion matrix', cmap=<matplotlib.colors.LinearSegmentedColormap object>, fmt=None)[source]
Plots a confusion matrix in a figure.
- Parameters
C – 2D numpy array with confusion matrix.
labels_true – Labels of the true classes (rows).
labels_cols – Labels of the predicted classes. If None, it is equal to labels_true.
title – Title for the figure.
cmp – Color MAP.
- hyperion.metrics.confusion_matrix.write_confusion_matrix(f, C, labels_true, labels_pred=None, fmt=None)[source]
Writes confusion matrix to file.
- Parameters
f – Python file hangle.
C – 2D numpy array with confusion matrix.
labels_true – Labels of the true classes (rows).
labels_cols – Labels of the predicted classes. If None, it is equal to labels_true.
- hyperion.metrics.confusion_matrix.print_confusion_matrix(C, labels_true, labels_pred=None, fmt=None)[source]
Prints confusion matrix to std output.
- Parameters
C – 2D numpy array with confusion matrix.
labels_true – Labels of the true classes (rows).
labels_cols – Labels of the predicted classes. If None, it is equal to labels_true.
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
Utility functions to evaluate performance
- hyperion.metrics.utils.effective_prior(p_tar, c_miss, c_fa)[source]
This function adjusts a given prior probability of target p_targ, to incorporate the effects of a cost of miss, cmiss, and a cost of false-alarm, cfa.
- Parameters
p_tar – target prior
c_miss – cost of miss
c_fa – cost of false alarm
- Returns
Effective prior
- hyperion.metrics.utils.pavx(y)[source]
PAV: Pool Adjacent Violators algorithm. Non-paramtetric optimization subject to monotonicity.
ghat = pav(y) fits a vector ghat with nondecreasing components to the data vector y such that sum((y - ghat).^2) is minimal. (Pool-adjacent-violators algorithm).
- Author: This code is and adaptation from Bosaris Toolkit and
it is a simplified version of the ‘IsoMeans.m’ code made available by Lutz Duembgen at:
- Parameters
y – uncalibrated scores
- Returns
Calibrated scores Width of pav bins, from left to right
(the number of bins is data dependent)
Height: corresponding heights of bins (in increasing order)
- hyperion.metrics.utils.opt_loglr(tar, non, method='laplace')[source]
Non-parametric optimization of score to log-likelihood-ratio mapping.
- Taken from Bosaris toolkit.
Niko Brummer and Johan du Preez, Application-Independent Evaluation of Speaker Detection, Computer Speech and Language, 2005
- Parameters
tar – target scores.
non – non-target scores.
method – laplace(default, avoids inf log-LR)/raw
- Returns
Calibrated tar and non-tar log-LR
Helper Code Blocks
Classes and codeblocks that are re-used in several scripts
- class hyperion.helpers.vector_reader.VectorReader(v_file, key_file, preproc=None, vlist_sep=' ')[source]
Class to load data to train PCA, centering, whitening.
- static add_argparse_args(parser, prefix=None)
- class hyperion.helpers.vector_class_reader.VectorClassReader(v_file, key_file, preproc=None, vlist_sep=' ', class2int_file=None, min_spc=1, max_spc=None, spc_pruning_mode='random', csplit_min_spc=1, csplit_max_spc=None, csplit_mode='random', csplit_overlap=0, vcr_seed=1024, csplit_once=True)[source]
Class to load data to train LDA, PLDA, PDDA.
- __init__(v_file, key_file, preproc=None, vlist_sep=' ', class2int_file=None, min_spc=1, max_spc=None, spc_pruning_mode='random', csplit_min_spc=1, csplit_max_spc=None, csplit_mode='random', csplit_overlap=0, vcr_seed=1024, csplit_once=True)[source]
- property class_names
- property samples_per_class
- property max_samples_per_class
- static add_argparse_args(parser, prefix=None)
- class hyperion.helpers.trial_data_reader.TrialDataReader(v_file, ndx_file, enroll_file, test_file=None, preproc=None, model_part_idx=1, num_model_parts=1, seg_part_idx=1, num_seg_parts=1, eval_set='enroll-test', tlist_sep=' ')[source]
Loads Ndx, enroll file and x-vectors to evaluate PLDA.
- __init__(v_file, ndx_file, enroll_file, test_file=None, preproc=None, model_part_idx=1, num_model_parts=1, seg_part_idx=1, num_seg_parts=1, eval_set='enroll-test', tlist_sep=' ')[source]
- static add_argparse_args(parser, prefix=None)
- class hyperion.helpers.multi_test_trial_data_reader.MultiTestTrialDataReader(v_file, ndx_file, enroll_file, test_file, test_subseg2orig_file, preproc, tlist_sep=' ', model_idx=1, num_model_parts=1, seg_idx=1, num_seg_parts=1, eval_set='enroll-test')[source]
Loads Ndx, enroll file and x-vectors to evaluate PLDA.
- __init__(v_file, ndx_file, enroll_file, test_file, test_subseg2orig_file, preproc, tlist_sep=' ', model_idx=1, num_model_parts=1, seg_idx=1, num_seg_parts=1, eval_set='enroll-test')[source]
- static add_argparse_args(parser, prefix=None)
- class hyperion.helpers.multi_test_trial_data_reader_v2.MultiTestTrialDataReaderV2(enroll_v_file, test_v_file, ndx_file, enroll_file, test_file, preproc=None, tlist_sep=' ', model_idx=1, num_model_parts=1, seg_idx=1, num_seg_parts=1, eval_set='enroll-test')[source]
Loads Ndx, enroll file and x-vectors to evaluate PLDA.