Utils

The hyperion.utils module contains several utility classes and functions.

Trial Management Classes

These are a series of utils to handle Trial Indices, Keys and Scores. These are based on the MATLAB implementations in the BOSARIS Toolkit.

class hyperion.utils.trial_key.TrialKey(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
Contains the trial key for speaker recognition trials.

Bosaris compatible Key.

model_set

List of model names.

seg_set

List of test segment names.

tar

Boolean matrix with target trials to True (num_models x num_segments).

non

Boolean matrix with non-target trials to True (num_models x num_segments).

model_cond

Conditions related to the model.

seg_cond

Conditions related to the test segment.

trial_cond

Conditions related to the combination of model and test segment.

model_cond_name

String list with the names of the model conditions.

seg_cond_name

String list with the names of the segment conditions.

trial_cond_name

String list with the names of the trial conditions.

__init__(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
property num_models
property num_tests
copy()[source]

Makes a copy of the object

sort()[source]

Sorts the object by model and test segment names.

save(file_path)[source]

Saves object to txt/h5 file.

Parameters

file_path – File to write the list.

save_h5(file_path)[source]

Saves object to h5 file.

Parameters

file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters

file_path – File to write the list.

classmethod load(file_path)[source]

Loads object from txt/h5 file

Parameters

file_path – File to read the list.

Returns

TrialKey object.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters

file_path – File to read the list.

Returns

TrialKey object.

classmethod load_txt(file_path)[source]

Loads object from txt file

Parameters

file_path – File to read the list.

Returns

TrialKey object.

classmethod merge(key_list)[source]

Merges several key objects.

Parameters

key_list – List of TrialKey objects.

Returns

Merged TrialKey object.

filter(model_set, seg_set, keep=True)[source]

Removes elements from TrialKey object.

Parameters
  • model_set – List of models to keep or remove.

  • seg_set – List of test segments to keep or remove.

  • keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

Returns

Filtered TrialKey object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
Splits the TrialKey into num_model_parts x num_seg_parts and returns part

(model_idx, seg_idx).

Parameters
  • model_idx – Model index of the part to return from 1 to num_model_parts.

  • num_model_parts – Number of parts to split the model list.

  • seg_idx – Segment index of the part to return from 1 to num_model_parts.

  • num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialKey

to_ndx()[source]

Converts TrialKey object into TrialNdx object.

Returns

TrialNdx object.

validate()[source]

Validates the attributes of the TrialKey object.

__eq__(other)[source]

Equal operator

__ne__(other)[source]

Non-equal operator

__cmp__(other)[source]

Comparison operator

test()[source]
class hyperion.utils.trial_ndx.TrialNdx(model_set=None, seg_set=None, trial_mask=None)[source]
Contains the trial index to run speaker recognition trials.

Bosaris compatible Ndx.

model_set

List of model names.

seg_set

List of test segment names.

trial_mask

Boolean matrix with the trials to execute to True (num_models x num_segments).

__init__(model_set=None, seg_set=None, trial_mask=None)[source]
property num_models
property num_tests
copy()[source]

Makes a copy of the object

sort()[source]

Sorts the object by model and test segment names.

save(file_path)[source]

Saves object to txt/h5 file.

Parameters

file_path – File to write the list.

save_h5(file_path)[source]

Saves object to h5 file.

Parameters

file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters

file_path – File to write the list.

classmethod load(file_path)[source]

Loads object from txt/h5 file

Parameters

file_path – File to read the list.

Returns

TrialNdx object.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters

file_path – File to read the list.

Returns

TrialNdx object.

classmethod load_txt(file_path)[source]

Loads object from txt file

Parameters

file_path – File to read the list.

Returns

TrialNdx object.

classmethod merge(ndx_list)[source]

Merges several index objects.

Parameters

key_list – List of TrialNdx objects.

Returns

Merged TrialNdx object.

static parse_eval_set(ndx, enroll, test=None, eval_set='enroll-test')[source]

Prepares the data structures required for evaluation.

Parameters
  • ndx – TrialNdx object cotaining the trials for the main evaluation.

  • enroll – Utt2Info where key are file_ids and second column are model names

  • test – Utt2Info of where key are test segments names. Needed in the cases enroll-coh and coh-coh.

  • eval_test – Type of of evaluation enroll-test: main evaluation of enrollment vs test segments. enroll-coh: enrollment vs cohort segments. coh-test: cohort vs test segments. coh-coh: cohort vs cohort segments.

Returns

TrialNdx object enroll: SCPList

Return type

ndx

filter(model_set, seg_set, keep=True)[source]

Removes elements from TrialNdx object.

Parameters
  • model_set – List of models to keep or remove.

  • seg_set – List of test segments to keep or remove.

  • keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

Returns

Filtered TrialNdx object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
Splits the TrialNdx into num_model_parts x num_seg_parts and returns part

(model_idx, seg_idx).

Parameters
  • model_idx – Model index of the part to return from 1 to num_model_parts.

  • num_model_parts – Number of parts to split the model list.

  • seg_idx – Segment index of the part to return from 1 to num_model_parts.

  • num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialNdx

validate()[source]

Validates the attributes of the TrialKey object.

apply_segmentation_to_test(segment_list)[source]

Splits test segment into multiple sub-segments Useful to create ndx for spk diarization or tracking.

Parameters

segment_list – ExtSegmentList object with mapping of file_id to ext_segment_id

Returns

New TrialNdx object with segment_ids in test instead of file_id.

__eq__(other)[source]

Equal operator

__ne__(other)[source]

Non-equal operator

__cmp__(other)[source]

Comparison operator

test()[source]
class hyperion.utils.trial_scores.TrialScores(model_set=None, seg_set=None, scores=None, score_mask=None)[source]
Contains the scores for the speaker recognition trials.

Bosaris compatible Scores.

model_set

List of model names.

seg_set

List of test segment names.

scores

Matrix with the scores (num_models x num_segments).

score_mask

Boolean matrix with the trials with valid scores to True (num_models x num_segments).

__init__(model_set=None, seg_set=None, scores=None, score_mask=None)[source]
property num_models
property num_tests
copy()[source]

Makes a copy of the object

sort()[source]

Sorts the object by model and test segment names.

save(file_path)[source]

Saves object to txt/h5 file.

Parameters

file_path – File to write the list.

save_h5(file_path)[source]

Saves object to h5 file.

Parameters

file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters

file_path – File to write the list.

classmethod load(file_path)[source]

Loads object from txt/h5 file

Parameters

file_path – File to read the list.

Returns

TrialScores object.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters

file_path – File to read the list.

Returns

TrialScores object.

classmethod load_txt(file_path)[source]

Loads object from h5 file

Parameters

file_path – File to read the list.

Returns

TrialScores object.

classmethod merge(scr_list)[source]

Merges several score objects.

Parameters

scr_list – List of TrialNdx objects.

Returns

Merged TrialScores object.

filter(model_set, seg_set, keep=True, raise_missing=True)[source]

Removes elements from TrialScores object.

Parameters
  • model_set – List of models to keep or remove.

  • seg_set – List of test segments to keep or remove.

  • keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

  • raise_missing – Raises exception if there are elements in model_set or seg_set that are not in the object.

Returns

Filtered TrialScores object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
Splits the TrialScores into num_model_parts x num_seg_parts and returns part

(model_idx, seg_idx).

Parameters
  • model_idx – Model index of the part to return from 1 to num_model_parts.

  • num_model_parts – Number of parts to split the model list.

  • seg_idx – Segment index of the part to return from 1 to num_model_parts.

  • num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialScores

validate()[source]

Validates the attributes of the TrialScores object.

align_with_ndx(ndx, raise_missing=True)[source]

Aligns scores, model_set and seg_set with TrialNdx or TrialKey.

Parameters
  • ndx – TrialNdx or TrialKey object.

  • raise_missing – Raises exception if there are trials in ndx that are not in the score object.

Returns

Aligned TrialScores object.

get_tar_non(key)[source]

Returns target and non target scores.

Parameters

key – TrialKey object.

Returns

Numpy array with target scores. Numpy array with non-target scores.

set_missing_to_value(ndx, val)[source]

Aligns the scores with a TrialNdx and sets the trials with missing scores to the same value.

Parameters
  • ndx – TrialNdx or TrialKey object.

  • val – Value for the missing scores.

Returns

Aligned TrialScores object.

transform(f)[source]

Applies a function to the valid scores of the object.

Parameters

f – function handle.

__eq__(other)[source]

Equal operator

__ne__(other)[source]

Non-equal operator

__cmp__(other)[source]

Comparison operator

test()[source]
class hyperion.utils.trial_stats.TrialStats(df_stats)[source]

Contains anciliary statistics from the trial such us quality measures like SNR

This class was created to store statistics about adversarial attacks like SNR (signal-to-perturbation ratio), Linf, L2 norms of the perturbation etc.

df_stats

pandas dataframe containing the stats. The dataframe needs to include the modelid and segmentid columns

__init__(df_stats)[source]
classmethod load(file_path)[source]

Loads stats file

Parameters

file_path – stats file in csv format

Returns

TrialScores object.

save_h5(file_path)[source]

Saves object to file.

Parameters

file_path – CSV format file

get_stats_mat(stat_name, ndx, raise_missing=True)[source]

Returns a matrix of trial statistics sorted to match a give Ndx or Key object

Parameters
  • stat_name – name of the statatistic (e.g. snr, linf), as given in the column name of the dataframe.

  • ndx – Ndx or Key object

Returns

Stat matrix (n_models x n_tests)

reset_stats_mats()[source]
class hyperion.utils.sparse_trial_key.SparseTrialKey(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
Contains the trial key for speaker recognition trials.

Bosaris compatible Key.

model_set

List of model names.

seg_set

List of test segment names.

tar

Boolean matrix with target trials to True (num_models x num_segments).

non

Boolean matrix with non-target trials to True (num_models x num_segments).

model_cond

Conditions related to the model.

seg_cond

Conditions related to the test segment.

trial_cond

Conditions related to the combination of model and test segment.

model_cond_name

String list with the names of the model conditions.

seg_cond_name

String list with the names of the segment conditions.

trial_cond_name

String list with the names of the trial conditions.

__init__(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
save_h5(file_path)[source]

Saves object to h5 file.

Parameters

file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters

file_path – File to write the list.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters

file_path – File to read the list.

Returns

TrialKey object.

classmethod load_txt(file_path)[source]

Loads object from txt file

Parameters

file_path – File to read the list.

Returns

TrialKey object.

classmethod merge(key_list)[source]

Merges several key objects.

Parameters

key_list – List of TrialKey objects.

Returns

Merged TrialKey object.

to_ndx()[source]

Converts TrialKey object into TrialNdx object.

Returns

TrialNdx object.

validate()[source]

Validates the attributes of the TrialKey object.

classmethod from_trial_key(key)[source]
__eq__(other)[source]

Equal operator

__cmp__(other)

Comparison operator

__ne__(other)

Non-equal operator

copy()

Makes a copy of the object

filter(model_set, seg_set, keep=True)

Removes elements from TrialKey object.

Parameters
  • model_set – List of models to keep or remove.

  • seg_set – List of test segments to keep or remove.

  • keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

Returns

Filtered TrialKey object.

classmethod load(file_path)

Loads object from txt/h5 file

Parameters

file_path – File to read the list.

Returns

TrialKey object.

property num_models
property num_tests
save(file_path)

Saves object to txt/h5 file.

Parameters

file_path – File to write the list.

sort()

Sorts the object by model and test segment names.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)
Splits the TrialKey into num_model_parts x num_seg_parts and returns part

(model_idx, seg_idx).

Parameters
  • model_idx – Model index of the part to return from 1 to num_model_parts.

  • num_model_parts – Number of parts to split the model list.

  • seg_idx – Segment index of the part to return from 1 to num_model_parts.

  • num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialKey

test()
class hyperion.utils.sparse_trial_scores.SparseTrialScores(model_set=None, seg_set=None, scores=None, score_mask=None)[source]
Contains the scores for the speaker recognition trials.

Bosaris compatible Scores.

model_set

List of model names.

seg_set

List of test segment names.

scores

Matrix with the scores (num_models x num_segments).

score_mask

Boolean matrix with the trials with valid scores to True (num_models x num_segments).

__init__(model_set=None, seg_set=None, scores=None, score_mask=None)[source]
save_h5(file_path)[source]

Saves object to h5 file.

Parameters

file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters

file_path – File to write the list.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters

file_path – File to read the list.

Returns

TrialScores object.

classmethod load_txt(file_path)[source]

Loads object from h5 file

Parameters

file_path – File to read the list.

Returns

SparseTrialScores object.

classmethod merge(scr_list)[source]

Merges several score objects.

Parameters

scr_list – List of TrialNdx objects.

Returns

Merged TrialScores object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
Splits the TrialScores into num_model_parts x num_seg_parts and returns part

(model_idx, seg_idx).

Parameters
  • model_idx – Model index of the part to return from 1 to num_model_parts.

  • num_model_parts – Number of parts to split the model list.

  • seg_idx – Segment index of the part to return from 1 to num_model_parts.

  • num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialScores

validate()[source]

Validates the attributes of the TrialKey object.

filter(model_set, seg_set, keep=True, raise_missing=True)[source]

Removes elements from TrialScores object.

Parameters
  • model_set – List of models to keep or remove.

  • seg_set – List of test segments to keep or remove.

  • keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

  • raise_missing – Raises exception if there are elements in model_set or seg_set that are not in the object.

Returns

Filtered TrialScores object.

align_with_ndx(ndx, raise_missing=True)[source]

Aligns scores, model_set and seg_set with TrialNdx or TrialKey.

Parameters
  • ndx – TrialNdx or TrialKey object.

  • raise_missing – Raises exception if there are trials in ndx that are not in the score object.

Returns

Aligned TrialScores object.

get_tar_non(key)[source]

Returns target and non target scores.

Parameters

key – TrialKey object.

Returns

Numpy array with target scores. Numpy array with non-target scores.

classmethod from_trial_scores(scr)[source]
set_missing_to_value(ndx, val)[source]

Aligns the scores with a TrialNdx and sets the trials with missing scores to the same value.

Parameters
  • ndx – TrialNdx or TrialKey object.

  • val – Value for the missing scores.

Returns

Aligned SparseTrialScores object.

__eq__(other)[source]

Equal operator

__cmp__(other)

Comparison operator

__ne__(other)

Non-equal operator

copy()

Makes a copy of the object

classmethod load(file_path)

Loads object from txt/h5 file

Parameters

file_path – File to read the list.

Returns

TrialScores object.

property num_models
property num_tests
save(file_path)

Saves object to txt/h5 file.

Parameters

file_path – File to write the list.

sort()

Sorts the object by model and test segment names.

test()
transform(f)

Applies a function to the valid scores of the object.

Parameters

f – function handle.

Kaldi Data Directory Manipulaton Classes

Thise are classes to manipulate Kaldi data directory files like wav.scp, utt2spk, segments, rttm.

class hyperion.utils.scp_list.SCPList(key, file_path, offset=None, range_spec=None)[source]

Class to manipulate script lists.

key

segment key name.

file_path

path to the file on hard drive, wav, ark or hdf5 file.

offset

Byte in Ark file where the data is located.

range_spec

range of frames (rows) to read.

key_to_index

Dictionary that returns the position of a key in the list.

__init__(key, file_path, offset=None, range_spec=None)[source]
validate()[source]

Validates the attributes of the SCPList object.

copy()[source]

Makes a copy of the object.

__len__()[source]

Returns the number of elements in the list.

len()[source]

Returns the number of elements in the list.

_create_dict()[source]

Creates dictionary that returns the position of a segment in the list.

get_index(key)[source]

Returns the position of key in the list.

__contains__(key)[source]

Returns True if the list contains the key

__getitem__(key)[source]
It allows to acces the data in the list by key or index like in

a ditionary, e.g.: If input is a string key:

scp = SCPList(keys, file_paths, offsets, ranges) file_path, offset, range = scp[‘data1’]

If input is an index:

key, file_path, offset, range = scp[0]

Parameters

key – String key or integer index.

Returns

file_path, offset and range_spec given the key. If key is the index in the key list:

key, file_path, offset and range_spec given the index.

Return type

If key is a string

add_prefix_to_filepath(prefix)[source]

Adds a prefix to the file path

sort()[source]

Sorts the list by key

save(file_path, sep=' ', offset_sep=':')[source]

Saves script list to text file.

Parameters
  • file_path – File to write the list.

  • sep – Separator between the key and file_path in the text file.

  • offset_sep – Separator between file_path and offset.

static parse_script(script, offset_sep)[source]

Parses the parts of the second field of the scp text file.

Parameters
  • script – Second column of scp file.

  • offset_sep – Separtor between file_path and offset.

Returns

file_path, offset and range_spec.

classmethod load(file_path, sep=' ', offset_sep=':', is_wav=False)[source]

Loads script list from text file.

Parameters
  • file_path – File to read the list.

  • sep – Separator between the key and file_path in the text file.

  • offset_sep – Separator between file_path and offset.

Returns

SCPList object.

split(idx, num_parts, group_by_key=True)[source]

Splits SCPList into num_parts and return part idx.

Parameters
  • idx – Part to return from 1 to num_parts.

  • num_parts – Number of parts to split the list.

  • group_by_key – If True, all the lines with the same key go to the same part.

Returns

Sub SCPList

classmethod merge(scp_lists)[source]

Merges several SCPList.

Parameters

scp_lists – List of SCPLists

Returns

SCPList object concatenation the scp_lists.

filter(filter_key, keep=True)[source]

Removes elements from SCPList ojbect by key

Parameters
  • filter_key – List with the keys of the elements to keep or remove.

  • keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

SCPList object.

filter_paths(filter_key, keep=True)[source]

Removes elements of SCPList by file_path

Parameters
  • filter_key – List with the file_path of the elements to keep or remove.

  • keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

SCPList object.

filter_index(index, keep=True)[source]

Removes elements of SCPList by index

Parameters
  • filter_key – List with the index of the elements to keep or remove.

  • keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

SCPList object.

shuffle(seed=1024, rng=None)[source]

Shuffles the elements of the list.

Parameters
  • seed – Seed for random number generator.

  • rng – numpy random number generator object.

Returns

Index used to shuffle the list.

__eq__(other)[source]

Equal operator

__ne__(other)[source]

Non-equal operator

__cmp__(other)[source]

Comparison operator

class hyperion.utils.utt2info.Utt2Info(utt_info)[source]

Class to manipulate utt2spk, utt2lang, etc. files.

key

segment key name.

info
key_to_index

Dictionary that returns the position of a key in the list.

__init__(utt_info)[source]
validate()[source]

Validates the attributes of the Utt2Info object.

classmethod create(key, info)[source]
property num_info_fields
property key
property info
copy()[source]

Makes a copy of the object.

__len__()[source]

Returns the number of elements in the list.

len()[source]

Returns the number of elements in the list.

_create_dict()[source]

Creates dictionary that returns the position of a segment in the list.

get_index(key)[source]

Returns the position of key in the list.

__contains__(key)[source]

Returns True if the list contains the key

__getitem__(key)[source]
It allows to acces the data in the list by key or index like in

a ditionary, e.g.: If input is a string key:

utt2spk = Utt2Info(info) spk_id = utt2spk[‘data1’]

If input is an index:

key, spk_id = utt2spk[0]

Parameters

key – String key or integer index.

Returns

info corresponding to key If key is the index in the key list:

key, info given index

Return type

If key is a string

sort(field=0)[source]

Sorts the list by key

save(file_path, sep=' ')[source]

Saves uttinfo to text file.

Parameters
  • file_path – File to write the list.

  • sep – Separator between the key and file_path in the text file.

classmethod load(file_path, sep=' ', dtype={0: <class 'str'>, 1: <class 'str'>})[source]

Loads utt2info list from text file.

Parameters
  • file_path – File to read the list.

  • sep – Separator between the key and file_path in the text file.

  • dtype – Dictionary with the dtypes of each column.

Returns

Utt2Info object

split(idx, num_parts, group_by_field=0)[source]

Splits SCPList into num_parts and return part idx.

Parameters
  • idx – Part to return from 1 to num_parts.

  • num_parts – Number of parts to split the list.

  • group_by_field – All the lines with the same value in column groub_by_field go to the same part

Returns

Sub Utt2Info object

classmethod merge(info_lists)[source]

Merges several Utt2Info tables.

Parameters

info_lists – List of Utt2Info

Returns

Utt2Info object concatenation the info_lists.

filter(filter_key, keep=True)[source]

Removes elements from Utt2Info object by key

Parameters
  • filter_key – List with the keys of the elements to keep or remove.

  • keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

Utt2Info object.

filter_info(filter_key, field=1, keep=True)[source]

Removes elements of Utt2Info by info value

Parameters
  • filter_key – List with the file_path of the elements to keep or remove.

  • field – Field number corresponding to the info to filter

  • keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

Utt2Info object.

filter_index(index, keep=True)[source]

Removes elements of Utt2Info by index

Parameters
  • filter_key – List with the index of the elements to keep or remove.

  • keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

Utt2Info object.

shuffle(seed=1024, rng=None)[source]

Shuffles the elements of the list.

Parameters
  • seed – Seed for random number generator.

  • rng – numpy random number generator object.

Returns

Index used to shuffle the list.

__eq__(other)[source]

Equal operator

__ne__(other)[source]

Non-equal operator

__cmp__(other)[source]

Comparison operator

class hyperion.utils.segment_list.SegmentList(segments, index_by_file=True)[source]

Class to manipulate segment files

segments

Pandas dataframe.

_index_by_file

if True the df is index by file name, if False by segment id.

iter_idx

index of the current element for the iterator.

uniq_file_id

unique file names.

__init__(segments, index_by_file=True)[source]
classmethod create(segment_id, file_id, tbeg, tend, index_by_file=True)[source]
validate()[source]

Validates the attributes of the SegmentList object.

property index_by_file
property file_id
property segment_id
property tbeg
property tend
copy()[source]

Makes a copy of the object.

segments_ids_from_file(file_id)[source]

Returns segments_ids corresponding to a given file_id

__len__()[source]

Returns the number of segments in the list.

__contains__(key)[source]

Returns True if the segments contains the key

getitem_by_key(key)[source]
It acceses the segments by file_id or segment_id

like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments.getiem_by_key(‘file’)

Parameters

key – Segment or file key

Returns

if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

getitem_by_index(index)[source]
It accesses the segments by index

like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments.getitem_by_index(0)

Parameters

key – Segment or file key

Returns

if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

__getitem__(key)[source]
It accesses the de segments by file_id or segment_id

like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments[‘file’]

Parameters

key – Segment or file key

Returns

if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

save(file_path, sep=' ')[source]

Saves segments to text file.

Parameters
  • file_path – File to write the list.

  • sep – Separator between the fields

classmethod load(file_path, sep=' ', index_by_file=True)[source]

Loads script list from text file.

Parameters
  • file_path – File to read the list.

  • sep – Separator between the key and file_path in the text file.

Returns

SegmentList object.

filter(filter_key, keep=True)[source]
split(idx, num_parts)[source]
classmethod merge(segment_lists, index_by_file=True)[source]
to_bin_vad(key, frame_shift=10, num_frames=None)[source]

Converts segments to binary VAD

Parameters
  • key – Segment or file key

  • frame_shift – frame_shift in milliseconds

  • num_frames – number of frames of file corresponding to key, if None it takes the maximum tend for file

Returns

if index_by_file is True if returns VAD joining all segments of one file else if returns VAD for one given segment

__eq__(other)[source]

Equal operator

__ne__(other)[source]

Non-equal operator

__cmp__(other)[source]

Comparison operator

class hyperion.utils.rttm.RTTM(segments, index_by_file=True)[source]

Class to manipulate rttm files

df

Pandas dataframe.

_index_by_file

if True the df is indexed by file name, if False by segment id.

iter_idx

index of the current element for the iterator.

unique_file_key

unique file names.

__init__(segments, index_by_file=True)[source]
classmethod create(segment_type, file_id, chnl=None, tbeg=None, tdur=None, ortho=None, stype=None, name=None, conf=None, slat=None, index_by_file=True)[source]
classmethod create_spkdiar(file_id, tbeg, tdur, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]
classmethod create_spkdiar_single_file(file_id, tbeg, tdur, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]
classmethod create_spkdiar_from_segments(segments, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]
classmethod create_spkdiar_from_ext_segments(ext_segments, chnl=None, index_by_file=True, prepend_file_id=False)[source]
validate()[source]

Validates the attributes of the RTTM object.

property index_by_file
property file_id
property tbeg
property tdur
property name
copy()[source]

Makes a copy of the object.

property num_files
property total_num_spks
property num_spks_per_file
property avg_num_spks_per_file
__len__()[source]

Returns the number of segments in the list.

__contains__(key)[source]

Returns True if the segments contains the key

__getitem__(key)[source]
It allows to acces the de segments by file_id or segment

like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments[‘file’]

Parameters

key – Segment or file key

Returns

if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

save(file_path, sep=' ')[source]

Saves segments to text file.

Parameters
  • file_path – File to write the list.

  • sep – Separator between the fields

classmethod load(file_path, sep=' ', index_by_file=True)[source]

Loads script list from text file.

Parameters
  • file_path – File to read the list.

  • sep – Separator between the key and file_path in the text file.

Returns

SegmentList object.

filter(filter_key, keep=True)[source]
split(idx, num_parts)[source]
classmethod merge(rttm_list, index_by_file=True)[source]
merge_adjacent_segments(t_margin=0)[source]
__eq__(other)[source]

Equal operator

__ne__(other)[source]

Non-equal operator

__cmp__(other)[source]

Comparison operator

get_segment_names_from_timestamps(file_id, timestamps, segment_type='SPEAKER', min_seg_dur=0.1)[source]
get_files_with_names_diff_to_file(file_id, segment_type='SPEAKER')[source]
prepend_file_id_to_name(segment_type='SPEAKER')[source]
get_segments_from_file(file_id)[source]
get_uniq_names_for_file(file_id=None)[source]
get_bin_frame_mask_for_spk(file_id, name, frame_length=0.025, frame_shift=0.01, snip_edges=False, signal_length=None, max_frames=None)[source]

Returns binary mask of a given speaker to select feature frames

Parameters
  • file_id – file identifier

  • name – speaker id

  • frame_length – frame-length used to compute the VAD

  • frame_shift – frame-shift used to compute the VAD

  • snip_edges – if True, computing VAD used snip-edges option

  • signal_length – total duration of the signal, if None it takes it from the last timestamp

  • max_frames – expected number of frames, if None it computes automatically

Returns

Binary VAD np.array

get_bin_sample_mask_for_spk(file_id, name, fs, signal_length=None, max_samples=None)[source]

Returns binary mask of a given speaker to select waveform samples

Parameters
  • file_id – file identifier

  • name – speaker id

  • fs – sampling frequency

  • signal_length – total duration of the signal, if None it takes it from the last timestamp

  • max_frames – expected number of frames, if None it computes automatically

Returns

Binary mask np.array

compute_stats(nbins_dur=None)[source]
to_segment_list()[source]
sort()[source]
tbeg_is_sorted()[source]

Kaldi Matrix Read/Write Classes

These are classes to read/write text and binary matrices from ARK files. They support the compression methods in Kaldi ARK files.

class hyperion.utils.kaldi_matrix.KaldiMatrix(data)[source]

Class to read/write uncompressed kaldi matrices/vectors.

When compressed matrix is found in file, it calls KaldiCompressedMatrix class automatically to uncompress.

data

numpy array with the matrix/vector values.

__init__(data)[source]
to_ndarray()[source]
Returns

numpy array containing the matrix/vector

property num_rows
property num_cols
classmethod read(f, binary, row_offset=0, num_rows=0, sequential_mode=True)[source]

Reads kaldi matrix/vector from file.

Parameters
  • f – Python file object

  • binary – True if we read from binary file and False if we read from text file.

  • row_offset – Reads matrix starting from a given row instead of row 0.

  • num_rows – Num. of rows to read, if 0 if read all the rows.

  • sequential_mode – True if we are reading the ark file sequentially and False if we are using random access.

Returns

KaldiMatrix object.

write(f, binary)[source]

Writes matrix/vector to ark file.

Parameters
  • f – Python file object.

  • binary – True if we write in binary file and False if we write to text file.

static read_shape(f, binary, sequential_mode=True)[source]

Reads the shape of the current matrix/vector in the ark file.

Parameters
  • f – Python file object

  • binary – True if we read from binary file and False if we read from text file.

  • sequential_mode – True if we are reading the ark file sequentially and False if we are using random access. In sequential_mode=True it moves the file pointer to the next matrix.

Returns

Tuple object with shape.

class hyperion.utils.kaldi_matrix.KaldiCompressedMatrix(data=None)[source]

Class to read/write compressed kaldi matrices.

When compressed matrix is found in file, it calls KaldiCompressedMatrix class automatically to uncompress.

data

numpy byte array with the compressed coded matrix.

data_format

{1, 2, 3, 4}

min_value

Minimum value in the matrix.

data_range

max_value - min_value

num_rows

Number of rows in the matrix

num_columns

Number of columns in the matrix

__init__(data=None)[source]
get_data_attrs()[source]
Returns

Coded matrix values in 2D format. Dictionary object with data attributes: data_format, min_value, data_range, percentiles.

classmethod build_from_data_attrs(data, attrs)[source]

Builds object from coded values and attributes

Parameters
  • data – Coded matrix values in 2D format.

  • attrs – Dictionary object with data attributes: data_format, min_value, data_range, percentiles.

Returns

KaldiCompressedMatrix object.

_unpack_header()[source]

Unpacks attributes from header

_pack_header()[source]

Creates header from the object attributes

scale(alpha)[source]

Multiplies matrix by alpha

_compute_global_header(mat, method)[source]

Computes the header

Parameters
  • mat – numpy array with the uncompressed matrix.

  • method – Compression method.

Returns

Byte array with header.

static _get_read_info(header, row_offset=0, num_rows=0)[source]

Gets info needed to read the matrix from file

static _data_size(header)[source]
Returns

Number of bytes of the coded matrix.

classmethod compress(mat, method='auto')[source]

Creates compressed matrix from uncompressed numpy matrix :param mat: numpy array with the uncompressed matrix. :param method: Compression method.

Returns

KaldiCompressedMatrix object.

_compute_column_header(v)[source]

Creates the column headers for the speech-feat compression.

Parameters

v – numpy array with the column to compress.

Returns

Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values.

_compress_column(v)[source]

Compress column for the speech-feat compression.

Parameters

v – numpy array with the column to compress.

Returns

Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values. Byte array with the coded column.

_uncompress_column(col_header, col_data)[source]

Compress column for the speech-feat compression.

Parameters
  • col_header – Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values.

  • col_data – Byte array with the coded column.

Returns

numpy array with the uncompressed column

static _float_to_char(v, p0, p25, p75, p100)[source]

Codes the column from float to bytes using the given percentiles

static _char_to_float(v, p0, p25, p75, p100)[source]

Decodes the column from bytes to float using the given percentiles

to_ndarray()[source]

Uncompresses matrix to numpy array. :returns: numpy array with uncompressed matrix.

to_matrix()[source]

Uncompresses matrix to KaldiMatrix object. :returns: KaldiMatrix with uncompressed matrix.

classmethod read(f, binary, row_offset=0, num_rows=0, sequential_mode=True)[source]

Reads kaldi compressed matrix/vector from file.

Parameters
  • f – Python file object

  • binary – True if we read from binary file and False if we read from text file.

  • row_offset – Reads matrix starting from a given row instead of row 0.

  • num_rows – Num. of rows to read, if 0 if read all the rows.

  • sequential_mode – True if we are reading the ark file sequentially and False if we are using random access.

Returns

KaldiCompressedMatrix object.

write(f, binary)[source]

Writes matrix/vector to ark file.

Parameters
  • f – Python file object.

  • binary – True if we write in binary file and False if we write to text file.

static read_shape(f, binary, sequential_mode=True)[source]

Reads the shape of the current matrix/vector in the ark file.

Parameters
  • f – Python file object

  • binary – True if we read from binary file and False if we read from text file.

  • sequential_mode – True if we are reading the ark file sequentially and False if we are using random access. In sequential_mode=True it moves the file pointer to the next matrix.

Returns

Tuple object with shape.

Kaldi I/O Functions

Utils to read/write binary ARK files

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

Functions to write and read kaldi files

hyperion.utils.kaldi_io_funcs.init_kaldi_output_stream(f, binary)[source]

Writes Kaldi Ark file binary marker.

hyperion.utils.kaldi_io_funcs.init_kaldi_input_stream(f)[source]

Reads Kaldi Ark file binary marker.

hyperion.utils.kaldi_io_funcs.check_token(token)[source]

Checks that token doesn’t have spaces.

hyperion.utils.kaldi_io_funcs.is_token(token)[source]

Checks if token is a valid token.

hyperion.utils.kaldi_io_funcs.read_token(f, binary)[source]

Reads next token from Ark file.

hyperion.utils.kaldi_io_funcs.write_token(f, binary, token)[source]

Writes token to Ark file.

hyperion.utils.kaldi_io_funcs.peek(f, binary, num_bytes=1)[source]

Peeks num_bytes from Ark file.

hyperion.utils.kaldi_io_funcs.read_int32(f, binary)[source]

Reads Int32 from Ark file.

hyperion.utils.kaldi_io_funcs.write_int32(f, binary, val)[source]

Writes Int32 val to Ark file.

VAD Utils

Functions to manipulate VAD output, convert from binary to timestamps, intersect VADs, etc.

Copyright 2020 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

hyperion.utils.vad_utils.merge_vad_timestamps(in_timestamps, tol=0.001)[source]

Merges vad timestamps that are contiguous

Parameters
  • in_timestamps – original time-stamps in start-time, end-time format

  • tol – tolerance, segments separted less than tol will be merged

Returns

Merged timestamps

hyperion.utils.vad_utils.bin_vad_to_timestamps(vad, frame_length, frame_shift, snip_edges=False, merge_tol=0.001)[source]

Converts binary VAD to a list of start end time stamps

Parameters
  • vad – Binary VAD

  • frame_length – frame-length used to compute the VAD

  • frame_shift – frame-shift used to compute the VAD

  • snip_edges – if True, computing VAD used snip-edges option

  • merge_tol – tolerance to merge contiguous segments

Returns

VAD time stamps refered to the begining of the file

hyperion.utils.vad_utils.vad_timestamps_to_bin(in_timestamps, frame_length, frame_shift, snip_edges=False, signal_length=None, max_frames=None)[source]

Converts VAD time-stamps to a binary vector

Parameters
  • in_timestamps – vad timestamps

  • frame_length – frame-length used to compute the VAD

  • frame_shift – frame-shift used to compute the VAD

  • snip_edges – if True, computing VAD used snip-edges option

  • signal_length – total duration of the signal, if None it takes it from the last timestamp

  • max_frames – expected number of frames, if None it computes automatically

Returns

Binary VAD np.array

hyperion.utils.vad_utils.timestamps_wrt_vad_to_absolute_timestamps(in_timestamps, vad_timestamps)[source]
Converts time stamps relative to a signal with silence removed

to absoulute time stamps in the original signal

VAD is provided in start-end timestamps format also.

Parameters
  • in_timestamps – time stamps relative to a signal with silence removed

  • vad_timestamps – vad timestamps used to remove silence from signal

Returns

Absolute VAD time-stamps

hyperion.utils.vad_utils.timestamps_wrt_bin_vad_to_absolute_timestamps(in_timestamps, vad, frame_length, frame_shift, snip_edges=False)[source]
Converts time stamps relative to a signal with silence removed

to absoulute time stamps in the original signal

VAD is provided in binary format

Parameters
  • in_timestamps – time stamps relative to a signal with silence removed

  • vad – Binary VAD

  • frame_length – frame-length used to compute the VAD

  • frame_shift – frame-shift used to compute the VAD

  • snip_edges – if True, computing VAD used snip-edges option

Returns

Absolute VAD time-stamps

hyperion.utils.vad_utils.intersect_segment_timestamps_with_vad(in_timestamps, vad_timestamps)[source]
Intersects a list of segment timestamps with a VAD time-stamps

It returns only the segments that contain speech modifying the start and end times to remove silence from the segments.

Parameters
  • in_timestamps – time stamps of a list of segments refered to time 0.

  • vad_timestamps – vad timestamps

Returns

Boolean array indicating which input segments contain speech Array of output segments with silence removed Array of indices, one index for each output segment indicating to which

input speech segment correspond to. The index correspond to input segments after removing input segments that only contain silence.

Math Functions

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

Some math functions.

hyperion.utils.math.logdet_pdmat(A)[source]

Log determinant of positive definite matrix.

hyperion.utils.math.invert_pdmat(A, right_inv=False, return_logdet=False, return_inv=False)[source]
Inversion of positive definite matrices.

Returns lambda function f that multiplies the inverse of A times a vector.

Parameters
  • A – Positive definite matrix

  • right_inv – If False, f(v)=A^{-1}v; if True f(v)=v’ A^{-1}

  • return_logdet – If True, it also returns the log determinant of A.

  • return_inv – If True, it also returns A^{-1}

Returns

Lambda function that multiplies A^{-1} times vector. Cholesky transform of A upper triangular Log determinant of A A^{-1}

hyperion.utils.math.invert_trimat(A, lower=False, right_inv=False, return_logdet=False, return_inv=False)[source]
Inversion of triangular matrices.

Returns lambda function f that multiplies the inverse of A times a vector.

Parameters
  • A – Triangular matrix.

  • lower – if True A is lower triangular, else A is upper triangular.

  • right_inv – If False, f(v)=A^{-1}v; if True f(v)=v’ A^{-1}

  • return_logdet – If True, it also returns the log determinant of A.

  • return_inv – If True, it also returns A^{-1}

Returns

Lambda function that multiplies A^{-1} times vector. Log determinant of A A^{-1}

hyperion.utils.math.softmax(r, axis=- 1)[source]
Returns

y = exp(r)/sum(exp(r))

hyperion.utils.math.logsumexp(r, axis=- 1)[source]
Returns

y = log sum(exp(r))

hyperion.utils.math.logsigmoid(x)[source]
Returns

y = log(sigmoid(x))

hyperion.utils.math.neglogsigmoid(x)[source]
Returns

y = -log(sigmoid(x))

hyperion.utils.math.sigmoid(x)[source]
Returns

y = sigmoid(x)

hyperion.utils.math.fisher_ratio(mu1, Sigma1, mu2, Sigma2)[source]

Computes the Fisher ratio between two classes from the class means and covariances.

hyperion.utils.math.fisher_ratio_with_precs(mu1, Lambda1, mu2, Lambda2)[source]

Computes the Fisher ratio between two classes from the class means precisions.

hyperion.utils.math.symmat2vec(A, lower=False, diag_factor=None)[source]

Puts a symmetric matrix into a vector.

Parameters
  • A – Symmetric matrix.

  • lower – If True, it uses the lower triangular part of the matrix. If False, it uses the upper triangular part of the matrix.

  • diag_factor – It multiplies the diagonal of A by diag_factor.

Returns

Vector with the upper or lower triangular part of A.

hyperion.utils.math.vec2symmat(v, lower=False, diag_factor=None)[source]

Puts a vector back into a symmetric matrix.

Parameters
  • v – Vector with the upper or lower triangular part of A.

  • lower – If True, v contains the lower triangular part of the matrix. If False, v contains the upper triangular part of the matrix.

  • diag_factor – It multiplies the diagonal of A by diag_factor.

Returns

Symmetric matrix.

hyperion.utils.math.trimat2vec(A, lower=False)[source]

Puts a triangular matrix into a vector.

Parameters
  • A – Triangular matrix.

  • lower – If True, it uses the lower triangular part of the matrix. If False, it uses the upper triangular part of the matrix.

Returns

Vector with the upper or lower triangular part of A.

hyperion.utils.math.vec2trimat(v, lower=False)[source]

Puts a vector back into a triangular matrix.

Parameters
  • v – Vector with the upper or lower triangular part of A.

  • lower – If True, v contains the lower triangular part of the matrix. If False, v contains the upper triangular part of the matrix.

Returns

Triangular matrix.

hyperion.utils.math.fullcov_varfloor(S, F, F_is_chol=False, lower=False)[source]

Variance flooring for full covariance matrices.

Parameters
  • S – Covariance.

  • F – Minimum cov or Cholesqy decomposisition of it

  • F_is_chol – If True F is Cholesqy decomposition

  • lower – True if cholF is lower triangular, False otherwise

Returns

Floored covariance

hyperion.utils.math.fullcov_varfloor_from_cholS(cholS, cholF, lower=False)[source]
Variance flooring for full covariance matrices

using Cholesky decomposition as input/output

Parameters
  • cholS – Cholesqy decomposisition of the covariance.

  • cholF – Cholesqy decomposisition of the minimum covariance.

  • lower – True if matrices are lower triangular, False otherwise

Returns

Cholesky decomposition of the floored covariance

hyperion.utils.math.int2onehot(class_ids, num_classes=None)[source]

Integer to 1-hot vector.

Parameters
  • class_ids – Numpy array of integers.

  • num_classes – Maximum number of classes.

Returns

1-hot Numpy array.

hyperion.utils.math.cosine_scoring(x1, x2)[source]

Miscellaneous Functions

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

Miscellaneous functions

hyperion.utils.misc.generate_data(g)[source]
hyperion.utils.misc.str2bool(s)[source]

Convert string to bool for argparse

hyperion.utils.misc.apply_gain_logx(x, AdB)[source]

Applies A dB gain to log(x)

hyperion.utils.misc.apply_gain_logx2(x, AdB)[source]

Applies A dB gain to log(x^2)

hyperion.utils.misc.apply_gain_x(x, AdB)[source]

Applies A dB gain to x

hyperion.utils.misc.apply_gain_x2(x, AdB)[source]

Applies A dB gain to x^2

hyperion.utils.misc.apply_gain(x, feat_type, AdB)[source]
hyperion.utils.misc.energy_vad(P)[source]
hyperion.utils.misc.compute_snr(x, n, axis=- 1)[source]
hyperion.utils.misc.filter_args(valid_args, kwargs)[source]

Filters arguments from a dictionary

Parameters
  • valid_args – list/tuple of valid arguments

  • kwargs – dictionary containing program config arguments

Returns

Dictionary with only valid_args keys if they exists