Utils

The hyperion.utils module contains several utility classes and functions.

Trial Management Classes

These are a series of utils to handle Trial Indices, Keys and Scores. These are based on the MATLAB implementations in the BOSARIS Toolkit.

class hyperion.utils.trial_key.TrialKey(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]

Contains the trial key for speaker recognition trials.: Bosaris compatible Key.

model_set: List of model names.

seg_set: List of test segment names.

tar: Boolean matrix with target trials to True (num_models x num_segments).

non: Boolean matrix with non-target trials to True (num_models x num_segments).

model_cond: Conditions related to the model.

seg_cond: Conditions related to the test segment.

trial_cond: Conditions related to the combination of model and test segment.

model_cond_name: String list with the names of the model conditions.

seg_cond_name: String list with the names of the segment conditions.

trial_cond_name: String list with the names of the trial conditions.

__init__(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]

property num_models

property num_tests

copy()[source]: Makes a copy of the object

sort()[source]: Sorts the object by model and test segment names.

save(file_path)[source]

Saves object to txt/h5 file.

Parameters: file_path – File to write the list.

save_h5(file_path)[source]

Saves object to h5 file.

Parameters: file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters: file_path – File to write the list.

classmethod load(file_path)[source]

Loads object from txt/h5 file

Parameters: file_path – File to read the list.
Returns: TrialKey object.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters: file_path – File to read the list.
Returns: TrialKey object.

classmethod load_txt(file_path)[source]

Loads object from txt file

Parameters: file_path – File to read the list.
Returns: TrialKey object.

classmethod merge(key_list)[source]

Merges several key objects.

Parameters: key_list – List of TrialKey objects.
Returns: Merged TrialKey object.

filter(model_set, seg_set, keep=True)[source]

Removes elements from TrialKey object.

Parameters

model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

Returns

Filtered TrialKey object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]

Splits the TrialKey into num_model_parts x num_seg_parts and returns part: (model_idx, seg_idx).

Parameters

model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialKey

to_ndx()[source]

Converts TrialKey object into TrialNdx object.

Returns: TrialNdx object.

validate()[source]: Validates the attributes of the TrialKey object.

__eq__(other)[source]: Equal operator

__ne__(other)[source]: Non-equal operator

__cmp__(other)[source]: Comparison operator

test()[source]

class hyperion.utils.trial_ndx.TrialNdx(model_set=None, seg_set=None, trial_mask=None)[source]

Contains the trial index to run speaker recognition trials.: Bosaris compatible Ndx.

model_set: List of model names.

seg_set: List of test segment names.

trial_mask: Boolean matrix with the trials to execute to True (num_models x num_segments).

__init__(model_set=None, seg_set=None, trial_mask=None)[source]

property num_models

property num_tests

copy()[source]: Makes a copy of the object

sort()[source]: Sorts the object by model and test segment names.

save(file_path)[source]

Saves object to txt/h5 file.

Parameters: file_path – File to write the list.

save_h5(file_path)[source]

Saves object to h5 file.

Parameters: file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters: file_path – File to write the list.

classmethod load(file_path)[source]

Loads object from txt/h5 file

Parameters: file_path – File to read the list.
Returns: TrialNdx object.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters: file_path – File to read the list.
Returns: TrialNdx object.

classmethod load_txt(file_path)[source]

Loads object from txt file

Parameters: file_path – File to read the list.
Returns: TrialNdx object.

classmethod merge(ndx_list)[source]

Merges several index objects.

Parameters: key_list – List of TrialNdx objects.
Returns: Merged TrialNdx object.

static parse_eval_set(ndx, enroll, test=None, eval_set='enroll-test')[source]

Prepares the data structures required for evaluation.

Parameters

ndx – TrialNdx object cotaining the trials for the main evaluation.
enroll – Utt2Info where key are file_ids and second column are model names
test – Utt2Info of where key are test segments names. Needed in the cases enroll-coh and coh-coh.
eval_test – Type of of evaluation enroll-test: main evaluation of enrollment vs test segments. enroll-coh: enrollment vs cohort segments. coh-test: cohort vs test segments. coh-coh: cohort vs cohort segments.

Returns

TrialNdx object enroll: SCPList

Return type

ndx

filter(model_set, seg_set, keep=True)[source]

Removes elements from TrialNdx object.

Parameters

model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

Returns

Filtered TrialNdx object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]

Splits the TrialNdx into num_model_parts x num_seg_parts and returns part: (model_idx, seg_idx).

Parameters

model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialNdx

validate()[source]: Validates the attributes of the TrialKey object.

apply_segmentation_to_test(segment_list)[source]

Splits test segment into multiple sub-segments Useful to create ndx for spk diarization or tracking.

Parameters: segment_list – ExtSegmentList object with mapping of file_id to ext_segment_id
Returns: New TrialNdx object with segment_ids in test instead of file_id.

__eq__(other)[source]: Equal operator

__ne__(other)[source]: Non-equal operator

__cmp__(other)[source]: Comparison operator

test()[source]

class hyperion.utils.trial_scores.TrialScores(model_set=None, seg_set=None, scores=None, score_mask=None)[source]

Contains the scores for the speaker recognition trials.: Bosaris compatible Scores.

model_set: List of model names.

seg_set: List of test segment names.

scores: Matrix with the scores (num_models x num_segments).

score_mask: Boolean matrix with the trials with valid scores to True (num_models x num_segments).

__init__(model_set=None, seg_set=None, scores=None, score_mask=None)[source]

property num_models

property num_tests

copy()[source]: Makes a copy of the object

sort()[source]: Sorts the object by model and test segment names.

save(file_path)[source]

Saves object to txt/h5 file.

Parameters: file_path – File to write the list.

save_h5(file_path)[source]

Saves object to h5 file.

Parameters: file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters: file_path – File to write the list.

classmethod load(file_path)[source]

Loads object from txt/h5 file

Parameters: file_path – File to read the list.
Returns: TrialScores object.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters: file_path – File to read the list.
Returns: TrialScores object.

classmethod load_txt(file_path)[source]

Loads object from h5 file

Parameters: file_path – File to read the list.
Returns: TrialScores object.

classmethod merge(scr_list)[source]

Merges several score objects.

Parameters: scr_list – List of TrialNdx objects.
Returns: Merged TrialScores object.

filter(model_set, seg_set, keep=True, raise_missing=True)[source]

Removes elements from TrialScores object.

Parameters

model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.
raise_missing – Raises exception if there are elements in model_set or seg_set that are not in the object.

Returns

Filtered TrialScores object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]

Splits the TrialScores into num_model_parts x num_seg_parts and returns part: (model_idx, seg_idx).

Parameters

model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialScores

validate()[source]: Validates the attributes of the TrialScores object.

align_with_ndx(ndx, raise_missing=True)[source]

Aligns scores, model_set and seg_set with TrialNdx or TrialKey.

Parameters

ndx – TrialNdx or TrialKey object.
raise_missing – Raises exception if there are trials in ndx that are not in the score object.

Returns

Aligned TrialScores object.

get_tar_non(key)[source]

Returns target and non target scores.

Parameters: key – TrialKey object.
Returns: Numpy array with target scores. Numpy array with non-target scores.

set_missing_to_value(ndx, val)[source]

Aligns the scores with a TrialNdx and sets the trials with missing scores to the same value.

Parameters

ndx – TrialNdx or TrialKey object.
val – Value for the missing scores.

Returns

Aligned TrialScores object.

transform(f)[source]

Applies a function to the valid scores of the object.

Parameters: f – function handle.

__eq__(other)[source]: Equal operator

__ne__(other)[source]: Non-equal operator

__cmp__(other)[source]: Comparison operator

test()[source]

class hyperion.utils.trial_stats.TrialStats(df_stats)[source]

Contains anciliary statistics from the trial such us quality measures like SNR

This class was created to store statistics about adversarial attacks like SNR (signal-to-perturbation ratio), Linf, L2 norms of the perturbation etc.

df_stats: pandas dataframe containing the stats. The dataframe needs to include the modelid and segmentid columns

__init__(df_stats)[source]

classmethod load(file_path)[source]

Loads stats file

Parameters: file_path – stats file in csv format
Returns: TrialScores object.

save_h5(file_path)[source]

Saves object to file.

Parameters: file_path – CSV format file

get_stats_mat(stat_name, ndx, raise_missing=True)[source]

Returns a matrix of trial statistics sorted to match a give Ndx or Key object

Parameters

stat_name – name of the statatistic (e.g. snr, linf), as given in the column name of the dataframe.
ndx – Ndx or Key object

Returns

Stat matrix (n_models x n_tests)

reset_stats_mats()[source]

class hyperion.utils.sparse_trial_key.SparseTrialKey(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]

Contains the trial key for speaker recognition trials.: Bosaris compatible Key.

model_set: List of model names.

seg_set: List of test segment names.

tar: Boolean matrix with target trials to True (num_models x num_segments).

non: Boolean matrix with non-target trials to True (num_models x num_segments).

model_cond: Conditions related to the model.

seg_cond: Conditions related to the test segment.

trial_cond: Conditions related to the combination of model and test segment.

model_cond_name: String list with the names of the model conditions.

seg_cond_name: String list with the names of the segment conditions.

trial_cond_name: String list with the names of the trial conditions.

__init__(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]

save_h5(file_path)[source]

Saves object to h5 file.

Parameters: file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters: file_path – File to write the list.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters: file_path – File to read the list.
Returns: TrialKey object.

classmethod load_txt(file_path)[source]

Loads object from txt file

Parameters: file_path – File to read the list.
Returns: TrialKey object.

classmethod merge(key_list)[source]

Merges several key objects.

Parameters: key_list – List of TrialKey objects.
Returns: Merged TrialKey object.

to_ndx()[source]

Converts TrialKey object into TrialNdx object.

Returns: TrialNdx object.

validate()[source]: Validates the attributes of the TrialKey object.

classmethod from_trial_key(key)[source]

__eq__(other)[source]: Equal operator

__cmp__(other): Comparison operator

__ne__(other): Non-equal operator

copy(): Makes a copy of the object

filter(model_set, seg_set, keep=True)

Removes elements from TrialKey object.

Parameters

model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.

Returns

Filtered TrialKey object.

classmethod load(file_path)

Loads object from txt/h5 file

Parameters: file_path – File to read the list.
Returns: TrialKey object.

property num_models

property num_tests

save(file_path)

Saves object to txt/h5 file.

Parameters: file_path – File to write the list.

sort(): Sorts the object by model and test segment names.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)

Splits the TrialKey into num_model_parts x num_seg_parts and returns part: (model_idx, seg_idx).

Parameters

model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialKey

test()

class hyperion.utils.sparse_trial_scores.SparseTrialScores(model_set=None, seg_set=None, scores=None, score_mask=None)[source]

Contains the scores for the speaker recognition trials.: Bosaris compatible Scores.

model_set: List of model names.

seg_set: List of test segment names.

scores: Matrix with the scores (num_models x num_segments).

score_mask: Boolean matrix with the trials with valid scores to True (num_models x num_segments).

__init__(model_set=None, seg_set=None, scores=None, score_mask=None)[source]

save_h5(file_path)[source]

Saves object to h5 file.

Parameters: file_path – File to write the list.

save_txt(file_path)[source]

Saves object to txt file.

Parameters: file_path – File to write the list.

classmethod load_h5(file_path)[source]

Loads object from h5 file

Parameters: file_path – File to read the list.
Returns: TrialScores object.

classmethod load_txt(file_path)[source]

Loads object from h5 file

Parameters: file_path – File to read the list.
Returns: SparseTrialScores object.

classmethod merge(scr_list)[source]

Merges several score objects.

Parameters: scr_list – List of TrialNdx objects.
Returns: Merged TrialScores object.

split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]

Splits the TrialScores into num_model_parts x num_seg_parts and returns part: (model_idx, seg_idx).

Parameters

model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.

Returns

Subpart of the TrialScores

validate()[source]: Validates the attributes of the TrialKey object.

filter(model_set, seg_set, keep=True, raise_missing=True)[source]

Removes elements from TrialScores object.

Parameters

model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.
raise_missing – Raises exception if there are elements in model_set or seg_set that are not in the object.

Returns

Filtered TrialScores object.

align_with_ndx(ndx, raise_missing=True)[source]

Aligns scores, model_set and seg_set with TrialNdx or TrialKey.

Parameters

ndx – TrialNdx or TrialKey object.
raise_missing – Raises exception if there are trials in ndx that are not in the score object.

Returns

Aligned TrialScores object.

get_tar_non(key)[source]

Returns target and non target scores.

Parameters: key – TrialKey object.
Returns: Numpy array with target scores. Numpy array with non-target scores.

classmethod from_trial_scores(scr)[source]

set_missing_to_value(ndx, val)[source]

Aligns the scores with a TrialNdx and sets the trials with missing scores to the same value.

Parameters

ndx – TrialNdx or TrialKey object.
val – Value for the missing scores.

Returns

Aligned SparseTrialScores object.

__eq__(other)[source]: Equal operator

__cmp__(other): Comparison operator

__ne__(other): Non-equal operator

copy(): Makes a copy of the object

classmethod load(file_path)

Loads object from txt/h5 file

Parameters: file_path – File to read the list.
Returns: TrialScores object.

property num_models

property num_tests

save(file_path)

Saves object to txt/h5 file.

Parameters: file_path – File to write the list.

sort(): Sorts the object by model and test segment names.

test()

transform(f)

Applies a function to the valid scores of the object.

Parameters: f – function handle.

Kaldi Data Directory Manipulaton Classes

Thise are classes to manipulate Kaldi data directory files like wav.scp, utt2spk, segments, rttm.

class hyperion.utils.scp_list.SCPList(key, file_path, offset=None, range_spec=None)[source]

Class to manipulate script lists.

key: segment key name.

file_path: path to the file on hard drive, wav, ark or hdf5 file.

offset: Byte in Ark file where the data is located.

range_spec: range of frames (rows) to read.

key_to_index: Dictionary that returns the position of a key in the list.

__init__(key, file_path, offset=None, range_spec=None)[source]

validate()[source]: Validates the attributes of the SCPList object.

copy()[source]: Makes a copy of the object.

__len__()[source]: Returns the number of elements in the list.

len()[source]: Returns the number of elements in the list.

_create_dict()[source]: Creates dictionary that returns the position of a segment in the list.

get_index(key)[source]: Returns the position of key in the list.

__contains__(key)[source]: Returns True if the list contains the key

__getitem__(key)[source]

It allows to acces the data in the list by key or index like in

a ditionary, e.g.: If input is a string key:

scp = SCPList(keys, file_paths, offsets, ranges) file_path, offset, range = scp[‘data1’]

If input is an index:: key, file_path, offset, range = scp[0]

Parameters

key – String key or integer index.

Returns

file_path, offset and range_spec given the key. If key is the index in the key list:

key, file_path, offset and range_spec given the index.

Return type

If key is a string

add_prefix_to_filepath(prefix)[source]: Adds a prefix to the file path

sort()[source]: Sorts the list by key

save(file_path, sep=' ', offset_sep=':')[source]

Saves script list to text file.

Parameters

file_path – File to write the list.
sep – Separator between the key and file_path in the text file.
offset_sep – Separator between file_path and offset.

static parse_script(script, offset_sep)[source]

Parses the parts of the second field of the scp text file.

Parameters

script – Second column of scp file.
offset_sep – Separtor between file_path and offset.

Returns

file_path, offset and range_spec.

classmethod load(file_path, sep=' ', offset_sep=':', is_wav=False)[source]

Loads script list from text file.

Parameters

file_path – File to read the list.
sep – Separator between the key and file_path in the text file.
offset_sep – Separator between file_path and offset.

Returns

SCPList object.

split(idx, num_parts, group_by_key=True)[source]

Splits SCPList into num_parts and return part idx.

Parameters

idx – Part to return from 1 to num_parts.
num_parts – Number of parts to split the list.
group_by_key – If True, all the lines with the same key go to the same part.

Returns

Sub SCPList

classmethod merge(scp_lists)[source]

Merges several SCPList.

Parameters: scp_lists – List of SCPLists
Returns: SCPList object concatenation the scp_lists.

filter(filter_key, keep=True)[source]

Removes elements from SCPList ojbect by key

Parameters

filter_key – List with the keys of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

SCPList object.

filter_paths(filter_key, keep=True)[source]

Removes elements of SCPList by file_path

Parameters

filter_key – List with the file_path of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

SCPList object.

filter_index(index, keep=True)[source]

Removes elements of SCPList by index

Parameters

filter_key – List with the index of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

SCPList object.

shuffle(seed=1024, rng=None)[source]

Shuffles the elements of the list.

Parameters

seed – Seed for random number generator.
rng – numpy random number generator object.

Returns

Index used to shuffle the list.

__eq__(other)[source]: Equal operator

__ne__(other)[source]: Non-equal operator

__cmp__(other)[source]: Comparison operator

class hyperion.utils.utt2info.Utt2Info(utt_info)[source]

Class to manipulate utt2spk, utt2lang, etc. files.

key: segment key name.

info

key_to_index: Dictionary that returns the position of a key in the list.

__init__(utt_info)[source]

validate()[source]: Validates the attributes of the Utt2Info object.

classmethod create(key, info)[source]

property num_info_fields

property key

property info

copy()[source]: Makes a copy of the object.

__len__()[source]: Returns the number of elements in the list.

len()[source]: Returns the number of elements in the list.

_create_dict()[source]: Creates dictionary that returns the position of a segment in the list.

get_index(key)[source]: Returns the position of key in the list.

__contains__(key)[source]: Returns True if the list contains the key

__getitem__(key)[source]

It allows to acces the data in the list by key or index like in

a ditionary, e.g.: If input is a string key:

utt2spk = Utt2Info(info) spk_id = utt2spk[‘data1’]

If input is an index:: key, spk_id = utt2spk[0]

Parameters

key – String key or integer index.

Returns

info corresponding to key If key is the index in the key list:

key, info given index

Return type

If key is a string

sort(field=0)[source]: Sorts the list by key

save(file_path, sep=' ')[source]

Saves uttinfo to text file.

Parameters

file_path – File to write the list.
sep – Separator between the key and file_path in the text file.

classmethod load(file_path, sep=' ', dtype={0: <class 'str'>, 1: <class 'str'>})[source]

Loads utt2info list from text file.

Parameters

file_path – File to read the list.
sep – Separator between the key and file_path in the text file.
dtype – Dictionary with the dtypes of each column.

Returns

Utt2Info object

split(idx, num_parts, group_by_field=0)[source]

Splits SCPList into num_parts and return part idx.

Parameters

idx – Part to return from 1 to num_parts.
num_parts – Number of parts to split the list.
group_by_field – All the lines with the same value in column groub_by_field go to the same part

Returns

Sub Utt2Info object

classmethod merge(info_lists)[source]

Merges several Utt2Info tables.

Parameters: info_lists – List of Utt2Info
Returns: Utt2Info object concatenation the info_lists.

filter(filter_key, keep=True)[source]

Removes elements from Utt2Info object by key

Parameters

filter_key – List with the keys of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

Utt2Info object.

filter_info(filter_key, field=1, keep=True)[source]

Removes elements of Utt2Info by info value

Parameters

filter_key – List with the file_path of the elements to keep or remove.
field – Field number corresponding to the info to filter
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

Utt2Info object.

filter_index(index, keep=True)[source]

Removes elements of Utt2Info by index

Parameters

filter_key – List with the index of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;

Returns

Utt2Info object.

shuffle(seed=1024, rng=None)[source]

Shuffles the elements of the list.

Parameters

seed – Seed for random number generator.
rng – numpy random number generator object.

Returns

Index used to shuffle the list.

__eq__(other)[source]: Equal operator

__ne__(other)[source]: Non-equal operator

__cmp__(other)[source]: Comparison operator

class hyperion.utils.segment_list.SegmentList(segments, index_by_file=True)[source]

Class to manipulate segment files

segments: Pandas dataframe.

_index_by_file: if True the df is index by file name, if False by segment id.

iter_idx: index of the current element for the iterator.

uniq_file_id: unique file names.

__init__(segments, index_by_file=True)[source]

classmethod create(segment_id, file_id, tbeg, tend, index_by_file=True)[source]

validate()[source]: Validates the attributes of the SegmentList object.

property index_by_file

property file_id

property segment_id

property tbeg

property tend

copy()[source]: Makes a copy of the object.

segments_ids_from_file(file_id)[source]: Returns segments_ids corresponding to a given file_id

__len__()[source]: Returns the number of segments in the list.

__contains__(key)[source]: Returns True if the segments contains the key

getitem_by_key(key)[source]

It acceses the segments by file_id or segment_id: like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments.getiem_by_key(‘file’)

Parameters: key – Segment or file key
Returns: if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

getitem_by_index(index)[source]

It accesses the segments by index: like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments.getitem_by_index(0)

Parameters: key – Segment or file key
Returns: if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

__getitem__(key)[source]

It accesses the de segments by file_id or segment_id: like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments[‘file’]

Parameters: key – Segment or file key
Returns: if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

save(file_path, sep=' ')[source]

Saves segments to text file.

Parameters

file_path – File to write the list.
sep – Separator between the fields

classmethod load(file_path, sep=' ', index_by_file=True)[source]

Loads script list from text file.

Parameters

file_path – File to read the list.
sep – Separator between the key and file_path in the text file.

Returns

SegmentList object.

filter(filter_key, keep=True)[source]

split(idx, num_parts)[source]

classmethod merge(segment_lists, index_by_file=True)[source]

to_bin_vad(key, frame_shift=10, num_frames=None)[source]

Converts segments to binary VAD

Parameters

key – Segment or file key
frame_shift – frame_shift in milliseconds
num_frames – number of frames of file corresponding to key, if None it takes the maximum tend for file

Returns

if index_by_file is True if returns VAD joining all segments of one file else if returns VAD for one given segment

__eq__(other)[source]: Equal operator

__ne__(other)[source]: Non-equal operator

__cmp__(other)[source]: Comparison operator

class hyperion.utils.rttm.RTTM(segments, index_by_file=True)[source]

Class to manipulate rttm files

df: Pandas dataframe.

_index_by_file: if True the df is indexed by file name, if False by segment id.

iter_idx: index of the current element for the iterator.

unique_file_key: unique file names.

__init__(segments, index_by_file=True)[source]

classmethod create(segment_type, file_id, chnl=None, tbeg=None, tdur=None, ortho=None, stype=None, name=None, conf=None, slat=None, index_by_file=True)[source]

classmethod create_spkdiar(file_id, tbeg, tdur, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]

classmethod create_spkdiar_single_file(file_id, tbeg, tdur, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]

classmethod create_spkdiar_from_segments(segments, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]

classmethod create_spkdiar_from_ext_segments(ext_segments, chnl=None, index_by_file=True, prepend_file_id=False)[source]

validate()[source]: Validates the attributes of the RTTM object.

property index_by_file

property file_id

property tbeg

property tdur

property name

copy()[source]: Makes a copy of the object.

property num_files

property total_num_spks

property num_spks_per_file

property avg_num_spks_per_file

__len__()[source]: Returns the number of segments in the list.

__contains__(key)[source]: Returns True if the segments contains the key

__getitem__(key)[source]

It allows to acces the de segments by file_id or segment: like in a ditionary, e.g.: If input is a string key:

segmetns = SegmentList(…) segment, tbeg, tend = segments[‘file’]

Parameters: key – Segment or file key
Returns: if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame

save(file_path, sep=' ')[source]

Saves segments to text file.

Parameters

file_path – File to write the list.
sep – Separator between the fields

classmethod load(file_path, sep=' ', index_by_file=True)[source]

Loads script list from text file.

Parameters

file_path – File to read the list.
sep – Separator between the key and file_path in the text file.

Returns

SegmentList object.

filter(filter_key, keep=True)[source]

split(idx, num_parts)[source]

classmethod merge(rttm_list, index_by_file=True)[source]

merge_adjacent_segments(t_margin=0)[source]

__eq__(other)[source]: Equal operator

__ne__(other)[source]: Non-equal operator

__cmp__(other)[source]: Comparison operator

get_segment_names_from_timestamps(file_id, timestamps, segment_type='SPEAKER', min_seg_dur=0.1)[source]

get_files_with_names_diff_to_file(file_id, segment_type='SPEAKER')[source]

prepend_file_id_to_name(segment_type='SPEAKER')[source]

get_segments_from_file(file_id)[source]

get_uniq_names_for_file(file_id=None)[source]

get_bin_frame_mask_for_spk(file_id, name, frame_length=0.025, frame_shift=0.01, snip_edges=False, signal_length=None, max_frames=None)[source]

Returns binary mask of a given speaker to select feature frames

Parameters

file_id – file identifier
name – speaker id
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option
signal_length – total duration of the signal, if None it takes it from the last timestamp
max_frames – expected number of frames, if None it computes automatically

Returns

Binary VAD np.array

get_bin_sample_mask_for_spk(file_id, name, fs, signal_length=None, max_samples=None)[source]

Returns binary mask of a given speaker to select waveform samples

Parameters

file_id – file identifier
name – speaker id
fs – sampling frequency
signal_length – total duration of the signal, if None it takes it from the last timestamp
max_frames – expected number of frames, if None it computes automatically

Returns

Binary mask np.array

compute_stats(nbins_dur=None)[source]

to_segment_list()[source]

sort()[source]

tbeg_is_sorted()[source]

Kaldi Matrix Read/Write Classes

These are classes to read/write text and binary matrices from ARK files. They support the compression methods in Kaldi ARK files.

class hyperion.utils.kaldi_matrix.KaldiMatrix(data)[source]

Class to read/write uncompressed kaldi matrices/vectors.

When compressed matrix is found in file, it calls KaldiCompressedMatrix class automatically to uncompress.

data: numpy array with the matrix/vector values.

__init__(data)[source]

to_ndarray()[source]

Returns: numpy array containing the matrix/vector

property num_rows

property num_cols

classmethod read(f, binary, row_offset=0, num_rows=0, sequential_mode=True)[source]

Reads kaldi matrix/vector from file.

Parameters

f – Python file object
binary – True if we read from binary file and False if we read from text file.
row_offset – Reads matrix starting from a given row instead of row 0.
num_rows – Num. of rows to read, if 0 if read all the rows.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access.

Returns

KaldiMatrix object.

write(f, binary)[source]

Writes matrix/vector to ark file.

Parameters

f – Python file object.
binary – True if we write in binary file and False if we write to text file.

static read_shape(f, binary, sequential_mode=True)[source]

Reads the shape of the current matrix/vector in the ark file.

Parameters

f – Python file object
binary – True if we read from binary file and False if we read from text file.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access. In sequential_mode=True it moves the file pointer to the next matrix.

Returns

Tuple object with shape.

class hyperion.utils.kaldi_matrix.KaldiCompressedMatrix(data=None)[source]

Class to read/write compressed kaldi matrices.

When compressed matrix is found in file, it calls KaldiCompressedMatrix class automatically to uncompress.

data: numpy byte array with the compressed coded matrix.

data_format: {1, 2, 3, 4}

min_value: Minimum value in the matrix.

data_range: max_value - min_value

num_rows: Number of rows in the matrix

num_columns: Number of columns in the matrix

__init__(data=None)[source]

get_data_attrs()[source]

Returns: Coded matrix values in 2D format. Dictionary object with data attributes: data_format, min_value, data_range, percentiles.

classmethod build_from_data_attrs(data, attrs)[source]

Builds object from coded values and attributes

Parameters

data – Coded matrix values in 2D format.
attrs – Dictionary object with data attributes: data_format, min_value, data_range, percentiles.

Returns

KaldiCompressedMatrix object.

_unpack_header()[source]: Unpacks attributes from header

_pack_header()[source]: Creates header from the object attributes

scale(alpha)[source]: Multiplies matrix by alpha

_compute_global_header(mat, method)[source]

Computes the header

Parameters

mat – numpy array with the uncompressed matrix.
method – Compression method.

Returns

Byte array with header.

static _get_read_info(header, row_offset=0, num_rows=0)[source]: Gets info needed to read the matrix from file

static _data_size(header)[source]

Returns: Number of bytes of the coded matrix.

classmethod compress(mat, method='auto')[source]

Creates compressed matrix from uncompressed numpy matrix :param mat: numpy array with the uncompressed matrix. :param method: Compression method.

Returns: KaldiCompressedMatrix object.

_compute_column_header(v)[source]

Creates the column headers for the speech-feat compression.

Parameters: v – numpy array with the column to compress.
Returns: Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values.

_compress_column(v)[source]

Compress column for the speech-feat compression.

Parameters: v – numpy array with the column to compress.
Returns: Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values. Byte array with the coded column.

_uncompress_column(col_header, col_data)[source]

Compress column for the speech-feat compression.

Parameters

col_header – Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values.
col_data – Byte array with the coded column.

Returns

numpy array with the uncompressed column

static _float_to_char(v, p0, p25, p75, p100)[source]: Codes the column from float to bytes using the given percentiles

static _char_to_float(v, p0, p25, p75, p100)[source]: Decodes the column from bytes to float using the given percentiles

to_ndarray()[source]: Uncompresses matrix to numpy array. :returns: numpy array with uncompressed matrix.

to_matrix()[source]: Uncompresses matrix to KaldiMatrix object. :returns: KaldiMatrix with uncompressed matrix.

classmethod read(f, binary, row_offset=0, num_rows=0, sequential_mode=True)[source]

Reads kaldi compressed matrix/vector from file.

Parameters

f – Python file object
binary – True if we read from binary file and False if we read from text file.
row_offset – Reads matrix starting from a given row instead of row 0.
num_rows – Num. of rows to read, if 0 if read all the rows.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access.

Returns

KaldiCompressedMatrix object.

write(f, binary)[source]

Writes matrix/vector to ark file.

Parameters

f – Python file object.
binary – True if we write in binary file and False if we write to text file.

static read_shape(f, binary, sequential_mode=True)[source]

Reads the shape of the current matrix/vector in the ark file.

Parameters

f – Python file object
binary – True if we read from binary file and False if we read from text file.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access. In sequential_mode=True it moves the file pointer to the next matrix.

Returns

Tuple object with shape.

Kaldi I/O Functions

Utils to read/write binary ARK files

Functions to write and read kaldi files

hyperion.utils.kaldi_io_funcs.init_kaldi_output_stream(f, binary)[source]: Writes Kaldi Ark file binary marker.

hyperion.utils.kaldi_io_funcs.init_kaldi_input_stream(f)[source]: Reads Kaldi Ark file binary marker.

hyperion.utils.kaldi_io_funcs.check_token(token)[source]: Checks that token doesn’t have spaces.

hyperion.utils.kaldi_io_funcs.is_token(token)[source]: Checks if token is a valid token.

hyperion.utils.kaldi_io_funcs.read_token(f, binary)[source]: Reads next token from Ark file.

hyperion.utils.kaldi_io_funcs.write_token(f, binary, token)[source]: Writes token to Ark file.

hyperion.utils.kaldi_io_funcs.peek(f, binary, num_bytes=1)[source]: Peeks num_bytes from Ark file.

hyperion.utils.kaldi_io_funcs.read_int32(f, binary)[source]: Reads Int32 from Ark file.

hyperion.utils.kaldi_io_funcs.write_int32(f, binary, val)[source]: Writes Int32 val to Ark file.

VAD Utils

Functions to manipulate VAD output, convert from binary to timestamps, intersect VADs, etc.

hyperion.utils.vad_utils.merge_vad_timestamps(in_timestamps, tol=0.001)[source]

Merges vad timestamps that are contiguous

Parameters

in_timestamps – original time-stamps in start-time, end-time format
tol – tolerance, segments separted less than tol will be merged

Returns

Merged timestamps

hyperion.utils.vad_utils.bin_vad_to_timestamps(vad, frame_length, frame_shift, snip_edges=False, merge_tol=0.001)[source]

Converts binary VAD to a list of start end time stamps

Parameters

vad – Binary VAD
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option
merge_tol – tolerance to merge contiguous segments

Returns

VAD time stamps refered to the begining of the file

hyperion.utils.vad_utils.vad_timestamps_to_bin(in_timestamps, frame_length, frame_shift, snip_edges=False, signal_length=None, max_frames=None)[source]

Converts VAD time-stamps to a binary vector

Parameters

in_timestamps – vad timestamps
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option
signal_length – total duration of the signal, if None it takes it from the last timestamp
max_frames – expected number of frames, if None it computes automatically

Returns

Binary VAD np.array

hyperion.utils.vad_utils.timestamps_wrt_vad_to_absolute_timestamps(in_timestamps, vad_timestamps)[source]

Converts time stamps relative to a signal with silence removed

to absoulute time stamps in the original signal

VAD is provided in start-end timestamps format also.

Parameters

in_timestamps – time stamps relative to a signal with silence removed
vad_timestamps – vad timestamps used to remove silence from signal

Returns

Absolute VAD time-stamps

hyperion.utils.vad_utils.timestamps_wrt_bin_vad_to_absolute_timestamps(in_timestamps, vad, frame_length, frame_shift, snip_edges=False)[source]

Converts time stamps relative to a signal with silence removed

to absoulute time stamps in the original signal

VAD is provided in binary format

Parameters

in_timestamps – time stamps relative to a signal with silence removed
vad – Binary VAD
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option

Returns

Absolute VAD time-stamps

hyperion.utils.vad_utils.intersect_segment_timestamps_with_vad(in_timestamps, vad_timestamps)[source]

Intersects a list of segment timestamps with a VAD time-stamps: It returns only the segments that contain speech modifying the start and end times to remove silence from the segments.

Parameters

in_timestamps – time stamps of a list of segments refered to time 0.
vad_timestamps – vad timestamps

Returns

Boolean array indicating which input segments contain speech Array of output segments with silence removed Array of indices, one index for each output segment indicating to which

input speech segment correspond to. The index correspond to input segments after removing input segments that only contain silence.

Math Functions

Some math functions.

hyperion.utils.math.logdet_pdmat(A)[source]: Log determinant of positive definite matrix.

hyperion.utils.math.invert_pdmat(A, right_inv=False, return_logdet=False, return_inv=False)[source]

Inversion of positive definite matrices.: Returns lambda function f that multiplies the inverse of A times a vector.

Parameters

A – Positive definite matrix
right_inv – If False, f(v)=A^{-1}v; if True f(v)=v’ A^{-1}
return_logdet – If True, it also returns the log determinant of A.
return_inv – If True, it also returns A^{-1}

Returns

Lambda function that multiplies A^{-1} times vector. Cholesky transform of A upper triangular Log determinant of A A^{-1}

hyperion.utils.math.invert_trimat(A, lower=False, right_inv=False, return_logdet=False, return_inv=False)[source]

Inversion of triangular matrices.: Returns lambda function f that multiplies the inverse of A times a vector.

Parameters

A – Triangular matrix.
lower – if True A is lower triangular, else A is upper triangular.
right_inv – If False, f(v)=A^{-1}v; if True f(v)=v’ A^{-1}
return_logdet – If True, it also returns the log determinant of A.
return_inv – If True, it also returns A^{-1}

Returns

Lambda function that multiplies A^{-1} times vector. Log determinant of A A^{-1}

hyperion.utils.math.softmax(r, axis=- 1)[source]

Returns: y = exp(r)/sum(exp(r))

hyperion.utils.math.logsumexp(r, axis=- 1)[source]

Returns: y = log sum(exp(r))

hyperion.utils.math.logsigmoid(x)[source]

Returns: y = log(sigmoid(x))

hyperion.utils.math.neglogsigmoid(x)[source]

Returns: y = -log(sigmoid(x))

hyperion.utils.math.sigmoid(x)[source]

Returns: y = sigmoid(x)

hyperion.utils.math.fisher_ratio(mu1, Sigma1, mu2, Sigma2)[source]: Computes the Fisher ratio between two classes from the class means and covariances.

hyperion.utils.math.fisher_ratio_with_precs(mu1, Lambda1, mu2, Lambda2)[source]: Computes the Fisher ratio between two classes from the class means precisions.

hyperion.utils.math.symmat2vec(A, lower=False, diag_factor=None)[source]

Puts a symmetric matrix into a vector.

Parameters

A – Symmetric matrix.
lower – If True, it uses the lower triangular part of the matrix. If False, it uses the upper triangular part of the matrix.
diag_factor – It multiplies the diagonal of A by diag_factor.

Returns

Vector with the upper or lower triangular part of A.

hyperion.utils.math.vec2symmat(v, lower=False, diag_factor=None)[source]

Puts a vector back into a symmetric matrix.

Parameters

v – Vector with the upper or lower triangular part of A.
lower – If True, v contains the lower triangular part of the matrix. If False, v contains the upper triangular part of the matrix.
diag_factor – It multiplies the diagonal of A by diag_factor.

Returns

Symmetric matrix.

hyperion.utils.math.trimat2vec(A, lower=False)[source]

Puts a triangular matrix into a vector.

Parameters

A – Triangular matrix.
lower – If True, it uses the lower triangular part of the matrix. If False, it uses the upper triangular part of the matrix.

Returns

Vector with the upper or lower triangular part of A.

hyperion.utils.math.vec2trimat(v, lower=False)[source]

Puts a vector back into a triangular matrix.

Parameters

v – Vector with the upper or lower triangular part of A.
lower – If True, v contains the lower triangular part of the matrix. If False, v contains the upper triangular part of the matrix.

Returns

Triangular matrix.

hyperion.utils.math.fullcov_varfloor(S, F, F_is_chol=False, lower=False)[source]

Variance flooring for full covariance matrices.

Parameters

S – Covariance.
F – Minimum cov or Cholesqy decomposisition of it
F_is_chol – If True F is Cholesqy decomposition
lower – True if cholF is lower triangular, False otherwise

Returns

Floored covariance

hyperion.utils.math.fullcov_varfloor_from_cholS(cholS, cholF, lower=False)[source]

Variance flooring for full covariance matrices: using Cholesky decomposition as input/output

Parameters

cholS – Cholesqy decomposisition of the covariance.
cholF – Cholesqy decomposisition of the minimum covariance.
lower – True if matrices are lower triangular, False otherwise

Returns

Cholesky decomposition of the floored covariance

hyperion.utils.math.int2onehot(class_ids, num_classes=None)[source]

Integer to 1-hot vector.

Parameters

class_ids – Numpy array of integers.
num_classes – Maximum number of classes.

Returns

1-hot Numpy array.

hyperion.utils.math.cosine_scoring(x1, x2)[source]

Miscellaneous Functions

Miscellaneous functions

hyperion.utils.misc.generate_data(g)[source]

hyperion.utils.misc.str2bool(s)[source]: Convert string to bool for argparse

hyperion.utils.misc.apply_gain_logx(x, AdB)[source]: Applies A dB gain to log(x)

hyperion.utils.misc.apply_gain_logx2(x, AdB)[source]: Applies A dB gain to log(x^2)

hyperion.utils.misc.apply_gain_x(x, AdB)[source]: Applies A dB gain to x

hyperion.utils.misc.apply_gain_x2(x, AdB)[source]: Applies A dB gain to x^2

hyperion.utils.misc.apply_gain(x, feat_type, AdB)[source]

hyperion.utils.misc.energy_vad(P)[source]

hyperion.utils.misc.compute_snr(x, n, axis=- 1)[source]

hyperion.utils.misc.filter_args(valid_args, kwargs)[source]

Filters arguments from a dictionary

Parameters

valid_args – list/tuple of valid arguments
kwargs – dictionary containing program config arguments

Returns: Dictionary with only valid_args keys if they exists