Utils
The hyperion.utils module contains several utility classes and functions.
Trial Management Classes
These are a series of utils to handle Trial Indices, Keys and Scores. These are based on the MATLAB implementations in the BOSARIS Toolkit.
- class hyperion.utils.trial_key.TrialKey(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
- Contains the trial key for speaker recognition trials.
Bosaris compatible Key.
- model_set
List of model names.
- seg_set
List of test segment names.
- tar
Boolean matrix with target trials to True (num_models x num_segments).
- non
Boolean matrix with non-target trials to True (num_models x num_segments).
- model_cond
Conditions related to the model.
- seg_cond
Conditions related to the test segment.
- trial_cond
Conditions related to the combination of model and test segment.
- model_cond_name
String list with the names of the model conditions.
- seg_cond_name
String list with the names of the segment conditions.
- trial_cond_name
String list with the names of the trial conditions.
- __init__(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
- property num_models
- property num_tests
- save(file_path)[source]
Saves object to txt/h5 file.
- Parameters
file_path – File to write the list.
- save_txt(file_path)[source]
Saves object to txt file.
- Parameters
file_path – File to write the list.
- classmethod load(file_path)[source]
Loads object from txt/h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialKey object.
- classmethod load_h5(file_path)[source]
Loads object from h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialKey object.
- classmethod load_txt(file_path)[source]
Loads object from txt file
- Parameters
file_path – File to read the list.
- Returns
TrialKey object.
- classmethod merge(key_list)[source]
Merges several key objects.
- Parameters
key_list – List of TrialKey objects.
- Returns
Merged TrialKey object.
- filter(model_set, seg_set, keep=True)[source]
Removes elements from TrialKey object.
- Parameters
model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.
- Returns
Filtered TrialKey object.
- split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
- Splits the TrialKey into num_model_parts x num_seg_parts and returns part
(model_idx, seg_idx).
- Parameters
model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.
- Returns
Subpart of the TrialKey
- class hyperion.utils.trial_ndx.TrialNdx(model_set=None, seg_set=None, trial_mask=None)[source]
- Contains the trial index to run speaker recognition trials.
Bosaris compatible Ndx.
- model_set
List of model names.
- seg_set
List of test segment names.
- trial_mask
Boolean matrix with the trials to execute to True (num_models x num_segments).
- property num_models
- property num_tests
- save(file_path)[source]
Saves object to txt/h5 file.
- Parameters
file_path – File to write the list.
- save_txt(file_path)[source]
Saves object to txt file.
- Parameters
file_path – File to write the list.
- classmethod load(file_path)[source]
Loads object from txt/h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialNdx object.
- classmethod load_h5(file_path)[source]
Loads object from h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialNdx object.
- classmethod load_txt(file_path)[source]
Loads object from txt file
- Parameters
file_path – File to read the list.
- Returns
TrialNdx object.
- classmethod merge(ndx_list)[source]
Merges several index objects.
- Parameters
key_list – List of TrialNdx objects.
- Returns
Merged TrialNdx object.
- static parse_eval_set(ndx, enroll, test=None, eval_set='enroll-test')[source]
Prepares the data structures required for evaluation.
- Parameters
ndx – TrialNdx object cotaining the trials for the main evaluation.
enroll – Utt2Info where key are file_ids and second column are model names
test – Utt2Info of where key are test segments names. Needed in the cases enroll-coh and coh-coh.
eval_test – Type of of evaluation enroll-test: main evaluation of enrollment vs test segments. enroll-coh: enrollment vs cohort segments. coh-test: cohort vs test segments. coh-coh: cohort vs cohort segments.
- Returns
TrialNdx object enroll: SCPList
- Return type
ndx
- filter(model_set, seg_set, keep=True)[source]
Removes elements from TrialNdx object.
- Parameters
model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.
- Returns
Filtered TrialNdx object.
- split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
- Splits the TrialNdx into num_model_parts x num_seg_parts and returns part
(model_idx, seg_idx).
- Parameters
model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.
- Returns
Subpart of the TrialNdx
- apply_segmentation_to_test(segment_list)[source]
Splits test segment into multiple sub-segments Useful to create ndx for spk diarization or tracking.
- Parameters
segment_list – ExtSegmentList object with mapping of file_id to ext_segment_id
- Returns
New TrialNdx object with segment_ids in test instead of file_id.
- class hyperion.utils.trial_scores.TrialScores(model_set=None, seg_set=None, scores=None, score_mask=None)[source]
- Contains the scores for the speaker recognition trials.
Bosaris compatible Scores.
- model_set
List of model names.
- seg_set
List of test segment names.
- scores
Matrix with the scores (num_models x num_segments).
- score_mask
Boolean matrix with the trials with valid scores to True (num_models x num_segments).
- property num_models
- property num_tests
- save(file_path)[source]
Saves object to txt/h5 file.
- Parameters
file_path – File to write the list.
- save_txt(file_path)[source]
Saves object to txt file.
- Parameters
file_path – File to write the list.
- classmethod load(file_path)[source]
Loads object from txt/h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialScores object.
- classmethod load_h5(file_path)[source]
Loads object from h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialScores object.
- classmethod load_txt(file_path)[source]
Loads object from h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialScores object.
- classmethod merge(scr_list)[source]
Merges several score objects.
- Parameters
scr_list – List of TrialNdx objects.
- Returns
Merged TrialScores object.
- filter(model_set, seg_set, keep=True, raise_missing=True)[source]
Removes elements from TrialScores object.
- Parameters
model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.
raise_missing – Raises exception if there are elements in model_set or seg_set that are not in the object.
- Returns
Filtered TrialScores object.
- split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
- Splits the TrialScores into num_model_parts x num_seg_parts and returns part
(model_idx, seg_idx).
- Parameters
model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.
- Returns
Subpart of the TrialScores
- align_with_ndx(ndx, raise_missing=True)[source]
Aligns scores, model_set and seg_set with TrialNdx or TrialKey.
- Parameters
ndx – TrialNdx or TrialKey object.
raise_missing – Raises exception if there are trials in ndx that are not in the score object.
- Returns
Aligned TrialScores object.
- get_tar_non(key)[source]
Returns target and non target scores.
- Parameters
key – TrialKey object.
- Returns
Numpy array with target scores. Numpy array with non-target scores.
- set_missing_to_value(ndx, val)[source]
Aligns the scores with a TrialNdx and sets the trials with missing scores to the same value.
- Parameters
ndx – TrialNdx or TrialKey object.
val – Value for the missing scores.
- Returns
Aligned TrialScores object.
- class hyperion.utils.trial_stats.TrialStats(df_stats)[source]
Contains anciliary statistics from the trial such us quality measures like SNR
This class was created to store statistics about adversarial attacks like SNR (signal-to-perturbation ratio), Linf, L2 norms of the perturbation etc.
- df_stats
pandas dataframe containing the stats. The dataframe needs to include the modelid and segmentid columns
- classmethod load(file_path)[source]
Loads stats file
- Parameters
file_path – stats file in csv format
- Returns
TrialScores object.
- get_stats_mat(stat_name, ndx, raise_missing=True)[source]
Returns a matrix of trial statistics sorted to match a give Ndx or Key object
- Parameters
stat_name – name of the statatistic (e.g. snr, linf), as given in the column name of the dataframe.
ndx – Ndx or Key object
- Returns
Stat matrix (n_models x n_tests)
- class hyperion.utils.sparse_trial_key.SparseTrialKey(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
- Contains the trial key for speaker recognition trials.
Bosaris compatible Key.
- model_set
List of model names.
- seg_set
List of test segment names.
- tar
Boolean matrix with target trials to True (num_models x num_segments).
- non
Boolean matrix with non-target trials to True (num_models x num_segments).
- model_cond
Conditions related to the model.
- seg_cond
Conditions related to the test segment.
- trial_cond
Conditions related to the combination of model and test segment.
- model_cond_name
String list with the names of the model conditions.
- seg_cond_name
String list with the names of the segment conditions.
- trial_cond_name
String list with the names of the trial conditions.
- __init__(model_set=None, seg_set=None, tar=None, non=None, model_cond=None, seg_cond=None, trial_cond=None, model_cond_name=None, seg_cond_name=None, trial_cond_name=None)[source]
- save_txt(file_path)[source]
Saves object to txt file.
- Parameters
file_path – File to write the list.
- classmethod load_h5(file_path)[source]
Loads object from h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialKey object.
- classmethod load_txt(file_path)[source]
Loads object from txt file
- Parameters
file_path – File to read the list.
- Returns
TrialKey object.
- classmethod merge(key_list)[source]
Merges several key objects.
- Parameters
key_list – List of TrialKey objects.
- Returns
Merged TrialKey object.
- __cmp__(other)
Comparison operator
- __ne__(other)
Non-equal operator
- copy()
Makes a copy of the object
- filter(model_set, seg_set, keep=True)
Removes elements from TrialKey object.
- Parameters
model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.
- Returns
Filtered TrialKey object.
- classmethod load(file_path)
Loads object from txt/h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialKey object.
- property num_models
- property num_tests
- save(file_path)
Saves object to txt/h5 file.
- Parameters
file_path – File to write the list.
- sort()
Sorts the object by model and test segment names.
- split(model_idx, num_model_parts, seg_idx, num_seg_parts)
- Splits the TrialKey into num_model_parts x num_seg_parts and returns part
(model_idx, seg_idx).
- Parameters
model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.
- Returns
Subpart of the TrialKey
- test()
- class hyperion.utils.sparse_trial_scores.SparseTrialScores(model_set=None, seg_set=None, scores=None, score_mask=None)[source]
- Contains the scores for the speaker recognition trials.
Bosaris compatible Scores.
- model_set
List of model names.
- seg_set
List of test segment names.
- scores
Matrix with the scores (num_models x num_segments).
- score_mask
Boolean matrix with the trials with valid scores to True (num_models x num_segments).
- save_txt(file_path)[source]
Saves object to txt file.
- Parameters
file_path – File to write the list.
- classmethod load_h5(file_path)[source]
Loads object from h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialScores object.
- classmethod load_txt(file_path)[source]
Loads object from h5 file
- Parameters
file_path – File to read the list.
- Returns
SparseTrialScores object.
- classmethod merge(scr_list)[source]
Merges several score objects.
- Parameters
scr_list – List of TrialNdx objects.
- Returns
Merged TrialScores object.
- split(model_idx, num_model_parts, seg_idx, num_seg_parts)[source]
- Splits the TrialScores into num_model_parts x num_seg_parts and returns part
(model_idx, seg_idx).
- Parameters
model_idx – Model index of the part to return from 1 to num_model_parts.
num_model_parts – Number of parts to split the model list.
seg_idx – Segment index of the part to return from 1 to num_model_parts.
num_seg_parts – Number of parts to split the test segment list.
- Returns
Subpart of the TrialScores
- filter(model_set, seg_set, keep=True, raise_missing=True)[source]
Removes elements from TrialScores object.
- Parameters
model_set – List of models to keep or remove.
seg_set – List of test segments to keep or remove.
keep – If True, we keep the elements in model_set/seg_set, if False, we remove the elements in model_set/seg_set.
raise_missing – Raises exception if there are elements in model_set or seg_set that are not in the object.
- Returns
Filtered TrialScores object.
- align_with_ndx(ndx, raise_missing=True)[source]
Aligns scores, model_set and seg_set with TrialNdx or TrialKey.
- Parameters
ndx – TrialNdx or TrialKey object.
raise_missing – Raises exception if there are trials in ndx that are not in the score object.
- Returns
Aligned TrialScores object.
- get_tar_non(key)[source]
Returns target and non target scores.
- Parameters
key – TrialKey object.
- Returns
Numpy array with target scores. Numpy array with non-target scores.
- set_missing_to_value(ndx, val)[source]
Aligns the scores with a TrialNdx and sets the trials with missing scores to the same value.
- Parameters
ndx – TrialNdx or TrialKey object.
val – Value for the missing scores.
- Returns
Aligned SparseTrialScores object.
- __cmp__(other)
Comparison operator
- __ne__(other)
Non-equal operator
- copy()
Makes a copy of the object
- classmethod load(file_path)
Loads object from txt/h5 file
- Parameters
file_path – File to read the list.
- Returns
TrialScores object.
- property num_models
- property num_tests
- save(file_path)
Saves object to txt/h5 file.
- Parameters
file_path – File to write the list.
- sort()
Sorts the object by model and test segment names.
- test()
- transform(f)
Applies a function to the valid scores of the object.
- Parameters
f – function handle.
Kaldi Data Directory Manipulaton Classes
Thise are classes to manipulate Kaldi data directory files like wav.scp, utt2spk, segments, rttm.
- class hyperion.utils.scp_list.SCPList(key, file_path, offset=None, range_spec=None)[source]
Class to manipulate script lists.
- key
segment key name.
- file_path
path to the file on hard drive, wav, ark or hdf5 file.
- offset
Byte in Ark file where the data is located.
- range_spec
range of frames (rows) to read.
- key_to_index
Dictionary that returns the position of a key in the list.
- __getitem__(key)[source]
- It allows to acces the data in the list by key or index like in
a ditionary, e.g.: If input is a string key:
scp = SCPList(keys, file_paths, offsets, ranges) file_path, offset, range = scp[‘data1’]
- If input is an index:
key, file_path, offset, range = scp[0]
- Parameters
key – String key or integer index.
- Returns
file_path, offset and range_spec given the key. If key is the index in the key list:
key, file_path, offset and range_spec given the index.
- Return type
If key is a string
- save(file_path, sep=' ', offset_sep=':')[source]
Saves script list to text file.
- Parameters
file_path – File to write the list.
sep – Separator between the key and file_path in the text file.
offset_sep – Separator between file_path and offset.
- static parse_script(script, offset_sep)[source]
Parses the parts of the second field of the scp text file.
- Parameters
script – Second column of scp file.
offset_sep – Separtor between file_path and offset.
- Returns
file_path, offset and range_spec.
- classmethod load(file_path, sep=' ', offset_sep=':', is_wav=False)[source]
Loads script list from text file.
- Parameters
file_path – File to read the list.
sep – Separator between the key and file_path in the text file.
offset_sep – Separator between file_path and offset.
- Returns
SCPList object.
- split(idx, num_parts, group_by_key=True)[source]
Splits SCPList into num_parts and return part idx.
- Parameters
idx – Part to return from 1 to num_parts.
num_parts – Number of parts to split the list.
group_by_key – If True, all the lines with the same key go to the same part.
- Returns
Sub SCPList
- classmethod merge(scp_lists)[source]
Merges several SCPList.
- Parameters
scp_lists – List of SCPLists
- Returns
SCPList object concatenation the scp_lists.
- filter(filter_key, keep=True)[source]
Removes elements from SCPList ojbect by key
- Parameters
filter_key – List with the keys of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;
- Returns
SCPList object.
- filter_paths(filter_key, keep=True)[source]
Removes elements of SCPList by file_path
- Parameters
filter_key – List with the file_path of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;
- Returns
SCPList object.
- filter_index(index, keep=True)[source]
Removes elements of SCPList by index
- Parameters
filter_key – List with the index of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;
- Returns
SCPList object.
- class hyperion.utils.utt2info.Utt2Info(utt_info)[source]
Class to manipulate utt2spk, utt2lang, etc. files.
- key
segment key name.
- info
- key_to_index
Dictionary that returns the position of a key in the list.
- property num_info_fields
- property key
- property info
- __getitem__(key)[source]
- It allows to acces the data in the list by key or index like in
a ditionary, e.g.: If input is a string key:
utt2spk = Utt2Info(info) spk_id = utt2spk[‘data1’]
- If input is an index:
key, spk_id = utt2spk[0]
- Parameters
key – String key or integer index.
- Returns
info corresponding to key If key is the index in the key list:
key, info given index
- Return type
If key is a string
- save(file_path, sep=' ')[source]
Saves uttinfo to text file.
- Parameters
file_path – File to write the list.
sep – Separator between the key and file_path in the text file.
- classmethod load(file_path, sep=' ', dtype={0: <class 'str'>, 1: <class 'str'>})[source]
Loads utt2info list from text file.
- Parameters
file_path – File to read the list.
sep – Separator between the key and file_path in the text file.
dtype – Dictionary with the dtypes of each column.
- Returns
Utt2Info object
- split(idx, num_parts, group_by_field=0)[source]
Splits SCPList into num_parts and return part idx.
- Parameters
idx – Part to return from 1 to num_parts.
num_parts – Number of parts to split the list.
group_by_field – All the lines with the same value in column groub_by_field go to the same part
- Returns
Sub Utt2Info object
- classmethod merge(info_lists)[source]
Merges several Utt2Info tables.
- Parameters
info_lists – List of Utt2Info
- Returns
Utt2Info object concatenation the info_lists.
- filter(filter_key, keep=True)[source]
Removes elements from Utt2Info object by key
- Parameters
filter_key – List with the keys of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;
- Returns
Utt2Info object.
- filter_info(filter_key, field=1, keep=True)[source]
Removes elements of Utt2Info by info value
- Parameters
filter_key – List with the file_path of the elements to keep or remove.
field – Field number corresponding to the info to filter
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;
- Returns
Utt2Info object.
- filter_index(index, keep=True)[source]
Removes elements of Utt2Info by index
- Parameters
filter_key – List with the index of the elements to keep or remove.
keep – If True, we keep the elements in filter_key; if False, we remove the elements in filter_key;
- Returns
Utt2Info object.
- class hyperion.utils.segment_list.SegmentList(segments, index_by_file=True)[source]
Class to manipulate segment files
- segments
Pandas dataframe.
- _index_by_file
if True the df is index by file name, if False by segment id.
- iter_idx
index of the current element for the iterator.
- uniq_file_id
unique file names.
- property index_by_file
- property file_id
- property segment_id
- property tbeg
- property tend
- getitem_by_key(key)[source]
- It acceses the segments by file_id or segment_id
like in a ditionary, e.g.: If input is a string key:
segmetns = SegmentList(…) segment, tbeg, tend = segments.getiem_by_key(‘file’)
- Parameters
key – Segment or file key
- Returns
if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame
- getitem_by_index(index)[source]
- It accesses the segments by index
like in a ditionary, e.g.: If input is a string key:
segmetns = SegmentList(…) segment, tbeg, tend = segments.getitem_by_index(0)
- Parameters
key – Segment or file key
- Returns
if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame
- __getitem__(key)[source]
- It accesses the de segments by file_id or segment_id
like in a ditionary, e.g.: If input is a string key:
segmetns = SegmentList(…) segment, tbeg, tend = segments[‘file’]
- Parameters
key – Segment or file key
- Returns
if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame
- save(file_path, sep=' ')[source]
Saves segments to text file.
- Parameters
file_path – File to write the list.
sep – Separator between the fields
- classmethod load(file_path, sep=' ', index_by_file=True)[source]
Loads script list from text file.
- Parameters
file_path – File to read the list.
sep – Separator between the key and file_path in the text file.
- Returns
SegmentList object.
- to_bin_vad(key, frame_shift=10, num_frames=None)[source]
Converts segments to binary VAD
- Parameters
key – Segment or file key
frame_shift – frame_shift in milliseconds
num_frames – number of frames of file corresponding to key, if None it takes the maximum tend for file
- Returns
if index_by_file is True if returns VAD joining all segments of one file else if returns VAD for one given segment
- class hyperion.utils.rttm.RTTM(segments, index_by_file=True)[source]
Class to manipulate rttm files
- df
Pandas dataframe.
- _index_by_file
if True the df is indexed by file name, if False by segment id.
- iter_idx
index of the current element for the iterator.
- unique_file_key
unique file names.
- classmethod create(segment_type, file_id, chnl=None, tbeg=None, tdur=None, ortho=None, stype=None, name=None, conf=None, slat=None, index_by_file=True)[source]
- classmethod create_spkdiar(file_id, tbeg, tdur, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]
- classmethod create_spkdiar_single_file(file_id, tbeg, tdur, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]
- classmethod create_spkdiar_from_segments(segments, spk_id, conf=None, chnl=None, index_by_file=True, prepend_file_id=False)[source]
- classmethod create_spkdiar_from_ext_segments(ext_segments, chnl=None, index_by_file=True, prepend_file_id=False)[source]
- property index_by_file
- property file_id
- property tbeg
- property tdur
- property name
- property num_files
- property total_num_spks
- property num_spks_per_file
- property avg_num_spks_per_file
- __getitem__(key)[source]
- It allows to acces the de segments by file_id or segment
like in a ditionary, e.g.: If input is a string key:
segmetns = SegmentList(…) segment, tbeg, tend = segments[‘file’]
- Parameters
key – Segment or file key
- Returns
if index_by_file is True if returns segments of a given file_id in SegmentsList format, else it returns DataFrame
- save(file_path, sep=' ')[source]
Saves segments to text file.
- Parameters
file_path – File to write the list.
sep – Separator between the fields
- classmethod load(file_path, sep=' ', index_by_file=True)[source]
Loads script list from text file.
- Parameters
file_path – File to read the list.
sep – Separator between the key and file_path in the text file.
- Returns
SegmentList object.
- get_segment_names_from_timestamps(file_id, timestamps, segment_type='SPEAKER', min_seg_dur=0.1)[source]
- get_bin_frame_mask_for_spk(file_id, name, frame_length=0.025, frame_shift=0.01, snip_edges=False, signal_length=None, max_frames=None)[source]
Returns binary mask of a given speaker to select feature frames
- Parameters
file_id – file identifier
name – speaker id
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option
signal_length – total duration of the signal, if None it takes it from the last timestamp
max_frames – expected number of frames, if None it computes automatically
- Returns
Binary VAD np.array
- get_bin_sample_mask_for_spk(file_id, name, fs, signal_length=None, max_samples=None)[source]
Returns binary mask of a given speaker to select waveform samples
- Parameters
file_id – file identifier
name – speaker id
fs – sampling frequency
signal_length – total duration of the signal, if None it takes it from the last timestamp
max_frames – expected number of frames, if None it computes automatically
- Returns
Binary mask np.array
Kaldi Matrix Read/Write Classes
These are classes to read/write text and binary matrices from ARK files. They support the compression methods in Kaldi ARK files.
- class hyperion.utils.kaldi_matrix.KaldiMatrix(data)[source]
Class to read/write uncompressed kaldi matrices/vectors.
When compressed matrix is found in file, it calls KaldiCompressedMatrix class automatically to uncompress.
- data
numpy array with the matrix/vector values.
- property num_rows
- property num_cols
- classmethod read(f, binary, row_offset=0, num_rows=0, sequential_mode=True)[source]
Reads kaldi matrix/vector from file.
- Parameters
f – Python file object
binary – True if we read from binary file and False if we read from text file.
row_offset – Reads matrix starting from a given row instead of row 0.
num_rows – Num. of rows to read, if 0 if read all the rows.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access.
- Returns
KaldiMatrix object.
- write(f, binary)[source]
Writes matrix/vector to ark file.
- Parameters
f – Python file object.
binary – True if we write in binary file and False if we write to text file.
- static read_shape(f, binary, sequential_mode=True)[source]
Reads the shape of the current matrix/vector in the ark file.
- Parameters
f – Python file object
binary – True if we read from binary file and False if we read from text file.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access. In sequential_mode=True it moves the file pointer to the next matrix.
- Returns
Tuple object with shape.
- class hyperion.utils.kaldi_matrix.KaldiCompressedMatrix(data=None)[source]
Class to read/write compressed kaldi matrices.
When compressed matrix is found in file, it calls KaldiCompressedMatrix class automatically to uncompress.
- data
numpy byte array with the compressed coded matrix.
- data_format
{1, 2, 3, 4}
- min_value
Minimum value in the matrix.
- data_range
max_value - min_value
- num_rows
Number of rows in the matrix
- num_columns
Number of columns in the matrix
- get_data_attrs()[source]
- Returns
Coded matrix values in 2D format. Dictionary object with data attributes: data_format, min_value, data_range, percentiles.
- classmethod build_from_data_attrs(data, attrs)[source]
Builds object from coded values and attributes
- Parameters
data – Coded matrix values in 2D format.
attrs – Dictionary object with data attributes: data_format, min_value, data_range, percentiles.
- Returns
KaldiCompressedMatrix object.
- _compute_global_header(mat, method)[source]
Computes the header
- Parameters
mat – numpy array with the uncompressed matrix.
method – Compression method.
- Returns
Byte array with header.
- static _get_read_info(header, row_offset=0, num_rows=0)[source]
Gets info needed to read the matrix from file
- classmethod compress(mat, method='auto')[source]
Creates compressed matrix from uncompressed numpy matrix :param mat: numpy array with the uncompressed matrix. :param method: Compression method.
- Returns
KaldiCompressedMatrix object.
- _compute_column_header(v)[source]
Creates the column headers for the speech-feat compression.
- Parameters
v – numpy array with the column to compress.
- Returns
Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values.
- _compress_column(v)[source]
Compress column for the speech-feat compression.
- Parameters
v – numpy array with the column to compress.
- Returns
Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values. Byte array with the coded column.
- _uncompress_column(col_header, col_data)[source]
Compress column for the speech-feat compression.
- Parameters
col_header – Byte array with the header of the column containg the 0, 25, 75 and 100 percentile values.
col_data – Byte array with the coded column.
- Returns
numpy array with the uncompressed column
- static _float_to_char(v, p0, p25, p75, p100)[source]
Codes the column from float to bytes using the given percentiles
- static _char_to_float(v, p0, p25, p75, p100)[source]
Decodes the column from bytes to float using the given percentiles
- to_ndarray()[source]
Uncompresses matrix to numpy array. :returns: numpy array with uncompressed matrix.
- to_matrix()[source]
Uncompresses matrix to KaldiMatrix object. :returns: KaldiMatrix with uncompressed matrix.
- classmethod read(f, binary, row_offset=0, num_rows=0, sequential_mode=True)[source]
Reads kaldi compressed matrix/vector from file.
- Parameters
f – Python file object
binary – True if we read from binary file and False if we read from text file.
row_offset – Reads matrix starting from a given row instead of row 0.
num_rows – Num. of rows to read, if 0 if read all the rows.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access.
- Returns
KaldiCompressedMatrix object.
- write(f, binary)[source]
Writes matrix/vector to ark file.
- Parameters
f – Python file object.
binary – True if we write in binary file and False if we write to text file.
- static read_shape(f, binary, sequential_mode=True)[source]
Reads the shape of the current matrix/vector in the ark file.
- Parameters
f – Python file object
binary – True if we read from binary file and False if we read from text file.
sequential_mode – True if we are reading the ark file sequentially and False if we are using random access. In sequential_mode=True it moves the file pointer to the next matrix.
- Returns
Tuple object with shape.
Kaldi I/O Functions
Utils to read/write binary ARK files
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
Functions to write and read kaldi files
- hyperion.utils.kaldi_io_funcs.init_kaldi_output_stream(f, binary)[source]
Writes Kaldi Ark file binary marker.
VAD Utils
Functions to manipulate VAD output, convert from binary to timestamps, intersect VADs, etc.
Copyright 2020 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
- hyperion.utils.vad_utils.merge_vad_timestamps(in_timestamps, tol=0.001)[source]
Merges vad timestamps that are contiguous
- Parameters
in_timestamps – original time-stamps in start-time, end-time format
tol – tolerance, segments separted less than tol will be merged
- Returns
Merged timestamps
- hyperion.utils.vad_utils.bin_vad_to_timestamps(vad, frame_length, frame_shift, snip_edges=False, merge_tol=0.001)[source]
Converts binary VAD to a list of start end time stamps
- Parameters
vad – Binary VAD
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option
merge_tol – tolerance to merge contiguous segments
- Returns
VAD time stamps refered to the begining of the file
- hyperion.utils.vad_utils.vad_timestamps_to_bin(in_timestamps, frame_length, frame_shift, snip_edges=False, signal_length=None, max_frames=None)[source]
Converts VAD time-stamps to a binary vector
- Parameters
in_timestamps – vad timestamps
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option
signal_length – total duration of the signal, if None it takes it from the last timestamp
max_frames – expected number of frames, if None it computes automatically
- Returns
Binary VAD np.array
- hyperion.utils.vad_utils.timestamps_wrt_vad_to_absolute_timestamps(in_timestamps, vad_timestamps)[source]
- Converts time stamps relative to a signal with silence removed
to absoulute time stamps in the original signal
VAD is provided in start-end timestamps format also.
- Parameters
in_timestamps – time stamps relative to a signal with silence removed
vad_timestamps – vad timestamps used to remove silence from signal
- Returns
Absolute VAD time-stamps
- hyperion.utils.vad_utils.timestamps_wrt_bin_vad_to_absolute_timestamps(in_timestamps, vad, frame_length, frame_shift, snip_edges=False)[source]
- Converts time stamps relative to a signal with silence removed
to absoulute time stamps in the original signal
VAD is provided in binary format
- Parameters
in_timestamps – time stamps relative to a signal with silence removed
vad – Binary VAD
frame_length – frame-length used to compute the VAD
frame_shift – frame-shift used to compute the VAD
snip_edges – if True, computing VAD used snip-edges option
- Returns
Absolute VAD time-stamps
- hyperion.utils.vad_utils.intersect_segment_timestamps_with_vad(in_timestamps, vad_timestamps)[source]
- Intersects a list of segment timestamps with a VAD time-stamps
It returns only the segments that contain speech modifying the start and end times to remove silence from the segments.
- Parameters
in_timestamps – time stamps of a list of segments refered to time 0.
vad_timestamps – vad timestamps
- Returns
Boolean array indicating which input segments contain speech Array of output segments with silence removed Array of indices, one index for each output segment indicating to which
input speech segment correspond to. The index correspond to input segments after removing input segments that only contain silence.
Math Functions
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
Some math functions.
- hyperion.utils.math.invert_pdmat(A, right_inv=False, return_logdet=False, return_inv=False)[source]
- Inversion of positive definite matrices.
Returns lambda function f that multiplies the inverse of A times a vector.
- Parameters
A – Positive definite matrix
right_inv – If False, f(v)=A^{-1}v; if True f(v)=v’ A^{-1}
return_logdet – If True, it also returns the log determinant of A.
return_inv – If True, it also returns A^{-1}
- Returns
Lambda function that multiplies A^{-1} times vector. Cholesky transform of A upper triangular Log determinant of A A^{-1}
- hyperion.utils.math.invert_trimat(A, lower=False, right_inv=False, return_logdet=False, return_inv=False)[source]
- Inversion of triangular matrices.
Returns lambda function f that multiplies the inverse of A times a vector.
- Parameters
A – Triangular matrix.
lower – if True A is lower triangular, else A is upper triangular.
right_inv – If False, f(v)=A^{-1}v; if True f(v)=v’ A^{-1}
return_logdet – If True, it also returns the log determinant of A.
return_inv – If True, it also returns A^{-1}
- Returns
Lambda function that multiplies A^{-1} times vector. Log determinant of A A^{-1}
- hyperion.utils.math.fisher_ratio(mu1, Sigma1, mu2, Sigma2)[source]
Computes the Fisher ratio between two classes from the class means and covariances.
- hyperion.utils.math.fisher_ratio_with_precs(mu1, Lambda1, mu2, Lambda2)[source]
Computes the Fisher ratio between two classes from the class means precisions.
- hyperion.utils.math.symmat2vec(A, lower=False, diag_factor=None)[source]
Puts a symmetric matrix into a vector.
- Parameters
A – Symmetric matrix.
lower – If True, it uses the lower triangular part of the matrix. If False, it uses the upper triangular part of the matrix.
diag_factor – It multiplies the diagonal of A by diag_factor.
- Returns
Vector with the upper or lower triangular part of A.
- hyperion.utils.math.vec2symmat(v, lower=False, diag_factor=None)[source]
Puts a vector back into a symmetric matrix.
- Parameters
v – Vector with the upper or lower triangular part of A.
lower – If True, v contains the lower triangular part of the matrix. If False, v contains the upper triangular part of the matrix.
diag_factor – It multiplies the diagonal of A by diag_factor.
- Returns
Symmetric matrix.
- hyperion.utils.math.trimat2vec(A, lower=False)[source]
Puts a triangular matrix into a vector.
- Parameters
A – Triangular matrix.
lower – If True, it uses the lower triangular part of the matrix. If False, it uses the upper triangular part of the matrix.
- Returns
Vector with the upper or lower triangular part of A.
- hyperion.utils.math.vec2trimat(v, lower=False)[source]
Puts a vector back into a triangular matrix.
- Parameters
v – Vector with the upper or lower triangular part of A.
lower – If True, v contains the lower triangular part of the matrix. If False, v contains the upper triangular part of the matrix.
- Returns
Triangular matrix.
- hyperion.utils.math.fullcov_varfloor(S, F, F_is_chol=False, lower=False)[source]
Variance flooring for full covariance matrices.
- Parameters
S – Covariance.
F – Minimum cov or Cholesqy decomposisition of it
F_is_chol – If True F is Cholesqy decomposition
lower – True if cholF is lower triangular, False otherwise
- Returns
Floored covariance
- hyperion.utils.math.fullcov_varfloor_from_cholS(cholS, cholF, lower=False)[source]
- Variance flooring for full covariance matrices
using Cholesky decomposition as input/output
- Parameters
cholS – Cholesqy decomposisition of the covariance.
cholF – Cholesqy decomposisition of the minimum covariance.
lower – True if matrices are lower triangular, False otherwise
- Returns
Cholesky decomposition of the floored covariance
Miscellaneous Functions
Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
Miscellaneous functions