Input/Output Utilities

The hyperion.io module contains several classes to read/write audio-files and features

Audio Read/Write Classes

Audio Reader Classes

These are classes to read audio files.

class hyperion.io.audio_reader.AudioReader(file_path, segments_path=None, wav_scale=32767)[source]

Class to read audio files from wav, flac or pipe

file_path: scp file with formant file_key wavspecifier (audio_file/pipe) or SCPList object.

segments_path: segments file with format: segment_id file_id tbeg tend

wav_scale: multiplies signal by scale factor

__init__(file_path, segments_path=None, wav_scale=32767)[source]

property keys

__enter__()[source]

Function required when entering contructions of type

with AudioReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with AudioReader(‘file.h5’) as f:: keys, data = f.read()

static read_wavspecifier(wavspecifier, scale=32768, time_offset=0, time_dur=0)[source]

Reads an audiospecifier (audio_file/pipe): It reads from pipe or from all the files that can be read by libsndfile <http://www.mega-nerd.com/libsndfile/#Features>

Parameters

wavspecifier – A pipe, wav, flac, ogg file etc.
scale – Multiplies signal by scale factor
time_offset – float indicating the start time to read in the utterance.
time_durs – floats indicating the number of seconds to read from the utterance, if 0 it reads untils the end

static read_pipe(wavspecifier, scale=32768)[source]: Reads wave file from a pipe :param wavspecifier: Shell command with pipe output :param scale: Multiplies signal by scale factor

_read_segment(segment, time_offset=0, time_dur=0)[source]

Reads a wave segment

Parameters: segment – pandas DataFrame (segment_id , file_id, tbeg, tend)
Returns: Wave, sampling frequency

read()[source]

class hyperion.io.audio_reader.SequentialAudioReader(file_path, segments_path=None, wav_scale=32767, part_idx=1, num_parts=1)[source]

__init__(file_path, segments_path=None, wav_scale=32767, part_idx=1, num_parts=1)[source]

__iter__()[source]: Needed to build an iterator, e.g.: r = SequentialAudioReader(…) for key, s, fs in r:

print(key) process(s)

__next__()[source]: Needed to build an iterator, e.g.: r = SequentialAudioReader(…) for key , s, fs in r:

process(s)

next()[source]: __next__ for Python 2

reset()[source]: Returns the file pointer to the begining of the dataset, then we can start reading the features again.

eof()[source]

End of file.

Returns: True, when we have read all the recordings in the dataset.

read(num_records=0, time_offset=0, time_durs=0)[source]

Reads next num_records audio files

Parameters

num_records – Number of audio files to read.
time_offset – List of floats indicating the start time to read in the utterance.
time_durs – List of floats indicating the number of seconds to read from each utterance

Returns

List of recording names. data: List of waveforms fs: list of sample freqs

Return type

key

static filter_args(**kwargs)[source]

static add_class_args(parser, prefix=None)[source]

static add_argparse_args(parser, prefix=None)

__enter__()

Function required when entering contructions of type

with AudioReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with AudioReader(‘file.h5’) as f:: keys, data = f.read()

_read_segment(segment, time_offset=0, time_dur=0)

Reads a wave segment

Parameters: segment – pandas DataFrame (segment_id , file_id, tbeg, tend)
Returns: Wave, sampling frequency

property keys

static read_pipe(wavspecifier, scale=32768): Reads wave file from a pipe :param wavspecifier: Shell command with pipe output :param scale: Multiplies signal by scale factor

static read_wavspecifier(wavspecifier, scale=32768, time_offset=0, time_dur=0)

Reads an audiospecifier (audio_file/pipe): It reads from pipe or from all the files that can be read by libsndfile <http://www.mega-nerd.com/libsndfile/#Features>

Parameters

wavspecifier – A pipe, wav, flac, ogg file etc.
scale – Multiplies signal by scale factor
time_offset – float indicating the start time to read in the utterance.
time_durs – floats indicating the number of seconds to read from the utterance, if 0 it reads untils the end

class hyperion.io.audio_reader.RandomAccessAudioReader(file_path, segments_path=None, wav_scale=32767)[source]

__init__(file_path, segments_path=None, wav_scale=32767)[source]

_read(keys, time_offset=0, time_durs=0)[source]

Reads the waveforms for the recordings in keys.

Parameters: keys – List of recording/segment_ids names.
Returns: List of waveforms
Return type: data

__enter__()

Function required when entering contructions of type

with AudioReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with AudioReader(‘file.h5’) as f:: keys, data = f.read()

_read_segment(segment, time_offset=0, time_dur=0)

Reads a wave segment

Parameters: segment – pandas DataFrame (segment_id , file_id, tbeg, tend)
Returns: Wave, sampling frequency

property keys

read(keys, time_offset=0, time_durs=0)[source]

Reads the waveforms for the recordings in keys.

Parameters: keys – List of recording/segment_ids names.
Returns: List of waveforms fs: List of sampling freq.
Return type: data

static read_pipe(wavspecifier, scale=32768): Reads wave file from a pipe :param wavspecifier: Shell command with pipe output :param scale: Multiplies signal by scale factor

static read_wavspecifier(wavspecifier, scale=32768, time_offset=0, time_dur=0)

Reads an audiospecifier (audio_file/pipe): It reads from pipe or from all the files that can be read by libsndfile <http://www.mega-nerd.com/libsndfile/#Features>

Parameters

wavspecifier – A pipe, wav, flac, ogg file etc.
scale – Multiplies signal by scale factor
time_offset – float indicating the start time to read in the utterance.
time_durs – floats indicating the number of seconds to read from the utterance, if 0 it reads untils the end

static filter_args(**kwargs)[source]

static add_class_args(parser, prefix=None)[source]

static add_argparse_args(parser, prefix=None)

Audio Writer Classes

These are classes to write audio files.

class hyperion.io.audio_writer.AudioWriter(output_path, script_path=None, audio_format='wav', audio_subtype=None, scp_sep=' ')[source]

Abstract base class to write audio files.

output_path: output data file path.

script_path: optional output scp file.

audio_format: audio file format

audio_subtype: subtype of audio in [PCM_16, PCM_32, FLOAT, DOUBLE, …], if None, it uses soundfile defaults (recommended)

scp_sep: Separator for scp files (default ‘ ‘).

__init__(output_path, script_path=None, audio_format='wav', audio_subtype=None, scp_sep=' ')[source]

__enter__()[source]

Function required when entering contructions of type

with AudioWriter(‘./path’) as f:: f.write(key, data)

__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with AudioWriter(‘./path’) as f:: f.write(key, data)

close()[source]: Closes the script file if open

write(keys, data, fs)[source]

Writes waveform to audio file.

Parameters

key – List of recodings names.
data – List of waveforms
fs –

static filter_args(**kwargs)[source]

static add_class_args(parser, prefix=None)[source]

static add_argparse_args(parser, prefix=None)

Features Read/Write Classes

These are classes to read feature files in ARK or HDF5 format.

Feature Reader/Writer Factory Classes

These are Factory Classes that generate Data Reader or Writer objects

class hyperion.io.data_rw_factory.DataWriterFactory[source]

Class to create object that write data to hdf5/ark files.

static create(wspecifier, compress=False, compression_method='auto', scp_sep=' ')[source]

static filter_args(**kwargs)[source]

static add_class_args(parser, prefix=None)[source]

class hyperion.io.data_rw_factory.SequentialDataReaderFactory[source]

static create(rspecifier, path_prefix=None, scp_sep=' ', **kwargs)[source]

static filter_args(**kwargs)[source]

static add_class_args(parser, prefix=None)[source]

class hyperion.io.data_rw_factory.RandomAccessDataReaderFactory[source]

static create(rspecifier, path_prefix=None, transform=None, scp_sep=' ')[source]

static filter_args(**kwargs)[source]

static add_class_args(parser, prefix=None)[source]

static add_argparse_args(parser, prefix=None)

Functions to write and read kaldi files

class hyperion.io.rw_specifiers.ArchiveType(value)[source]

Types of archive: hdf5, Kaldi Ark or packed-audio files.

H5 = 0

ARK = 1

AUDIO = 2

SEGMENT_LIST = 3

RTTM = 4

class hyperion.io.rw_specifiers.WSpecType(value)[source]

Type of Kaldi stype write specifiers.

NO = 0

ARCHIVE = 1

SCRIPT = 2

BOTH = 3

class hyperion.io.rw_specifiers.WSpecifier(spec_type, archive, script, archive_type=ArchiveType.H5, binary=True, flush=False, permissive=False)[source]

Class to parse Kaldi style write specifier.

spec_type: WSpecType object describing the type of specfier: ARCHIVE: Specifier contains Ark or hdf5 file. SCRIPT: Specifier contains scp file. BOTH: Specifier contains Ark/hdf5 file and scp file.

archive: output data file path.

script: optional output scp file.

archive_type: type of data files. ARK: Kaldi Ark file. H5: hdf5 file.

binary: True if the the Ark file is binary, False if it is text file.

flush: If True, it flushes the output after writing each feature matrix.

permissive: when writing to an scp file only: will ignore missing scp entries

__init__(spec_type, archive, script, archive_type=ArchiveType.H5, binary=True, flush=False, permissive=False)[source]

classmethod create(wspecifier)[source]

Creates WSpecifier object from string.

Parameters: wspecifier – Write specifier string, e.g.: file.h5 h5:file.h5 ark:file.ark h5,scp:file.h5,file.scp ark,scp:file.ark,file.scp
Returns: WSpecifier object.

__eq__(other)[source]: Equal operator.

__ne__(other)[source]: Non-equal operator.

__cmp__(other)[source]: Comparison operator.

class hyperion.io.rw_specifiers.RSpecType(value)[source]

An enumeration.

NO = 0

ARCHIVE = 1

SCRIPT = 2

class hyperion.io.rw_specifiers.RSpecifier(spec_type, archive, archive_type=ArchiveType.H5, once=False, is_sorted=False, called_sorted=False, permissive=False, background=False)[source]

__init__(spec_type, archive, archive_type=ArchiveType.H5, once=False, is_sorted=False, called_sorted=False, permissive=False, background=False)[source]

property script

classmethod create(rspecifier)[source]

Feature Reader Classes

ARK Feature Reader Classes

class hyperion.io.ark_data_reader.SequentialArkDataReader(file_path, **kwargs)[source]

Abstract base class to read Ark feature files in sequential order.

Attributes:
file_path: ark or scp file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only
part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path: h5, ark or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]: Closes input file.

_seek(offset)[source]

Moves the pointer of the input file.

Parameters: offset – Byte where we want to put the pointer.

_open_archive(file_path, offset=0)[source]

Opens the current file if it is not open and moves the: file pointer to a given position. Closes previous open Ark files.

Parameters

file_path – File from which we want to read the next feature matrix.
offset – Byte position where feature matrix is in the file.

read_num_rows(num_records=0, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads all the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

read_dims(num_records=0, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__iter__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

abstract eof()

End of file.

Returns: True, when we have read all the recordings in the dataset.

next(): __next__ for Python 2

abstract read(num_records=0, squeeze=False, offset=0, num_rows=0)

Reads next num_records feature matrices/vectors.

Parameters

num_records – Number of feature matrices to read.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

abstract read_shapes(num_records=0, assert_same_dim=True)

Reads the shapes in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

abstract reset(): Returns the file pointer to the begining of the dataset, then we can start reading the features again.

class hyperion.io.ark_data_reader.SequentialArkFileDataReader(file_path, **kwargs)[source]

Class to read feature matrices/vectors in sequential order from a single Ark file.

Attributes:
file_path: Ark file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only
part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path: h5, ark or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

reset()[source]: Puts the file pointer back to the begining of the file

eof()[source]: Returns True when it reaches the end of the ark file.

property keys

read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters

num_records – Number of feature matrices to read.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__iter__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path, offset=0)

Opens the current file if it is not open and moves the: file pointer to a given position. Closes previous open Ark files.

Parameters

file_path – File from which we want to read the next feature matrix.
offset – Byte position where feature matrix is in the file.

_seek(offset)

Moves the pointer of the input file.

Parameters: offset – Byte where we want to put the pointer.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close(): Closes input file.

next(): __next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads all the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.ark_data_reader.SequentialArkScriptDataReader(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Class to read Ark feature files indexed by a scp file in sequential order.

Attributes:
file_path: scp file to read. path_prefix: If input_spec is a scp file, it pre-appends

path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

scp_sep: Separator for scp files (default ‘ ‘). transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only
part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path: h5, ark or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys

reset()[source]: Closes all the open Ark files and puts the read pointer pointing to the first element in the scp file.

eof()[source]: Returns True when all the elements in the scp have been read.

read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters

num_records – Number of feature matrices to read.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__iter__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path, offset=0)

Opens the current file if it is not open and moves the: file pointer to a given position. Closes previous open Ark files.

Parameters

file_path – File from which we want to read the next feature matrix.
offset – Byte position where feature matrix is in the file.

_seek(offset)

Moves the pointer of the input file.

Parameters: offset – Byte where we want to put the pointer.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close(): Closes input file.

next(): __next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads all the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.ark_data_reader.RandomAccessArkDataReader(file_path, path_prefix=None, transform=None, permissive=False, scp_sep=' ')[source]

Class to read Ark files in random order, using scp file to index the Ark files.

Attributes:
file_path: scp file to read. path_prefix: If input_spec is a scp file, it pre-appends

path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

transform: TransformList object, applies a transformation to the
features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file
it returns an empty matrix, if False it raises an exception.

scp_sep: Separator for scp files (default ‘ ‘).

__init__(file_path, path_prefix=None, transform=None, permissive=False, scp_sep=' ')[source]

Abstract base class to read Ark or hdf5 feature files in: random order.

file_path: h5 or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys

close()[source]: Closes all the open Ark files.

_open_archive(key_idx, offset=0)[source]

Opens the Ark file correspoding to a given feature/matrix

if it is not already open and moves the file pointer to the point where we can read that feature matrix.

If the file was already open, it only moves the file pointer.

Parameters

key_idx – Integer position of the feature matrix in the scp file.
offset – Byte where we can find the feature matrix in the Ark file.

Returns

Python file object. threading.Lock object corresponding to the file

read_num_rows(keys, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of rows.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

read_dims(keys, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of columns.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

read_shapes(keys, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the shapes.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

read(keys, squeeze=False, row_offset=0, num_rows=0)[source]

Reads the feature matrices/vectors for the recordings in keys.

Parameters

keys – List of recording names from which we want to retrieve the feature matrices/vectors.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

HDF5 Feature Reader Classes

Classes to read data from hdf5 files.

hyperion.io.h5_data_reader._read_h5_data(dset, row_offset=0, num_rows=0, transform=None)[source]

Auxiliary function to read the feature matrix from hdf5 dataset.: It decompresses the data if it was compressed.

Parameters

dset – hdf5 dataset correspoding to a feature matrix/vector.
row_offset – First row to read from each feature matrix.
num_rows – Number of rows to read from the feature matrix. If 0 it reads all the rows.
transform – TransformList object, applies a transformation to the features after reading them from disk.

Returns

Numpy array with feature matrix/vector.

class hyperion.io.h5_data_reader.SequentialH5DataReader(file_path, **kwargs)[source]

Abstract base class to read hdf5 feature files in sequential order.

Attributes:
file_path: ark or scp file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only
part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path: h5, ark or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]: Closes current hdf5 file.

_open_archive(file_path)[source]: Opens the hdf5 file where the next matrix/vector is if it is not open. If there was another hdf5 file open, it closes it.

read_num_rows(num_records=0, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

read_dims(num_records=0, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__iter__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

abstract eof()

End of file.

Returns: True, when we have read all the recordings in the dataset.

next(): __next__ for Python 2

abstract read(num_records=0, squeeze=False, offset=0, num_rows=0)

Reads next num_records feature matrices/vectors.

Parameters

num_records – Number of feature matrices to read.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

abstract read_shapes(num_records=0, assert_same_dim=True)

Reads the shapes in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

abstract reset(): Returns the file pointer to the begining of the dataset, then we can start reading the features again.

class hyperion.io.h5_data_reader.SequentialH5FileDataReader(file_path, **kwargs)[source]

Class to read feature matrices/vectors in sequential order from a single hdf5 file.

Attributes:
file_path: Ark file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only
part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path: h5, ark or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys

reset()[source]: Puts the file pointer back to the begining of the file

eof()[source]: Returns True when it reaches the end of the ark file.

read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters

num_records – Number of feature matrices to read.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__iter__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path): Opens the hdf5 file where the next matrix/vector is if it is not open. If there was another hdf5 file open, it closes it.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close(): Closes current hdf5 file.

next(): __next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.h5_data_reader.SequentialH5ScriptDataReader(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Class to read features from multiple hdf5 files where a scp file indicates which hdf5 file contains each feature matrix.

Attributes:
file_path: scp file to read. path_prefix: If input_spec is a scp file, it pre-appends

path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

scp_sep: Separator for scp files (default ‘ ‘). transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only
part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path: h5, ark or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys

reset()[source]: Closes all the open hdf5 files and puts the read pointer pointing to the first element in the scp file.

eof()[source]: Returns True when all the elements in the scp have been read.

read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters

num_records – Number of feature matrices to read.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__iter__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__(): Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path): Opens the hdf5 file where the next matrix/vector is if it is not open. If there was another hdf5 file open, it closes it.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close(): Closes current hdf5 file.

next(): __next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters

num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.h5_data_reader.RandomAccessH5DataReader(file_path, transform=None, permissive=False)[source]

Abstract base class to read hdf5 feature files in random order.

Attributes:
file_path: hdf5 or scp file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file
it returns an empty matrix, if False it raises an exception.

__init__(file_path, transform=None, permissive=False)[source]

Abstract base class to read Ark or hdf5 feature files in: random order.

file_path: h5 or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

read_num_rows(keys, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of rows.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

read_dims(keys, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of columns.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

abstract close(): Closes input file.

abstract read(keys, squeeze=False, offset=0, num_rows=0)

Reads the feature matrices/vectors for the recordings in keys.

Parameters

keys – List of recording names from which we want to retrieve the feature matrices/vectors.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

abstract read_shapes(keys=None, assert_same_dim=True)

Reads the shapes in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the shapes.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

class hyperion.io.h5_data_reader.RandomAccessH5FileDataReader(file_path, **kwargs)[source]

Class to read from a single hdf5 file in random order

file_path: scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files in: random order.

file_path: h5 or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]: Closes the hdf5 files.

_open_archive(file_path)[source]: Open the hdf5 file it it is not open.

property keys

read_shapes(keys, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the shapes.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

read(keys, squeeze=False, row_offset=0, num_rows=0)[source]

Reads the feature matrices/vectors for the recordings in keys.

Parameters

keys – List of recording names from which we want to retrieve the feature matrices/vectors.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

read_dims(keys, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of columns.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

read_num_rows(keys, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of rows.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

class hyperion.io.h5_data_reader.RandomAccessH5ScriptDataReader(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Class to read multiple hdf5 files in random order, where a scp file indicates which hdf5 file contains each feature matrix.

file_path: scp file to read.

path_prefix: If input_spec is a scp file, it pre-appends path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

scp_sep: Separator for scp files (default ‘ ‘).

__init__(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files in: random order.

file_path: h5 or scp file to read.

transform: TransformList object, applies a transformation to the features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]: Closes all the open hdf5 files.

property keys

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:: keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)

Modifies shape given the user defined row_offset and num_rows to read.: If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters

shape – Original shape of the feature matrix.
row_offset – User defined row_offset, first frame to read.
num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)

Combines two frame ranges.

One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters

read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.
row_offset – User defined row_offset.
num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(key_idx)[source]

Opens the hdf5 file correspoding to a given feature/matrix: if it is not already open.

Parameters: key_idx – Integer position of the feature matrix in the scp file.
Returns: Python file object.

static _squeeze(data, permissive=False)

Converts list of matrices to 3D numpy array or: list of vectors to 2D numpy array.

Parameters

data – List of matrices or vectors.
permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

read_dims(keys, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of columns.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

read_num_rows(keys, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the number of rows.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

read_shapes(keys, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters

keys – List of recording names from which we want to retrieve the shapes.
assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

read(keys, squeeze=False, row_offset=0, num_rows=0)[source]

Reads the feature matrices/vectors for the recordings in keys.

Parameters

keys – List of recording names from which we want to retrieve the feature matrices/vectors.
squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.
offset – List of integers or numpy array of with the first row to read from each feature matrix.
num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

Feature Writer Classes

ARK Feature Reader Classes

class hyperion.io.ark_data_writer.ArkDataWriter(archive_path, script_path=None, binary=True, **kwargs)[source]

Class to write Ark feature files.

archive_path: output data file path.

script_path: optional output scp file.

binary: True if the the Ark file is binary, False if it is text file.

flush[source]: If True, it flushes the output after writing each feature file.

compress: It True, it uses Kaldi compression.

compression_method: Kaldi compression method: {auto (default), speech_feat,

2byte-auto, 2byte-signed-integer, 1byte-auto, 1byte-unsigned-integer, 1byte-0-1}.

scp_sep: Separator for scp files (default ‘ ‘).

__init__(archive_path, script_path=None, binary=True, **kwargs)[source]

__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with ArkDataWriter(‘file.h5’) as f:
f.write(key, data)

It closes the output file.

close()[source]: Closes the output file

flush()[source]: Flushes the file

__enter__()

Function required when entering contructions of type

with DataWriter(‘file.h5’) as f:: f.write(key, data)

_convert_data(data)[source]: Converts the feature matrix from numpy array to KaldiMatrix or KaldiCompressedMatrix.

write(keys, data)[source]

Writes data to file.

Parameters

key – List of recodings names.
data – List of Feature matrices or vectors. If all the matrices have the same dimension it can be a 3D numpy array. If they are vectors, it can be a 2D numpy array.

HDF5 Feature Reader Classes

class hyperion.io.h5_data_writer.H5DataWriter(archive_path, script_path=None, **kwargs)[source]

Class to write hdf5 feature files.

archive_path: output data file path.

script_path: optional output scp file.

flush[source]: If True, it flushes the output after writing each feature file.

compress: It True, it uses Kaldi compression.

compression_method: Kaldi compression method: {auto (default), speech_feat,

2byte-auto, 2byte-signed-integer, 1byte-auto, 1byte-unsigned-integer, 1byte-0-1}.

scp_sep: Separator for scp files (default ‘ ‘).

__init__(archive_path, script_path=None, **kwargs)[source]

__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with H5DataWriter(‘file.h5’) as f:
f.write(key, data)

It closes the output file.

close()[source]: Closes the output file

flush()[source]: Flushes the file

_convert_data(data)[source]

Converts data to the format for saving. Compresses the data it needed. :param Numpy array feature matrix/vector.:

Returns: Numpy array to save in h5 file. Atrributes for the hdf5 dataset with information about the compression.

__enter__()

Function required when entering contructions of type

with DataWriter(‘file.h5’) as f:: f.write(key, data)

write(keys, data)[source]

Writes data to file.

Parameters

key – List of recodings names.
data – List of Feature matrices or vectors. If all the matrices have the same dimension it can be a 3D numpy array. If they are vectors, it can be a 2D numpy array.

VAD Read/Write Classes

VAD Reader Factory Classes

These are Factory Classes that generate VAD Reader objects.

class hyperion.io.vad_rw_factory.VADReaderFactory[source]

static create(rspecifier, path_prefix=None, scp_sep=' ', frame_length=25, frame_shift=10, snip_edges=False)[source]

static filter_args(**kwargs)[source]

static add_class_args(parser, prefix=None)[source]

static add_argparse_args(parser, prefix=None)

VAD Reader Classes

class hyperion.io.bin_vad_reader.BinVADReader(rspecifier, path_prefix=None, scp_sep=' ', frame_length=25, frame_shift=10, snip_edges=False)[source]

__init__(rspecifier, path_prefix=None, scp_sep=' ', frame_length=25, frame_shift=10, snip_edges=False)[source]

read_num_frames(keys)[source]

read(keys, squeeze=False, offset=0, num_frames=0, frame_length=25, frame_shift=10, snip_edges=False, signal_lengths=None)[source]

read_timestamps(keys, merge_tol=0.001)[source]

__enter__()

Function required when entering contructions of type

with VADReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with VADReader(‘file.h5’) as f:: keys, data = f.read()

close(): Closes input file.

class hyperion.io.segment_vad_reader.SegmentVADReader(segments_file, permissive=False)[source]

__init__(segments_file, permissive=False)[source]

read(keys, squeeze=False, offset=0, num_frames=0, frame_length=25, frame_shift=10, snip_edges=False, signal_lengths=None)[source]

read_timestamps(keys, merge_tol=0)[source]

__enter__()

Function required when entering contructions of type

with VADReader(‘file.h5’) as f:: keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with VADReader(‘file.h5’) as f:: keys, data = f.read()

close(): Closes input file.