Input/Output Utilities

The hyperion.io module contains several classes to read/write audio-files and features

Audio Read/Write Classes

Audio Reader Classes

These are classes to read audio files.

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.audio_reader.AudioReader(file_path, segments_path=None, wav_scale=32767)[source]

Class to read audio files from wav, flac or pipe

file_path

scp file with formant file_key wavspecifier (audio_file/pipe) or SCPList object.

segments_path

segments file with format: segment_id file_id tbeg tend

wav_scale

multiplies signal by scale factor

__init__(file_path, segments_path=None, wav_scale=32767)[source]
property keys
__enter__()[source]

Function required when entering contructions of type

with AudioReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with AudioReader(‘file.h5’) as f:

keys, data = f.read()

static read_wavspecifier(wavspecifier, scale=32768, time_offset=0, time_dur=0)[source]
Reads an audiospecifier (audio_file/pipe)

It reads from pipe or from all the files that can be read by libsndfile <http://www.mega-nerd.com/libsndfile/#Features>

Parameters
  • wavspecifier – A pipe, wav, flac, ogg file etc.

  • scale – Multiplies signal by scale factor

  • time_offset – float indicating the start time to read in the utterance.

  • time_durs – floats indicating the number of seconds to read from the utterance, if 0 it reads untils the end

static read_pipe(wavspecifier, scale=32768)[source]

Reads wave file from a pipe :param wavspecifier: Shell command with pipe output :param scale: Multiplies signal by scale factor

_read_segment(segment, time_offset=0, time_dur=0)[source]

Reads a wave segment

Parameters

segment – pandas DataFrame (segment_id , file_id, tbeg, tend)

Returns

Wave, sampling frequency

read()[source]
class hyperion.io.audio_reader.SequentialAudioReader(file_path, segments_path=None, wav_scale=32767, part_idx=1, num_parts=1)[source]
__init__(file_path, segments_path=None, wav_scale=32767, part_idx=1, num_parts=1)[source]
__iter__()[source]

Needed to build an iterator, e.g.: r = SequentialAudioReader(…) for key, s, fs in r:

print(key) process(s)

__next__()[source]

Needed to build an iterator, e.g.: r = SequentialAudioReader(…) for key , s, fs in r:

process(s)

next()[source]

__next__ for Python 2

reset()[source]

Returns the file pointer to the begining of the dataset, then we can start reading the features again.

eof()[source]

End of file.

Returns

True, when we have read all the recordings in the dataset.

read(num_records=0, time_offset=0, time_durs=0)[source]

Reads next num_records audio files

Parameters
  • num_records – Number of audio files to read.

  • time_offset – List of floats indicating the start time to read in the utterance.

  • time_durs – List of floats indicating the number of seconds to read from each utterance

Returns

List of recording names. data: List of waveforms fs: list of sample freqs

Return type

key

static filter_args(**kwargs)[source]
static add_class_args(parser, prefix=None)[source]
static add_argparse_args(parser, prefix=None)
__enter__()

Function required when entering contructions of type

with AudioReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with AudioReader(‘file.h5’) as f:

keys, data = f.read()

_read_segment(segment, time_offset=0, time_dur=0)

Reads a wave segment

Parameters

segment – pandas DataFrame (segment_id , file_id, tbeg, tend)

Returns

Wave, sampling frequency

property keys
static read_pipe(wavspecifier, scale=32768)

Reads wave file from a pipe :param wavspecifier: Shell command with pipe output :param scale: Multiplies signal by scale factor

static read_wavspecifier(wavspecifier, scale=32768, time_offset=0, time_dur=0)
Reads an audiospecifier (audio_file/pipe)

It reads from pipe or from all the files that can be read by libsndfile <http://www.mega-nerd.com/libsndfile/#Features>

Parameters
  • wavspecifier – A pipe, wav, flac, ogg file etc.

  • scale – Multiplies signal by scale factor

  • time_offset – float indicating the start time to read in the utterance.

  • time_durs – floats indicating the number of seconds to read from the utterance, if 0 it reads untils the end

class hyperion.io.audio_reader.RandomAccessAudioReader(file_path, segments_path=None, wav_scale=32767)[source]
__init__(file_path, segments_path=None, wav_scale=32767)[source]
_read(keys, time_offset=0, time_durs=0)[source]

Reads the waveforms for the recordings in keys.

Parameters

keys – List of recording/segment_ids names.

Returns

List of waveforms

Return type

data

__enter__()

Function required when entering contructions of type

with AudioReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with AudioReader(‘file.h5’) as f:

keys, data = f.read()

_read_segment(segment, time_offset=0, time_dur=0)

Reads a wave segment

Parameters

segment – pandas DataFrame (segment_id , file_id, tbeg, tend)

Returns

Wave, sampling frequency

property keys
read(keys, time_offset=0, time_durs=0)[source]

Reads the waveforms for the recordings in keys.

Parameters

keys – List of recording/segment_ids names.

Returns

List of waveforms fs: List of sampling freq.

Return type

data

static read_pipe(wavspecifier, scale=32768)

Reads wave file from a pipe :param wavspecifier: Shell command with pipe output :param scale: Multiplies signal by scale factor

static read_wavspecifier(wavspecifier, scale=32768, time_offset=0, time_dur=0)
Reads an audiospecifier (audio_file/pipe)

It reads from pipe or from all the files that can be read by libsndfile <http://www.mega-nerd.com/libsndfile/#Features>

Parameters
  • wavspecifier – A pipe, wav, flac, ogg file etc.

  • scale – Multiplies signal by scale factor

  • time_offset – float indicating the start time to read in the utterance.

  • time_durs – floats indicating the number of seconds to read from the utterance, if 0 it reads untils the end

static filter_args(**kwargs)[source]
static add_class_args(parser, prefix=None)[source]
static add_argparse_args(parser, prefix=None)

Audio Writer Classes

These are classes to write audio files.

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.audio_writer.AudioWriter(output_path, script_path=None, audio_format='wav', audio_subtype=None, scp_sep=' ')[source]

Abstract base class to write audio files.

output_path

output data file path.

script_path

optional output scp file.

audio_format

audio file format

audio_subtype

subtype of audio in [PCM_16, PCM_32, FLOAT, DOUBLE, …], if None, it uses soundfile defaults (recommended)

scp_sep

Separator for scp files (default ‘ ‘).

__init__(output_path, script_path=None, audio_format='wav', audio_subtype=None, scp_sep=' ')[source]
__enter__()[source]

Function required when entering contructions of type

with AudioWriter(‘./path’) as f:

f.write(key, data)

__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with AudioWriter(‘./path’) as f:

f.write(key, data)

close()[source]

Closes the script file if open

write(keys, data, fs)[source]

Writes waveform to audio file.

Parameters
  • key – List of recodings names.

  • data – List of waveforms

  • fs

static filter_args(**kwargs)[source]
static add_class_args(parser, prefix=None)[source]
static add_argparse_args(parser, prefix=None)

Features Read/Write Classes

These are classes to read feature files in ARK or HDF5 format.

Feature Reader/Writer Factory Classes

These are Factory Classes that generate Data Reader or Writer objects

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.data_rw_factory.DataWriterFactory[source]

Class to create object that write data to hdf5/ark files.

static create(wspecifier, compress=False, compression_method='auto', scp_sep=' ')[source]
static filter_args(**kwargs)[source]
static add_class_args(parser, prefix=None)[source]
class hyperion.io.data_rw_factory.SequentialDataReaderFactory[source]
static create(rspecifier, path_prefix=None, scp_sep=' ', **kwargs)[source]
static filter_args(**kwargs)[source]
static add_class_args(parser, prefix=None)[source]
class hyperion.io.data_rw_factory.RandomAccessDataReaderFactory[source]
static create(rspecifier, path_prefix=None, transform=None, scp_sep=' ')[source]
static filter_args(**kwargs)[source]
static add_class_args(parser, prefix=None)[source]
static add_argparse_args(parser, prefix=None)

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

Functions to write and read kaldi files

class hyperion.io.rw_specifiers.ArchiveType(value)[source]

Types of archive: hdf5, Kaldi Ark or packed-audio files.

H5 = 0
ARK = 1
AUDIO = 2
SEGMENT_LIST = 3
RTTM = 4
class hyperion.io.rw_specifiers.WSpecType(value)[source]

Type of Kaldi stype write specifiers.

NO = 0
ARCHIVE = 1
SCRIPT = 2
BOTH = 3
class hyperion.io.rw_specifiers.WSpecifier(spec_type, archive, script, archive_type=ArchiveType.H5, binary=True, flush=False, permissive=False)[source]

Class to parse Kaldi style write specifier.

spec_type

WSpecType object describing the type of specfier: ARCHIVE: Specifier contains Ark or hdf5 file. SCRIPT: Specifier contains scp file. BOTH: Specifier contains Ark/hdf5 file and scp file.

archive

output data file path.

script

optional output scp file.

archive_type

type of data files. ARK: Kaldi Ark file. H5: hdf5 file.

binary

True if the the Ark file is binary, False if it is text file.

flush

If True, it flushes the output after writing each feature matrix.

permissive

when writing to an scp file only: will ignore missing scp entries

__init__(spec_type, archive, script, archive_type=ArchiveType.H5, binary=True, flush=False, permissive=False)[source]
classmethod create(wspecifier)[source]

Creates WSpecifier object from string.

Parameters

wspecifier – Write specifier string, e.g.: file.h5 h5:file.h5 ark:file.ark h5,scp:file.h5,file.scp ark,scp:file.ark,file.scp

Returns

WSpecifier object.

__eq__(other)[source]

Equal operator.

__ne__(other)[source]

Non-equal operator.

__cmp__(other)[source]

Comparison operator.

class hyperion.io.rw_specifiers.RSpecType(value)[source]

An enumeration.

NO = 0
ARCHIVE = 1
SCRIPT = 2
class hyperion.io.rw_specifiers.RSpecifier(spec_type, archive, archive_type=ArchiveType.H5, once=False, is_sorted=False, called_sorted=False, permissive=False, background=False)[source]
__init__(spec_type, archive, archive_type=ArchiveType.H5, once=False, is_sorted=False, called_sorted=False, permissive=False, background=False)[source]
property script
classmethod create(rspecifier)[source]

Feature Reader Classes

ARK Feature Reader Classes

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.ark_data_reader.SequentialArkDataReader(file_path, **kwargs)[source]

Abstract base class to read Ark feature files in sequential order.

Attributes:

file_path: ark or scp file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only

part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path

h5, ark or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]

Closes input file.

_seek(offset)[source]

Moves the pointer of the input file.

Parameters

offset – Byte where we want to put the pointer.

_open_archive(file_path, offset=0)[source]
Opens the current file if it is not open and moves the

file pointer to a given position. Closes previous open Ark files.

Parameters
  • file_path – File from which we want to read the next feature matrix.

  • offset – Byte position where feature matrix is in the file.

read_num_rows(num_records=0, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads all the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

read_dims(num_records=0, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__iter__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

abstract eof()

End of file.

Returns

True, when we have read all the recordings in the dataset.

next()

__next__ for Python 2

abstract read(num_records=0, squeeze=False, offset=0, num_rows=0)

Reads next num_records feature matrices/vectors.

Parameters
  • num_records – Number of feature matrices to read.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

abstract read_shapes(num_records=0, assert_same_dim=True)

Reads the shapes in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

abstract reset()

Returns the file pointer to the begining of the dataset, then we can start reading the features again.

class hyperion.io.ark_data_reader.SequentialArkFileDataReader(file_path, **kwargs)[source]

Class to read feature matrices/vectors in sequential order from a single Ark file.

Attributes:

file_path: Ark file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only

part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path

h5, ark or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

reset()[source]

Puts the file pointer back to the begining of the file

eof()[source]

Returns True when it reaches the end of the ark file.

property keys
read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters
  • num_records – Number of feature matrices to read.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__iter__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path, offset=0)
Opens the current file if it is not open and moves the

file pointer to a given position. Closes previous open Ark files.

Parameters
  • file_path – File from which we want to read the next feature matrix.

  • offset – Byte position where feature matrix is in the file.

_seek(offset)

Moves the pointer of the input file.

Parameters

offset – Byte where we want to put the pointer.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close()

Closes input file.

next()

__next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads all the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.ark_data_reader.SequentialArkScriptDataReader(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Class to read Ark feature files indexed by a scp file in sequential order.

Attributes:

file_path: scp file to read. path_prefix: If input_spec is a scp file, it pre-appends

path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

scp_sep: Separator for scp files (default ‘ ‘). transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only

part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path

h5, ark or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys
reset()[source]

Closes all the open Ark files and puts the read pointer pointing to the first element in the scp file.

eof()[source]

Returns True when all the elements in the scp have been read.

read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters
  • num_records – Number of feature matrices to read.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__iter__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path, offset=0)
Opens the current file if it is not open and moves the

file pointer to a given position. Closes previous open Ark files.

Parameters
  • file_path – File from which we want to read the next feature matrix.

  • offset – Byte position where feature matrix is in the file.

_seek(offset)

Moves the pointer of the input file.

Parameters

offset – Byte where we want to put the pointer.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close()

Closes input file.

next()

__next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads all the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.ark_data_reader.RandomAccessArkDataReader(file_path, path_prefix=None, transform=None, permissive=False, scp_sep=' ')[source]

Class to read Ark files in random order, using scp file to index the Ark files.

Attributes:

file_path: scp file to read. path_prefix: If input_spec is a scp file, it pre-appends

path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

transform: TransformList object, applies a transformation to the

features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file

it returns an empty matrix, if False it raises an exception.

scp_sep: Separator for scp files (default ‘ ‘).

__init__(file_path, path_prefix=None, transform=None, permissive=False, scp_sep=' ')[source]
Abstract base class to read Ark or hdf5 feature files in

random order.

file_path

h5 or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys
close()[source]

Closes all the open Ark files.

_open_archive(key_idx, offset=0)[source]
Opens the Ark file correspoding to a given feature/matrix

if it is not already open and moves the file pointer to the point where we can read that feature matrix.

If the file was already open, it only moves the file pointer.

Parameters
  • key_idx – Integer position of the feature matrix in the scp file.

  • offset – Byte where we can find the feature matrix in the Ark file.

Returns

Python file object. threading.Lock object corresponding to the file

read_num_rows(keys, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of rows.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

read_dims(keys, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of columns.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

read_shapes(keys, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the shapes.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

read(keys, squeeze=False, row_offset=0, num_rows=0)[source]

Reads the feature matrices/vectors for the recordings in keys.

Parameters
  • keys – List of recording names from which we want to retrieve the feature matrices/vectors.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

HDF5 Feature Reader Classes

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

Classes to read data from hdf5 files.

hyperion.io.h5_data_reader._read_h5_data(dset, row_offset=0, num_rows=0, transform=None)[source]
Auxiliary function to read the feature matrix from hdf5 dataset.

It decompresses the data if it was compressed.

Parameters
  • dset – hdf5 dataset correspoding to a feature matrix/vector.

  • row_offset – First row to read from each feature matrix.

  • num_rows – Number of rows to read from the feature matrix. If 0 it reads all the rows.

  • transform – TransformList object, applies a transformation to the features after reading them from disk.

Returns

Numpy array with feature matrix/vector.

class hyperion.io.h5_data_reader.SequentialH5DataReader(file_path, **kwargs)[source]

Abstract base class to read hdf5 feature files in sequential order.

Attributes:

file_path: ark or scp file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only

part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path

h5, ark or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]

Closes current hdf5 file.

_open_archive(file_path)[source]

Opens the hdf5 file where the next matrix/vector is if it is not open. If there was another hdf5 file open, it closes it.

read_num_rows(num_records=0, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

read_dims(num_records=0, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__iter__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

abstract eof()

End of file.

Returns

True, when we have read all the recordings in the dataset.

next()

__next__ for Python 2

abstract read(num_records=0, squeeze=False, offset=0, num_rows=0)

Reads next num_records feature matrices/vectors.

Parameters
  • num_records – Number of feature matrices to read.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

abstract read_shapes(num_records=0, assert_same_dim=True)

Reads the shapes in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

abstract reset()

Returns the file pointer to the begining of the dataset, then we can start reading the features again.

class hyperion.io.h5_data_reader.SequentialH5FileDataReader(file_path, **kwargs)[source]

Class to read feature matrices/vectors in sequential order from a single hdf5 file.

Attributes:

file_path: Ark file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only

part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path

h5, ark or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys
reset()[source]

Puts the file pointer back to the begining of the file

eof()[source]

Returns True when it reaches the end of the ark file.

read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters
  • num_records – Number of feature matrices to read.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__iter__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path)

Opens the hdf5 file where the next matrix/vector is if it is not open. If there was another hdf5 file open, it closes it.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close()

Closes current hdf5 file.

next()

__next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.h5_data_reader.SequentialH5ScriptDataReader(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Class to read features from multiple hdf5 files where a scp file indicates which hdf5 file contains each feature matrix.

Attributes:

file_path: scp file to read. path_prefix: If input_spec is a scp file, it pre-appends

path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

scp_sep: Separator for scp files (default ‘ ‘). transform: TransformList object, applies a transformation to the

features after reading them from disk.

part_idx: It splits the input into num_parts and writes only

part part_idx, where part_idx=1,…,num_parts.

num_parts: Number of parts to split the input data. split_by_key: If True, all the elements with the same key go to the same part.

__init__(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Abstract base class to read Ark or hdf5 feature files.

file_path

h5, ark or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

property keys
reset()[source]

Closes all the open hdf5 files and puts the read pointer pointing to the first element in the scp file.

eof()[source]

Returns True when all the elements in the scp have been read.

read_shapes(num_records=0, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. List of tuples with num_records shapes.

read(num_records=0, squeeze=False, row_offset=0, num_rows=0)[source]

Reads next num_records feature matrices/vectors.

Parameters
  • num_records – Number of feature matrices to read.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of recording names. data: List of feature matrices/vectors or 3D/2D numpy array.

Return type

key

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__iter__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

__next__()

Needed to build an iterator, e.g.: r = SequentialDataReader(…) for key, data in r:

print(key, data)

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(file_path)

Opens the hdf5 file where the next matrix/vector is if it is not open. If there was another hdf5 file open, it closes it.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

close()

Closes current hdf5 file.

next()

__next__ for Python 2

read_dims(num_records=0, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of columns.

read_num_rows(num_records=0, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • num_records – How many matrices shapes to read, if num_records=0 it reads al the matrices in the dataset.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of num_records recording names. Integer numpy array with num_records number of rows.

class hyperion.io.h5_data_reader.RandomAccessH5DataReader(file_path, transform=None, permissive=False)[source]

Abstract base class to read hdf5 feature files in random order.

Attributes:

file_path: hdf5 or scp file to read. transform: TransformList object, applies a transformation to the

features after reading them from disk.

permissive: If True, if the data that we want to read is not in the file

it returns an empty matrix, if False it raises an exception.

__init__(file_path, transform=None, permissive=False)[source]
Abstract base class to read Ark or hdf5 feature files in

random order.

file_path

h5 or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

read_num_rows(keys, assert_same_dim=True)[source]

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of rows.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

read_dims(keys, assert_same_dim=True)[source]

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of columns.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

abstract close()

Closes input file.

abstract read(keys, squeeze=False, offset=0, num_rows=0)

Reads the feature matrices/vectors for the recordings in keys.

Parameters
  • keys – List of recording names from which we want to retrieve the feature matrices/vectors.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

abstract read_shapes(keys=None, assert_same_dim=True)

Reads the shapes in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the shapes.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

class hyperion.io.h5_data_reader.RandomAccessH5FileDataReader(file_path, **kwargs)[source]

Class to read from a single hdf5 file in random order

file_path

scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

__init__(file_path, **kwargs)[source]
Abstract base class to read Ark or hdf5 feature files in

random order.

file_path

h5 or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]

Closes the hdf5 files.

_open_archive(file_path)[source]

Open the hdf5 file it it is not open.

property keys
read_shapes(keys, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the shapes.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

read(keys, squeeze=False, row_offset=0, num_rows=0)[source]

Reads the feature matrices/vectors for the recordings in keys.

Parameters
  • keys – List of recording names from which we want to retrieve the feature matrices/vectors.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

read_dims(keys, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of columns.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

read_num_rows(keys, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of rows.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

class hyperion.io.h5_data_reader.RandomAccessH5ScriptDataReader(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]

Class to read multiple hdf5 files in random order, where a scp file indicates which hdf5 file contains each feature matrix.

file_path

scp file to read.

path_prefix

If input_spec is a scp file, it pre-appends path_prefix string to the second column of the scp file. This is useful when data is read from a different directory of that it was created.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

scp_sep

Separator for scp files (default ‘ ‘).

__init__(file_path, path_prefix=None, scp_sep=' ', **kwargs)[source]
Abstract base class to read Ark or hdf5 feature files in

random order.

file_path

h5 or scp file to read.

transform

TransformList object, applies a transformation to the features after reading them from disk.

permissive

If True, if the data that we want to read is not in the file it returns an empty matrix, if False it raises an exception.

close()[source]

Closes all the open hdf5 files.

property keys
__enter__()

Function required when entering contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with DataReader(‘file.h5’) as f:

keys, data = f.read()

static _apply_range_to_shape(shape, row_offset, num_rows)
Modifies shape given the user defined row_offset and num_rows to read.

If we are reading a matrix of shape (100,4) and row_offset=10, num_rows=20, it returns (20,4). If row_offset=20, num_rows=0, it returns (80,4).

Parameters
  • shape – Original shape of the feature matrix.

  • row_offset – User defined row_offset, first frame to read.

  • num_rows – User defined num_rows, number of frames to read.

Returns

2D tuple with modified shape.

static _combine_ranges(read_range, row_offset, num_rows)
Combines two frame ranges.
One is the range in the scp file, e.g, in the scp file

recording1 file1.ark:34[3:40] recording2 file1.ark:100[5:20]

[3:40] and [5:20] are frame ranges.

The user can decide to just read a submatrix of that, e.g., read 10 rows starting in row_offset 1. If we combine that with the range [3:40], the function returns. row_offset=4 (3+1) and num_rows=10.

Parameters
  • read_range – Frame range from scp file. It is a tuple with the first row and number of rows to read.

  • row_offset – User defined row_offset.

  • num_rows – User defined number of rows to read, it it is 0, we read all the rows defined in the scp read_range.

Returns

Combined row_offset, first row of the recording to read. Combined number of rows (frames) to read.

_open_archive(key_idx)[source]
Opens the hdf5 file correspoding to a given feature/matrix

if it is not already open.

Parameters

key_idx – Integer position of the feature matrix in the scp file.

Returns

Python file object.

static _squeeze(data, permissive=False)
Converts list of matrices to 3D numpy array or

list of vectors to 2D numpy array.

Parameters
  • data – List of matrices or vectors.

  • permissive – If True, if one of the matrices/vectors in data is empty, it substitutes it by matrix/vector with all zeros. If false, it raises exception.

Returns

2D or 3D numpy array.

read_dims(keys, assert_same_dim=True)

Reads the number of columns in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of columns.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of columns for the recordings in keys

read_num_rows(keys, assert_same_dim=True)

Reads the number of rows in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the number of rows.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

Integer numpy array with the number of rows for the recordings in keys.

read_shapes(keys, assert_same_dim=True)[source]

Reads the shapes in the feature matrices of the dataset.

Parameters
  • keys – List of recording names from which we want to retrieve the shapes.

  • assert_same_dim – If True, it raise exception in not all the matrices have the same number of columns.

Returns

List of tuples with the shapes for the recordings in keys.

read(keys, squeeze=False, row_offset=0, num_rows=0)[source]

Reads the feature matrices/vectors for the recordings in keys.

Parameters
  • keys – List of recording names from which we want to retrieve the feature matrices/vectors.

  • squeeze – If True, it converts the list of matrices/vectors to 3D/2D numpy array. All matrices need to have same number of rows.

  • offset – List of integers or numpy array of with the first row to read from each feature matrix.

  • num_rows – List of integers or numpy array of with the number of rows to read from each feature matrix. If 0 it reads all the rows.

Returns

List of feature matrices/vectors or 3D/2D numpy array.

Return type

data

Feature Writer Classes

ARK Feature Reader Classes

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.ark_data_writer.ArkDataWriter(archive_path, script_path=None, binary=True, **kwargs)[source]

Class to write Ark feature files.

archive_path

output data file path.

script_path

optional output scp file.

binary

True if the the Ark file is binary, False if it is text file.

flush[source]

If True, it flushes the output after writing each feature file.

compress

It True, it uses Kaldi compression.

compression_method

Kaldi compression method: {auto (default), speech_feat,

2byte-auto, 2byte-signed-integer, 1byte-auto, 1byte-unsigned-integer, 1byte-0-1}.

scp_sep

Separator for scp files (default ‘ ‘).

__init__(archive_path, script_path=None, binary=True, **kwargs)[source]
__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with ArkDataWriter(‘file.h5’) as f:

f.write(key, data)

It closes the output file.

close()[source]

Closes the output file

flush()[source]

Flushes the file

__enter__()

Function required when entering contructions of type

with DataWriter(‘file.h5’) as f:

f.write(key, data)

_convert_data(data)[source]

Converts the feature matrix from numpy array to KaldiMatrix or KaldiCompressedMatrix.

write(keys, data)[source]

Writes data to file.

Parameters
  • key – List of recodings names.

  • data – List of Feature matrices or vectors. If all the matrices have the same dimension it can be a 3D numpy array. If they are vectors, it can be a 2D numpy array.

HDF5 Feature Reader Classes

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.h5_data_writer.H5DataWriter(archive_path, script_path=None, **kwargs)[source]

Class to write hdf5 feature files.

archive_path

output data file path.

script_path

optional output scp file.

flush[source]

If True, it flushes the output after writing each feature file.

compress

It True, it uses Kaldi compression.

compression_method

Kaldi compression method: {auto (default), speech_feat,

2byte-auto, 2byte-signed-integer, 1byte-auto, 1byte-unsigned-integer, 1byte-0-1}.

scp_sep

Separator for scp files (default ‘ ‘).

__init__(archive_path, script_path=None, **kwargs)[source]
__exit__(exc_type, exc_value, traceback)[source]

Function required when exiting from contructions of type

with H5DataWriter(‘file.h5’) as f:

f.write(key, data)

It closes the output file.

close()[source]

Closes the output file

flush()[source]

Flushes the file

_convert_data(data)[source]

Converts data to the format for saving. Compresses the data it needed. :param Numpy array feature matrix/vector.:

Returns

Numpy array to save in h5 file. Atrributes for the hdf5 dataset with information about the compression.

__enter__()

Function required when entering contructions of type

with DataWriter(‘file.h5’) as f:

f.write(key, data)

write(keys, data)[source]

Writes data to file.

Parameters
  • key – List of recodings names.

  • data – List of Feature matrices or vectors. If all the matrices have the same dimension it can be a 3D numpy array. If they are vectors, it can be a 2D numpy array.

VAD Read/Write Classes

VAD Reader Factory Classes

These are Factory Classes that generate VAD Reader objects.

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.vad_rw_factory.VADReaderFactory[source]
static create(rspecifier, path_prefix=None, scp_sep=' ', frame_length=25, frame_shift=10, snip_edges=False)[source]
static filter_args(**kwargs)[source]
static add_class_args(parser, prefix=None)[source]
static add_argparse_args(parser, prefix=None)

VAD Reader Classes

Copyright 2019 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.bin_vad_reader.BinVADReader(rspecifier, path_prefix=None, scp_sep=' ', frame_length=25, frame_shift=10, snip_edges=False)[source]
__init__(rspecifier, path_prefix=None, scp_sep=' ', frame_length=25, frame_shift=10, snip_edges=False)[source]
read_num_frames(keys)[source]
read(keys, squeeze=False, offset=0, num_frames=0, frame_length=25, frame_shift=10, snip_edges=False, signal_lengths=None)[source]
read_timestamps(keys, merge_tol=0.001)[source]
__enter__()

Function required when entering contructions of type

with VADReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with VADReader(‘file.h5’) as f:

keys, data = f.read()

close()

Closes input file.

Copyright 2018 Johns Hopkins University (Author: Jesus Villalba) Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)

class hyperion.io.segment_vad_reader.SegmentVADReader(segments_file, permissive=False)[source]
__init__(segments_file, permissive=False)[source]
read(keys, squeeze=False, offset=0, num_frames=0, frame_length=25, frame_shift=10, snip_edges=False, signal_lengths=None)[source]
read_timestamps(keys, merge_tol=0)[source]
__enter__()

Function required when entering contructions of type

with VADReader(‘file.h5’) as f:

keys, data = f.read()

__exit__(exc_type, exc_value, traceback)

Function required when exiting from contructions of type

with VADReader(‘file.h5’) as f:

keys, data = f.read()

close()

Closes input file.