Nabu-asr
Functions
input_pipeline.py File Reference

contains the methotology for creating the input pipeline More...

Functions

def nabu.processing.input_pipeline.get_filenames (dataconfs)
 create a list of filenames to put into the queue More...
 
def nabu.processing.input_pipeline.input_pipeline (data_queue, batch_size, numbuckets, dataconfs, variable_batch_size=False, allow_smaller_final_batch=False, name=None)
 create the input pipeline More...
 
def nabu.processing.input_pipeline.bucket_boundaries (histogram, numbuckets)
 detemine the bucket boundaries to uniformally devide the number of elements in the buckets More...
 

Detailed Description

contains the methotology for creating the input pipeline

Function Documentation

§ bucket_boundaries()

def nabu.processing.input_pipeline.bucket_boundaries (   histogram,
  numbuckets 
)

detemine the bucket boundaries to uniformally devide the number of elements in the buckets

this is a greedy algorithm and does not guarantee an optimal solution

§ get_filenames()

def nabu.processing.input_pipeline.get_filenames (   dataconfs)

create a list of filenames to put into the queue

Parameters
dataconfsthe database configurations as a list of lists
Returns
  • a list containing all the filenames
  • a list containing the names

§ input_pipeline()

def nabu.processing.input_pipeline.input_pipeline (   data_queue,
  batch_size,
  numbuckets,
  dataconfs,
  variable_batch_size = False,
  allow_smaller_final_batch = False,
  name = None 
)

create the input pipeline

Parameters
data_queuethe data queue where the filenemas are queued
batch_sizethe desired batch size
numbucketsthe number of data buckets
dataconfsthe databes configuration sections that should be read as a list of lists
variable_batch_sizebool, change batch size from bucket to bucket, for buckets with higher seq_length a smaller batch size is used
allow_smaller_final_batchif set to True a smaller final batch is
allowed
namename of the pipeline
Returns
  • the data elements as a list of [batch_size x ...] tensor
  • the sequence lengths as a list of [batch_size] tensor
  • the number of steps in each epoch
  • the maximal length a sequence can be