Detailed API Doc

ML_Logger API

class ml_logger.ML_Logger(prefix='', *prefixae, root='/home/docs/checkouts/readthedocs.org/user_builds/ml-logger/checkouts/stable/ml_logger/docs', user=None, access_token=None, buffer_size=2048, max_workers=None, asynchronous=None, summary_cache_opts: dict = None)[source]

ML_Logger, a logging utility for ML training. —

Async(clean=False, **kwargs)[source]

Returns a context in which the logger logs [a]synchronously. The new asynchronous request pool is cached on the logging client, so this context can happen repetitively without creating a run-away number of parallel threads.

The context object can only be used once b/c it is create through generator using the @contextmanager decorator.

Parameters:
  • clean – boolean flag for removing the thead pool after __exit__. used to enforce single-use AsyncContexts.
  • max_workersfuture_sessions.Session pool max_workers field
Returns:

context object

AsyncContext(clean=False, **kwargs)

Returns a context in which the logger logs [a]synchronously. The new asynchronous request pool is cached on the logging client, so this context can happen repetitively without creating a run-away number of parallel threads.

The context object can only be used once b/c it is create through generator using the @contextmanager decorator.

Parameters:
  • clean – boolean flag for removing the thead pool after __exit__. used to enforce single-use AsyncContexts.
  • max_workersfuture_sessions.Session pool max_workers field
Returns:

context object

Prefix(*praefixa, metrics=None, sep='/')[source]

Returns a context in which the prefix of the logger is set to prefix :param praefixa: the new prefix :return: context object

PrefixContext(*praefixa, metrics=None, sep='/')

Returns a context in which the prefix of the logger is set to prefix :param praefixa: the new prefix :return: context object

Sync(clean=False, **kwargs)[source]

Returns a context in which the logger logs synchronously. The new synchronous request pool is cached on the logging client, so this context can happen repetitively without creating a run-away number of parallel threads.

The context object can only be used once b/c it is create through generator using the @contextmanager decorator.

Parameters:
  • clean – boolean flag for removing the thead pool after __exit__. used to enforce single-use SyncContexts.
  • max_workersurllib3 session pool max_workers field
Returns:

context object

SyncContext(clean=False, **kwargs)

Returns a context in which the logger logs synchronously. The new synchronous request pool is cached on the logging client, so this context can happen repetitively without creating a run-away number of parallel threads.

The context object can only be used once b/c it is create through generator using the @contextmanager decorator.

Parameters:
  • clean – boolean flag for removing the thead pool after __exit__. used to enforce single-use SyncContexts.
  • max_workersurllib3 session pool max_workers field
Returns:

context object

abspath(*paths)[source]

returns the absolute path w.r.t the logging directory.

print(logger.abspath("some", "path"))

# /home/ge/some/path
Parameters:*paths

position arguments for each segment of the path.

Returns:absolute path w.r.t. the logging directory (excluding the prefix)
configure(prefix=None, *prefixae, root: str = None, user=None, access_token=None, asynchronous=None, max_workers=None, buffer_size=None, summary_cache_opts: dict = None, register_experiment=None, silent=False)[source]

Configure an existing logger with updated configurations.

# LogClient Behavior

The logger.client would be re-constructed if

  • root_dir is changed
  • max_workers is not None
  • asynchronous is not None

Because the http LogClient contains http thread pools, one shouldn’t call this configure function in a loop. Instead, use the logger.(A)syncContext() contexts. That context caches the pool so that you don’t create new thread pools again and again.

# Cache Behavior

Both key-value cache and the summary cache would be cleared if summary_cache_opts is set to not None. A new summary cache would be created, whereas the old key-value cache would be cleared.

# Print Buffer Behavior If configure is called with a buffer_size not None, the old print buffer would be cleared.

todo: I’m considering also clearing this buffer also when summary-cache is updated. The use-case of changing print_buffer_size is pretty small. Should probaly just deprecate this.

# Registering New Experiment

This is a convinient default for new users. It prints out a dashboard link to the dashboard url.

todo: the table at the moment seems a bit verbose. I’m considering making this
just a single line print.
Parameters:
  • prefix – the first prefix
  • *prefixae

    a list of prefix segments

  • root
  • user
  • access_token
  • buffer_size
  • summary_cache_opts
  • asynchronous
  • max_workers
  • register_experiment
  • silent – bool, True to turn off the print.
Returns:

diff(diff_directory='.', diff_filename='index.diff', ref='HEAD', verbose=False)[source]

example usage:

from ml_logger import logger

logger.diff()  # => this writes a diff file to the root of your logging directory.
Parameters:
  • ref – the ref w.r.t which you want to diff against. Default to HEAD
  • diff_directory – The root directory to call git diff, default to current directory.
  • diff_filename – The file key for saving the diff file.
  • verbose – if True, print out the command.
Returns:

string containing the content of the patch

every(n=1, key='default', start_on=0)[source]

returns True every n counts. Use the key to count different intervals.

Example:

for i in range(100):
    if logger.every(10):
        print('every tenth count!')
    if logger.every(100, "hudred"):
        print('every 100th count!')
    if logger.every(10, "hudred", start_on=1):
        print('every 10th count starting from the first call: i =', i)
Parameters:
  • n
  • key
  • start – start on this call. Use start_on=1 for tail mode [0, 10, 20] instead of [9, 19, …]
Returns:

flush(cache=None, file=None)[source]

Flushes the key_value cache and the print buffer

static fn_info(fn)[source]

logs information of the caller’s stack (module, filename etc)

Parameters:fn
Returns:info = dict( name=_[‘__name__’], doc=_[‘__doc__’], module=_[‘__module__’], file=_[‘__globals__’][‘__file__’] )
get_dataframe(*keys, x_key=None, path='metrics.pkl', wd=None, num_bins=None, bin_size=1, silent=False, default=None, collect='std', verbose=False)

Returns a Pandas.DataFrame object that contains metrics from all files.

Parameters:
  • keys

    if non passed, returns the entire dataframe. If 1 key is passed, return that column. If multiple keys are passed, return individual columns.

    If you want to get the joined table for multiple keys, directly filter after this call.

  • bin – binOption(xKey, n, steps)
  • path – can contain glob patterns, will return concatenated dataframe from all paths found with the pattern.
  • silent
  • default – Default value for columns. Not widely used.
  • collect – One of [ “std”, True, False ]
  • kwargs – Not used besides the default argument.
Returns:

pandas.DataFrame or None when no metric file is found.

get_parameters(*keys, path='parameters.pkl', not_exist_ok=False, **kwargs)[source]

utility to obtain the hyperparameters as a flattened dictionary.

  1. returns a dot-flattened dictionary if no keys are passed.
  2. returns a single value if only one key is passed.
  3. returns a list of values if multiple keys are passed.

If keys are passed, returns an array with each item corresponding to those keys

lr, global_metric = logger.get_parameters('Args.lr', 'Args.global_metric')
print(lr, global_metric)

this returns:

0.03 'ResNet18L2'

Raises FileNotFound error if the parameter file pointed by the path is empty. To avoid this, add a default keyword value to the call:

param = logger.get_parameter('does_not_exist', default=None)
assert param is None, "should be the default value: None"
Parameters:
  • *keys

    A list of strings to specify the parameter keys

  • silent – bool, prevents raising an exception.
  • path – Path to the parameters.pkl file. Keyword argument, default to parameters.pkl.
  • default – Undefined. If the default key is present, return default when param is missing.
Returns:

git_rev(branch)[source]

Helper function used by `logger.__head__` that returns the git revision hash of the branch that you pass in.

full reference here: https://stackoverflow.com/a/949391 the show-ref and the for-each-ref commands both show a list of refs. We only need to get the ref hash for the revision, not the entire branch of by tag.

glob(query, wd=None, recursive=True, start=None, stop=None)[source]

Globs files under the work directory (wd). Note that wd affects the file paths being returned. The default is the current logging prefix. Use absolute path (with a leanding slash (/) to escape the logging prefix. Use two leanding slashes for the absolute path in the host for the logging server.

with logger.PrefixContext("<your-run-prefix>"):
    runs = logger.glob('**/metrics.pkl')
    for _ in runs:
        exp_log = logger.load_pkl(_)
Parameters:
  • query
  • wd

    defaults to the current prefix. When trueful values are given, uses: > wd = pJoin(self.prefix, wd)

    if you want root of the logging server instance, use abs path headed by /. If you want root of the server file system, double slash: //home/directory-name-blah.

  • recursive
  • start
  • stop
Returns:

None if the director does not exist (internal FileNotFoundError)

glob_gs(query='', wd=None, max_results=1000, **kwargs)[source]

Does not support wildcard or pagination, but we could add it in the future.

Parameters:
  • query
  • wd
  • max_keys – default is 1000 as in boto3
Returns:

glob_s3(query='*', wd=None, max_keys=1000, **KWargs)[source]

Does not support wildcard or pagination, but we could add it in the future.

Parameters:
  • query
  • wd
  • max_keys – default is 1000 as in boto3
Returns:

iload_pkl(key, **kwargs)[source]

load a pkl file as an iterator.

   for chunk in logger.iload_pkl("episodeyang/weights.pkl")
       print(chunk)

or alternatively just read a single data file:
data, = logger.iload_pkl("episodeyang/weights.pkl")

when key starts with a single slash as in “/debug/some-run”, the leading slash is removed and the remaining path is pathJoin’ed with the data_dir of the server.

So if you want to access absolute path of the filesystem that the logging server is in, you should append two leadning slashes. This way, when the leanding slash is removed, the remaining path is still an absolute value and joining with the data_dir would post no effect.

“//home/ubuntu/ins-runs/debug/some-other-run” would point to the system absolute path.

Parameters:
  • key – path string.
  • start – Starting index for the chunks None means from the beginning.
  • stop – Stop index for the chunks. None means to the end of the file.
  • tries – (int) The number of ties for the request. The last one does not catch error.
  • delay – (float) the delay multiplier between the retries. Multiplied (in seconds) with a random float in [0, 1).
Returns:

a iterator.

load_file(*keys, path=None)[source]

return the binary stream, most versatile.

todo: check handling of line-separated files

when key starts with a single slash as in “/debug/some-run”, the leading slash is removed and the remaining path is pathJoin’ed with the data_dir of the server.

So if you want to access absolute path of the filesystem that the logging server is in, you should append two leadning slashes. This way, when the leanding slash is removed, the remaining path is still an absolute value and joining with the data_dir would post no effect.

“//home/ubuntu/ins-runs/debug/some-other-run” would point to the system absolute path.

Parameters:*keys

path string fragments that are joined together

Returns:a tuple of each one of the data chunck logged into the file.
load_module(module, path='weights.pkl', wd=None, stream=True, tries=5, matcher=None, map_location=None)[source]

Load torch module from file.

Now supports:

  • streaming mode: where multiple segments of the same model is
    saved as chunks in a pickle file.
  • partial, or prefixed load with matcher.
  • multiple tires: on unreliable networks (coffee shop!)

To manipulate the prefix of a checkpoint file you can do

Using Matcher for Partial or Prefixed load

Imaging you are trying to load weights from a different module that is missing a prefix for their keys. (For example you have a L2 metric function, and is trying to load from a VAE embedding function baseline (only half of the netowrk)).

from ml_logger import logger

net = models.ResNet()
logger.load_module(
       net,
       path="/checkpoint/geyang/resnet.pkl",
       matcher=lambda d, k, p: d[k.replace('embed.')])

To fill-in if there are missing keys:

from ml_logger import logger

net = models.ResNet()
logger.load_module(
       net,
       path="/checkpoint/geyang/resnet.pkl",
       matcher=lambda d, k, p: d[k] if k in d else p[k])
Parameters:
  • module – target torch module you want to load
  • path – the weight file containing the weights
  • stream
  • tries
  • matcher

    function to remove prefix, repeat keys, partial load (by). Should take in 2 or three arguments:

    def matcher(checkpoint_dict, key, current_dict):
    
Returns:

None

load_np(*keys)[source]

load a np file

when key starts with a single slash as in “/debug/some-run”, the leading slash is removed and the remaining path is pathJoin’ed with the data_dir of the server.

So if you want to access absolute path of the filesystem that the logging server is in, you should append two leadning slashes. This way, when the leanding slash is removed, the remaining path is still an absolute value and joining with the data_dir would post no effect.

“//home/ubuntu/ins-runs/debug/some-other-run” would point to the system absolute path.

Parameters:keys – path strings
Returns:a tuple of each one of the data chunck logged into the file.
load_pkl(*keys, start=None, stop=None, tries=1, delay=1)[source]

load a pkl file as a tuple. By default, each file would contain 1 data item.

data, = logger.load_pkl("episodeyang/weights.pkl")

You could also load a particular data chunk by index:

data_chunks = logger.load_pkl("episodeyang/weights.pkl", start=10)

when key starts with a single slash as in “/debug/some-run”, the leading slash is removed and the remaining path is pathJoin’ed with the data_dir of the server.

So if you want to access absolute path of the filesystem that the logging server is in, you should append two leadning slashes. This way, when the leanding slash is removed, the remaining path is still an absolute value and joining with the data_dir would post no effect.

“//home/ubuntu/ins-runs/debug/some-other-run” would point to the system absolute path.

Because loading is usually synchronous, we can encounter connection errors. We don’t want to halt our training session b/c of these errors without retrying a few times.

For this reason, logger.load_pkl (and iload_pkl to equal measure) both takes a tries argument and a delay argument. The delay argument is multipled by a random number, to avoid synchronized DDoS attach on your instrumentation server.

tries

Parameters:
  • *keys

    path string fragments

  • start – Starting index for the chunks None means from the beginning.
  • stop – Stop index for the chunks. None means to the end of the file.
  • tries – (int) The number of ties for the request. The last one does not catch error.
  • delay – (float) the delay multiplier between the retries. Multiplied (in seconds) with a random float in [1, 1.5).
Returns:

a tuple of each one of the data chunck logged into the file.

load_text(*keys)[source]

return the text content of the file (in a single chunk)

todo: check handling of line-separated files

when key starts with a single slash as in “/debug/some-run”, the leading slash is removed and the remaining path is pathJoin’ed with the data_dir of the server.

So if you want to access absolute path of the filesystem that the logging server is in, you should append two leadning slashes. This way, when the leanding slash is removed, the remaining path is still an absolute value and joining with the data_dir would post no effect.

“//home/ubuntu/ins-runs/debug/some-other-run” would point to the system absolute path.

Parameters:*keys

path string fragments

Returns:a tuple of each one of the data chunck logged into the file.
load_variables(path, variables=None)[source]

load the saved value from a pickle file into tensorflow variables.

The variables that are loaded is the intersection between the tf.global_variables() list and the variables saved in the weight_dict. When a variable in the weight_dict is not present in the current session’s computation graph, no error is reported. When a variable present in the global variables list is not present in the weight_dict, no exception is raised.

The variables argument overrides the global variable list. When a variable present in this list doesn’t exist in the weight list, an exception should be raised.

Parameters:
  • path – path to the saved checkpoint pickle file.
  • variables – None or a list of tensorflow variables. When this list is supplied, every variable’s truncated name has to exist inside the loaded weight_dict.
Returns:

log(*args, metrics=None, silent=False, sep=' ', end='\n', flush=None, cache=None, file=None, _prefix=None, **_key_values) → None[source]

log dictionaries of data, key=value pairs at step == step.

logs *argss as line and kwargs as key / value pairs

param args:(str) strings or objects to be printed.
param metrics:(dict) a dictionary of key/value pairs to be saved in the key_value_cache
param sep:(str) separator between the strings in *args
param end:(str) string to use for the end of line. Default to “
param silent:(boolean) whether to also print to stdout or just log to file
param flush:(boolean) whether to flush the text logs
param cache:optional (str) a specific cache key, useful for scoped reporting
param kwargs:key/value arguments
return:
log_data(data, path=None, overwrite=False)[source]

Append data to the file located at the path specified.

Parameters:
  • data – python data object to be saved
  • path – path for the object, relative to the root logging directory.
  • overwrite – boolean flag to switch between ‘appending’ mode and ‘overwrite’ mode.
Returns:

None

log_line(*args, sep=' ', end='\n', flush=True, file=None, **kwargs)[source]

this is similar to the print function. It logs *args with a default EOL postfix in the end.

n = 10
logger.log_line("Mary", "has", n, "sheep.", color="green")

This outputs:

>>> "Mary has 10 sheep" (colored green)
Parameters:
  • *args

    List of object to be converted to string and printed out.

  • sep – Same as the sep kwarg in regular print statements
  • end – Same as the end kwarg in regular print statements
  • flush – bool, whether the output is flushed. Default to True
  • file – file object to which the line is written
  • color – str, color of the line. We use termcolor.colored as our color library. See list of colors here: termcolor: https://pypi.org/project/termcolor/
Returns:

None

log_metrics(metrics=None, _prefix=None, silent=None, cache: Optional[str] = None, file: Optional[str] = None, flush=None, **_key_values) → None[source]
Parameters:
  • metrics – (mapping) key/values of metrics to be logged. Overwrites previous value if exist.
  • cache – optional KeyValueCache object to be passed in
  • flush
  • _key_values
Returns:

log_metrics_summary(key_values: dict = None, cache: str = None, key_stats: dict = None, default_stats=None, silent=False, flush: bool = True, _prefix=None, **_key_modes) → None[source]

logs the statistical properties of the stored metrics, and clears the summary_cache if under tiled mode, and keeps the data otherwise (under rolling mode).

To enable explicit mode without specifying *only_keys, set get_only to True

Modes for the Statistics:

key_mode would be one of:
  • mean:
  • min_max:
  • std_dev:
  • quantile:
  • histogram(bins=10):
Parameters:
  • key_values – extra key (and values) to log together with summary such as timestep, epoch, etc.
  • cache – (dict) An optional cache object from which the summary is made.
  • key_stats – (dict) a dictionary for the key and the statistic modes to be returned.
  • default_stats – (one of [‘mean’, ‘min_max’, ‘std_dev’, ‘quantile’, ‘histogram’])
  • silent – (bool) a flag to turn the printing On/Off
  • flush – (bool) flush the key_value cache if trueful.
  • _key_modes – (**) key value pairs, as a short hand for the key_modes dictionary.
Returns:

None

log_params(path='parameters.pkl', silent=False, **kwargs)[source]

Log namespaced parameters in a list.

Examples:

logger.log_params(some_namespace=dict(layer=10, learning_rate=0.0001))

generates a table that looks like:

══════════════════════════════════════════
   some_namespace
────────────────────┬─────────────────────
       layer        │ 10
   learning_rate    │ 0.0001
════════════════════╧═════════════════════
Parameters:
  • path – the file to which we save these parameters
  • silent – do not print out
  • kwargs – list of key/value pairs, each key representing the name of the namespace, and the namespace itself.
Returns:

None

log_text(text: str = None, filename=None, dedent=False, overwrite=False)[source]

logging and printing a string object.

This does not log to the buffer. It calls the low-level log_text method right away without buffering.

logger.log_text('''
    some text
    with indent''', dedent=True)

This logs with out the indentation at the begining of the text.

Parameters:
  • text
  • filename – file name to which the string is logged.
  • dedent – boolean flag for dedenting the multi-line string
Returns:

ping(status='running', interval=None)[source]

pings the instrumentation server to stay alive. Gets a control signal in return. The background thread is responsible for making the call . This method just returns the buffered signal synchronously.

Returns:tuple signals
static plt2data(fig)[source]

@brief Convert a Matplotlib figure to a 4D numpy array with RGBA channels and return it @param fig a matplotlib figure @return a numpy 3D array of RGBA values

read_metrics(*keys, x_key=None, path='metrics.pkl', wd=None, num_bins=None, bin_size=1, silent=False, default=None, collect='std', verbose=False)[source]

Returns a Pandas.DataFrame object that contains metrics from all files.

Parameters:
  • keys

    if non passed, returns the entire dataframe. If 1 key is passed, return that column. If multiple keys are passed, return individual columns.

    If you want to get the joined table for multiple keys, directly filter after this call.

  • bin – binOption(xKey, n, steps)
  • path – can contain glob patterns, will return concatenated dataframe from all paths found with the pattern.
  • silent
  • default – Default value for columns. Not widely used.
  • collect – One of [ “std”, True, False ]
  • kwargs – Not used besides the default argument.
Returns:

pandas.DataFrame or None when no metric file is found.

read_params(*keys, path='parameters.pkl', not_exist_ok=False, **kwargs)

utility to obtain the hyperparameters as a flattened dictionary.

  1. returns a dot-flattened dictionary if no keys are passed.
  2. returns a single value if only one key is passed.
  3. returns a list of values if multiple keys are passed.

If keys are passed, returns an array with each item corresponding to those keys

lr, global_metric = logger.get_parameters('Args.lr', 'Args.global_metric')
print(lr, global_metric)

this returns:

0.03 'ResNet18L2'

Raises FileNotFound error if the parameter file pointed by the path is empty. To avoid this, add a default keyword value to the call:

param = logger.get_parameter('does_not_exist', default=None)
assert param is None, "should be the default value: None"
Parameters:
  • *keys

    A list of strings to specify the parameter keys

  • silent – bool, prevents raising an exception.
  • path – Path to the parameters.pkl file. Keyword argument, default to parameters.pkl.
  • default – Undefined. If the default key is present, return default when param is missing.
Returns:

remove(*paths)[source]

removes files and folders by path

Parameters:path
Returns:
save_image(image, key: str, cmap=None, normalize=None)[source]

Log a single image.

Parameters:
  • image – numpy object Size(w, h, 3)
  • key – example: “figures/some_fig_name.png”, the file key to which the image is saved.
save_images(stack, key, n_rows=None, n_cols=None, cmap=None, normalize=None, background=1)[source]

Log images as a composite of a grid. Images input as a 4-D stack.

Parameters:
  • stack – Size(n, w, h, c)
  • key – the filename for the composite image.
  • n_rows – number of rows
  • n_cols – number of columns
  • cmap – OneOf([str, matplotlib.cm.ColorMap])
  • normalize – defaul None. OneOf[None, ‘individual’, ‘row’, ‘column’, ‘grid’]. Only ‘grid’ and ‘individual’ are implemented.
Returns:

None

save_module(module, path='weights.pkl', tries=3, backup=3.0)[source]

Save torch module. Overwrites existing file.

Now Supports nn.DataParallel modules. First try to access the state dict, if not available try the module.module attribute.

module = nn.DataParallel(lenet)
logger.save_module(module, "checkpoint.pk")

When the model is large, this function uploads the weight dictionary (state_dict) in chunks. You can specify the size for the chunks, measured in number of tensors.

The conversion convention for the upload chunks is roughly 32bit, or 8 bytes for each np.float32 entry. so the upload size for chunk = 100,000 is roughly

100_000 * 8 * <base56 encoding ration> ~ 960k.
Parameters:
  • module – the PyTorch module to be saved.
  • path – filename to which we save the module.
Returns:

None

save_pkl(data, *keys, path=None, append=False, use_dill=False)[source]

Save data in pkl format

Note: We use dill so that we can save lambda functions but, but we use pure
pickle when saving nn.Modules
Parameters:
  • data – python data object to be saved
  • path – path for the object, relative to the root logging directory.
  • append – default to False – overwrite by default
Returns:

None

save_pyplot(path='plot.png', fig=None, format=None, **kwargs)[source]
Saves matplotlib figure. The interface of this method emulates matplotlib.pyplot.savefig
method.
Parameters:
  • key – (str) file name to which the plot is saved.
  • fig – optioanl matplotlib figure object. When omitted just saves the current figure.
  • format – One of the output formats [‘pdf’, ‘png’, ‘svg’ etc]. Default to the extension given by the key argument in savefig().
  • **kwargs – other optional arguments that are passed into _matplotlib.pyplot.savefig: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.savefig.html
Returns:

(str) path to which the figure is saved to.

save_variables(variables, path='variables.pkl', keys=None)[source]

save tensorflow variables in a dictionary

Parameters:
  • variables – A Tuple (Array) of TensorFlow Variables.
  • path – default: ‘variables.pkl’, filepath to the pkl file, with which we save the variable values.
  • namespace – A folder name for the saved variable. Default to ./checkpoints to keep things organized.
  • keys – None or Array(size=len(variables)). When is an array the length has to be the same as that of

the list of variables. This parameter allows you to overwrite the key we use to save the variables.

By default, we generate the keys from the variable name, without the :[0-9] at the end that points to the tensor (from the variable itself). :return: None

save_video(frame_stack, key, format=None, fps=20, **imageio_kwargs)[source]

Let’s do the compression here. Video frames are first written to a temporary file and the file containing the compressed data is sent over as a file buffer.

Save a stack of images to

Parameters:
  • frame_stack – the stack of video frames
  • key – the file key to which the video is logged.
  • format – Supports ‘mp4’, ‘gif’, ‘apng’ etc.
  • imageio_kwargs – (map) optional keyword arguments for imageio.mimsave.
Returns:

savefig(key, fig=None, format=None, **kwargs)[source]
Saves matplotlib figure. The interface of this method emulates matplotlib.pyplot.savefig
method.
Parameters:
  • key – (str) file name to which the plot is saved.
  • fig – optioanl matplotlib figure object. When omitted just saves the current figure.
  • format – One of the output formats [‘pdf’, ‘png’, ‘svg’ etc]. Default to the extension given by the key argument in savefig().
  • **kwargs – other optional arguments that are passed into _matplotlib.pyplot.savefig: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.savefig.html
Returns:

(str) path to which the figure is saved to.

since(*keys)[source]

returns a float in seconds when 1 key is passed, or a list of floats when multiple keys are passed in. The returned value are in seconds, measured by delta in perf_counter.

Automatically de-dupes the keys, but will return the same number of intervals. duplicates
will receive the same result.

Note: This is idempotent.

from ml_logger import logger

logger.start('loop', 'iter')
it = 0
for i in range(10):
    it += logger.split('iter')
print('iteration', it / 10)
print('loop', logger.since('loop'))
Parameters:*keys

position arguments are timed together.

Returns:float (in seconds)
split(*keys)[source]

returns a float in seconds when 1 key is passed, or a list of floats when multiple keys are passed-in.

Automatically de-dupes the keys, but will return the same number of intervals. duplicates will receive the same result.

Note: This is Not idempotent, which is why it is not a property.

from ml_logger import logger

logger.split('loop', 'iter')
it = 0
for i in range(10):
    it += logger.split('iter')
print('iteration', it / 10)
print('loop', logger.split('loop'))
Parameters:*keys

position arguments are timed together.

Returns:float (in seconds)
start(*keys)[source]

starts a timer, saved in float in seconds. The returned perf_counter does not have meaning on its own. Only differences between two perf_counters make sense as time delta.

Automatically de-dupes the keys, but will return the same number of intervals. duplicates will receive the same result.

from ml_logger import logger

logger.start('loop', 'iter')
it = 0
for i in range(10):
    it += logger.split('iter')
print('iteration', it / 10)
print('loop', logger.since('loop'))
Parameters:*keys

position arguments are timed together.

Returns:float (in seconds)
stem(path)[source]

returns the stem of the filename in the path, removes the extension

path = "/Users/geyang/some-proj/experiments/rope-cnn.py"
logger.stem(path)

returns:

"/Users/geyang/some-proj/experiments/rope-cnn"

You can use this in combination with the truncate function. .. code:: python

_ = logger.truncate(path, 4) _ = logger.stem(_)
"experiments/rope-cnn"

This is useful for saving the relative path of your main script.

Parameters:path – “learning-to-learn/experiments/run.py”
Returns:“run”
store(metrics=None, silent=None, cache: Optional[str] = None, _prefix=None, **key_values)

Store the metric data (with the default summary cache) for making the summary later. This allows the logging/saving of training metrics asynchronously from the logging.

Parameters:
  • metrics (*) – a mapping of metrics. Will be destructured and appended to the data store one key/value at a time,
  • silent – bool flag for silencing the keys stored in this call.
  • cache
  • key_values (**) – key/value arguments, each being a metric key / metric value pair.
Returns:

None

store_key_value(key: str, value: Any, silent=None, cache: Optional[str] = None) → None[source]

store the key: value awaiting future summary.

Parameters:
  • key – str, can be / separated.
  • value – numerical value
  • silent
  • cache
Returns:

store_metrics(metrics=None, silent=None, cache: Optional[str] = None, _prefix=None, **key_values)[source]

Store the metric data (with the default summary cache) for making the summary later. This allows the logging/saving of training metrics asynchronously from the logging.

Parameters:
  • metrics (*) – a mapping of metrics. Will be destructured and appended to the data store one key/value at a time,
  • silent – bool flag for silencing the keys stored in this call.
  • cache
  • key_values (**) – key/value arguments, each being a metric key / metric value pair.
Returns:

None

truncate(path, depth=-1)[source]

truncates the path’s parent directories w.r.t. given depth. By default, returns the filename of the path.

path = "/Users/geyang/some-proj/experiments/rope-cnn.py"
logger.truncate(path, -1)
"rope-cnn.py"
logger.truncate(path, 4)
"experiments/rope-cnn.py"

This is useful for saving the relative path of your main script.

Parameters:
  • path – “learning-to-learn/experiments/run.py”
  • depth – 1, 2… when 1 it picks only the file name.
Returns:

“run”

upload_dir(dir_path, target, excludes=(), archive='tar', temp_dir=None)[source]

upload dir to gs, s3, and ml-logger.

Parameters:
  • dir_path – this is the path to the dir
  • target – this is the target location
  • excludes – NotImplemented
  • archive – is the archive format: one of “zip”, “tar”, “gztar”, “bztar”, or “xztar”. Or any other registered format.
  • temp_dir – NotImplemented, should allow override of temp folder in case storage limits exist.
Returns:

upload_file(file_path: str = None, target_path: str = 'files/', once=True) → None[source]

uploads a file (through a binary byte string) to a target_folder. Default target is “files”

Parameters:
  • file_path – the path to the file to be uploaded
  • target_path – the target folder for the file, preserving the filename of the file. if end of /, uses the original file name.
Returns:

None

static upload_s3(source_path, *keys, path=None)[source]

Upload a file to an S3 bucket

Parameters:
  • source_path – File name for the file to be uploaded
  • path – path to an S3 bucket to upload to
Returns:

True if file was uploaded, else False