PYME.IO.cluster_directory module¶

Work in progress … refactor some of the listing stuff out of clusterIO to

reduce the size of clusterIO
reduce duplication between, e.g. locate and listdirectory
ultimately allow pluggable directory management / caching, e.g. to enable a central directory server (unclear if this would solve current performance issues, but potentially worth a try).

class PYME.IO.cluster_directory.DirectoryInfoManager(ns=None, serverfilter='')¶

Bases: object

cglob(pattern)¶

Find files matching a given glob on the cluster. Analogous to the python glob.glob function.

Parameters

Returns

property dataservers¶: Find all the data servers belonging to the cluster, caching the results

exists(name)¶

Test whether a file exists on the cluster. Analogue to os.path.exists for local files.

Parameters

Returns

isdir(name)¶: Tests if a given path on the cluster is a directory. Analogous to os.path.isdir

list_single_node_dir(dirurl, nRetries=1, timeout=10, strict_caching=False)¶

List the directory on a single node

Parameters

Returns

listdir(dirname)¶: Lists the contents of a directory on the cluster. Similar to os.listdir, but directories are indicated by a trailing slash

listdirectory(dirname, timeout=5)¶

Lists the contents of a directory on the cluster.

Returns a dictionary mapping filenames to clusterListing.FileInfo named tuples.

locate_file(filename, return_first_hit=False)¶

Searches the cluster to find which server(s) a given file is stored on

Parameters

filenamestr: The file name
return_first_hitbool: Whether to try and find all locations, or return when we find the first copy

Returns

register_file(filename, url, size)¶: Call after uploading a new file so we can update our caches