datalab.storage Module

Google Cloud Platform library - Cloud Storage Functionality.

class datalab.storage.Bucket(name, info=None, context=None)[source]

Represents a Cloud Storage bucket.

Initializes an instance of a Bucket object.

Parameters:
  • name – the name of the bucket.
  • info – the information about the bucket if available.
  • context – an optional Context object providing project_id and credentials. If a specific project id or credentials are unspecified, the default ones configured at the global level are used.
create(project_id=None)[source]

Creates the bucket.

Parameters:project_id – the project in which to create the bucket.
Returns:The bucket.
Raises:Exception if there was an error creating the bucket.
delete()[source]

Deletes the bucket.

Raises:Exception if there was an error deleting the bucket.
exists()[source]

Checks if the bucket exists.

item(key)[source]

Retrieves an Item object for the specified key in this bucket.

The item need not exist.

Parameters:key – the key of the item within the bucket.
Returns:An Item instance representing the specified key.
items(prefix=None, delimiter=None)[source]

Get an iterator for the items within this bucket.

Parameters:
  • prefix – an optional prefix to match items.
  • delimiter – an optional string to simulate directory-like semantics. The returned items will be those whose names do not contain the delimiter after the prefix. For the remaining items, the names will be returned truncated after the delimiter with duplicates removed (i.e. as pseudo-directories).
Returns:

An iterable list of items within this bucket.

metadata

Retrieves metadata about the bucket.

Returns:A BucketMetadata instance with information about this bucket.
Raises:Exception if there was an error requesting the bucket’s metadata.
name

The name of the bucket.

class datalab.storage.Buckets(project_id=None, context=None)[source]

Represents a list of Cloud Storage buckets for a project.

Initializes an instance of a BucketList.

Parameters:
  • project_id – an optional project whose buckets we want to manipulate. If None this is obtained from the api object.
  • context – an optional Context object providing project_id and credentials. If a specific project id or credentials are unspecified, the default ones configured at the global level are used.
contains(name)[source]

Checks if the specified bucket exists.

Parameters:name – the name of the bucket to lookup.
Returns:True if the bucket exists; False otherwise.
Raises:Exception if there was an error requesting information about the bucket.
create(name)[source]

Creates a new bucket.

Parameters:name – a unique name for the new bucket.
Returns:The newly created bucket.
Raises:Exception if there was an error creating the bucket.
class datalab.storage.Item(bucket, key, info=None, context=None)[source]

Represents a Cloud Storage object within a bucket.

Initializes an instance of an Item.

Parameters:
  • bucket – the name of the bucket containing the item.
  • key – the key of the item.
  • info – the information about the item if available.
  • context – an optional Context object providing project_id and credentials. If a specific project id or credentials are unspecified, the default ones configured at the global level are used.
copy_to(new_key, bucket=None)[source]

Copies this item to the specified new key.

Parameters:
  • new_key – the new key to copy this item to.
  • bucket – the bucket of the new item; if None (the default) use the same bucket.
Returns:

An Item corresponding to new key.

Raises:

Exception if there was an error copying the item.

delete()[source]

Deletes this item from its bucket.

Raises:Exception if there was an error deleting the item.
exists()[source]

Checks if the item exists.

key

Returns the key of the item.

metadata

Retrieves metadata about the bucket.

Returns:A BucketMetadata instance with information about this bucket.
Raises:Exception if there was an error requesting the bucket’s metadata.
read_from(start_offset=0, byte_count=None)[source]

Reads the content of this item as text.

Parameters:
  • start_offset – the start offset of bytes to read.
  • byte_count – the number of bytes to read. If None, it reads to the end.
Returns:

The text content within the item.

Raises:

Exception if there was an error requesting the item’s content.

read_lines(max_lines=None)[source]

Reads the content of this item as text, and return a list of lines up to some max.

Parameters:max_lines – max number of lines to return. If None, return all lines.
Returns:The text content of the item as a list of lines.
Raises:Exception if there was an error requesting the item’s content.
uri

Returns the gs – // URI for the item.

write_to(content, content_type)[source]

Writes text content to this item.

Parameters:
  • content – the text content to be written.
  • content_type – the type of text content.
Raises:

Exception if there was an error requesting the item’s content.

class datalab.storage.Items(bucket, prefix, delimiter, context=None)[source]

Represents a list of Cloud Storage objects within a bucket.

Initializes an instance of an ItemList.

Parameters:
  • bucket – the name of the bucket containing the items.
  • prefix – an optional prefix to match items.
  • delimiter – an optional string to simulate directory-like semantics. The returned items will be those whose names do not contain the delimiter after the prefix. For the remaining items, the names will be returned truncated after the delimiter with duplicates removed (i.e. as pseudo-directories).
  • context – an optional Context object providing project_id and credentials. If a specific project id or credentials are unspecified, the default ones configured at the global level are used.
contains(key)[source]

Checks if the specified item exists.

Parameters:key – the key of the item to lookup.
Returns:True if the item exists; False otherwise.
Raises:Exception if there was an error requesting information about the item.