undr.remote
#
Low-level implementation of resource download.
Overview#
Classes#
Retrieves data from a remote server. |
|
Retrieves data from a remote server and saves it to a file. |
|
A placeholder server that raises an exception when used. |
|
Message that reports download progress. |
|
Represents a remote server. |
Module Contents#
- class undr.remote.Download(path_id: pathlib.PurePosixPath, suffix: str | None, server: Server, stream: bool)#
Bases:
undr.task.Task
Retrieves data from a remote server.
This is an abstract task that calls its methods (lifecycle callbacks) as follows:
on_begin()
is called before contacting the server. This function can be used to create write resources and must return an offset in bytes. Download resumes from that offset if it is non-zero. If the offset is negative, the task assumes that the download is complete and it callson_end()
immediately.on_range_failed()
is called ifon_begin()
returned a non-zero offset and the server rejects the range request (HTTP 206). It can be used to clean up ‘append’ resources and replace them with ‘write’ resources. The actual download starts afteron_range_failed()
as ifon_begin()
returned 0.on_response_ready()
is called when the response is ready for iteration. The subclass must callrequests.Response.close()
after reading the response (and probablyon_end()
).
This lifecycle allows users to yield on response chunks (see
undr.path.File._chunks()
for an example).- Parameters:
path_id (pathlib.PurePosixPath) – The resource’s unique path id.
suffix (Optional[str]) – Added to the file name while it is being downloaded.
server (Server) – The remote server.
stream (bool) – Whether to download the file in chunks (slightly slower for small files, reduces memory usage for large files).
- abstract on_begin(manager: undr.task.Manager) int #
Called before contacting the server.
This function must return an offset in bytes.
0 indicates that the file is not downloaded yet.
Positive values indicate the number of bytes already downloaded.
Negative values indicate that the download is already complete and must be skipped.
- Parameters:
manager (task.Manager) – The task manager for reporting updates.
- Returns:
Number of bytes already downloaded.
- Return type:
- abstract on_end(manager: undr.task.Manager) None #
Called when the download task completes.
This function is called automatically if the byte offset returned by
on_begin()
is nagative. Implementations should call it after consuming the response inon_response_ready()
.- Parameters:
manager (task.Manager) – The task manager for reporting updates.
- abstract on_range_failed(manager: undr.task.Manager) None #
Called if the HTTP range call fails.
The HTTP range request asks the serve to resumes download at a given byte offset. It used when
on_begin()
returns a non-zero value. Range is not always supported by the server. This function should reset counters and ready the local file system for a standard (full) download.- Parameters:
manager (task.Manager) – The task manager for reporting updates.
- abstract on_response_ready(response: requests.Response, manager: undr.task.Manager) None #
Called when the HTTP response object is ready.
The reponse object can be used to download the remote file.
- Parameters:
response (requests.Response) – HTTP response object.
manager (task.Manager) – The task manager for reporting updates.
- run(session: requests.Session, manager: undr.task.Manager)#
- class undr.remote.DownloadFile(path_root: pathlib.Path, path_id: pathlib.PurePosixPath, suffix: str | None, server: Server, force: bool, expected_size: int | None, expected_hash: str | None)#
Bases:
Download
Retrieves data from a remote server and saves it to a file.
- on_begin(manager: undr.task.Manager) int #
Opens the local file before starting the download.
If the file exists, this function opens it in append mode and returns its size in bytes.
- Parameters:
manager (task.Manager) – The task manager for reporting updates.
- Returns:
Number of bytes already downloaded.
- Return type:
- on_end(manager: undr.task.Manager)#
Checks the hash and closes the file.
- Parameters:
manager (task.Manager) – The task manager for reporting updates.
- Raises:
exception.HashMismatch – if the provided and effective hashes are different.
exception.SizeMismatch – if the provided and effective sizes are different.
- on_range_failed(manager: undr.task.Manager)#
Re-opens the file in write mode.
- Parameters:
manager (task.Manager) – The task manager for reporting updates.
- on_response_ready(response: requests.Response, manager: undr.task.Manager) None #
Iterates over the file chunks and writes them to the file.
- Parameters:
response (requests.Response) – HTTP response object.
manager (task.Manager) – The task manager for reporting updates.
- class undr.remote.NullServer#
Bases:
Server
A placeholder server that raises an exception when used.
Some functions and classes require a server to download resources that are no available locally. If the resources are known to be local, this server can be used to detect download attempts.
- abstract path_id_to_url(path_id: pathlib.PurePosixPath)#
Calculates a resource URL from its path ID.
- Parameters:
path_id (pathlib.PurePosixPath) – The resource’s path ID, including the dataset name.
- Returns:
The resource’s remote URL.
- Return type:
- class undr.remote.Progress#
Message that reports download progress.
- initial_bytes: int#
Number of bytes of the remote resource that were already downloaded when the current download began.
- path_id: pathlib.PurePosixPath#
Path ID of the associated resource
- class undr.remote.Server#
Represents a remote server.
- url: str#
The server’s base URL.
Resources URL are calculated by appending the file’s path ID to the server URL. A slash is inserted before the path ID if the server’s URL does not end with one.
- path_id_to_url(path_id: pathlib.PurePosixPath) str #
Calculates a resource URL from its path ID.
- Parameters:
path_id (pathlib.PurePosixPath) – The resource’s path ID, including the dataset name.
- Returns:
The resource’s remote URL.
- Return type: