dep_tools.task module
Tasks form the core of the DEP scaling procedure. They orchestrate tasks by loading data, processing it, and writing the output. Tasks can be generic but are sometimes fine-tuned for specific processing.
- class dep_tools.task.Task(task_id, loader, processor, writer, logger)[source]
Bases:
ABCThe abstract base for Task objects.
Task objects load data, process it, and write output. They are reusable for the same task operating on new data.
- Parameters:
- class dep_tools.task.AreaTask(id, area, loader, processor, writer, logger=<RootLogger root (WARNING)>)[source]
Bases:
TaskAn AreaTask adds an area property to a basic
Task.Most other arguments are as for
Task.- Parameters:
area (
GeoDataFrame) – An area for use by the loader and/or processor. For instance, it can be used to clip data.
- class dep_tools.task.StacTask(id, area, searcher, loader, processor, writer, post_processor=None, stac_creator=None, stac_writer=None, logger=<RootLogger root (WARNING)>)[source]
Bases:
AreaTaskA StacTask extends
AreaTaskby adding a searcher and optional post-processor, STAC creator, and STAC writer.Most arguments (id, area, loader, processor, writer, logger) are as for
AreaTask.- Parameters:
searcher (
Searcher) – The searcher searches for data, typically on the basis of the id and/or the area.post_processor (
Optional[Processor]) – AProcessorthat can prep data for writing, for example scaling or data type conversions.stac_creator (
Optional[StacCreator]) – Creates a STAC Item from the data.stac_writer (
Optional[Writer]) – Writes the STAC Item to storage.
- class dep_tools.task.AwsStacTask(itempath, id, area, searcher, loader, processor, post_processor=None, logger=<RootLogger root (WARNING)>, **kwargs)[source]
Bases:
StacTaskA convenience class with values of writer, stac_creator, and stac_writer set to sensible defaults for writing to S3.
By default, an
AwsDsCogWriteris used as the primary writer, anAwsStacWriteris used to write STAC Items, and the baseStacCreatoris used to create the STAC object.All other arguments are as for
StacTask.- Parameters:
**kwargs – Additional arguments passed to
StacTask.
- class dep_tools.task.ItemStacTask(id, item, loader, processor, writer, post_processor=None, stac_creator=None, stac_writer=None, logger=<RootLogger root (WARNING)>)[source]
Bases:
TaskA task for a single STAC item.
Most arguments are as for
StacTask, except area is dropped.- Parameters:
item (
Item) – Apystac.Itemrepresenting the input data.
- class dep_tools.task.ErrorCategoryAreaTask(id, area, loader, processor, writer, logger=<RootLogger root (WARNING)>)[source]
Bases:
AreaTaskAn AreaTask with extra logging.
- Errors logged include:
EmptyCollectionErrorfrom loader: logged as “no items for areas”Other
Exceptionfrom loader: logged as “load error”Exceptionfrom processor: logged as “processor error”Empty processor output: logged as “no output from processor”
Writer error: logged as “error” (could be from writer or something beforehand if using dask)
- class dep_tools.task.MultiAreaTask(ids, areas, logger, task_class, fail_on_error=True, **kwargs)[source]
Bases:
objectA “Task” object that iterates over multiple IDs and runs a task for each.
This class is useful when running multiple short tasks where the time to build the run environment (for instance, if running on a pod) adds considerably to overall processing time.
- Parameters:
ids (
list[str]) – A list of IDs.areas (
GeoDataFrame) – Ageopandas.GeoDataFramewith index corresponding to the IDs.task_class (
type[AreaTask]) – TheAreaTasksubclass to use for each task.fail_on_error (
bool) – If True, will exit on error. Otherwise, will log the full exception and continue.logger – A logger.
**kwargs – Additional arguments to the task_class constructor.