Hello
I keep getting an error when downloading a specific S2 photo range. I suspect maybe an image corruption has occured, because other time periodes work just fine.
I get a pretty huge stacktrace, but these are the highlights:
{ 'id': '[1749817012387, 17005]', 'level': 'info', 'message': 'load_collection: Creating raster datacube for ' "TERRASCOPE_S2_TOC_V2 with arguments {'temporal_extent': " "('2025-05-26', '2025-06-02'), 'spatial_extent': {'west': " "2.530489164377812, 'south': 50.646159831055506, 'east': " "5.952960980443944, 'north': 51.503991645411425, 'crs': " "'EPSG:4326'}, 'global_extent': {'west': 466800, 'south': " "5610580, 'east': 708770, 'north': 5710010, 'crs': " "'EPSG:32631'}, 'bands': ['B02', 'B03', 'B04', 'B08', " "'B11', 'B12', 'SCL'], 'properties': {'eo:cloud_cover': " "{'process_graph': {'lte1': {'result': True, 'process_id': " "'lte', 'arguments': {'x': {'from_parameter': 'value'}, " "'y': 80}}}}}, 'aggregate_spatial_geometries': None, " "'sar_backscatter': None, 'process_types': " "{<ProcessType.GLOBAL_TIME: 4>}, 'custom_mask': {'method': " "'mask_scl_dilation', 'scl_band_name': None}, 'data_mask': " "None, 'target_crs': None, 'target_resolution': None, " "'resample_method': 'near', 'pixel_buffer': None}, " "environment: {'vault_token': None, " "'sentinel_hub_client_alias': 'default', " "'max_soft_errors_ratio': 0.1, 'dependencies': [], " "'pyramid_levels': 'highest', 'require_bounds': True, " "'correlation_id': 'j-250613121558405e905bcf390fb6547e', " "'user': " "User('xxx@egi.eu', " "None), 'openeo_api_version': '1.2'}", 'time': '2025-06-13T12:16:52.387Z'}, ... { 'id': '[1749819025201, 45095]', 'level': 'warning', 'message': 'A part of the process graph failed, and will be retried, ' 'the reason was: "Job aborted due to stage failure: Task 0 ' 'in stage 11.0 failed 4 times, most recent failure: Lost ' 'task 0.3 in stage 11.0 (TID 1142) (epod189.vgt.vito.be ' 'executor 23): ExecutorLostFailure (executor 23 exited ' 'caused by one of the running tasks) Reason: Container from ' 'a bad node: container_e5156_1749151415540_18817_01_000034 ' 'on host: epod189.vgt.vito.be. Exit status: 143. ' 'Diagnostics: [2025-06-13 14:50:24.695]Container killed on ' 'request. Exit code is 143\n' '[2025-06-13 14:50:24.716]Container exited with a non-zero ' 'exit code 143. \n' '[2025-06-13 14:50:24.716]Killed by external signal\n' '.\n' 'Driver stacktrace:"\n' 'Your job may still complete if the failure was caused by a ' 'transient error, but will take more time. A common cause ' 'of transient errors is too little executor memory ' '(overhead). Too low executor-memory can be seen by a high ' "'garbage collection' time, which was: 0.008 seconds.\n", 'time': '2025-06-13T12:50:25.201Z'}, .. { 'id': '[1749819071990, 25576]', 'level': 'error', 'message': 'OpenEO batch job failed: Your batch job failed because ' 'workers used too much memory. The same task was attempted ' 'multiple times. Consider increasing executor-memory, ' 'python-memory or executor-memoryOverhead or contact the ' 'developers to investigate.', 'time': '2025-06-13T12:51:11.990Z'}]
Thanks!
Comments
Hi Joeri Thanks for…
Hi Joeri
Thanks for reporting the problem. We'll take a look and update you when we find out more.
memory
Hi, You could try with higher memory options when running the task. It is a relative large task and the last message suggest there was an out of memory. This might help: datacube.create_job(job_options = {"python-memory": "8G"})
If that is not enough, you can try with: job_options = { "driver-memory": "6G", "driver-memoryOverhead": "8G", "executor-memory": "6G", "executor-memoryOverhead": "8G", }
But try to aim for the least memory possible, to avoid long queue time and higher credit costs.
Emile