Script Layering for large-scale number crunching in Python
By Catherine Moroney

Python's scripting capabilities, combined with its number-crunching utility make it very easy to gather up all the data, submit thousands of jobs and then monitor their progress, all with a single command. One script submits a single orbit, another one calls the first to run a month or year worth of data, and then a third calls the second to run the entire processing chain. Script layering!

Python's combination of number-crunching and plotting prowess, combined with scripting makes it easy to submit a massive number of scientific jobs in a fully-automated fashion. I often need to run hundreds or thousands of orbits through the processing software and by dividing the duty into multiple scripts, each calling the one below, I have a customizable and automated way of finding the input files, caching them to a local disk, creating the input files, submitting each orbit to our batch-processing system and monitoring their progress so the entire chain consisting of 10-15 executables can all be run in order with a single command.

Catherine Moroney

Physicist and software engineer at the Jet Propulsion Laboratory, studying clouds and climate by analyzing satellite data. I help to develop the algorithms for generating scientific data products from the raw data, analyzing their performance and then making them into production code. I use Python very heavily both for the heavy-duty number crunching, plotting the results and also a lot of scripting to submit and run all the jobs in batch-processing fashion.

Sponsors