TaskVine is our third-generation workflow system for building scalable data intensive applications that run on HPC clusters, cloud services, and other clusters.
TaskVine is a task scheduler for building large scale data intensive dynamic workflows that run on HPC clusters, GPU clusters, and commercial clouds. As tasks access external data sources and produce their own outputs, more and more data is pulled into local storage on workers. This data is used to accelerate future tasks and avoid re-computing exisiting results. Data gradually grows “like a vine” through the cluster. TaskVine is our third-generation workflow system, built on our twenty years of experience creating scalable applications in fields such as high energy physics, bioinformatics, molecular dynamics, and machine learning.
Related Publications
Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine
Barry Sly-Delgado, Ben Tovar, Jin Zhou, and Douglas Thain
@inproceedings{reshaping-sc-2024,author={Sly-Delgado, Barry and Tovar, Ben and Zhou, Jin and Thain, Douglas},title={{Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine}},booktitle={{ACM/IEEE Supercomputing}},pages={1-11},year={2024},cclpaperid={996},keywords={taskvine},doi={10.1109/SC41406.2024.00068}}
Poster: Leveraging Intermediate Data Management with Parsl/TaskVine
Colin Thomas and Douglas Thain
In Greater Chicago Area Systems Research Workshop, 2024
@inproceedings{data-gcasr-2024,author={Thomas, Colin and Thain, Douglas},title={{Poster: Leveraging Intermediate Data Management with Parsl/TaskVine}},booktitle={{Greater Chicago Area Systems Research Workshop}},pages={1},year={2024},cclpaperid={998},keywords={taskvine},}
Poster: Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine
Barry Sly-Delgado and Douglas Thain
In Greater Chicago Area Systems Research Workshop, 2024
@inproceedings{reshaping-gcasr-2024,author={Sly-Delgado, Barry and Thain, Douglas},title={{Poster: Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine}},booktitle={{Greater Chicago Area Systems Research Workshop}},pages={1},year={2024},cclpaperid={1002},keywords={taskvine},}
TaskVine: Managing In-Cluster Storage for High-Throughput Data Intensive Workflows
Barry Sly-Delgado, Thanh Son Phung, Colin Thomas, David Simonetti, Andrew Hennessee, Ben Tovar, and Douglas Thain
In 18th Workshop on Workflows in Support of Large-Scale Science, 2023
@inproceedings{taskvine-works-2023,author={Sly-Delgado, Barry and Phung, Thanh Son and Thomas, Colin and Simonetti, David and Hennessee, Andrew and Tovar, Ben and Thain, Douglas},title={{TaskVine: Managing In-Cluster Storage for High-Throughput Data Intensive Workflows}},booktitle={{ 18th Workshop on Workflows in Support of Large-Scale Science}},year={2023},cclpaperid={991},keywords={taskvine},}
Poster: TaskVine: A User-Level Framework for Data Intensive Scientific Applications
@inproceedings{taskvine-cssi-2023,author={Thain, Douglas},title={{Poster: TaskVine: A User-Level Framework for Data Intensive Scientific Applications}},booktitle={{CSSI PI Meeting}},year={2023},cclpaperid={989},keywords={taskvine},}
Poster: Mixed Modality Workflows in TaskVine
David Simonetti, Ben Tovar, and Douglas Thain
In ACM High Performance Distributed Computing, 2023
@inproceedings{mixed-hpdc-2023,author={Simonetti, David and Tovar, Ben and Thain, Douglas},title={{Poster: Mixed Modality Workflows in TaskVine}},booktitle={{ACM High Performance Distributed Computing}},pages={331-332},year={2023},cclpaperid={988},keywords={taskvine},}
Acknowledgement
This work was supported in part by grant OAC #1931348 "CSSI Elements: Data Swarm: A User-Level Framework for Data Intensive Scientific Computing".