Publications
2025
2024
2023
- LANDLORD: Coordinating Dynamic Software Environments to Reduce Container SprawlIEEE Transactions on Parallel and Distributed Systems, 2023doi: 10.1109/TPDS.2023.3241598
2022
- Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy PhysicsIn IEEE International Parallel and Distributed Processing Symposium, 2022doi: 10.1109/IPDPS53621.2022.00041
-
2021
-
- Lightweight Function Monitors for Fine-Grained Management in Large Scale Python ApplicationsIn IEEE International Parallel and Distributed Processing Symposium, 2021doi: 10.1109/IPDPS49936.2021.00088
- Harnessing HPC resources for CMS jobs using a Virtual Private NetworkIn 25th International Conference on Computing in High Energy and Nuclear Physics (CHEP), 2021doi: 10.1051/epjconf/202125102032
- A Community Roadmap for Scientific Workflows Research and DevelopmentIn IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS), 2021doi: 10.1109/WORKS54523.2021.00016
2020
- Solving the Container Explosion Problem for Distributed High Throughput ComputingIn International Parallel and Distributed Processing Symposium, 2020doi: 10.1109/IPDPS47924.2020.00048
2019
- Flexible Partitioning of Scientific Workflows Using the JX Workflow LanguageIn Practice and Experience in Advanced Research Computing (PEARC), 2019doi: 10.1145/3332186.3338100
2018
- A Lightweight Model for Right-Sizing Master-Worker ApplicationsIn ACM/IEEE Supercomputing (SC), 2018doi: 10.1109/SC.2018.00042
- An Algebra for Robust Workflow TransformationsIn IEEE International Conference on e-Science, 2018doi: 10.1109/eScience.2018.00031
- Poster: A First Look at the JX Workflow LanguageIn IEEE International Conference on e-Science, 2018doi: 10.1109/eScience.2018.00094
- Automatic Dependency Management for Scientific Applications on ClustersIn IEEE International Conference on Cloud Engineering (IC2E) , 2018doi: 10.1109/IC2E.2018.00026
- A Workflow Management System to Facilitate Reproducibility of Scientific Computing Applications2018
- A Job Sizing Strategy for High-Throughput Scientific WorkflowsIEEE Transactions on Parallel and Distributed Systems, 2018doi: 10.1109/TPDS.2017.2762310
2017
- Poster: Wharf: Sharing Docker Images across Hosts from a Distributed FilesystemIn IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2017
- Enabling Implementation and Optimization of Scientific Algorithms via Graphics Processing Units2017
- Facilitating the Reproducibility of Scientific Workflows with Execution Environment SpecificationsIn The 17th International Conference on Computational Science (ICCS), 2017doi: 10.1016/j.procs.2017.05.116
- Deploying High Throughput Scientific Workflows on Container Schedulers with Makeflow and MesosIn 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2017), 2017doi: 10.1109/CCGRID.2017.9
2016
- An Analysis of Reproducibility and Non-Determinism in HEP Software and ROOT DataIn International Conference on Computing in High Energy and Nuclear Physics, 2016doi: 10.1088/1742-6596/898/10/102007
- DistIA: A Cost-Effective Dynamic Impact Analysis for Distributed ProrgamsIn IEEE/ACM International Conference on Automated Software Engineering, 2016doi: 10.1145/2970276.2970352
- Balancing push and pull in Confuga, an active storage cluster file system for scientific workflowsConcurrency and Computation: Practice and Experience, 2016doi: 10.1002/cpe.3834
2015
- A Case Study in Preserving a High Energy Physics Application with ParrotIn Journal of Physics: Conference Series (CHEP 2015), 2015doi: 10.1088/1742-6596/664/3/032022
- Adapting Collaborative Software Development Techniques to Structural EngineeringIEEE/AIP Computing in Science and Engineering, 2015doi: 10.1109/MCSE.2015.88
- Lessons Learned from Crowdsourcing Complex Engineering TasksPLOS One, 2015doi: 10.1371/journal.pone.0134978
2014
2013
- Design of an Active Storage Cluster File System for DAG WorkflowsIn International Workshop on Data-Intensive Scalable Computing Systems, 2013doi: 10.1145/2534645.2534656
- Right-sizing Resource Allocations for Scientific Applications in Clusters, Grids, and Clouds2013
2012
- A Framework for Scalable Genome Assembly on Clusters, Clouds, and GridsIEEE Transactions on Parallel and Distributed Systems, 2012doi: 10.1109/TPDS.2012.80
- A System for Management of Computational Fluid Dynamics Simulations for Civil EngineeringIn 8th IEEE International Conference on eScience, 2012doi: 10.1109/eScience.2012.6404433
-
- Fine-Grained Access Control in the Chirp Distributed File SystemIn IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, 2012doi: 10.1109/CCGrid.2012.128
- Makeflow: A Portable Abstraction for Data Intensive Computing on Clusters, Clouds, and GridsIn Workshop on Scalable Workflow Enactment Engines and Technologies (SWEET) at ACM SIGMOD, 2012doi: 10.1145/2443416.2443417
- Data Intensive Computing with Clustered Chirp ServersIn Data Intensive Distributed Computing: Challenges and Solutions for Large Scale Information Management, 2012isbn: 9781615209712
2011
- Converting a High Performance Application to an Elastic Cloud ApplicationIn The 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011), 2011
- An Introduction to Open-Source IaaS Cloud MiddlewareIn Cloud Computing: Methodology, Systems, and Applications, 2011isbn: 978-1-4398-5641-3
2010
- A Comparison and Critique of Eucalyptus, OpenNebula and NimbusIn IEEE International Conference on Cloud Computing Technology and Science, 2010doi: 10.1109/CloudCom.2010.42
- Taming Complex Bioinformatics Workflows with Weaver, Makeflow, and StarchIn Workshop on Workflows in Support of Large Scale Science, 2010doi: 10.1109/WORKS.2010.5671858
- Environmentally Opportunistic Computing: Transforming the Data Center for Economic and Environmental SustainabilityIn IEEE Green Computing Conference, 2010doi: 10.1109/GREENCOMP.2010.5598289
- Abstractions for Cloud Computing with CondorIn Cloud Computing and Software Services: Theory and Techniques, 2010isbn: 9781439803158
- ROARS: A Scalable Repository for Data Intensive Scientific ComputingIn The Third International Workshop on Data Intensive Distributed Computing at ACM HPDC 2010, 2010doi: 10.1145/1851476.1851587
- Weaver: Integrating Distributed Computing Abstractions into Scientific Workflows using PythonIn Challenges of Large Applications in Distributed Environments at ACM HPDC 2010, 2010doi: 10.1145/1851476.1851570
- Towards Long Term Data Quality in a Large Scale Biometrics ExperimentIn Managing Data Quality for Collaborative Science at ACM HPDC 2010, 2010doi: 10.1145/1851476.1851559
- Biocompute: Toward a Collaborative Workspace for Data Intensive Bio-ScienceIn Workshop on Emerging Computational Methods for Life Sciences at ACM HPDC 2010, 2010doi: 10.1145/1851476.1851547
- All-Pairs: An Abstraction for Data Intensive Computing on Campus GridsIEEE Transactions on Parallel and Distributed Systems, 2010doi: 10.1109/TPDS.2009.49
- Visualizing Massively Multithreaded Applications with ThreadScopeConcurrency and Computation: Practice and Experience, 2010doi: 10.1002/cpe.1469
2009
- Towards Data Intensive Many Task ComputingIn Data Intensive Distributed Computing: Challenges and Solutions for Large-Scale Information Management, 2009
-
- Coordination of Access to Large-scale Datasets in Distributed EnvironmentsIn Scientific Data Management: Challenges, Existing Technology, and Deployment, 2009isbn: 978-1420069808
-
-
- Experience with BXGrid: A Data Repository and Computing Grid for Biometrics ResearchJournal of Cluster Computing, 2009doi: 10.1007/s10586-009-0098-7
2008
- Poster: BXGrid: A Data Repository and Workflow Abstraction for Biometrics Research2008isbn: 10.1109/eScience.2008.135
- Biomolecular Committor Probability Calculation Enabled by Processing in Network StorageJournal of Parallel Computing, 2008doi: 10.1016/j.parco.2008.08.001
- Making the Best of a Bad Situation: Prioritized Storage Management in GEMSFuture Generation Computing Systems, 2008doi: 10.1016/j.future.2007.04.003
-
2007
- On Demand Transient Storage and Backup in Mobile SystemsIn IEEE Military Communications Conference, 2007doi: 10.1109/MILCOM.2007.4454917
- Lessons Learned Building TeamTrak: An Urban/Outdoor Mobile TestbedIn International Conference on Wireless Architectures Systems and Applications, 2007doi: 10.1109/WASA.2007.35
- Work in Progress: Integrating Undergraduate Research and Education via the TeamTrak Mobile Computing FrameworkIn IEEE Frontiers in Education, 2007doi: 10.1109/FIE.2007.4418007
- Poster: Lockdown: Distributed Policy Analysis and Enforcement within the Enterprise NetworkIn USENIX Security Symposium, 2007
- Biomolecular Path Sampling Enabled by Processing in Network StorageIn Workshop on High Performance Computational Biology at IEEE IPDPS, 2007doi: 10.1109/IPDPS.2007.370446
-
-
2006
- Access Control for a Replica Management DatabaseIn ACM Workshop on Storage Security and Survivability at ACM CCS, 2006doi: 10.1145/1179559.1179567
-
- Experience with a Literate Approach to Computer ScienceIn IEEE Frontiers in Education, 2006doi: 10.1109/FIE.2006.322405
- Applying Feedback Control to a Replica Management SystemIn IEEE Southeastern Symposium on System Theory, 2006doi: 10.1109/SSST.2006.1619125
- Using Condor Glide-Ins and Parrot to Move from Dedicated Resources to the GridLecture Notes in Informatics, 2006
-
2005
- Separating Abstractions from Resources in a Tactical Storage SystemIn IEEE/ACM Supercomputing, 2005doi: 10.1109/SC.2005.64
- Work in Progress: A Literate Approach to Graduate Computer Science EducationIn IEEE Frontiers in Education, 2005doi: 10.1109/FIE.2005.1612087
- Parrot: An Application Environment for Data-Intensive ComputingScalable Computing: Practice and Experience, 2005
- Poster: Identity Boxing: Secure User-Level Containment for the Grid2005isbn: 10.1109/HPDC.2005.1520984
- Generosity and Gluttony in GEMS: Grid Enabled Molecular SimulationsIn IEEE Symposium on High Performance Distributed Computing, 2005doi: 10.1109/HPDC.2005.1520959
- Distributed Computing in Practice: The Condor ExperienceConcurrency and Computation: Practice and Experience, 2005doi: 10.1002/cpe.v17:2/4
2004
- Building Reliable Clients and ServersIn Grid: Blueprint for a New Computing Infrastructure, 2004isbn: 1-55860-933-4
- Explicit Control in a Batch Aware Distributed File SystemIn USENIX Networked Systems Design and Implementation (NSDI), 2004
2003
- Condor and the GridIn Grid Computing: Making the Global Infrastructure a Reality, 2003isbn: 0-470-85319-0
2002
2001
- Gathering at the Well: Creating Communities for Grid I/OIn IEEE/ACM Supercomputing, 2001doi: 10.1109/SC.2001.10023
- The Kangaroo Approach to Data Movement on the GridIn IEEE High Performance Distributed Computing, 2001doi: 10.1109/HPDC.2001.945200
- Multiple Bypass: Interposition Agents for Distributed ComputingJournal of Cluster Computing, 2001doi: 10.1023/A:1011412209850
2000
- Bypass: A Tool for Building Split Execution SystemsIn IEEE High Performance Distributed Computing, 2000doi: 10.1109/HPDC.2000.868637