TY - GEN
T1 - FlexQuery
T2 - 15th IEEE International Conference on Cluster Computing, CLUSTER 2013
AU - Zou, Hongbo
AU - Schwan, Karsten
AU - Slawinska, Magdalena
AU - Wolf, Matt
AU - Eisenhauer, Greg
AU - Zheng, Fang
AU - Dayal, Jai
AU - Logan, Jeremy
AU - Liu, Qing
AU - Klasky, Scott
AU - Bode, Tanja
AU - Clark, Michael
AU - Kinsey, Matt
PY - 2013
Y1 - 2013
N2 - The remote visual exploration of live data generated by scientific simulations is useful for scientific discovery, performance monitoring, and online validation for the simulation results. Online visualization methods are challenged, however, by the continued growth in the volume of simulation output data that has to be transferred from its source - the simulation running on the high end machine - to where it is analyzed, visualized, and displayed. A specific challenge in this context is limits in the communication bandwidth between data source(s) and sinks. Previous work places queries 'near' data sources, exploiting their data reduction capabilities, but such work does not address the common scenario in which scientists make multiple different queries on the data being produced. This paper considers the general case in which science users are interested in different (sub)sets of the data produced by a high end simulation. We offer the FlexQuery online data query system that can deploy and execute data queries 'along' the I/O and analytics pipelines. FlexQuery carefully extends such analytics pipelines, using online performance monitoring and data location tracking, to realize data queries in ways that minimize additional data movement and offer low latency in data query execution. Using a real-world scientific application - the Maya astrophysics code and its analytics workflow - we demonstrate FlexQuery's ability to dynamically deploy queries for low-latency remote data visualization.
AB - The remote visual exploration of live data generated by scientific simulations is useful for scientific discovery, performance monitoring, and online validation for the simulation results. Online visualization methods are challenged, however, by the continued growth in the volume of simulation output data that has to be transferred from its source - the simulation running on the high end machine - to where it is analyzed, visualized, and displayed. A specific challenge in this context is limits in the communication bandwidth between data source(s) and sinks. Previous work places queries 'near' data sources, exploiting their data reduction capabilities, but such work does not address the common scenario in which scientists make multiple different queries on the data being produced. This paper considers the general case in which science users are interested in different (sub)sets of the data produced by a high end simulation. We offer the FlexQuery online data query system that can deploy and execute data queries 'along' the I/O and analytics pipelines. FlexQuery carefully extends such analytics pipelines, using online performance monitoring and data location tracking, to realize data queries in ways that minimize additional data movement and offer low latency in data query execution. Using a real-world scientific application - the Maya astrophysics code and its analytics workflow - we demonstrate FlexQuery's ability to dynamically deploy queries for low-latency remote data visualization.
KW - data reduction
KW - online query
KW - remote visualization
UR - http://www.scopus.com/inward/record.url?scp=84893540205&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893540205&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2013.6702635
DO - 10.1109/CLUSTER.2013.6702635
M3 - Conference contribution
AN - SCOPUS:84893540205
SN - 9781479908981
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
BT - 2013 IEEE International Conference on Cluster Computing, CLUSTER 2013
Y2 - 23 September 2013 through 27 September 2013
ER -