Interaction Designer, Researcher, Software Developer
Research, Prototyping, Evaluation, Pilot Tool Development
May 2016 - June 2016
Chris Chung, Lyle Klyne, Ankit Potdar
Improving the visualization analysis workflow for an oceanographic data set
Salinity explorer is an interactive data visualization tool to uncover insights in an oceanographic data set collected to understand how freshwater mixes with the ocean.
Densely populated coastal areas are often situated at the river mouths, and so gaining an understanding of how river water mixes with ocean water can be key to understanding coastal pollutant, nutrient, and sediment transport in these areas. Our sponsors, researchers Sam Kastner, Alex Horner-Devine, and Jim Thomson of the Civil and Environmental Engineering and Applied Physics Lab at the University of Washington, collected data from the Fraser River plume to understand what the main forcing factors are in fresh water mixing with the ocean.
Kastner and his group rely on data visualization to explore their highly dimensional data set, but had limited expertise to develop analysis tools. We set out to improve and standardize their visualization toolset with improved data encoding and workflow.
The Fraser River plume data was collected from specialized drifting buoys developed by Jim Thomson ‘s lab, called Surface Wave Instrument Floats with Tracking, or SWIFTs. The array of sensors in the SWIFTs and resulting highly dimensional data makes analysis complicated. Additionally the methods previously employed to process, encode, and ultimately understand the data had significant room for improvement.
Understanding Goals and Existing Workflow
We started with a broad discussion with our collaborators to understand the questions they had for their data set and their current workflow. The SWIFT data was collected to broadly understand the system of physical forcing conditions, such as wind speed and direction, and tidal phase and magnitude, influence fresh and ocean water mixing. The data set Kastner’s lab is analyzing contains 24 parameters of data collected from the SWIFT buoy. We reviewed the data by clarifying encodings, units, expected values, and expected relationships. The previous process of visualization involved custom Matlab scripts and Google Earth map configuration for every visualization. This multistep process was not only time consuming but the resulting visualizations were difficult to interpret.
- Surface salinity and depth salinity were the first dimensions examined to sanity check the validity of the data set.
- Many data parameters that held constant, including Atmospheric pressure, were ignored.
- Rainbow color schemes were liberally used to encode data, however these encodings are often not appropriate.
As many of the visualizations employed prior to our redesign, this visualization uses multiple colors to encode quantitative data. Rainbow colors are poor choice to represent quantitative values because colors do not have an inherent order. We corrected this by using a single color scale so that comparisons can be made more intuitively.
We additionally found that surface salinity and depth salinity were the first dimensions examined to sanity check the validity of the data set. In the final tool we present the relationship of surface and depth salinity prominently. Many data parameters that held constant, including atmospheric pressure, were ignored. This helped us to reduce the number of significant data relationships to present.
Before considering layout, we explored the data and how to best visualize individual graphs in Tableau. Our teammate, Ankit, spearheaded the exploration in Tableau experimenting with different encoding techniques and data processing to create visualizations that best support pattern finding. We determined that maps and scatterplots would be the most suitable fit.
The next step was to investigate layout options that best support Kastner’s lab's questions. We considered two main methods of visualization: small multiples, and parallel coordinates. Both methods support the analysis of many different variables simultaneously. Ultimately we chose small multiples due to its greater flexibility and because it supports broad exploratory analysis rather than optimization which is more suitable for parallel coordinates.
The calendar style date selector was overkill for Kastner’s lab as data sets are typically collected over a period of about 10 or so days. The initial left side controls for the tool took up valuable horizontal screen space that could be better allotted by the map. Kastner also wanted the ability to examine differences across buoy deployments, which was later supported by the inclusion of small multiples of the Map view.
Encoding With Color
As mentioned before, using multiple colors for encoding a single dimension is problematic. Not only do colors not have a clear order, the typical RGB format of color is optimized for hardware reproduction not human perception.
In one of our graphs we needed an additional encoding to represent wind direction. Our initial encoding used arrows indicating the cardinal direction of the wind, however our collaborators had difficulty finding clusters of wind direction. We realized that although the data type of wind direction was quantitative, during initial analysis the ability to group samples was more important than evaluating changes in direction. Color encodings are a good fit for categorical or nominal data. For this reason we used discrete color encoding for wind direction, using LAB and HCL color spaces rather than RGB.
Our final design iteration incorporated cross filtered graphs and maps implemented primarily using D3.js and MapBox.js. Development was primarily done by me and Ankit. We focused on first implementing the features that would have the greatest impact for our collaborators. Some of our internal reach goals included a responsive layout, a flexible script to normalize and clean raw SWIFT data, and an interface to create custom scatter plots.
Had it not been for our extremely responsive and communicative sponsors we would not have been able to develop a useful and usable tool so quickly. Our final design was extremely well received not only by our sponsor’s lab, but also peaked the interest of Oceanography PhD candidates from the University of Washington. Our sponsors have enthusiastically requested to expand on the original project to standardize and develop the tool further. Some suggestions for the new version are customizability of the scatter plots, a more precise method to select data points for cross filtering, and an improved workflow for importing new data.
Additional information about the project can be found on our research paper, which was written largely by our teammate Chris Chung.
Try out the tool here!