datafist: Exploration And Analysis
In this post I introduce datafist, an in-browser tool for visually exploring
your data.
First, A Quote #
Are we analyzing data? Then we should be manipulating the data themselves; or if we are designing an analysis of data, we should be manipulating the analytic structures themselves.
The Sixfold Path Of Self-Tracking #
Self-tracking is a complex process. It can be broken down into stages:
- Intent: our initial goals or motivations in deciding to self-track.
- Tools: the devices or methods we use for tracking.
- Measurement: collecting our data using those tools.
- Analysis: extracting insights from our data.
- Interpretation: creating personal meaning from those insights.
- Action: responding to that personal meaning.
Thanks to smartphones and cheap sensors, many of us already have the tools necessary for measurement. The psychology of intent and action is rapidly being explored through Habit Design and Captology, with startups like Lift and Beeminder harnessing the findings to great effect.
What are the technologies of analysis and interpretation?
Visualization #
Visualizations, even interactive ones, are usually designed to answer specific questions such as how did Obama win re-election? Within the context of those questions, they help us understand our data intuitively. Outside that context, however, they are often useless.
Statistical/Mathematical Software #
Environments like R and Mathematica allow you to explore your data in meticulous detail. In the hands of people like Stephen Wolfram, they are the holy grail of data analysis. For the less technically inclined, they remain hopelessly unintuitive.
A Middle Road #
What we need is something between the two, a hybrid that exposes the exploratory power of the latter through the intuitive interface of the former. Such a tool would give us the opportunity to explore our data as we see fit. We could ask our data questions, iterating quickly on those questions until we reach useful insights.
Until we can all converse with our data with the fluency of Hans Rosling, there's still room for improvement!
datafist #
datafist tries to bridge this gap by providing visual and gestural actions for data manipulation. This is probably easier to demonstrate than describe, so here's a screencast that shows an early development version of datafist in action:
As you use datafist, you're constantly modifying the analysis itself. This modification takes place through visual and gestural actions: you move channels to the viewer, drag out ranges of time to zoom in on, and draw regions around interesting clusters. As you do this, the view is updated in real-time, allowing you to see the effects of your actions.
Try datafist Out! #
I'm hosting a version of datafist here at savageevan.com. Note that this is still a very early development version!
Contribute to datafist! #
If you're interested in making datafist better, fork me on github! Bug reports should be submitted via the issue tracker. In particular, if you have a CSV file that won't import properly, please attach it for testing purposes!
Inspiration #
- AudioMulch is an awesome graphical audio synthesis tool.
- PureData and Max/MSP are visual signal processing languages.
- Direct Manipulation Interfaces is a classic paper on the design of natural-seeming interfaces. You'll recognize some of the interface concepts from the analysis package design mocks at the beginning.
- FAKEGRIMLOCK is a fountain of poorly-Englished wisdom on entrepreneurship.