Monday, January 11, 2016

Teaching a Semester of D3.js

I spent last semester frantically putting together a course on D3.js for journalism students at the University of Miami at the same time as teaching it and grading it. Wow, teaching a semester course is hard. Teaching coding, especially to non-CS students, is a special challenge. I was lucky to have a small class of very patient and motivated guinea pigs students for the first semester.

The class was meant to be a portfolio-builder, focused on journalistic interactive visualization. We used data from UNICEF in the first semester, visible in the examples and projects. This coming semester has fewer journalism students, which means changing the content a little, a process I'm still going through in the repo. This post is a recap of what we did and what was hard about it. Next post (in a week) will show some of my students' work.

Interactivity and "Journalistic" Vis

Why teach D3? At least one friend teaching journalism students said he'd never do that again. I heard this right before I started on the adventure. But this course was meant to be on interactive data visualization, which means a chart does more than behave like a static bar chart and readers do more than look at the bars. I have talk slides here about designing for interactivity in vis, and primarily the examples I show are built in D3. This is the current lay of the land!

There is still no better library than D3 for building custom data-driven designs, with custom interactions, and integrating them with the web page DOM. I did show Highcharts, and one of the first homeworks was to use it for a few charts. But the animated transitions in D3 (and open palette of design options) are what sell it, and all my students wanted to do fancy artistic animations in their final projects: animated maps, animated lines, animated lines on maps, synchronized lines and maps that animate over time, you name it if it involved lines or maps apparently. :) (It pushed me hard too, to help them figure all that out.)

When I was trying to learn D3, I wanted to know how to hook up a chart to UI elements and make things move, but the books out there didn't get into anything that fancy, sticking mostly to how to create static charts in isolation. Static charts are usually much easier to create with other tools than D3 (unless it's an "unusual" chart type). So for my class I focused a lot on the UI interaction aspects of D3 coding. D3 can do a lot of fancy things, like networks, parallel coordinates, sankey diagrams... But I stuck to the "basics" for journalistic vis in this class:

  • Tables and heatmaps
  • Bars, vertical and horizontal
  • Lines, including handling lots and lots of lines
  • Stream/Area charts
  • Stacked and grouped bars
  • Scatterplots
  • Small multiples
  • Maps

We also covered a lot of key interaction features like animated transitions, swapping out a dataset and animating in a new one, how to hook up various UI elements like select menus, buttons, sliders; making complex tooltips, linking two charts together with a toggle switch or a click/mouseover, annotating particular data points, adding legends. In Javascript, important data concepts included sorting, getting top 10's (or N's), creating calculated variables.

Setting up the Tools: Github and Servers, Oh My

Getting folks set up on day one with a server and Github was a challenge, but luckily most of them had encountered a little bit of git before. However, most students did not know how to use the command line, and two of them had Windows machines, so this was "challenging" for all including me. (I totally forgot that all people don't automatically know Unix and Windows command line. Really threw me for a loop.) I probably oversold how useful "git stash" is when they had conflicts, but I feel no regret. Before too long they were git pulling every week and had learned how to make gists.

Gists are the building blocks of a portfolio of bl.ocks, a key component of the D3 community eco-system. Also, they were required for easier grading and debugging on my part — especially now that Ian Johnson (@enjalot) has released, which made debugging a lot simpler.

For some reason, using a server really stumps new web programmers. (After watching people struggle, I've put a bunch of documentation on setting them up in the nascent drafty d3-faq.) Folks who have done only static web design have usually not got a good understanding of why you need to use a server to view and render code. Unlearning that they can just double click on their file to view it takes a lot of time. No, the URL really has to say "localhost://" not "file://". The source of many bugs for the first few weeks was folks not having loaded their page using the server, even after they had set one up. (And note: That's an example of an issue that's harder to debug by email remotely than it is when you're looking over their shoulder. There were a lot like this. My office hours were sometimes busy.)

Javascript with D3

My class came in with required background in HTML and CSS, but little to no Javascript. Heck, this is how a lot of people learn D3, so why not? Well, anyone (like me) who has gone this route knows that the Javascript part is the thing that trips you up the most, even after you start to "get" the D3 paradigm. Just understanding the D3 examples out there (especially Mike Bostock's) requires a fairly advanced understanding of Javascript.

For all data visualization, data "munging" is hard and sometimes very data-set specific. You can either "munge" in a tool outside Javascript — I recommended and showed Excel — but at a certain point, you need to get a grip on the munging that's close to the vis code itself. Structuring your data to make it easy to get at certain values during interaction in the UI is pretty important. Getting data sets merged, looping through them to do calculations, or to create subsets of data, learning and using a functional coding style with forEach and maps — these things were hard for everyone, even the students with some programming background. I gave a few pure Javascript homeworks, on topics like debugging and data manipulations, but honestly, I should have given more of them. (OTOH, this is harder to grade, because it usually requires eyes-on-careful-review of each one. Meh.)

I also should have buckled down on teaching data manipulation tools earlier. In an attempt to be "easier" on them, I didn't teach d3.nest() right away, and helped one poor student (hi Luis!) write a laborious loop in JS to nest his data... After that hour, I realized, "Teach all the tools. Teach the nest()." Students need to know about the helper functions, which will save them time down the road. A homework on nesting data followed. I'll introduce lodash.js this spring semester, too.

A Lack of "Complex" Examples To Teach From

Many of the D3 examples, books, and tutorials are basic or even "toy" (abstract from realistic frames, not using real data, etc). There's a role for the basic — the best intro book is Scott Murray's very simple, unscary starter book, Interactive Data Vis for the Web. We started there, of course, but as we got into complex animations and transitions, there were fewer and fewer good working examples and tutorials out there to inspire class materials.

The big exceptions are the tutorials of Jim Vallandingham and Nathan Yau on Flowing Data; both do "journalistic" vis how-to's on their sites. I borrowed and adapted several of theirs, for small multiples and maps in particular. Jim's code tends towards more "advanced" and I simplified some of it — which I have mixed feelings about and may undo; Nathan's code I sometimes updated when it was using older D3 style or could be made more functional. Scott Murray's intro examples I also updated to use more D3-common conventions (e.g., adding the margin object convention, removing for-loops).

Even after seeing how to use functions for update patterns in D3, when project time came, everyone struggled to organize their code. When I asked people to just make a page combining 3 charts on it, all hell broke loose in the global scope conflict space. While I was quite clear that projects were judged on end-user experience, not code quality, code structure issues made it much harder for the students to modify and debug their own code. I'll be focusing more on code structure this semester.

Unfortunately, there are more examples online of how to use Angular or React to structure big projects, rather than pure Javascript. Obviously those frameworks solve a lot of organizational and architectural issues, but this is a challenge for everyone teaching D3, I feel. I don't want to inflict a framework on students who are just learning Javascript and D3.

Finding a Data Story Is Hard

Almost all of the class had had a static infographics class (from Alberto Cairo), but the practice of finding a story in data is hard, and I considered it outside the scope of the course. I recommended and demoed Excel and Tableau to a few students who were struggling, and luckily several had already had experience using Tableau. (I tried PowerBI briefly and was also very impressed by it!) Nevertheless, data "stories" for their projects were in flux until the very end. It's notoriously difficult to "design" for data vis without using the real data (sketching by hand only gets you so far), and a lack of proficiency with exploratory tools probably impaired some of them.

With a class next semester that's less journalistic, I'll expand the project grading to allow for less data-driven stories and allow a broader range of data visualization. I'll also be exploring a design process that starts with data exploration, then moves to UI sketches, then moves to phased development and feedback cycles.

Debugging is Also Hard

I knew I should teach debugging, and I did, but I think you can only teach it to a point. It's boring to watch someone else doing it, but it's also necessary. Getting students to learn how to use breakpoints in the Chrome console is a necessary evil, as is walking back through the stack trace.

One of the harder aspects of debugging is that you have to have a lot of experience with what can go wrong to be able to guess what it might be this time. It's about hours spent doing it. This is hard to teach; it just requires practice time.

Students Will Find and Replicate All Your Bugs

Because the general practice of learning D3 in the wild is to take examples and modify them to fit your own data, I wanted to support that in my class. I made examples and then had the class plug in their own data (hopefully on the topic of their final project!). This means that code sloppiness, errors, and bad habits in my code ended up replicated and magnified over and over. Including bad UI design — one example with unfortunate bar coloring showed up in a couple of projects.

My homework is to fix all that in the repo and try not to introduce too many new ones.

Thanks for Content I Borrowed, Linked To, or Adapted

People whose work contributed a lot to this repo include Mike Bostock, Scott Murray, Jim Vallandingham, Nathan Yau, Mike Freeman, Ian Johnson.

Course Materials

The repo (that will keep evolving this semester) is here. I expect to be adding more examples — such as for canvas, crossfilter/dc.js, and perhaps other layouts. There might even be data "art." I will post links and examples from student projects for the fall in another week or so!


Curran Kelleher said...

Awesome retrospective! Very inspiring and useful collection of successes and pain points. Thank you for this post.

Shawn Day said...

Thanks for taking the time to reflect on the success and challenges. Have to applaud you for undertaking it in the first place and rolling with the issues that emerged. These are some nicely generalised observations that help us to iteratively determine the most effective ways to teach coding to non-cs students. Thanks for sharing.

Boris Dahav said...

After reading this post and the way it is written, I envy your students. Thank You.

Micah Stubbs said...

Solid recap, thanks for sharing this! I too have struggled to find complex d3 examples that are only d3 and vanilla Javascript. Elijah Meeks' blocks and D3.js in Action book are the closest thing I've found so far -

To keep learning, I've decided to get proficient with the libraries you commonly see alongside d3. For me that's been mostly jQuery, lodash, Angular, React, and crossfilter. Build tools like grunt, gulp, and webpack showed up in interesting projects and were also worth learning at some point.

Lynn said...

Hi Micah -
Yes, I recommended Elijah's book to the class. I had some jquery in my class last semester, and this semester I'm adding lodash and some crossfilter examples. I don't want to teach a framework, though :( And I don't feel like I have time to do grunt et all either! I feel like a good class on professional js dev should also be required and cover that stuff.


Mike Ward said...

I'm an experienced D3 programer. I'm betting I would have learned much from your course. Very inspiring. Thanks for sharing and posting the materials.

Metti_Hoof said...

Thanks for sharing this!!

Generalizing european said...

I'm just finishing up a very similar course and my observations are very similar. We did go into the data story part quite a bit, and less into the technicalities of data munging. My students are in the sciences (primarily computer science) so I assumed they could do that already.

I wish I'd had this blog post 3 months ago.

Chris Saden said...

Great reading your reflections on teaching D3.js, JavaScript, and interactive graphics.

I created a course on D3.js and data visualization basics which some of your students might find interesting. The first 2 lessons cover a lot of the basics that students would get from a static visualization course and choosing graphical forms for the data at hand.

You might also like the book, Visual Storytelling wtih D3.js by Ritchie S. King.

I agree that it is hard to find more advanced and well-organized examples of D3 visualizations in the wild.

Francis Kim said...

I keep hearing about D3.js, I think I need to check it out.