Understanding D3

This is a first draft. I will likely re­vise (multiple times) in the fu­ture for clar­ity. But hope­fully, even in its cur­rent state, you might find this use­ful.

Table of Contents

The strange case of D3

Dr. Jekyll: D3 is fa­mous for be­ing an in­cred­i­bly ver­sa­tile vi­su­al­iza­tion li­brary.

Mr. Hyde: D3 is in­fa­mous for hav­ing a steep learn­ing curve and be­ing dif­fi­cult to un­der­stand past copy and past­ing ex­am­ples.

I have ex­pe­ri­enced the dark side of D3, but re­cently, I feel as if I have had mi­nor break­through, so hope­fully I can in turn pro­vide you with some sort of light­bulb mo­ment.

But just as a dis­claimer: in no way do I pro­fess to be an ex­pert (or any­where close). I’ve still a lot to learn. I just think some­thing clicked for me.

The big idea (and why it’s so hard)

Having gained some un­der­stand­ing, I now know this is the mantra you must re­peat to your­self.

D3 gives you func­tions to work with data bound to el­e­ments.

Does that sound kind of fa­mil­iar? It prob­a­bly should. Here’s the first sen­tence on the D3 web­site.

D3.js is a JavaScript li­brary for ma­nip­u­lat­ing doc­u­ments based on data

and later on…

D3 al­lows you to bind ar­bi­trary data to a Document Object Model (DOM), and then ap­ply data-dri­ven trans­for­ma­tions to the doc­u­ment

That leads me to be­lieve D3 is dif­fi­cult to learn be­cause it re­lies on a very sim­ple and pow­er­ful con­cept — too sim­ple to war­rant un­der­stand­ing when just start­ing out and too pow­er­ful to ap­pre­ci­ate with­out build­ing and see­ing lots of ex­am­ples.

Let’s start with just a line

We will be­gin with the three ba­sic parts to work­ing with D3. We will have an el­e­ment in the DOM (element). We will have some data, with which we’ll draw a line (data). And we’ll use D3 to put those things to­gether (functions).

D3 has a lot of con­ven­tions. These are use­ful to learn as a step­ping stone to learn­ing the con­cepts, so I will try to use a few of them here.

This is what the re­sult looks like.

Figure: Just a Line Plot

Set up

Let’s start with the eas­i­est part to set up: the DOM. This is all the HTML I have:

<figure class="example">
<h2 class="figure__title">Figure: Just a Line Plot</h2>
<svg>
</svg>
</figure>

We have a sim­ple dataset (5 points to form a line).

const X = [1, 2, 3, 4, 5];
const Y = [3, 5, 1, 6, 9];

Again, noth­ing spe­cial.

The Code

This is the in­ter­est­ing part, but don’t be in­tim­i­dated. In fact, this chart is made with only 16 lines of ac­tual code.

First, let’s de­fine some vari­ables we want to keep track of:

let width = 600;
let height = 300;
// this is the margin convention
let margin = {top: 5, right: 0, bottom: 30, left: 30};

Remember our mantra: data, func­tion, el­e­ments. Again, let’s start with el­e­ments.

Elements

// Create elements
// select SVG
let svg = d3.select("figure.example > svg")
    .attr("width", width)
    .attr("height", height)

// create chart
let chart = svg.append("g")
    .attr("transform", `translate(${margin.left}, ${margin.top})`);
let chart__line = chart.append("path")
    .attr("fill", "none")
    .attr("stroke", "black");
let chart__xaxis = chart.append("g")
    .attr("class", "g__xaxis")
    .attr("transform", `translate(0, ${height - margin.top - margin.bottom})`);
let chart__yaxis = chart.append("g")
    .attr("class", "g__yaxis")

I won’t go too far into the de­tails here, but the ba­sic idea is we cre­ate all of our chart el­e­ments un­der a sin­gle chart” group. This has to do with the con­ven­tion of defin­ing a size for the SVG, defin­ing mar­gins, and ad­just­ing the size of your canvas” ac­cord­ingly.

I want to stress that this sets up our chart, but we’ve yet to ac­tu­ally draw any­thing. We have added the el­e­ments, but we haven’t set up any func­tions or bound any data. So let’s do some of that now.

Functions

// Create functions
let xScale = d3.scaleLinear()
                .domain(d3.extent(data.map(d => d[0])))
                .range([0, width - margin.left - margin.right]);
let yScale = d3.scaleLinear()
                .domain(d3.extent(data.map(d => d[1])))
                .range([height - margin.top - margin.bottom, 0]);
let line = d3.line().x(d => xScale(d[0])).y(d => yScale(d[1]));
let xAxis = d3.axisBottom().scale(xScale);
let yAxis = d3.axisLeft().scale(yScale);

Here, we set up our scales, axes, as well as the line gen­er­a­tor. This does­n’t draw any­thing ei­ther, be­cause we’ve yet to put the func­tions and el­e­ments to­gether. Luckily for us, that step is pretty easy.

Data

// Bind data and call functions
chart__line.datum(data)
    .attr("d", line);

chart__xaxis.call(xAxis);
chart__yaxis.call(yAxis);

I want to fo­cus on two things here. First, no­tice how we now bind our data to the chart__line el­e­ment. We use datum be­cause we are draw­ing one line, so we’re bind­ing this one piece of data (singular: da­tum). Now, when we set the d attribute, we can pass in a func­tion that takes the da­tum (our ar­ray of co­or­di­nates) as in­put and re­turns the line path as out­put.

Second, no­tice how we’re draw­ing the axes. We set up the xAxis and yAxis functions ear­lier, and the .call is es­sen­tially pass­ing our se­lec­tion as in­put to the func­tion. It’s equiv­a­lent to xAxis(chart__xaxis), for ex­am­ple, so don’t let the syn­tax scare you.

Using .call is con­ve­nient for chain­ing func­tions, and this pat­tern is rec­om­mended for cre­at­ing reusable charts. The idea is you can have a se­lec­tion of svgs with bound data, call your chart on that se­lec­tion, and draw charts us­ing the bound data for each el­e­ment in your se­lec­tion.

Why this way?

It’s pos­si­ble to cre­ate D3 graph­ics with­out fol­low­ing this pat­tern. I’d bet most peo­ple (myself in­cluded) did not start cre­at­ing graph­ics with this pat­tern. So why take the ex­tra ef­fort to learn and im­ple­ment these ideas?

There are a cou­ple of rea­sons which go hand-in-hand.

  1. This is the way D3 was de­signed to be used.I will qual­ify this state­ment by say­ing that, since I did­n’t de­sign D3, tech­ni­cally I don’t know if this is true. That said, hav­ing read a lot of Mike Bostock’s (and oth­ers’) writ­ing on best prac­tices for D3, this seems to be the right way”
  2. Because of this, when you im­ple­ment things this way, things are more likely to just work”.
  3. Also be­cause of this, de­bug­ging be­comes eas­ier be­cause you’re not try­ing to re­solve any con­flicts be­tween your men­tal model and the model D3 adopts.

I’ll pro­vide an ex­am­ple of how you might be tempted to do things, and how those end up caus­ing headaches later on. The tl;dr is that prob­lems arise when you don’t sep­a­rat­ing el­e­ments, data, and func­tions ef­fec­tively.

In the past, I used to do this a lot:

// draw axes
g.append("path").datum(data).attr("d", line)
g.append("g").call(xAxis)
g.append("g").call(yAxis)

Why is this bad? We don’t save our se­lec­tions, and you can think of this as treat­ing the path el­e­ment as bound to this spe­cific datum.

What does that mean? If we want to change our data, we will up­date data and run that code again, but since we don’t keep a pointer to the path element, we end up just ap­pend­ing an­other path. Instead of re­draw­ing the line, we draw a new line. Additionally, if we wanted our graphic to be re­spon­sive (that is, re­size when the browser win­dow re­sizes), we need to up­date and re­draw our axes. That means we should also keep pointer to the axes el­e­ments.

Moral of the story? Set up your el­e­ments once, then use D3 func­tions to bind data and draw many times. Since you’re not cre­at­ing a new el­e­ment every up­date, D3 can keep track of the cur­rent bound data and up­date ac­cord­ingly. This also lets you use D3 tran­si­tions more eas­ily.