THE DAIR ZINE LIBRARY
← BackLogo

Cover: Every dataset has a perspective

Page description: The title “Every DATASET has a PERSPECTIVE" is in white letters on a black background. "Every", "has", and "a" are in cursive. "DATASET" is in a stack of boxes resembling computer vision labelling boxes, and "PERSPECTIVE" is in a speech bubble. At the bottom of the page are two eyes with the irises and pupils replaced with two slightly different 2d illustrations of worlds.

Page 2

This is because when you turn something into "data", yoy have to make a lot of choices.

Take any object, like this tree. It seems simple, but it can mean many things in many ways.

What's important to record, measure, or describe here? Let's give it a shot. So...

Page description: The three blocks of text above are in a square, an bubble pointing to a coniferous tree, and a hill respectively. The rest of the background is black. "record" is on a stack of rectangles, "measure" is on a ruler, and "describe" is in a speech bubble. The rest of the background is black.

Page 3

We could describe it as: SPRUCE ADULT HEALTHY

Or maybe: EDIBLE SPRUCE TIPS LUMBER MEDICINAL NEEDLES FIREWOOD FLEXIBLE INNER BARK

Or maybe: 26 FT TALL 8 FT WIDE 2000 LBS 907 KG

Or maybe even PREMIUM-GRADE CHRISTMAS TREE $250-300 BOUGHT/SOLD

What fits? What do we leave out?

Page description: The text is all on a rectangular column pointing to the other half of the coniferous tree. The capitalized text above is in boxes mimicking data tags. The last line of text is underlined. The background behind the coniferous tree is black.

Page 4

Each of these choices tells a story.

Page description: "Story" is underlined. Below is an illustration composed of small cartoons paired with a speech bubble and a number of data "tags". From top to bottom, left to right, he cartoons are:

  1. A young person lying in bed with a mouth thermometer and cold compress. The speech bubble says: 12 Y/O, 90 LBS, ASTHMA, 60 INCHES, BLOND, FLU ?

  2. A young person kicking a soccer ball under the son. The speech bubble says: SOCCER, FORWARD, FAST ?, SLOW ?

  3. A young person writing a test with their tongue sticking out of the corner of their mouth in focus. Behind them is a board with a clock and a report card, from which a speech bubble says: A-, 88, GIFTED? 3.45

  4. A young person at an immigration booth. Inside the booth are their passport and some documents. The speech bubble says: DOCUMENTED, ✓, MINOR, VALID VISA ?, EXPIRED ?

  5. A young person holding a sign saying CLASS PRESIDENT while doing a thumbs up

  6. A young person with their father and several photographs of people behind them. A stack of documents with a speech bubble says: 100K HOUSEHOLD INCOME, SINGLE FATHER, ADOPTED ?

Page 5

The bits of data we see shape our perspective on the world around us.

Page description: "Perspective" is underlined, and the quote above is at the bottom of the page. In the illustration above it, a large humanoid figure with eyes that have worlds as pupils looks down on a black board, with its hands hanging over the edge. Inside the board are various circles with tags and an associated icon. These include:

  1. A flag tagged with ALLY, MEMBER

  2. A person with a ? on them, tagged with FRIEND, CITIZEN, WORTHY, REFUGEE

  3. A plant with a ? on it, tagged with USEFUL, EDIBLE, WEED

  4. A house with a board across the door and smoke coming out of the chimney, tagged SAFE, LOWER-INCOME, OBSTRUCTS CONSTRUCTION

  5. A group of three people with ? on each of them, tagged COMMUNITY, LOVING, EXCLUSIVE?

  6. A music note tagged with WEIRD, NOISE

  7. A person with a ? on them tagged with WORKER, FRIEND

  8. A book tagged with PRIVILEGE, NEED, RIGHT

  9. A pair of glasses tagged with SLOW, SMART

Below all of these on the lower half of the board are "discarded" or "loose tags" like TRUE, OPEN, $70K, LEGAL, etc.

Page 6

So whenever you see data (or things like AI models trained on a whole lot) of it, ask yourself:

Page description: This quote is in a box, and "data" and "AI models" are underlined. Below this quote is an illustration. On the left are two people in front of a truck, which is presumably unloading boxes of tags. In the middle, a person has climbed up a ladder to pour a pail of tags into a funnel that leads to a machine labelled "LLM". This machine's output arm then powers a large computer and its accessories on the far right. The rest of the background is black.

How was it made? Where did it come from? What choices were involved? Why?

Page description: This quote is in a box. A humanoid figure holds a pair of binoculars to look at a pile of tags that is under a spotlight. They are surrounded by several white question marks. The rest of the background is black. The quote below is also in a box. All in all, this page appears like five boxes stacked on top of each other, alternating white and black.

Whose perspective is it showing? What are the scope and limits of this point of view?

Page 7

And most importantly,

What stories does it tell?

What might be missing?

Page description: The above text is in a rectangle placed in front of a coniferous tree illustration. Two smaller coniferous trees are on the left and right, and all of them are on a hill. Above the trees are ten silhouettes of birds. The rest of the background is black.

Back cover: DAIR - dair-institute.org

Page description: A hand-drawn illustration of the DAIR logo with the dair-institute.org link written on the bottom.

THE DAIR INSTITUTE