The Data Bubble

8 min readJul 31, 2010

It didn’t happen in 2010, but it will in 2016.

This Post ran on my blog almost six years ago. I was wrong about the timing, but not about the turning: because it’s about to happen this month at the Computer History Museum in Silicon Valley. More about that below the post.

The tide turned today. Mark it: 31 July 2010.

That’s when The Wall Street Journal published The Web’s Gold Mine: Your Secrets, subtitled A Journal investigation finds that one of the fastest-growing businesses on the Internet is the business of spying on consumers. First in a series. It has ten links to other sections of today’s report.

It’s pretty freaking amazing — and amazingly freaky, when you dig down to the business assumptions behind it. Here’s the gist:

The Journal conducted a comprehensive study that assesses and analyzes the broad array of cookies and other surveillance technology that companies are deploying on Internet users. It reveals that the tracking of consumers has grown both far more pervasive and far more intrusive than is realized by all but a handful of people in the vanguard of the industry.

It gets worse:

In between the Internet user and the advertiser, the Journal identified more than 100 middlemen — tracking companies, data brokers and advertising networks — competing to meet the growing demand for data on individual behavior and interests.The data on Ms. Hayes-Beaty’s film-watching habits, for instance, is being offered to advertisers on BlueKai Inc., one of the new data exchanges. “It is a sea change in the way the industry works,” says Omar Tawakol, CEO of BlueKai. “Advertisers want to buy access to people, not Web pages.” The Journal examined the 50 most popular U.S. websites, which account for about 40% of the Web pages viewed by Americans. (The Journal also tested its own site, WSJ.com.) It then analyzed the tracking files and programs these sites downloaded onto a test computer. As a group, the top 50 sites placed 3,180 tracking files in total on the Journal’s test computer. Nearly a third of these were innocuous, deployed to remember the password to a favorite site or tally most-popular articles. But over two-thirds — 2,224 — were installed by 131 companies, many of which are in the business of tracking Web users to create rich databases of consumer profiles that can be sold.

Here’s what’s delusional about all this: There is no demand for tracking by individual customers. All the demand comes from advertisers — or from companies selling to advertisers. For now.

Here is the difference between an advertiser and an ordinary company just trying to sell stuff to customers: nothing. If a better way to sell stuff comes along — especially if customers like it better than this crap the Journal is reporting on — advertising is in trouble.

Here is the difference between an active customer who wants to buy stuff and a consumer targeted by secretive tracking bullshit: everything.

Two things are going to happen here. One is that we’ll stop putting up with it. The other is that we’ll find better ways for demand and supply to meet — ways that don’t involve tracking or the guesswork called advertising.

Improving a pain in the ass doesn’t make it a kiss. The frontier here is on the demand side, not the supply side.

Advertising may pay for lots of great stuff (such as search) that we take for granted, but advertising even at its best is guesswork. It flourishes in the absence of more efficient and direct demand-supply interactions.

The idea of making advertising perfectly personal has been a holy grail of the business since Day Alpha. Now that Day Omega is approaching, thanks to creepy shit like this, the advertsing business is going to crash up against a harsh fact: “consumers” are real people, and most real people are creeped out by this stuff.

Rough impersonal guesswork is tolerable. Totally personalized guesswork is not.

Trust me, if I had exposed every possible action in my life this past week, including every word I wrote, every click I made, everything I ate and smelled and heard and looked at, the guesswork engine has not been built that can tell any seller the next thing I’ll actually want. (Even Amazon, widely regarded as the best at this stuff, sucks to some degree.)

Meanwhile I have money ready to spend on about eight things, right now, that I’d be glad to let the right sellers know, provided that information is confined to my relationship with those sellers, and that it doesn’t feed into anybody’s guesswork mill. I’m ready to share that information on exactly those conditions.

Tools to do that will be far more leveraged in the ready-to-spend economy than any guesswork system. (And we’re working on those tools.) Chris Locke put it best in Cluetrain eleven years ago. He said, if you only have time for one clue this year, this is the one to get…

Thanks to the Wall Street Journal, that dealing may finally come in 2010.

[Later…] Jeff Jarvis thinks the Journal is being silly. I love Jeff, and I agree that the Journal may be blurring some concerns, off-base on some of the tech and even a bit breathless; but I also think they’re on to something, and I’m glad they’re on it.

Most people don’t know how much they’re being followed, and I think what the Journal’s doing here really does mark a turning point.

I also think, as I said, that the deeper story is the market for advertising, which is actually threatened by absolute personalization. (The future market for real engagement, however, is enormous. But that’s a different business than advertising — and it’s no less thick with data… just data that’s voluntarily shared with trusted limits to use by others.)

[Later still…] TechCrunch had some fun throwing Eric Clemons and Danny Sullivan together. Steel Cage Debate On The Future Of Online Advertising: Danny Sullivan Vs. Eric Clemons, says the headline. Eric’s original is Why Advertising is Failing on the Internet. Danny’s reply is at that first link. As you might guess, I lean toward Eric on this one. But this post is a kind of corollary to Eric’s case, which is compressed here (at the first link again):

I stand by my earlier points:

Users don’t trust ads
Users don’t want to view ads
Users don’t need ads
Ads cannot be the sole source of funding for the internet
Ad revenue will diminish because of brutal competition brought on by an oversupply of inventory, and it will be replaced in many instances by micropayments and subscription payments for content.
There are numerous other business models that will work on the net, that will be tried, and that will succeed.

The last point, actually, seemed to be the most important. It was really the intent of the article, and the original title was “Business Models for Monetizing the Internet: Surely There Must Be Something Other Than Advertising.” This point got lost in the fury over the title of the article and in rage over the idea that online advertising might lose its importance.

My case is that advertisers themselves will tire of the guesswork business when something better comes along. Whether or not that “something better” funds Web sites and services is beside the points I am making, though it could hardly be a more important topic.

For what it’s worth, I believe that the Googles of the world are well positioned to take advantage of a new economy in which demand drives supply at least as well as supply drives demand. So, in fact, are some of those back-end data companies. (Disclosure: I currently consult one of them.)

Look at it this way…

What if all that collected data were yours and not just theirs?
What if you could improve that data voluntarily?
What if there were standard ways you could get that data back, and use it in your own ways?
What if those same companies were in the business of helping you buy stuff, and not just helping sellers target you?

Those questions are all on the table now.

___________________

9 April 2016 — The What They Know series ran in The Wall Street Journal until 2012. Since then the tracking economy has grown into a monster that Shoshana Zuboff calls The Big Other, and Surveillance Capitalism.

The tide against surveillance began to turn with the adoption of ad blockers and tracking blockers. But, while those provide a measure of relief, they don’t fix the problem. For that we need tools that engage the publishers and advertisers of the world, in ways that work for them as well.

They might think it’s working for them today; but it’s clearly not, and this has been apparent for a long time.

In Identity and the Independent Web, published in October 2010, John Battelle said “the fact is, the choices provided to us as we navigate are increasingly driven by algorithms modeled on the service’s understanding of our identity. We know this, and we’re cool with the deal.”

In The Data Bubble II (also in October 2010) I replied,

In fact we don’t know, we’re not cool with it, and it isn’t a deal.
If we knew, The Wall Street Journal wouldn’t have a reason to clue us in at such length.
We’re cool with it only to the degree that we are uncomplaining about it — so far.
And it isn’t a “deal” because nothing was ever negotiated.

To have a deal, both parties need to come to the table with terms the other can understand and accept. For example, we could come with a term that says, Just show me ads that aren’t based on tracking me. (In other words, Just show me the kind of advertising we’ve always had in the offline world — and in the online one before the surveillance-based “interactive” kind gave brain cancer to Madison Avenue.)

And that’s how we turn the tide. This month. We’ll prepare the work on VRM Day (25 April), and then hammer it into code at IIW (26–28 April). By the end of that week we’ll post the term and the code at Customer Commons (which was designed for that purpose, on the Creative Commons model).

Having this term (which needs a name — help us think of one) is a good deal for advertisers because non-tracking based ads are not only perfectly understood and good at doing what they’ve always done, but because they are actually worth more (thank you, Don Marti) than the tracking-based kind.

It’s a good deal for high-reputation publishers, because it gets them out of a shitty business that tracks their readers to low reputation sites where placing ads is cheaper. And it lets them keep publishing ads that readers can appreciate because the ads clearly support the publication. (Bet they can charge more for the ads too, simply because they are worth more.)

It’s even good for the “interactive” advertising business because it allows the next round of terms to support advertising based on tracking that the reader actually welcomes. If there is such a thing, however, it needs to be on terms the reader asserts, and not on labor-intensive industry-run opt-out systems such as Ad Choices.

If you have a stake in these outcomes, come to VRM Day and IIW and help us make it happen. VRM Day is free, and IIW is very cheap compared to most other conferences. It is also an unconference. That means it has no keynotes or panels. Instead it’s about getting stuff done, over three days of breakouts, all on topics chosen by you, me and anybody else who shows up.

When we’re done, the Data Bubble will start bursting for real. It won’t mean that data goes away, however. It will just mean that data gets put to better uses than the icky ones we’ve put up with for at least six years too long.

The Data Bubble

Written by Doc Searls