My Pigeons & Planes Feature (and an interview!)
The article where I tell you to go read a different article
Last month I hit a major milestone with my first commissioned article, published on Pigeons & Planes. P&P is a blog I’ve been visiting to find emerging artists since I first discovered music blogs as a freshman in high school, and I credit them with helping shape my taste in music as I first ventured away from mainly listening to artist I heard on the radio. Getting a piece published on there was a real full circle moment for me.
The article came about after a chance reply on Twitter. In the midst of an overnight shift, I saw P&P founder Jacob Moore quote tweet an article about bot streaming, reflecting on the depressing reality of it all. I saw this as an opportunity to share some strange information I found on the Rolling Stone charts last summer that I hadn’t known what to do with.
It may have only gotten seven likes, but that tweet lead to DMs, which lead to emails, which eventually lead to this article on how faking numbers can be spotted from a fan’s perspective (spoiler: it’s almost impossible to know for sure, but if you see a lot of these signs in succession, you can be pretty confident in some kind of foul play). Click the headline to read the full piece, and underneath that I’ve included an update on the case study, and some fascinating excerpts from my interview with Eric Drott that’s cited in the feature.
When Numbers Lie: How to Spot Fake Data in Music and Why It Matters
Case Study Update
While the feature is only a few weeks old now, the data collecting and writing of the case study section happened back in April, 2021. So now four months later I went back and looked at how our mystery artist has done streaming wise since then.
When I made the case study they were hovering at 17,000 monthly listeners on Spotify, and now in August they’ve surpassed 42k. They’ve released a new single as well but it has a marginal amount of streams compared to that debut single. The Spotify top cities are still suspicious, just in different ways now. They’ve traded in a top city of New Taipei, TW from when I wrote the feature to Jakarta, ID. Again, this is suspicious because the artist is US based, and has no reason (that I could find) to suggest they’d be earning legitimate listens overseas before getting any in their home country. If they had done a series of shows in those cities, or maybe had a song on a movie soundtrack that was popular in those countries, any type of connection to the locations, then it would make sense. But as far as I could find, nothing like that applies here. Their top 5 cities are now rounded out by another Indonesian City, Amsterdam, a city in Brazil, and Los Angeles! L.A. is where the artist is based so that one actually makes sense. Their debut song is doing pretty well on a playlist of alternative rock classics from the 90s and 2000s, and they’ve also tripled their Spotify followers. The social accounts look remotely the same, and to anyone wondering, they have surpassed the single like on their Facebook page (all the way up to 28 now!). As I say in the article, I can’t say for certain that it was bot streams (pay for play playlists is also a possibility), but whatever they’re doing is clearly continuing to create numbers for them.
The scary part is that if they’re able to keep gaining numbers this way, it could potentially lead to some reputable outlets or even official Spotify playlists biting on the numbers and covering their songs. The “fake” numbers would then naturally transition into organic actual listeners, and start to look real. If that happens, there would be no way to know from looking at their profiles that all of it (potentially) started from fraudulent streams. You can see the signs of this beginning to happen with Los Angeles creeping into their top cities.
Eric Drott is an associate professor for the Butler School of Music at The University of Texas at Austin. His journal article, Fake Streams, Listening Bots, and Click Farms: Counterfeiting Attention in the Streaming Music Economy, was incredibly helpful while researching for my piece. Back in April he generously took the time to give me further insight as I was putting together my Pigeons & Planes feature. Though he’d probably decline the title, he’s someone I consider to be an ‘expert’ in the relatively new field of bot streams. Below are some selected excerpts from our conversation that didn’t end up making the feature, edited for clarity.
Brian Harrington: You had a point in your article where you said “where cheap technologies of fraud detection prove most effective is in combatting equally cheap methods of generating fake streams.” Why wouldn't Spotify or other streaming services want to invest more in catching fraud detection? Do you think it's just too difficult to actually know when it's happening? Or do you think they have anything to gain from bot streaming on their end?
Eric Drott: I'll answer that in two parts. First of all, to understand the incentives of streaming platforms, I would highly recommend one book in general, Lying for Money by Dan Davies, especially the first chapter. I only read this one recently, after having written the article you’re referring to, the research for which is actually a few years old, even though it just came out in the fall. I can't remember the exact phrase Davies uses, but he makes the point early on in the book that there's a certain kind of economic rationality at work in allowing a certain amount of fraud to exist in any kind of economic system. This is because the costs of verifying every transaction would be so prohibitive otherwise. So every economic system, including the streaming economy, tolerates a certain amount of fraudulence. Because it would cost Spotify a ton of money to invest in the labor or technology necessary to monitor all the streaming that takes place on their service. And besides, such technology is spotty to begin with, so you can't really trust it to distinguish between “real” and “fake” listening. I think there's a good analog here with content moderation on YouTube or Facebook. There are a number of scholars who've written on this, such as Sarah Roberts and Tarleton Gillespie, and one of the points they make is, that it's very expensive for those companies to moderate content effectively, and wherever they can save money by outsourcing it to users or hiring low-wage content moderators, they will do so.
You think it would be more expensive than the revenue they are losing that just goes to fraudulent streams?
Yeah, yeah, probably. I think the problem for someone like Spotify is that, if this happens around the margins, it doesn't really affect their bottom line. Because hypothetically if a certain percentage of streams on Spotify are fake, what does that mean, ultimately? Well, what it means will depend on what percentage of those come from accounts that are on the ad-supported free tier or some kind of discounted tier, or the $10 a month subscription tier. If some of those are coming from the subscription tier, then Spotify is still making money, right? Because it doesn't really matter to them where the streams are going, since they are still receiving their $10 a month subscription fee. But even if fraudulent activity is going on in the ad-supported tier, so long as it goes undetected, it doesn't really affect them because they're still selling advertising space. So, from Spotify's perspective, who gives a shit? If nobody knows for sure, it's not hurting them.
Who it does hurt are artists, and this is on account of the sort of pro-rata or market share model of revenue distribution that Spotify and most streaming platforms have used up until now to pay out royalties. According to this model, artists get paid a proportion of the revenue that corresponds to the share of streams that their music is responsible for generating within a given reporting period. So let’s say there's some spammer who pays for a bot that drives 1% of Spotify's total streams in that month to their music.
Well, that just means that there is now 1% less revenue that's going to be shared out with everybody else across the platform. Now, realistically, this doesn't add up to much for most artists, because it is diluted so massively. And it’s not like most of them are getting paid that much to begin with. So, all a scam like this does is change how the pot of money is getting distributed among artists. What it doesn't change is the size of the pot of money that Spotify is receiving or paying out to rights holders.
You talked about how a user-centric model would be different because artists would only be getting money from people that actually stream them, but do you feel like that's a realistic solution?
Well, it's a realistic solution to certain problems and not others. There's a lot of talk about how this is going to be the thing that fixes streaming, and I don't buy that. One problem it would help fix relates to fake streams. There is a case I cite, what's sometimes called the Bulgarian Spotify scam, where clearly the people paid for premium subscriptions. There were some scammers, who weren’t necessarily Bulgarian, but who used Bulgarian IP addresses. They paid for 1,200 premium accounts on Spotify, and used these to boost the streaming numbers for a couple of playlists comprised of music to which they had the royalties. So even though they gamed the system and made a fair bit of money, from Spotify’s perspective that’s still great for Spotify. The scam only works because all the money raised from paid subscriptions goes into a single pot, and is shared out from there. What it means is that subscribers who listen less in a given month actually subsidize the activity of listeners who listen to Spotify more. So, if you're somebody who only listens to 10 tracks a month on Spotify, a lot of your money is effectively being redirected to help pay other people's preferred artists. And so what these people do is take advantage of that kind of loophole.
Under the user-centric model, the money from a particular subscriber's account would go to whichever artists they happen to listen to and to no others. It would be earmarked for those artists, which means that if you're the scammers, what would have happened is, you buy 1,200 accounts, and then you stream all this music on them, and you get exactly all the money from those 1,200 accounts. And so, unless you manage to get other people to listen to it (which obviously wasn't what they were trying to do) then you've basically come out of this neither losing money nor making any either. Well, actually you probably lost $3 on every subscription because Spotify will take its cut. So, you've lost $3,000 or something like that, if you’re paying the standard $10 a month price. So, the only way this scam works is through the existence of the current pro-rata model. Or at least that’s true for this particular scam; as I said, Spotify is still getting their cut regardless.
Another thing you touched on was the arms race between bot services and anti-bot technology, how they're just going to keep one-upping each other. I don't know how much your research has continued on that since the article, but have you seen that play out at all? Are there new tricks going on?
Not that I'm aware of, that was speculative on my part, though there are some indications that that’s what’s going on behind the scenes. There's much more extensive literature on email spam, and that's where researches have documented this kind of arms race. It’s a phrase I borrow from Finn Brunton, who uses it in his book Spam, where the more sophisticated the filtering program, the more sophisticated the response. I saw some evidence of that with the one software package that I described in the article, Spotviewbot. When it was first launched, it was pretty primitive. Then two years later its developers had introduced all these randomizer functions so that it would not be so obviously algorithmically generated. And I imagine that’s in order to foil algorithmic detection on the other end.
I think that so long as there's a market for this kind of service, there's this sort of incentive to improve the technology. But I think that it's easy and tempting to get caught up on the technological angle. Again, I don’t have hard empirical evidence, but I would imagine that the more effective way of procuring streams is through the click farms that I talked about at the very end of my article. And one of the things I touch on is how, if you go to these so-called black-hat marketplaces and try to buy 10,000 streams or 50,000 streams or 100,000 streams or whatever, a lot of the traffic on those platforms is between people that have been contracted to generate streams, who then contract with other people to generate streams. And so, you get this reticulated network, because what ends up happening is, let’s say a person like me in Texas buys 100,000 streams from some guy in, say, Italy, who then buys them from somebody in Kenya and Mexico and Indonesia. And each of those people buy from three other people in South Africa, Israel, Australia and the Philippines. And so, it becomes in a way a more effective way of disguising where these things are coming from and the artificiality of the listening behind it.
There was some evidence of this in a footnote I had to cut from the article. Namely, I actually found an email exchange in the user forum for one of these websites, where certain users were complaining about another seller. In particular, there was this one guy who was complaining about how he had been contracted to generate a certain number of streams. And so, he subcontracted it out to this other guy on the same platform who then defrauded him. So, it's basically just like one con artist having been conned and complaining about it.
And he was warning everybody on the site, "Hey, don't work with this guy, he's unreliable, he's an asshole," or whatever. And he uploaded the email chain as a .pdf. When I read this .pdf, I thought to myself, "Oh, this is great! It really shows how this works." Obviously for an academic paper, this is not a credible source, so I had to remove the footnote. But it really was revealing of the underlying mechanics of how that kind of traffic works. Once you pay for 100,000 streams, what happens? Well, chances are the person who's delivering them isn't the person who generated them. They were probably working through a number of other intermediary links in this long and distended supply chain.
It's all super fucked up, but that last section of your article where cite an you interview with a man who worked in a click farm just felt very dystopic.
Yes, it is pretty dystopic. The thing is though, one of the reasons why I wanted to talk about that at the end of the article and one point that the critical scholars of automation have pointed out is that there's a lot of things that are passed off to us as having been automated, which really aren't. It's just being performed by cheap labor that's just hidden from view.
You know, it's much cheaper for a lot of major corporations to pay piece wages to people and to take advantage of global labor arbitrage to pay poverty wages to somebody in Bangladesh to do this kind of click work, and then present to people in the global North as if this is done by a bot or a script. So, it's important to sort of bear in mind that there's a lot of what Astra Taylor calls fauxtomation, that is, faux-automation. I don't know what the exact proportions are, but I would imagine that most of the fake streams come from people clicking.
Is your research on bot streaming continuing?
Not necessarily on the bot streaming per se. I'm finishing a book on the political economy of streaming platforms. So, it'll probably come out in a year, knock on wood. That's the big thing I've been laboring away at for the past few years. The bot stuff is an important chunk of it. And another thing I talk about is people trying to hijack users’ attention by posting soundalike cover versions of popular songs, who are angling to get users to accidentally click on them. The idea is to fool listeners, usually by exploiting the fact that there is some song somebody might have heard on the radio or in a shop one time, but they can’t remember the name of the track of the artist. So they search for a song title and something similar pops up. But this sort of phenomenon also keeps evolving. And there’s a kind of arms race in this regard as well.
For example, another thing people have talked about is how a big thing now is exploiting movie soundtracks, because there are a lot of movie soundtracks where the score and the pre-existing music get separated into show separate albums. So, individuals will make these playlists and combine the two, but then they’ll tuck into these playlists a couple of their own tracks in hopes that people listening to it won’t notice the difference. What they are hoping is that listeners think to themselves “was that in the movie? I don’t remember, but maybe it was.” But the bigger issue is that there’s a kind of more fundamental fraud that undergirds all these sorts of incidental - or to use a fancy word, epiphenomenal - kinds of fraudulent activity, like people boosting stream counts. The larger kind of imposter fraud, in my opinion, is the one platforms engage in: the idea that the numbers that they tally actually have a real, tangible, and direct link to the activity that they're supposed to symbolize or index, that somehow a stream count corresponds to somebody actually listening to music on the other side.
Which raises the question, how does the platform know that? They don't, right? Something pressed play and didn't skip, that's all the platform knows. There was a signal that was sent from a terminal or a device of some sort, and that's it. You don't know what prompted it and whether or not the listener was paying attention, whether they were even in the room. That, for me, is the bigger issue. For a streaming platform like Spotify, this is critical because if all of a sudden people start to doubt the veracity of their streams, it’s a problem. Like I said, they're willing to tolerate a little bit of funny business around the margins because it's too costly to clamp down on all of it. But if all of a sudden digital advertisers, for instance, start to think, "Oh, well, their numbers are bullshit." Then their business model is seriously fucked.
It really starts to hurt their bottom line then.
Yeah. If investors start to think that, "Oh, this isn't as popular a site because 10% or 20% of the streams are generated by bots, this actually doesn't have as big an audience. It doesn't have the growth potential we thought it did." That's a big problem for a company like Spotify.