Home > BigData, Data Warehouse, Musing > The Curse of Self-Service

The Curse of Self-Service

These days, we seem to be high on data and data related trends. My opinion on Big Data should be well known to my readers: it is something that has to be carefully managed and largely a fad for all but a select few companies.

With data being the new black, similar trends grab the attention of modern managers. One of these is Self Service. It seems like such a logical consequence of our advanced data visualisation: democratise the data.

It’s worth noting that the notion of humans making better decisions when well served with information is rather old. Thomas Jefferson said: “whenever the people are well-informed, they can be trusted with their own government”. But what exactly does it mean to be well-informed? Another great statesman, Churchill, said: “The best argument against democracy is a five-minute conversation with the average voter”.

In this blog entry, I will argue that is does not follow that humans will make better decisions if we just give them access to more data. In fact, allowing people to self-service their data can be outright harmful.

Process, Wisdom and Courage

We humans have a curious behaviour when we participate in systems that exceed our ability to comprehend them: we implement process to defend ourselves against the fear of the unknown.

We observe it clearly in people with obsessive compulsive disorder (who doesn’t have a bit of that?). When you suffer from OCD, you apply a forced behaviour to counteract frustrating attributes of our environment. Likely, this pattern is pre-programmed into our genetic structure. B. F. Skinners experiment “‘Superstition’ in the pigeon” shows that pigeons can easily be conditioned to react strongly to patterns in its surrounding that don’t exist.

At a more harmless scale, we see this happen when we speak about the weather. Since moving to the UK, I have learned that the weather forecast here is significantly more unreliable than in Denmark. Largely, I theorise, because of the more complex weather pattern of this island nation. You never know when to bring an umbrella here. The effect on the average Englishman you meet on the tube is almost too stereotypical: they compulsively moan about the weather a lot here, as if it had any effect on it.

Unfortunately, this curious evolutionary perk isn’t restricted to simple pigeon experiments, forced behaviour of individuals and idle chatter on the Tube. We see it all too clearly at macroscopic scales too – where it is causing significant harm to the planet. A great place to observe this in action is our western economic system. At this point, it should be relatively clear to any non biased observer that the stock market is no longer correlated with the prosperity, flourishing and happiness of nations and companies. Since the crash of 2008, we have followed our evolutionary pattern of adding MORE rules to a system we already fail to comprehend. Instead of going over this argument in detail, I will refer you to the excellent talk by Barry Schwartz: “Using our Practical Wisdom”.

Another place to observe this pre-programmed response is in the National Healthcare System (NHS) in the UK. The NHS is one of the top 10 employers in the world by size (source: http://www.bbc.co.uk/news/magazine-17429786) – just under McDonalds (ironic and probably largely explaining the weight difference between the average US and UK citizen). The UK is in economic trouble these days, yet the NHS still maintains its admirable promise for healthcare: “free at the point of delivery”. Yet, under the pressure of creating more efficient healthcare, doing more with less, the NHS management is responding with an all too predictable pattern: “More rules, more process”. The effect is equally predictable: Malpractice, burnout among the staff and a pessimistic outlook on the future with no light in sight (source: The Kings Fund 2012 report). If the NHS was a private company, now would be a good time to jump ship.

Implementing more rules doesn’t seem to work – so why do we do it? My theory is that this is driven by our fear of the unknown. We are afraid of the dark, of death, of losing control of our life. We believe in bronze age legends because we fear death. We implement BASEL because we fear that the stock market is not a good way to run our society and we don’t know what else to do. We worship empty-headed celebrities, shallow philosophies and management “theories” because we don’t have the courage to follow the idols who tell us that life is difficult, finding your purpose is hard and that humans are complicated.

Is the cure for our fear of the dark more data? If only data was free and accessible, would we make better choices?

Your honour, exhibit number one: Fox News.

Fox News – Are people in the US stupid?

I have had a hard time finding accurate, unbiased numbers on the viewership of Fox News. However, from the stats I have been able to find (http://tvbythenumbers.zap2it.com/2013/08/16/cable-news-ratings-for-thursday-august-15-2013/197950/), Fox News has more viewers than CNN. Even considering the already poor, populist quality of CNN news (and I use the term “news” lightly), remember that Fox News is a channel which regularly features programs which deny global warming, thinks creationism is a reasonable alternative to evolution, believes that the war in the Middle East is going well and still refuse to see that the power balance in the world is shifting East and that the American dream never was. Now, consider that the facts of all these “controversial” issues are easily available. You can self-service your way to the data. Yet, there are people in the world, a significant amount of them in fact, who consider themselves well informed by the gibberish that Fox News spoon feeds them. Are Americans simply dumber or more gullible than Europeans?

Your honour, exhibit number two: The MMR vaccine.

MMR vaccine – Why do mothers intentionally harm their children?

In 1998 Andrew Wakefield, a doctor now barred from practising medicine in the UK, published a controversial study linking the life saving MMR vaccine to autism spectrum disorder. The media reacted very strongly and the herd immunity conferred by the vaccine was compromised as mothers refused to vaccinate their children. The article was later discovered to be fraudulent and the Lancet withdrew it. The history of this case is well documented in WikiPedia and the reference section in the entry provides ample information from peer reviewed journals to make the conclusion obvious: There is no link between Autism and the MMR vaccine.

In November 2012, an outbreak of measles (one of the three diseases the MMR vaccine provides immunity to) in Wales caused more than 1000 children and young people to fall ill. Nearly a hundred people were hospitalised and a young man of only 25 was tragically killed by the disease. The NHS made a heroic effort to provide the MMR vaccine to the region to stop the outbreak. This loss of life and human wellbeing is completely preventable.

In the meantime, the media are still giving people like Jenny McCarty a platform. At this point, shouldn’t we know better? Shouldn’t we be able to self service our way to the data to make the right decision?

Ah, but recall, the media are not doing this for truth, they are doing it for profit. And here is the catch: If the data is easily accessible, they (the media that is) can simply point to it and say: “Hey, make up your own mind, we just represent an opinion”.

And that brings us to the outsourcing of work and responsibility, away from the people who we pay or elect to make hard decisions, to the people on the factory floor of civilisation – the consumer.

Your honour, I call to the stand: The average consumer, me.

Outsourcing – Let the Consumer do the hard work

In Europe, the minimum wage is constantly growing – while the distribution of wealth is becoming even more skewed. I believe that technological progress, increased productivity per worker, is a reasonable explanation for some of this effect. However, I don’t believe this is the full story.

The checkout line in a supermarketConsider the case of supermarkets; I HATE supermarkets. On the one hand, I appreciate the symbolic value of the supermarket, the amazing logistical equation that has to be solved to bring me a wide selection of fresh goods. On the other hand, I cant stand BEING in the supermarket; the endless queues, the organisation of goods in ways that I find counter intuitive (and which the supermarket has carefully calculated will make me buy more), the noise and the muzak – Its just too much.

Supermarkets know this, so they have provided two “solutions” to the problem.

First, they allow me to shop online and will deliver to the door. I consider this a brilliant idea, a win-win. I get the ease of browsing goods the way I want to and I don’t have to physically visit the supermarket. The supermarket does not even NEED a store, which saves them money and we both save on logistics and CO2 (since the supermarket can plan the delivery route carefully, saving total fuel consumption). This is self service at its best: I get to pick from a range of pre-configured goods and someone delivers it to me when I want it.

Second, the supermarket has invented a new way to “self-service”. They have removed the checkout assistant and replaced him with a machine where I can scan my own goods. I HATE those machines, and I refuse to use them. What is happening here? The supermarket, always pushed to maximise profit, has found a way to remove a job (the checkout assistant) and steal a bit of my time instead. Instead of solving the logistical problem (as with the first solution) they are outsourcing the solution to me. This takes time away from me, and gives it to the supermarket’s bottom line. It scales very well for them, but not very well for me or anyone else – there is a hidden cost to society here. It also eliminates a perfectly viable, low skilled job and replaces it with my work instead – contributing to unemployment rates in the process. I am morally opposed to these machines.

imageBanks and telcos have taken this idea one step further. Not only do they take a time consuming processes and give it away to the “end user”, they outsource the organisation of those processes to the end user. A telco I recently used has outsourced part of their customer sales support to Philippines, their technical support to India, the routing of my request to a machine and the communication between the two departments to me. There are a lot of issues with this. First of all, it wastes my time when I have an issue to resolve. Second, it is disrespectful to the poor call centre employee who have to listen to my frustrations. Third, it removes lowly skilled labour from the West and moves it East. While there is something ironically “fair” about boosting foreign economies by moving money around like this – it sends a very poor signal to both the outsourcer and the outsourced. I am willing to pay money, as I indeed have done lately, to avoid providers who do this.


Summary – The case against self service

Summarising the observations so far:

Even when good data is easily accessible, people don’t react on it. We only need to look at the world around us. The facts are everywhere, but even people with scientific training fail to act on them. Action has to be taken by people with courage to make tough decisions. Left to their own devices and with the existing education system, the crowd is stupid, not smart.

Once a belief has been established, future data, no matter how easily accessible, alone won’t change it: As we saw with the case of the MMR vaccine and indeed with most religious beliefs – new data wont change a perception once it has been placed there. In the words of Gordon Livingston: "It’s difficult to remove by logic an idea not placed there by logic in the first place".

Self-serving the data supply problem does not solve it, it just distributes the cost of the inefficiency: The solution to the supermarket problem of distribution is not to let the customer do the hard work of scanning goods – it is to solve the logistical challenges of distributing the goods efficiently in the first place. But self-servicing data by passing the work to the end user makes it APPEAR as if the problem has been solved – at least in the eyes of the data provider (the supermarket). The fact that supermarkets have found another way than self-service to solve the logistics, and that we as computer science data modelers cant, is a massive failure on our part and something we need to address. Democratising the process isn’t the solution, good design patterns are.

Now, it may sound like I an advocating a benign dictatorship here, maybe even a data intelligentsia. I am doing no such thing. However, I would like to point out that our modern, western civilisation and education system has largely failed to address the basics of supplying people with good, reliable facts and making those facts widely known and accepted. Organisations like Gapminder, The Gates Foundation and TED are doing an amazing job here, but we have so far to go.

Data is Hard Work!In the field of IT, we have tried to throw “solutions” like Big Data, Self Service, MPP and a plethora of data “modeling” tools (Like ERWin and self service ETL tools) at it. But those are all techie fixes, they don’t address the underlying inability of humans to comprehend data and act logically in large crowds. Until we address this by providing a strong foundation of scientific method in our schools, I am afraid we wont make much progress. I am as guilty as anyone, I found statistics to be a horribly boring subject in university, despite my teacher’s best attempts to make it interesting. The essence of the argument is: Data is hard work!

We live in a world that needs to be data driven more than ever and the data supply and analysis problem is simply not being solved fast enough. Stem cell engineering, human rejuvenation, genetic engineering, artificial brains and robotic warfare are on our doorstep. Regurgitating decade old technologies like column stores and promoting self service as the solution to the data problem is just not good enough; in fact, it trivialises the problem instead of solving it. It also makes us guilty of conspiring with the people who WANT to dumb down our data and the population to maintain their power base (both in government and in companies). We need to train data modelers and statisticians and find ways to communicate difficult statistical concepts to the masses. As computer scientists, we need to solve the logistical problem of data distribution MUCH better than we do today. We need champions in the political system who can understand, and act on the data, and people who independently validate the quality of that data. For obvious reasons, something the self-service crowd seems to miss, the people who do the data quality checks cant be the same people who rely on the data to support their opinions.

We are going to need to accept some pretty uncomfortable “hard facts” and find ways to communicate them emphatically. Throwing masses of people who are “empowered” with self service and pretty graphs isn’t the solution to this challenge…

Your honour, I rest my case…

  1. mat young
    August 19, 2013 at 00:03

    Light you say, nothing light about this. A good read.

  2. Djeepy1
    August 19, 2013 at 08:01

    Interestant article. I do not share all your ideas but it make me think about it.
    Without going to debate, my point of view is about Pareto Principle. Even if “crowd” is globaly stupid, you need it whole to reach the 20% who could do a fantastic work with data if you free it for them.

    • Thomas Kejser
      August 19, 2013 at 08:17

      There are two problems with the Pareto approach here

      1) democracy does not grant leadership to the 20% – it grants leadership to the most common view
      2) if the rest of the 80% are ill informed and unable to accept what the data says – you are just going to burn all your time convincing them to act

      I would claim that the world is currently suffering from both of these pathologies. Dumbing down the problem isn’t the way to address it – we tried that, it didn’t work

  3. Djeepy1
    August 19, 2013 at 11:56

    I do not know in other countries but in companies I work for in France, the 20% leaders runs the company. 80% others are just following, they do not waste any time (or a very few).
    (Note : I do not say they are useless but their contributions are unitary small)

    • Thomas Kejser
      August 26, 2013 at 08:30

      That is a rather interesting observation.

      Let’s just for a moment grant it to be true that in France:

      1) people follow their leaders
      2) those leaders are not wasting any time

      Frequent strikes over trivial matters, near bankruptcy, massive unemployment rates, extremely low happiness and one of the weakest economies in Europe (for a country that size) aside, does it still follow that those French leaders make better decisions when they have access to self service? Or if they don’t, does it follow that if only those leaders had more time (implying that self service is more time efficient) – would they make MORE “good” decisions?

      I am not entirely against the idea that a company run by an intellectual elite can serve as a stepping stone to something greater. Indeed, I think some form of benign, elitist dictatorship may even be applicable at the country scale, as a temporary form of government, for countries not (yet) well versed and educated in the principles of democracy. However, I would be extremely skeptical about the efficiency of such a construction if it was not carefully monitored by checks and balances and transparency that prevents the abuse of power… Indeed, there is an argument FOR self service here if the “mob” was granted access to the same information as its leaders.

  4. August 22, 2013 at 19:31

    Definitely a thought provoking piece, and I think I agree with most of your takeaway…especially this part:

    >>> “We need to train data modelers and statisticians and find ways to communicate difficult statistical concepts to the masses. As computer scientists, we need to solve the logistical problem of data distribution MUCH better than we do today. We need champions in the political system who can understand, and act on the data, and people who independently validate the quality of that data.”

    However, there were a few areas that felt a bit “strawman”-ish. For example:

    >>> “promoting self service as the solution to the data problem is just not good enough”

    The idea behind self-service (and Big Data for that matter) is to provide access to data and tools for those “interested, capable, and properly incentivized” to make the best decision. It is not a panacea for solving the worlds’ problems in and of itself.

    If a person is not capable of doing the work whether that be due to lack of education, interest, and/or incentives and therefore fails to solve the problem at hand…this does not speak to the failure and/or short-comings of self-service BI.

    In my opinion, self-service is more about operational efficiency. Subject-matter experts on the business side will be able to leverage self-service tools to make decisions faster than waiting on me to build, test, and deploy. But there are underlying assumptions that simply cannot be hand-waved away…this subject-mater expert must have the education (both business knowledge and data-proficiency)…the data must be clean…and ultimately the business needs to accept the results and be capable of taking action on them.

    • Thomas Kejser
      August 26, 2013 at 08:10

      Hi Bill

      Sorry for the late reply,I have been on vacation.

      Let us focus for a moment of the BI aspects of the self service problem.

      I agree that even given the right tools and education of the “self served user” – the business will still need to learn to accept the facts. However, there are two problems with this:

      1) if the business is large, and therefore has the scale problem that self service is meant to solve, it is only rarely driven by facts (humans in large groups act on mob mentality and power dynamics, not on what is best for them)
      2) if the business is small and not tainted by politics, one wonders why traditional reporting isn’t good enough to get by and why self service is an issue in the first place.

      … Self service BI seems like a good idea in an ideal world where humans generally understand data and where people can interpret data objectively. But I don’t think I am guilty of a straw man argument to point out that this is not the world we currently live in.

      Note that I am not saying that we can’t do better with technology today. Indeed, I think a lot of “traditional” BI has been tainted by poor implementation, models, report tools (Reporting Services being a particularly grim example), low agility and general misestimation of the effort required to clean up data for proper use. Despite there being reasonably good sources of information on how to overcome these issues (Kimball for example), this has not happened on a big scale. Microsoft is a good case study here, several good initiatives (SSAS dimensional, project Barcelona, project data dude, SSIS) had the opportunity to become de facto solutions to the problem of BI/DW at scale. Yet even MS listened to the siren’s song of smaller players, when tying yourself to the mast and stuffing your ears might have saved the ship.

      To summarise my other battle with self service: I think it is naive to believe that crowd sourcing the solution makes the problem go away. Indeed, I think it blurs the responsibility for data gathering and analyses. It should be relatively easy to see how this blurring can be a (malicious) political goal in itself.

      • williamanton
        August 27, 2013 at 13:22

        (sorry for the double post – wordpress log-in issues)

        yes – the political factors can definitely be a barrier to progress/solutions…good point.

        and you’re also right on the part about “people generally not understanding data enough to use it properly”…in fact that can likely be extrapolated out to a more general educational issue plaguing the global society.

        i somehow got the impression that you were evaluating the success/failure of self-service BI on the basis of whether it could be a tool for uneducated users – preventing them from making incorrect use with the data…(the example that comes to mind is how Tableau recommends different visualizations based on data selected). I see not that my first impression was incorrect – so consider the straw-man comment retracted 😉

      • Thomas Kejser
        August 27, 2013 at 13:28

        Your comment pointed out a lack of clarity on my argument. Appreciate you pointing it out, thanks.

    • Thomas Kejser
      August 26, 2013 at 08:33

      A question I forgot to ask Bill:

      When you say that a business has to wait for build, test, deploy – how long does that process take in your experience?

      • williamanton
        August 27, 2013 at 13:23

        assuming some form of agile methodology in place… the delivery period should be a matter of days to weeks (depending on the complexity required). For example, to iterate through a few lines on a bus-matrix, maybe a few weeks….to pop out a single report based on existing data in the warehouse…a day or 2 depending on the complexity.

        (don’t get me started when the business doesn’t know exactly what they want but they absolutely must have it yesterday…those can drag out)

      • Thomas Kejser
        August 27, 2013 at 13:27

        A few days, if not even a few hours, in an automated build environment is my ballpark figure here… It seems we are aligned on the ballpark figure.

        Assuming that number, I have a hard time seeing how anything faster than that serves any actual need except the vanity of the end user.

        And yes, data quality and users no knowing what they need. Hard problem (and one that I would claim self service only makes worse)

  5. williamanton
    August 22, 2013 at 19:38

    Definitely a thought provoking piece, and I think I agree with most of your takeaway…especially this part:

    >>> “We need to train data modelers and statisticians and find ways to communicate difficult statistical concepts to the masses. As computer scientists, we need to solve the logistical problem of data distribution MUCH better than we do today. We need champions in the political system who can understand, and act on the data, and people who independently validate the quality of that data.”

    However, there were a few areas that felt a bit “strawman”-ish. For example:

    >>> “promoting self service as the solution to the data problem is just not good enough”

    Anyone who promotes self-service as the solution to the data problem is likely a self-interested vendor.

    • Thomas Kejser
      August 26, 2013 at 08:11

      Hi Bill

      See my other reply.

      And yes, I agree on last point here 🙂

  6. WillBeFine
    September 2, 2013 at 12:07

    The quote, “There are lies, damned lies and statistics” sums up the problem with data. It can be manipulated either through ignorance or wilful purpose. Self service is like a buffet meal. It is cheaper, offers lots of choice and meets your needs quickly. If you go to an expensive restaurant it costs more, a limited choice, slower service but of higher quality. Personally, I like both offerings and restricting yourself to one or other has drawbacks. In the end, it is difficult to avoid the statistics!

  7. Stephen Morris
    October 5, 2013 at 08:52

    I agree with your premise that self-service is basically unsound and unsafe, strangely enough I former this opinion more than 20 years ago whilst working as an information officer in the National Health Service. What the end user (normally a Dr, cleverer and better educated than me) thought they wanted – their data request, and what they actually wanted – the information provided, was the majority of the time very different. The key to being a good information officer was to understand the way the data was collected, its reliability, definition and its correct interpretation. This knowledge takes time to acquire and is a speciality in itself. For complex organisations like the NHS having someone who can highlight the pitfalls and unreliability of the raw data is essential if they are to avoid the end users initial assumptions leading to important decisions being taken on incorrect information.

    • October 5, 2013 at 09:12

      Stephen, thanks for this valuable insight. Its fascinating that even an organisation like the NHS, which is supposedly “science driven” still has this problem of people self servicing/harming.

      Overall, I think this self service wave is confusing “easy access to data” with “intelligent use”. I am all for making data access as simple as possible – but only AFTER the data has been cleansed, categorised and properly validated. Providing users with self service is a bit like firing the librarians in your public library and piling all the books up on the floor.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s