Skepticism about science and medicine

In search of disinterested science

Knowledge, understanding — but then there’s Wikipedia

Posted by Henry Bauer on 2014/07/17

I’ve had much occasion to comment on the unreliability of Wikipedia on any topic where viewpoints differ (The Wiles of Wiki; Health, Wikipedia, and Common Sense; Lowest common denominator — Wikipedia and its ilk; The unqualified (= without qualifications) gurus of Wikipedia; Another horror story about Wikipedia; The Fairy-Tale Cult of Wikipedia; Beware the Internet: “reviews”, Wikipedia, and other sources of misinformation).

However, yesterday morning’s Public Radio warned me that I should question Wikipedia’s reliability even over what might seem to be objective factual data. Many media ran the same story, for instance the Sydney Morning Herald.

The revelation was that 8.5% of all Wiki articles, some 2.7 million of them, were “written” by one individual, Sverker Johansson: “On a good day the output can be as high as 10,000 articles”. “His claims to authorship are contested however, as they were created by a computer generated software algorithm, otherwise known as a bot. Johansson has named his Lsjbot”.

The Public Radio piece included comments from Jimmy Wales, Wiki’s founder, who said that this was actually nothing new, that “bots” had been used to “create” “content” from the very time Wiki was first established.

Johansson said that his motive is to bring knowledge to the widest possible audience.
An obvious question would have been, what is meant by “knowledge”?

A primitive answer might be, knowledge consists of facts, things that are indisputably so.
For example?
Well . . . . That all humans are mortal?
Hard to quarrel with that one, though quibblers might suggest a dependence on the definition of “human” and on the status of gods who sometimes take human shape.
So how about “the Earth is not flat”?
No quibbling there, provided we ignore as irrelevant any technicalities that concern only topologists and their multiple dimensions. But such negatives are not particularly informative, and surely “knowledge” implies being informative.
So should we have said “the Earth is spherical”? No, because quite important characteristics and phenomena depend on the fact that the Earth is not exactly spherical.

The point is, I suggest, that there’s no such thing as purely factual knowledge, because that isn’t informative. Data have meaning only in some context.
One might say that there are two kinds of knowledge, map-like and story-like. Maps tell you how to go somewhere, but give no reason for doing so, no meaningful context. Stories, on the other hand, many not be factually accurate in every respect, but they convey meaning, understanding. As Steven Weinberg has put it, “The more the universe seems comprehensible, the more it also seems pointless”. Pure facts, data, convey nothing that’s meaningful for us human beings; context, relationships, emotions, ethics, morality are what give meaning to facts.

Bots, robots, computer programs, “artificial intelligence”, “information technology” are inherently incapable of delivering meaningful knowledge, or of judging whether or not certain data are meaningful or whether they are nonsensical.
It follows that Wikipedia ought to restrict itself to things that matter to computers, automata, robots, bots.

The usefulness of Wikipedia — of anything that claims to be informative — depends inescapably on the inescapably human judgment that went into selecting and vetting what is presented as “knowledge”, even if that has the appearance of purely “objective” data.
In the earliest days of the computer-obsessed era, a principle was recognized that contemporary computeroids like Wales should re-learn: GIGO, garbage in = garbage out.

There are no databases or other repositories of supposed fact that can be relied on not to contain errors and misleading “facts”, and only human intelligence, common sense, and judgment are capable of detecting them. I learned about that early in my research career, when I was studying photolysis of organic iodine compounds. Nitric oxide, NO, could be used to combine with iodine atoms, so I searched for information about NOI, nitrosyl iodide, in the index of Chemical Abstracts, the universal source of information about chemical matters in pre-computer days. I was astonished to find that a cited source turned out to be an article not about NOI but about NaI, sodium iodide. I assumed that whoever had “written” that article had dictated to a secretary and then failed to proofread. I doubt that such errors no longer occur, albeit perhaps owing to flawed speech-recognition software rather than secretaries.
Beyond that, how is a computer or a bot to figure out whether or not the Earth should be described as spherical?
And how much more misleading would a bot be about more complex matters?
Could a bot recognize that the conclusions of a published, peer-reviewed article are not to be believed because the statistics were incompetent, or the protocol inappropriate?

Automated procedures cannot deliver reliable information. They can search databases, but they may just be collecting Garbage Input. Imagine what “purely factual” information computers would glean about HIV/AIDS, say, since just about everything in the mainstream literature has been misinterpreted (The Case against HIV).

Sadly, the computeroid nonsense doesn’t stop with Wikipedia. Books are “written” in the same way:
“Phil Parker, who is purported to be the most published author in history, has successfully published over 85,000 physical books on niche topics such as childhood acute lymphoblastic leukaemia. Each book takes less than an hour to ‘write’. In fact the writing is carried out by patented algorithms which enable computers to do all the heavy lifting.” “The books — typically non-fiction and on extremely niche topics — are compiled on-demand, based on publicly available information found on the internet and in offline sources” (Automaton author writes up a storm).
Not everyone would agree that this technique can produce non-fiction, something that is not fictional.

“Bots may also be writing the journalist out of the future of journalism. Ken Schwencke, a reporter on the Los Angeles Times, has created ‘Quakebot’, an algorithm which automatically creates and publishes a story on the newspaper’s website every time an earthquake is detected in California” (This is how Sverker Johansson wrote 8.5 per cent of everything published on Wikipedia).

This is how the world will end, not with a bang, not with a whimper *, but through the abandonment of thinking under the spell of computeroids and their bots.


* See “The Hollow Men” by T. S. Eliot


3 Responses to “Knowledge, understanding — but then there’s Wikipedia”

  1. Mark said

    “Could a bot recognize that the conclusions of a published, peer-reviewed article are not to be believed because the statistics were incompetent, or the protocol inappropriate?”

    Yes. Yes it could.

    I don’t like some of this negative stuff that you’re saying about artificial intelligence. I think that more of a “wait and see” attitude would be more appropriate.

    • Henry Bauer said

      We disagree. I can’t conceive of any algorithm capable of doing those things because they depend on making judgments.
      In general, I think all algorithms suffer from the limitations that were recognized when learning machines and the like became popular: To create an algorithm that can handle every conceivable possibility is impossible. But human intelligence can handle novel situations.

  2. Pol Dubart said

    Yes, I quite agree with you and I think we are living more and more in some kind of an Orwellian science-fiction world that prevents people from having any critical judgement about what is told them by media and scientific or political “authorities”… Never have people been so gullible as in our so disinformed modern world…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s