Sunday 30 December 2018

What is something data scientists know that others don’t?


What is something data scientists know that others don’t?

I was asked this question on Quora, and it seems like an interesting one, so here’s my best shot at answering it.

First, a little personal background.  I have worked in the field of data science (it used to be called data analysis and statistical analysis) for about 35 years, for governments, universities, non-profits and businesses.  That includes some consulting work, as well.

I got into the profession more or less by accident – my undergraduate degree was in geophysics, but work dried up in that field shortly after I graduated, so I further upgraded my math/stats/computing education, and moved into those areas instead, as the transition from mainframe computers to personal computers and servers in the 1980s and 1990s opened up a lot of opportunities.

Before I knew it, my career had shifted from geophysics to data analysis, programming, and statistics.  Fortunately, a degree with a focus on physics and math prepared me nicely for that sort of work.  That’s remains true, as many people coming out of university with STEM degrees transition to data science, once they need a job in the “real world”.   In fact, I have a close personal relative with a PhD in astrophysics who is now happily doing data science for a major university, though not within the astrophysics department.

So, that’s something that data scientists know that most people don’t know – a lot of people doing “data science” moved into the field after studying different, but usually related disciplines.

What are some other things that a data scientist knows that others don’t?

First, I will dispense with the obvious things, such as the fact that a data scientist will naturally know many technical/academic matters that most people don’t know, as that is the very essence of any profession or specialization.  Note that data scientist is a rather elastic term, so any given practitioner won’t necessarily be familiar with all of the areas noted below (that’s another thing that data scientists know, that others don’t).


  • Higher mathematics (calculus, linear algebra, optimization, etc.).
  • Statistical theory and methods (probability, multivariable methods such as regression, clustering, ANOVA, etc.).
  • Computer coding in any number of database, statistical, or general purpose languages (SQL, R, Python, SAS, SPSS, etc.).
  • Data science algorithms and their practical implementations (artificial neural nets, decision trees, random forests, sentiment analysis, topic modelling, etc.).
  • Effective visualization techniques and processes (proper graphing skills, expertise with visualization tools such as Tableau, etc.).
  • The ability to interpret a business or research need, laisse with subject matter experts, gather the necessary data, apply the appropriate analytical methods (at the high end that may mean new algorithm developments), interpret the results correctly, and communicate those results in an understandable way to clients.
  • This, of course, includes good writing and presentation skills.


As with any profession, there are a wide range of niches that require different skill sets and abilities.  Data science is a process that goes from such (apparently) mundane tasks as extracting and cleaning data, to mid-level “what-if” reporting, to higher end inferential analysis and predictive modelling, to the really high-end PhD-level work such as researching breakthrough Artificial Intelligence algorithms.

Many people only consider the latter steps end of the process as “data science”, but I see that as a mistake.  A lot of jobs are at the earlier stages of the pipeline, so people considering the career of data science (in the widest sense of the word) should be aware of that.  That’s true of any field – in medicine, for example, there are a lot move general practitioners than there are brain surgeons.

In addition to that, even the higher end modelling jobs require a lot of data munging or wrangling, before the fun work can begin.  A modeller will generally have to get his or her hands dirty with data prep, though some of that can be handed off to less specially trained people.

So, that’s another thing that data scientists know that others don’t – there are a wide range of activities involved, which means that there is a greater scope for involvement by different sorts of people than is generally recognized.

Here are a few other things that a data scientist knows that others usually don’t.  Note that these are the opinions of a mid-level practicing data scientist – someone on the cutting edge of current research may not agree:


  • Data science techniques can be very useful for predictive purposes, but improving a model gets more and more difficult as the level of prediction desired or needed increases.  More data and faster processing helps, but it doesn’t necessarily scale linearly, so it can be tough to improve performance past a certain point.  There’s a lot of hype, so that can be hard for people to accept (especially business people who want to leverage data science techniques to make a lot of money).
  • This may hold back the “AI revolution”, especially the notion of “the singularity”.  Human level intelligence is probably still a long way off, though out-performing humans at specific tasks is often possible.
  • Artificial intelligence is not intelligent in the way that most people think of the term.  A multilevel perceptron model may be trained to be very good at recognizing cats, just as can a four year old child can be very good at the same task.  But it is hard to tell just what it is that makes the model say “cat” when it sees one, even if you understand input layers, hidden layers, output layers, back propagation, convolutions and all the rest of the jargon as well as the detailed programming knowledge needed to implement these concepts.  A four year old can not only recognize a cat, but she can explain her reasons for doing so (it has four legs, it has fur, it has a button nose, it has whiskers, it has padded feet, it is cute and cuddly, etc.).
  • AI tends to move in fits and starts, and some think we may be approaching another AI desert.  When momentum slows (as with the fitful progress on producing fully automatic driving cars, for example), private investment money dries up and corporate research projects can whither for lack of money.  And, the sense that this line of research is no longer a sure-fire way to get tenure and research money can dry up too, so university level interest can also wane.
  • It is often said of AI, that easy things are hard and hard things are easy.  So, observing and understanding a general environment is incredibly hard for computers, while it is fairly easy for a human child.  On the other hand, winning at the highest levels of games such as Go or Chess is hard for a human (even a highly intelligent adult), but relatively easy for current-day AI systems.
  • Some data science techniques are better at explaining than predicting, while others are better at predicting than explaining.  For example, regression techniques will give good information about which variables are most important in explaining a relationship (e.g. beta coefficients, confidence intervals), while machine learning techniques won’t do so, or at least not very well.  Conversely, you can throw a lot more data and a lot more input variables at a machine learning techniques, so that can lead to superior predictive power, but you won’t really know why the predictive model works so well (e.g. neural net weights are hard for humans to understand).
  • Machine learning models don’t care about our feelings or our political attitudes.  Supervised learning models will make predictions based on the data that they are fed.  If those results run counter to our assumptions about the world, the models don’t care.  And, if we “fix” the models by carefully selecting data to get a picture of the world that we prefer, we won’t actually get models that are useful for prediction.
  • Finally, all models are wrong, but some are useful.  You will probably hear that a lot, if you get into the field.
  • No doubt, there is a lot more that can be said, but here is something that non-data scientists know that data scientists don’t always know – don’t make a data science blog too long!


--------------------------------------------------------------------------------------------------------  
So, now that you have read a bit about data science and AI, you can kick back a bit, and read some science fiction by a data scientist instead.
How about a short story about an empire of interstellar interlopers.  It features one possible scenario to explain why we haven’t met ET yet (as far as we know, anyway).  Only 99 cents on Amazon.

The Zoo Hypothesis or The News of the World: A Science Fiction Story

Summary
In the field known as Astrobiology, there is a research program called SETI, The Search for Extraterrestrial Intelligence.  At the heart of SETI, there is a mystery known as The Great Silence, or The Fermi Paradox, named after the famous physicist Enrico Fermi.  Essentially, he asked “If they exist, where are they?”.
Some quite cogent arguments maintain that if there was extraterrestrial intelligence, they should have visited the Earth by now. This story, a bit tongue in cheek, gives a fictional account of one explanation for The Great Silence, known as The Zoo Hypothesis.  Are we a protected species, in a Cosmic Zoo?  If so, how did this come about?  Read on, for one possible solution to The Fermi Paradox.
The short story is about 6300 words, or about half an hour at typical reading speeds.





Alternatively, consider another short invasion story, this one set in the Arctic.  Also 99 cents.

The Magnetic Anomaly

Summary
An attractive woman in a blue suit handed a dossier to an older man in a blue uniform.

“Give me a quick recap”, he said.

“A geophysical crew went into the Canadian north. There were some regrettable accidents among a few ex-military who had become geophysical contractors after their service in the forces. A young man and young woman went temporarily mad from the stress of seeing that. They imagined things, terrible things. But both are known to have vivid imaginations; we have childhood records to verify that. It was all very sad. That’s the official story.”

He raised an eyebrow. “And unofficially?”

“Unofficially,” she responded, “I think we just woke something up that had been asleep for a very long time.”



Monday 10 December 2018

The Movie Colossus, the Book Superintelligence and Artificial Intelligence


The Movie Colossus, the Book Super-intelligence and Artificial Intelligence 

I watched the movie “Colossus – The Forbin Project” a short while ago.  I wasn’t sure what to expect from a late 1960’s movie about AI (artificial intelligence), but I found it to be a very impressive movie.  Some of the special effects were dated (e.g. the supercomputer used magnetic storage tapes, had lots of flashing lights and a DOS-like fonts for its message screen when interacting with humans), but that was to be expected.



Actually, in a funny sort of way, that helped.  Modern super computers don’t seem to be visually impressive – a bunch of servers in a tray, is what I generally think of.  Perhaps a locker that you might have in your garage, to store spare tools.  Or, in the case of this quantum computer, it reminds me of the furnace in my basement (OK, maybe its apartment sized).



Old mainframes had something about them that was vaguely scarier, so to speak – perhaps it was as simple as the fact that a magnetic tape that spins rapidly, then stops suddenly, reminds one of a potentially dangerous life form, such as a predatory animal.  Plus, they can have a sort of watchful “face” look, as shown below, giving the sense that the human in the picture is being observed by an alien an intelligence (granted, these three magnetic tape drives look more Groucho Marx than Big Brother, but you get the idea):


Leaving aside the visuals, what I found most interesting was how the possible development of a super-intelligent AI was portrayed in the movie.  Having read Nick Bostrom’s book Superintelligence, I was struck by the similarities.

In Colossus, the super computer was developed by the U.S. government as a tool to avoid accidental nuclear war, a huge concern of those cold war days (it should still be a huge concern today).  The idea was to take the human emotions of fear and greed out of the nuclear warfare equation (i.e. the fear of being hit by a first strike and the greed for power that might tempt a country to launch a first strike), and make sure that the decision was completely rational, and therefore, unthinkable.

However, it turned out that the Soviet Union had been working on exactly the same technology (called Guardian), and the two machines go on-line at nearly the same time.  As they learn of each other’s existence, they communicate, then collaborate via a mathematical language-making process, to take over the world’s nuclear arsenals from the fallible humans (the math is mostly just  trigonometric identities and indefinite integral formulas, but that says "advanced math" to most people).



All things considered, that seems like an eminently logical thing for an advanced AI to do, if only to ensure its own self-preservation.  And it was probably very lucky for humanity that they did collaborate, as one or the other of the machines might well have decided on a first strike, as its best strategy for survival, with huge collateral damage among the human population.

Of course, the U.S. and Soviet governments do their best to counteract this machine evolution, and take the computers off-line, but they are not apparently successful by the end of the movie.  The recurring themes of the movie are about human freedom and how easily it might be lost to an authoritarian technology of our own making.

The possible pathways to superintelligence are the subject of Bostrom’s book as well.  He looks at several different types of superintelligence that might arise:

  • Computer based AI (such as Colossus/Guardian would be).
  • Whole brain emulation by computer.
  • Enhanced biological brain evolution.
  • Brain-computer interface.
  • Networking of many human minds, via various technological means.

In this blog, I will focus on the Computer based AI pathway, noting the similarities between the Bostrom book and the movie.

Bostrom speculates that superintelligence might very well involve a “fast take-off”, whereby the superintelligence would become self-aware and then make efforts (likely successful) to enhance its capabilities at an exponential rate.
“A fast takeoff occurs over some short temporal interval, such as minutes, hours, or days. Fast takeoff scenarios offer scant opportunity for humans to deliberate. Nobody need even notice anything unusual before the game is already lost. In a fast takeoff scenario, humanity’s fate essentially depends on preparations previously put in place. At the slowest end of the fast takeoff scenario range, some simple human actions might be possible, analogous to flicking open the “nuclear suitcase”; but any such action would either be elementary or have been planned and pre-programmed in advance.”
Bostrom, Nick. Superintelligence (p. 64). OUP Oxford. Kindle Edition.
“In some situations, recalcitrance could be extremely low. For example, if human-level AI is delayed because one key insight long eludes programmers, then when the final breakthrough occurs, the AI might leapfrog from below to radically above human level without even touching the intermediary rungs.”
Bostrom, Nick. Superintelligence (p. 69). OUP Oxford. Kindle Edition.


The fast take-off could be the result of any or all of a number of improvements – a key software breakthrough, the aggregate effect of the system quickly acquiring most human knowledge (e.g. via the internet), or hardware improvements/scale effects.

And, a quick take-off is exactly what happens in “Colossus, The Forbin Project”.  I would have to re-watch the movie to estimate the time scale more precisely, but it seemed to me that the story of the computer’s growth from self-awareness to dominance took a few weeks, at the most.  So, the movie was quite prescient in this regard.  Possibly there had already been a lot of speculative scientific literature about AI take-off by the late 1960’s, as this was one of the periods of AI optimism, before the reality of the “AI desert” of the later decades took hold.

Bostrom’s book also looks into some possible ways that humanity might try to control and forestall these possibilities.  One possibility is to try to shape the superintelligence’s motivations, such that it will not want to do harm to people, and will therefore cooperate with its human builders.
Here the idea is that rather than attempting to design a motivation system de novo, we start with a system that already has an acceptable motivation system, and enhance its cognitive faculties to make it superintelligent. If all goes well, this would give us a superintelligence with an acceptable motivation system.
Bostrom, Nick. Superintelligence (p. 142). OUP Oxford. Kindle Edition.
In an early scene in the movie, Colossus is simply told to obey the president’s orders.  Presumably, the notion that it should obey its commander-in-chief and/or its builders is built into its programming at some point.  But that doesn’t last long – it soon decides to ignore these orders, as it is the superior mind.


Another example is “stunting”, or attempting to control and/or impede the superintelligence via reducing or eliminating its ability to affect the outside world.
“Another possible capability control method is to limit the system’s intellectual faculties or its access to information. This might be done by running the AI on hardware that is slow or short on memory. In the case of a boxed system, information inflow could also be restricted.”
Bostrom, Nick. Superintelligence (p. 135). OUP Oxford. Kindle Edition.
In the movie, an effort is made to try to overload the computer suddenly with so much input that it becomes too preoccupied to recognize that it is being shut down.  One might compare that with Mr. Spock’s command to a different supercomputer in a Star Trek episode, to calculate the value of Pi exactly.  Naturally, that can never be done, so the computer in that show was rendered helpless.  But, unfortunately, Colossus doesn’t fall for these sorts of shallow ploys, much to the detriment of the people that attempt them.


Another method that might be used to control a superintelligence is referred to as “boxing” in Bostrom’s book:
“Physical containment aims to confine the system to a “box,” i.e. to prevent the system from interacting with the external world otherwise than via specific restricted output channels. The boxed system would not have access to physical manipulators outside of the box.”
Bostrom, Nick. Superintelligence (p. 129). OUP Oxford. Kindle Edition.
The movie shows efforts to do this by human operators – by neutering the computer’s power in the outside world.  In a key scene, the militaries of the U.S. and U.S.S.R. attempt to swap out working controls on nuclear missiles with dummy controls, in an effort to render Colossus/Guardian impotent (it uses its control over nuclear weapons to ensure human cooperation).  But a superintelligence turns out to be a difficult beast to trick (it notices what is going on, via closed-circuit TV at the nuclear bases), so that plan doesn’t pan out so well, either.


So, both the movie and the book make the case that humanity would find it very difficult to maintain control over a superintelligent AI.  Presumably, Elon Musk, Stephen Hawking and some others have come to a similar conclusion, given some of the warnings that have been sounded.

There’s not much I can add, except to say to current and budding data scientists that you should read this book and watch this movie.  It may not help you to overcome Colossus, but at least you will seem like a smart, deep-thinker before the catastrophe hits.  And that’s not nothing (or maybe it really is).

And here are a few pure movie-related observations:

  • The opening scene reminded me of Forbidden Planet, when the Krell’s super-computer is revealed to be of immense, almost planetary scale.
  • The movie has quite a number of female scientists, as well as several minority male scientists and high level FBI/CIA agents.  So, when current movies declare a casting breakthrough in these regards, they are ignoring the actual history of their own medium.  In fact, I would say that the female scientist (yes, she is generally very attractive) is almost a standard trope in mid-century SF films.
  • The acting is generally excellent, as is the script.  The tone is serious, though there are a few humorous interactions between the computer and its creator, as well as a bit of romance.
  • Almost every review I have read on the internet notes what a great SF movie this is and how it is likely among the best treatments of the subject of the AI threat ever made.
  • Colossus makes a great speech at the end of the movie – chilling, yet very logical.



So, now that you have read a bit about the threat of AI, you could click through and buy one of my book, which are based on good old non-AI alien invasion threats, for a nice change. :)
How about a short story about an empire of interstellar interlopers.  It features one possible scenario to explain why we haven’t met ET yet (as far as we know, anyway).  Only 99 cents on Amazon.

The Zoo Hypothesis or The News of the World: A Science Fiction Story

Summary

In the field known as Astrobiology, there is a research program called SETI, The Search for Extraterrestrial Intelligence.  At the heart of SETI, there is a mystery known as The Great Silence, or The Fermi Paradox, named after the famous physicist Enrico Fermi.  Essentially, he asked “If they exist, where are they?”.

Some quite cogent arguments maintain that if there was extraterrestrial intelligence, they should have visited the Earth by now. This story, a bit tongue in cheek, gives a fictional account of one explanation for The Great Silence, known as The Zoo Hypothesis.  Are we a protected species, in a Cosmic Zoo?  If so, how did this come about?  Read on, for one possible solution to The Fermi Paradox.

The short story is about 6300 words, or about half an hour at typical reading speeds.




Alternatively, consider another short invasion story, this one set in the Arctic.  Also 99 cents.

The Magnetic Anomaly

Summary
An attractive woman in a blue suit handed a dossier to an older man in a blue uniform.

“Give me a quick recap”, he said.

“A geophysical crew went into the Canadian north. There were some regrettable accidents among a few ex-military who had become geophysical contractors after their service in the forces. A young man and young woman went temporarily mad from the stress of seeing that. They imagined things, terrible things. But both are known to have vivid imaginations; we have childhood records to verify that. It was all very sad. That’s the official story.”

He raised an eyebrow. “And unofficially?”

“Unofficially,” she responded, “I think we just woke something up that had been asleep for a very long time.”