Sign up for our newsletter and get the latest big data news and analysis.

How the SP Theory of Intelligence May Help Solve Nine Problems with Big Data

I’ve spent several years developing a theory of intelligence, expressed in a computer model, which is designed to simplify and integrate ideas across AI and related fields. [1] The result is a system that demonstrates versatility in aspects of intelligence and versatility in the representation of diverse forms of knowledge. This versatility is due largely to part of the theory called SP-multiple-alignment, with the added benefit that it facilitates seamless integration of diverse aspects of intelligence and diverse forms of knowledge, in any combination.

Compression of information is a central part of the theory, so the theory is called “SP”—because compression of information may be understood as Simplification of information by reducing repetition of information while, at the same time, retaining as much as possible of its expressive or descriptive Power. Because compression of information is central in the SP system, it was suggested that the system might be helpful in reducing the size of big data. But it came as a surprise to find that there were at least eight other ways in which the SP system might help with problems associated with big data. These nine potential benefits of the system with big data, which are described in a peer-reviewed paper,[2] are what this article is about.

Overcoming the problem of variety in big data

When we are trying to extract value from big data, the many different formats and formalisms for data is headache, because any one method of analysis rarely works with more than one or two different forms of data.

The SP system may help because of the versatility of the system in aspects of intelligence and in the representation of knowledge. It could, with some more development, become a universal framework for the representation and processing of diverse kinds of knowledge (UFK),

With such a framework, there is potential to translate diverse forms of knowledge into the UFK form. But if the UFK becomes widely adopted, then big data may start life in the UFK form, thus eliminating the need for translation.

Learning and discovery

Big data offers the prospect of valuable insights from processes of learning and discovery, but owing to the problem of variety in big data and shortcomings in current methods of analysis, many of the potential benefits are yet to be realised. There is potential for the SP system to assist in this area, partly because of its potential to help solve the problem of variety in big data, and partly because unsupervised learning is an integral part of the system. Also, learning in the SP system is much faster than in, for example, deep learning.[3]

Interpretation of data

In a similar way, big data offers the prospect of valuable insights from various forms of analysis or interpretation of the data, but it is difficult to turn the turn the potential into reality, mainly because of the jumble of different forms for knowledge.

As with learning and discovery, the SP system can be helpful in this area, partly via its potential to reduce the problem of variety in big data, and partly because of its strengths in other aspects of intelligence.  These include pattern recognition, information retrieval, parsing and production of natural language, translation from one representation to another, several kinds of reasoning, planning and problem solving.

Velocity: analysis of streaming data

Many conventional techniques for the analysis of data have been developed, each one, to process a finite batch of data and to deliver a finite result. But this does not work well with much of big data, where there is typically a more-or-less continuous stream of data that demands a correspondingly continuous process of analysis.

It is true that, up to now, in the process of developing the SP computer model, much testing has been done with relatively small batches. But the basic concept of the SP system envisages a more-or-less continuous input of data, with a correspondingly continuous process of analysis, in much the same way that people are constantly trying to interpret and learn from new sensory information, as it is received.

Volume: making big data smaller

As indicated at the beginning of this article, the SP system has clear potential to yield benefits with big data by making it smaller. Reducing the size of any large body of data is likely to reduce problems in its storage and transmission.

Additional economies in the transmission of data

There is potential for additional economies in the transmission of data, which are potentially very substantial, by judicious separation of ‘encoding’ and ‘grammar’, in a technique called ‘model-based coding’. When this was first proposed, by John Pierce in 1961, it was not feasible to do. But it should fall within the scope of the SP system when it is more mature, at least in a simplified form.

Energy, speed, and bulk

With the SP system, there is potential for big cuts in the use of energy in computing, for greater speed of processing with a given computational resource, and for corresponding reductions in the size and weight of computers.

Veracity: managing errors and uncertainties in data

With the SP system, there is clear potential to identify possible errors or uncertainties in data, to suggest possible corrections or interpolations, and to calculate associated probabilities.


A major potential benefit of the SP system, which contrasts with some other systems, especially ‘deep learning’, is that knowledge structures created by the system, and inferential processes in the system, are all transparent and open to inspection. They lend themselves to display with static and moving images.


[1] Further information about the SP system may be found in “Introduction to the SP theory of intelligence” (PDF,, and on .

[2] The paper is “Big data and the SP theory of intelligence (IEEE Access, 2, 301-315, 2014,

[3] See Sections V-D and V-E in “The SP theory of intelligence: distinctive features and advantages” (IEEE Access, 4, 216-246, 2016,

About the Author

Dr Gerry Wolff PhD Ceng MIEEE MBCS is Director of Cognition Research. Previously, he held academic posts in the School of Informatics, University of Wales, Bangor, the Department of Psychology, University of Dundee, and the University Hospital of Wales, Cardiff.  His first degree at Cambridge University was in Natural Sciences (specialising in Experimental Psychology) and his PhD at the University of Wales, Cardiff, was in the area of Cognitive Science. Up to 1987, his main research interests were in developing computer models of language learning. Between 1987 and 2005 his research has focused on the development of the SP theory. Between early 2006 and late 2012, he was engaged full time on environmental campaigning (climate change) but is now concentrating on the development of the SP system, and raising awareness of the SP research. Dr Wolff has numerous publications in a wide range of journals, collected papers and conference proceedings.


Sign up for the free insideBIGDATA newsletter.


  1. Michael Stuart says:

    Perhaps SP can help humanity determine that always-on nuclear power plants can never be replaced by an intermittent and expensive concentrating solar power system – no matter how geographically widespread with HVDC transmission lines. Of course, we experts in the energy field have known that for decades, but sometimes it takes artificial intelligence to reach these same conclusions. Cheers! 🙂

Leave a Comment


Resource Links: