[ PDF version ]

PDF DOWNLOADS

Download my 202-page
report: Special Operations
in Medical Research
[pdf – 2.1MB]:

Download my Open Letter to the British Prime Minister & Health Secretary
[pdf – 380KB]:

The Limits of Rationality
(An important mathematical oversight)
[840KB]:

Radical Affinity and
Variant Proportion in
Natural Numbers
[2.23MB]:

Mind: Before & Beyond Computation
[667KB]:

Dawkins' Theory of Memetics – A Biological Assault on the Cultural
[509KB]:

Randomness, Non-
Randomness, & Structural Selectivity
[617KB]:

Digital’s Hidden Corruption

Within the title page in this section, with reference to original research into some of the mathematical principles that underpin the construction of digital algorithms and digital information systems, I have tried to draw attention to what appears in the analysis of that research to be a critical problem affecting the exchange of data across digital domains, generally speaking – a problem that has not been acknowledged or addressed by the AI designers and systems engineers who develop and promote those systems.

Digital information systems have been with us for a good while already; indeed many of us are young enough not to have experienced a time when digital technologies did not play a decisive determining role in the forms of our social and economic organisation. Hence, there is a considerable industrial and economic momentum already established in favour of the successful rolling-out of those technologies, together with the appearance of a growing alliance between corporate and state power – one that depends for its essential stability upon the progress of that deployment continuing without significant interruption.

Nevertheless, those technologies remain by and large experimental in terms of their extended effects in practice, and while there have been important criticisms raised in response to some of those effects (for example by the 2019 AI Now Report¹), these have been necessarily reactive (i.e., as important critical responses to some of the negative unintended consequences of the digital experiment), and have not to my knowledge addressed themselves to the kind of foundational criticisms I have tried to draw attention to on this website. The particular focus of these foundational criticisms is an emphasis upon the specific problem of the logical inconsistency of data when exchanged across digital domains.

In The Limits of Rationality, I have tried to show why it is that the logic that informs the meaning and interpretation of data within its domain of origin should not be considered as freely transferable outside of that domain, and why the torrid exchange of data across domains without regard to that issue is a potential source of global confusion and disarray. It has nevertheless been a tacit assumption of information scientists that computational logic does somehow transcend the limits of digital domains, and that the meaning and representative value of digital information is somehow retained integrally within the data itself as it is liberally exchanged beyond its domain of origin.

That assumption is an error-in-principle. I have argued that in order for any digital data to retain its logical consistency it cannot be considered independently from the particular set of algorithmic rules under which it was derived; that those rules exhibit no universal applicability, and that all further uses of the data outside its domain of origin must be fully qualified in terms of those original rules; i.e., with respect to the original purposes and intents of the data. It has indeed been the purpose in part of the recent General Data Protection Regulation (‘GDPR’) to establish regulatory limits upon the reprocessing of subject data that arbitrarily exceeds its original intents and purposes.² However, the particular problem of logical inconsistency I have highlighted is not limited to ethical issues concerning the integrity of individuals’ personal data, but one that potentially impinges upon the logical consistency of all data universally.

We may have experienced a range of failings and irregularities in our use of digital technologies which, if the faults were not anticipated to lie within the data itself, we were accustomed to attribute to human or systemic errors in the management and processing of the data, or to weaknesses in the security of its storage. There is a sense in which it appears that errors ‘just happen’ due to an essential incompatibility between the technology itself and the ways in which we are accustomed to work with it. While I do not wish to claim that all of these problems might be attributable to the problem of logical inconsistency I have highlighted in these pages, in drawing attention to that particular issue as a problem inherent in data-sharing, but one that has yet to be openly acknowledged by the industry itself, there may now be less reassurance in the idea that any data problem might be eradicated by removing the factor of human error from it, or by simply throwing more resources at it.

Whether it is associated with the problem of inherent logical inconsistency or otherwise, it seems to me that all digital data is at least potentially redundant (out-of-date or simply incorrect) as soon as it is compiled. This is in the nature of data produced and stored digitally, as it is essentially static and resistant to change. How often have you come across personal or other data through the Internet which is incorrect in one or more essential details, but for which there seems to be no available means to amend it, nor any shared interest in maintaining its accuracy – the absence of which implies that the misinformation promises to persist thereafter indelibly? It is not too wild an extrapolation to project that the petty confusion and helplessness provoked in the researcher by such misinformation is only the microcosm of a related global data-disarray, one shared by literally billions of users privately tapping incredulous queries into their mendacious devices, and who are lucky if they can act upon fifty percent of what they find there.

While there are clearly variations in the reliability between different categories of data (according to the relative integrity of its sources), serious and unforeseen vulnerabilities arise from the sheer ubiquity of data and the expectations placed upon it, in terms of its ability to faithfully retain its ontological value. A particular weakness is the hidden tenuousness of the value attached to subject-provided data (which somehow assumes that individuals never knowingly or unknowingly provide incorrect information on forms). Whichever way you look at it, there is inevitable disagreement between the body of data (however its limits are conceived) and its reference points, which is generally unanticipated and whose scale cannot be estimated. This should be understood thermodynamically in terms of increasing entropy in the system (i.e., as a tendency towards increasing disorder in the system).

With the highlighted problem of logical inconsistency in mind, we should firstly consider: Is there any real ontological value in the sharing of any digital data outside its domain of origin? This must be the point at which the integrity of the data is first compromised and its vulnerabilities exposed – where its exchange-value outstrips its use-value (the latter conceived as the original representative value of the data – its capacity to speak accurately about the world). Exchange-value and use-value work quasi-independently, according to different logics, so that newly emerging exchange-value of data is calculated on the basis of a somewhat mythical (redundant) conception of its original use-value. Any new use of the data is both promiscuous and precarious, as it is too remote from that original use-value – the data no longer speaks accurately about the world, as it is being exploited for purposes incompatible with those it was intended for.

In light of these objections, we should anticipate that the much-vaunted recent expansion in tools for generative AI production, such as ChatGPT and its kind, which exploit access to vast resources of online data, with characteristic disregard to the kind of logical inconsistencies I have highlighted, will serve as significant antagonists to the problem of thermodynamic entropy in the aggregate of globally shared information and misinformation, exacerbating everyone’s difficulty in isolating one from the other. That should be understood as the real threat posed by naive and reckless expansion in AI technology, in place of the irritating warnings about a looming prospect of AI surpassing human intelligence, coming from those who, after all, are major stakeholders in the technology itself. Those warnings should be seen for what they are as insincere and indulgent science-fictional hyperbole, intended to disempower and distract criticism from attention to the fundamental weaknesses at the heart of digital technology.

Expert fallibility and technical inertia

There is a serious imbalance, generally speaking, between the level of reprocessing done to data and the work done in evaluating it; so that while data may enjoy unwarranted liquidity in the degree to which it is exchanged as a commodity, it nevertheless remains static and resistant to change. Computational systems are imbued with imaginary super-human capabilities, which promise to do all the work for us. A not entirely unintended consequence of rapid digital innovation has been the marginalisation of human engagement and concern in the granular management of all kinds of information, because digital technology frees us in varying degrees from the labour of that engagement. Unfortunately however, at the same time it encourages us to dispense with the methods and wisdom through which we had previously exercised such essential critical engagement and concern.

It is important to point out a factor which I’m sure every person with the least experience of digital encoding has felt, but the significance of which has not been fully appreciated by experts in the field – that there is a ‘top-heavy’ relationship between the degree of coding, testing, and hence debugging required to manage the distributed effects of deploying any particular digital procedure, and the limited practical needs intended to be served by that procedure. In practical terms, coding is always more complex than anticipated because of unexpected concomitant effects that follow down-the-line from instantiations of new code or changes to existing code. The result is an unforgiving gradient of required effort and attention from software engineers that results in a backlog of inertia and failure in information systems, with the result that at times end-users are led to raise the question: If it’s broke why not fix it? As the effects of these failures are largely down-the-line and remote from their sources, they will frequently be imperceptible as to their causes, which means that the expert’s honest response to that question, if it were ever openly given, is likely to be: I’m sorry but we don’t actually know how to. In place of that honest response, as there may be no perceivable link between the error and its cause, the expert may simply deny responsibility on behalf of software design and point to end-user human error instead.³

If the causes of failures in data processes tend to remain opaque to us, and also to those who design and manage those systems, the problem is already widely out of control. Will those failures ever indeed be fully remediable, either on the basis of improvements in the technology itself, or in our methods of applying it? To answer in the affirmative is to express some underlying faith in the idea that information technology is essentially unmotivated, neutral, and impartial – that it is implicitly benign, and effectively at our service, if only we could learn how to design it or to manipulate it appropriately. This needs some unpacking. The belief is firstly quite oblivious to the prospect that, aside from the effects of any human input, digital processes might in themselves be inherently responsible for the generation of inconsistencies and failures in the systems they populate. While that perception may have once remained occult and easily dismissible, by raising an alert attention, as I have attempted to do on these pages, to a source of hidden corruption in digital information systems (the consequence of an historical oversight with regard to their founding mathematical principles), thereby exposing the very real, unforeseen problem of inherent logical inconsistency in those systems, such confidence is at least undermined.

New data tools tend to be marketed on the basis of their seductiveness as novel solutions to well-established problems. This strategy engenders a wide-eyed approach to problem-solving that is prepared to abandon established and proven methodologies in favour of revolutionary and unprecedented solutions – the prescriptive recklessness of “move fast and break things” – an approach which is insensitive to the prospect of the unforeseen deleterious consequences that tend to arise, with apparent inevitability, from the use of these novel technologies. It is a recklessness borne out of the idea that any use of technology (by virtue of the fact that it displaces human involvement) cannot in itself be the cause of error, because by the nature of technology it is unmotivated and impartial, and in that sense implicitly benign. The error must therefore result from the fact that we have employed the technology in some unrefined manner – that we are in effect infants in the use of this technology which is itself in its infancy. We have committed ourselves to a very steep learning-curve, abandoning previous wisdoms and skills, in exchange for a naive expectation that technology will ultimately provide some form of complete solution; while we remain inimical to the realisation that efforts at digital transformation tend to create as many new weird and intractable problems as those they offer to solve.

[continues]

4 April 2021
(revised: 27 February 2025)

Footnotes:

The 2019 AI Now Report, produced by the AI Now Institute, New York University. This Report addresses a range of socially regressive effects that follow from the use of advanced AI technologies, particularly within the labour market with respect to the ‘gig economy’ and the use of zero-hours contracts – practices which depend upon the widespread divestment of employment rights from workers that have been described under the attribute of “techno-feudalism”; suggesting that the standard of rights enjoyed by gig-economy workers represents their subjection to a sort of motorised medieval serfdom. The Report is also concerned over regressive social consequences following the rapid expansion of public surveillance technologies, particularly in the area of facial-recognition systems, and their implications for individual privacy. From a majority feminine perspective, the report emphasises a tendency for AI technologies to create inherent algorithmic biases, typically entrenching existing patterns of inequality and discrimination, and resulting in the further consolidation of power amongst the already powerful, through the “private automation of public infrastructure”. CITATION: Kate Crawford, Roel Dobbe, Theodora Dryer, Genevieve Fried, Ben Green, Elizabeth Kaziunas, Amba Kak, Varoon Mathur, Erin McElroy, Andrea Nill Sánchez, Deborah Raji, Joy Lisi Rankin, Rashida Richardson, Jason Schultz, Sarah Myers West, and Meredith Whittaker; AI Now 2019 Report. New York: AI Now Institute, 2019: https://ainowinstitute.org/AI_Now_2019_Report.html. [back]
Article 5 §1(b) of GDPR states:
“Personal data shall be:
[…]
collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes; further processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes (‘purpose limitation’);”
It should be noted however that the exemption from regulation with regard to reprocessing of data for public interest and general research purposes suggests that the European Commission sees no potential inconsistency (‘incompatibility’) in the reprocessing of personal data for those purposes. Hence, the concern of the Regulation is purely for subjects’ rights of privacy regarding the integrity of their personal data under risk of exploitation by third parties (typically, unsolicited commercial interests). The Regulation is complacent and remains inattentive regarding the technical problem discussed here of the logical inconsistency of digital data made subject to algorithmic reprocessing for public interest or research purposes not envisaged in its origination. [back]
A particularly poignant example of this problem – that the causes of software and systems failures tend to remain opaque to a large proportion not only of users, but also to those who manage those systems – is the catalogue of systemic errors experienced by sub-postmasters in the UK following the Post Office’s rolling-out of its Horizon branch accounting IT system, which began as a pilot scheme in 1996. The system was the cause of widespread shortfalls in the accounts being submitted by the company’s sub-post offices. These shortfalls were first reported in the year 2000. The Post Office had failed to investigate the problem initially, instead pursuing spurious allegations of false-accounting, fraud, and theft against as many as 900 sub-postmasters, 736 of whom were successfully prosecuted, with many being either jailed or bankrupted as a result. A team of forensic accountants, Second Sight, appointed by the Post Office in 2013, declared the Horizon system as “not fit for purpose”, and reported that it regularly failed to track certain specific forms of transaction. The Post Office, at that point already committed to private prosecutions against hundreds of innocent sub-postmasters, dismissed Second Sight’s critical report, and five senior Post Office executives declared, with spectacular arrogance: “We cannot conceive of there being failings in our Horizon system”. The Post Office has since generally relied upon confidentiality clauses as a means of deterring further enquiry and investigation into this monumental miscarriage of justice, and the false convictions of 736 sub-postmasters have only since 2021 begun to be overturned by the Court of Appeal, when 39 of the convictions were quashed, thanks in part to the work of the Justice For Subpostmasters Alliance (JFSA). See also Nick Wallis’ extensive blog on the case at: https://www.postofficetrial.com. [back]