Algorithms and the black-box

By Siddharth Pai

The word “algorithm” made its way out of the of the lexicons of mathematicians and computer scientists and moved into popular use a few years ago. I see the word being freely used today by the laity, but sometimes wonder whether those using the word have a sufficient understanding of what it means.

At its heart, the algorithm is simple. It is a rule used to automate how a piece of data is handled. In other words, it forms the base for the classical logic used in computing of the if/then/else type of question. So, if “z” happens, do “x”. Else, do “y”. At the base level, a computer program is just an agglomeration of several such algorithms, strung through in a logical sequence, that allows for a certain result with the data that has been so manipulated. These are simple operations on data sets. Computers seem as if they are all powerful not because they are intelligent, but simply because they can perform such automated algorithmic functions at much greater speed than the human mind.

Over the last few years however, the word “algorithm” has taken on new meaning. We speak of employees in gig-work companies such as Uber and Zomato as being “managed by an algorithm”. Google’s algorithms allow for targeted advertising. Facebook’s content is served up based on algorithms which attempt to predict what the viewer would like to see. Medico-radiological images are examined, and diagnoses made based on these algorithms. Generative Artificial Intelligence (AI)’s prediction methodologies are based on such algorithms. Entire sets of start-ups are funded based solely on their being able to “beat” the algorithm” of large engines such as YouTube or Instagram.

As the word has passed into the lexicon of commoners, the understanding of its simple meaning has changed. We think that the firms that create such algorithms also have the ability change such algorithms. Witness the criticism of Google Gemini’s image generating capabilities last month, which led its co-founder Sergei Brin to confess that the firm had “definitely messed up”. But the Genie was out of the bottle. Gemini showed Popes and American “Founding Fathers” who were black, and Chinese women and other people of colour in World War II German Wehrmacht uniforms. Other parts of Gemini, which were handling text responses, made the claim that they couldn’t differentiate between the harm caused by Elon Musk’s memes and Adolf Hitler. So much for useful generative AI.

When we criticise the results of such new age algorithms, we assume that if a Google or a Facebook created such an algorithm, then it stands to reason that they can fix them. This is true in so far as overall responsibility for the results of algorithms are concerned, but we miss an important point. What we don’t understand in the world of AI is that while a firm may set off an initial algorithm, the algorithm then learns and changes itself.

While their owners may periodically interfere with their functioning, as Google plainly did with Gemini so that it could appear “politically correct”, they cannot completely control an algorithm that is off and functioning calling on various databases, operating systems and so on to “better” itself. The functioning of these AI algorithms becomes a “black-box”.

Little wonder then, that academics and practitioners alike are working hard on methods to audit and thereby potentially modify these black-boxes. But as I have pointed out above and Google has demonstrated, such interval-based audits of these black-boxes may not be sufficient to rein in their ability to go off by themselves and produce unacceptable results.

In a recent exploration of the effectiveness of AI audits, a team led by Stephen Casper and others from prestigious institutions like MIT, Harvard, and others, highlighted the limitations of black-box audits and advocated for more comprehensive access models to ensure rigorous evaluation of AI systems (bit.ly/3wScqTT). Black-box audits, where auditors can only interact with the system through its inputs and outputs, severely limit the ability to uncover deep-seated issues within AI systems according to them. The team argues for the necessity of white-box access (complete access to the system’s inner workings) and outside-the-box access (information about the system’s development and deployment) for more effective scrutiny. They stress that without these broader access levels, evaluations remain superficial.

Cathy O’Neil, a well-known critic of opaque algorithmic systems, disagrees with Casper et al (bit.ly/48QLhhd). While she acknowledges that she makes money from conducting such audits, she takes umbrage at the five main points that the academicians make about such audits and takes each one down with a counterpoint, each one instructive in its own way. The first point she takes on is that black-boxes are not suited to a generalised understanding; according to her, this misses the point—regardless of its innards, how an algorithm treats people is always patent—and this provides sufficient basis for its inspection. She then goes on to discuss Casper et al’s assertion that black-boxes can create misleading results (as we have plainly seen with Gemini). Her argument is that while this may be true, it is certainly possible to design a test (or in fact, a battery of tests) to see whether the black-box’s outputs consistently pass discrimination tests. In other words, the logic inside the black-box is irrelevant as long as it consistently provides acceptable results.

The implication is clear—Gemini’s algorithmic results were simply not subjected to rigorous testing and audit before they were released by Google. And the culpability for the results still lies with the firm, as Brin has accepted. Let’s see what’s next.

The author is Siddharth Pai, technology consultant and venture capitalist.Views are personal.

Algorithms and the black-box

There must be a consensus on AI audits to ensure that Google’s Gemini fiasco isn’t repeated