Securing AI systems layer by layer

02-12-2024

AI layers to build systems

We agree with ENISA that a generic label "AI" is not very useful do determine the risks and countermeasures, and that a more refined structuring is required to address these risks. We propose a layered structure to facilitate pinpointing the risks for major types of AI applications.

The trigger was the ENISA document talking about different AI types. When lining up a couple of the types in the ENISA document, and more elaboration based on general knowledge about AI systems, a layered structure emerged. The following items showcase parts of the structure.

Computer vision is only a subset of image recognition and image generation, where images can be stills or movies.
Speech recognition and speech generation are linked. Natural language systems provide speech understanding, speech generation, and translation based on language models.
Expert systems (a term from to 90s) are a form of decision support systems or even systems taking autonomous decisions.
Machine learning and deep learning in particular are a foundation of most recent AI systems.
Statistics as explanation engine or oracle are widespread.
Multi-agent systems are linked to the IoT world, with various Things both providing input to an AI system as well as responding to output from that AI system. Input can be images or speech, output to these systems can be speech or images or directives.
Robotics are integrated sets of Things. They builds upon the image, speech and decision supporting capabilities.

Link to references

Layering AI systems

The diagram presents a layering of the AI space.

The four layers are, from bottom to top:

The AI foundation layer

Contains the "learning" systems, with a prominent role for deep learning, and including statistics-based options

The core AI functions layer

Produces the core use cases of the results of deep learning

The combination layer

Combines core AI functions with each other to provide additional functions

The application layer

Integrates elements of AI into all sorts of applications

Each layers must understand the properties of the one underneath, and accept or compensate for weaknesses that impact the properties of this layer.

The foundation layer

Machine learning variants

The base layer contains machine learning. This is not a new domain. A reduced list shows some of the alternative approaches based on solid mathematical grounds:

Supervised Learning
Linear Regression, Logistic Regression
Decision Trees and Random Forest
Bayesian Methods
Naive Bayes and Bayesian Networks.
Unsupervised Learning
K-Means Clustering and Hierarchical Clustering
Principal Component Analysis (PCA).

Other learning systems attempt to extract human readable and understandable rules, properties, rules, categories and the like from the raw data.

The dominant and rather successful variant, deep learning, has its roots in machine learning based on neural networks. Deep learning demonstrated a giant step forward impacting all things build on top of it. Deep learning rose on the availability of monumental volumes of data, the technical capability to store and process the data, and the required computing power achieving first, self-configuration of models, and second, using those very large models for generating content.

Deep learning is not the main concern, the capabilities and limitations are known. Everything build on top of it must take these limitations into account. We must understand this technology, what it promises, what it delivers, and what is perception.

This angle is different from looking at the malicious use cases that can also be built on top of these systems, like deepfakes. In that case the objective is malicious. Malicious use cases are covered in another document.

Deep learning concerns

Machine learning systems share some common properties, including deep learning systems:

GiGo

The main one is GiGo: garbage in, garbage out. Systems based on statistics will recognize often outliers in the inputs and discard these. While this is an old concern it should get more focus in the large models resulting from deep learning on massive data.

Imperfection

Most machine learning systems make mistakes. In classic statistics based machine learning systems, the theory can provide reliable measures for the level of assurance. Deep learning systems have unpredictable failure behavior: when and why they fail can come as a complete surprise.

Before building systems on top of these learning engines one needs to understand both GiGo and imperfection risks: is the negative impact on the upper layers acceptable, or can it be compensated for?

The core functions layer

Deep learning is a generic method and disconnected from specific use cases. That independence is the root of the wide applicability of deep learning. On top of the models you see the major foundational AI functions. For any function the two main use cases of AI are analysis and generation. The main domains are image (including video), speech, text, and decisions.

Generating images, speech and text

"Just" content generation is in principle the safest. Even if errors creep in, humans will be able to spot these and correct them. Especially speech and text are more safe options as there is not that much variation and good samples are abundant.

In text generation the so-called hallucination problem lurks: text is based on the source data but details like numbers, names and other elements may be "fabricated", indistinguishable from real data.

It is interesting to note that image generation and specifically picturing people is often considered less "correct" than the other two. The probable cause is the level of detail humans observe about "people" is high. For others, the laws of physics are impregnated in our perceptions.

A demonstrated danger with image generation are biases towards gender, age, race and religion. Cultural and geographical aspects may further influence details of the generated images and their sensitivity.

The training data may contain copyrighted material or items with intellectual property right restrictions. Any generated content may violate these restrictions. It does not have to be a perfect copy, just similar enough to the protected content.

Interpreting images and sounds

The interpretation of images and sounds remains a very interesting objective while at the same time it is also very challenging. The better the context is defined the more likely that the interpretation is accurate. Simple sorting robots can perform excellently. Computer vision for driverless trucks in closed environments works. Table tennis robots are quite an achievement. However, automating cars deployed in real traffic conditions is hard.

Today, voice recognition is common, although many still need to be trained on the voice(s) of the user(s). Producing transcripts works, with errors, yet, the productivity gain is significant. One just need to keep in mind that errors will be made, and the more uncommon the domain of the conversation, the more likely errors are.

Such mistakes may have a life of their own. Once in a while transcript or auto-complete errors are very funny. Not catching these shows sloppiness, or sense of humor. If your last name becomes "bad demon" … If meeting minutes on access control contain "airbag" a lot, it may take a few seconds to recognize RBAC was intended.

Great care is required when wrong interpretation of images or sound is unacceptable. It is complicated to get sufficient assurance, as deep learning models tend not to expose their weaknesses easily. Seemingly immaterial details may be crucial in practical use cases.

A desirable property for systems is graceful degradation: the impact of a mistake should be proportional to the exceptionality of the case. AI systems may lack this property.

Decisions: support or taking

Systems that impact or take decisions carry elevated risks because of their heavy impact: deciding on a loan, a mortgage, the diagnosis for a patient, launching a missile etc. are serious business. For such cases a decision support system is required: the actual decision must be with a human expert. Unless …

The human expert should understand that modifying the proposed decision is to be done thoughtfully. Evidence suggests that overruling the AI does more damage than good on average (no emotions, no convictions, no ego involved in the AI decision).

In case time does not allow a human in the loop, the risk analysis needs to be very thorough, balancing the need for speed and the trust level in the decisions.

It is mandatory to check on the laws, regulations and ethical evaluations if a system with autonomous decision-making is a proper, legal and authorized solution.

The combination layer

In this layer we combine different AI solutions into a foundation for specific problem categories.

There are many basic combinations possible (another great fact). We list some of the common text and speech combinations.

Text-to-text:

chat bots: from prompt to answer, a well-known case
summaries: identify the major topics and generate a summary on that basis
translation: the idea of using a "world model" that is independent of the language is an old idea to support translation. The deep learning model could be considered such a model.

Speech-to-text:

Transcribing: the texts of meetings, radio emissions can be created alongside the speech recording. It can assist the hearing-impaired as well.
Dictating: instead of recording a text and then produce a text by hand later, the AI produces an draft text immediately, while the content is still fresh in memory, and immediate correction is an option.

Text-to-speech:

"podcast" a text
create voice-over, also: aid for the visually-impaired
helping people with reduced vision capabilities

Speech-to-text-to-text-to-speech:

on the spot translation: the original speech in language 1 is converted in a text in language 1, that is used to produce the text in language 2, that is then converted to speech in language 2.

The application layer

This layer represents solutions for specific problems. This layer is a hybrid layer combining AI technologies and classical IT technologies into a solution. The final application interface may expose the underlying AI technologies, or wrap them for stronger control and simpler user experience.

Examples (fictitious)

Case 1: medical report from sound recording

Goal

The physician dictates his findings while examining the patient or when analyzing the medical images that were created. There is no forced order that must be followed but in the end a standardized report is created with the data in the appropriate fields.

Approach

The verbal reporting is transcribed by an AI system. The transcribing is enhanced by training the AI also on existing medical reports and terminology.

The data for the report is derived from transcript based on prompt to obtain data for the specific report fields.

The report is provided at the end of a session to the physician for validation, with facilitated access to the relevant transcript parts.

AI stack:

Analysis

Step 0

The first question when this idea is launched is checking if this case is legal, and compliant with the sensitivity for error in the specific medical domain.

Step 1

As reliability is key, we need to check each step of the way. The first step is to check if the transcript will be "correct". A standard text to speech system is not bad for general texts. It may be failing due to the specific vocabulary used.

This is lesson 1: the general performance in a non-specialist situation is not a good indication of the performance in very specific cases. It may be necessary to train specifically for the vocabulary, increasing the cost.

The review of the transcript by a knowledgeable reviewer (ideally: the one executing the test) changes the operation from automated to computer-assisted, which is advised anyway for critical cases.

The transcript generation must be tested to know the accuracy rating, and what is acceptable. This is similar to false positive and false negative ratings of other technologies.

It this case, tampering with the model or the data feeding the model is not likely.

Step 2

An AI will extract the report data from the transcript. This avoids expert human labor for a baseline job. The approach assumes that all relevant data from the transcript can be picked up. It must include a way to deal with remarks that are non-standard, as those may be vital.

The responsibility for the queries and their accuracy is fully with the company creating them, and not with any tool builder. It is easy, requires limited technical knowledge, can be adapted etc. But mistakes by the AI may be hard to spot, as covering tests are a utopia, impacting the trust obtained through testing. Wrong behavior is far from obvious to fix.

Example 2: copyright

Goal

A collaborator is asked to write position document on AI in the utilities sector. He is happy as it presents an opportunity to shine and use his knowledge of ChatGPT, coPilot and Gemini.

Approach

Using AI tools the collaborator quickly collects a quit good overview with simple prompts like "what are the key concerns and opportunities for using AI in the utilities sector?". The phrasing and the contents of the responses are a great start, only minor edits are required, and reshuffling a bit, and some formatting to fit in the corporate template.

For the finishing touch, a few generated pictures are added for illustrating the key topics.

It is a success! The document is even promoted to the corporate website to convince customers they know this stuff.

AI stack

A few legal letters later …

It seems like large parts of the text could be matched to specific pages on websites of big companies that had provided a high quality assessment of the topic, with copyright notice. The resemblance was actually more than 70%. They were not amused.

Also, the "generated" images were barely modified variations of images offered for sale for commercial use. This may trigger the investigation by intellectual property agencies on all of the company brochures, handouts, …

Some people investigated some of the data that were reported and found these to be questionable. While it were figures in the margin, they stirred up some commotion in the industry.

Conclusion

AI has proven it can produce qualitative results when trained on quality resources. It is very tempting to take that free gift without further ado and with gratitude to the unknow sources.

The breakthroughs made by what is called AI are staggering. 20 years ago chess computers were so-so, and Go computers where nowhere, now they are very strong. Interacting with computers via natural language and images just works, the responses are good enough for casual usage. Given the impression that "computers are always right" combined with this positive experience misleads many people into hyperbolic projections for the near future. AI is a probabilistic endeavor with a non-zero, unquantified failure rate.

What makes AI adaption for systems with an important function tricky is the lack of graceful degradation: they can go wrong with erroneous results, missing key information, or inclusion of fake data, unpredictably.

The other downside is that intellectual property right violations are lurking everywhere. It is nice to say that information wants to be free, in reality, authors want to have a reasonable compensation for their efforts. They are the ones that feed the AI systems with valuable content, so the rewards should go to them in the first place.

Economic reality may reduce creative and artistic production. AI routinely creates pictures or music in the style of any artist. It took the artist often years to find the magic combination that "works". The law protected their interests. It is broken if a new "Madonna" takes 1 minute to produce, without royalties.

The challenge for organizations is to know and work with these limits while enjoying the efficiency gains that have become possible. Whenever you include an AI component, the reliability of the result must be questioned and quantified. Testing will increase confidence, it does not lead to certainty.

AI applications are layered, and each layer has specific security properties. It is necessary to reflect on the baseline, the AI training. Is relevant, specialized input included from reputable sources? Is there a possibility of malicious inference during the training? How accurate is the model? Is it a third party model or component? How stable are the results? Is it possible to maliciously change the model afterwards? Does the system provide insight in its operation, like explanations?

In the combination layers each part has its own properties. The reliability of one does not mean reliability of the others. When an application is build on top of AI, it may hide the models, base layer and combination layers. They still are there, impacting the application.

In IT we are used to use the architecture "layers" model. Networking is the best-known example with 7 layers used typically. Each layer abstracts away the details of the underlying layers, so you can forget about them, and work at a higher level. Because you know they are solid. AI layers are not that solid.

Securing AI systems layer by layer

AI layers to build systems

Layering AI systems

The foundation layer

Machine learning variants

Deep learning concerns

The core functions layer

Generating images, speech and text

Interpreting images and sounds

Decisions: support or taking

The combination layer

The application layer

Examples (fictitious)

Case 1: medical report from sound recording

Goal

Approach

AI stack:

Analysis

Example 2: copyright

Goal

Approach

AI stack

A few legal letters later …

Conclusion

Geavanceerde instellingen