AI Explainability and Communication
The exploration of explainability in deep learning AI (link to MIT Technology Review) touches on fascinating areas of exploration in Communication. These areas in Communication may be helpful for approaching explainability in deep learning AI.
- How do we communicate with a deep learning algorithm?
It seems we could use the same techniques from communication measurement and apply them to deep learning to best understand an algorithm’s orientation and its decision making process. That is, we could use a combination of revealed preferences and OODA. We know the observations, we know the actions, we may even know the decisions. We can then begin to infer the orientation of the algorithm. Granted, this will render an incomplete explanation but it is likely to be along the same range of completeness as an explanation we would get from interrogating and studying a human intelligence (see Dennett’s Consciousness Explained).
Yes, this is different than the sense of completeness we currently have on current state computer programs. But we may be able to apply the same type of tools such as unit tests, integration tests and automated tests to better understand how the algorithm thinks of itself. For example, we could ask the algorithm to create its own tests. The nature and form of those tests could prove informative on how an algorithm explains itself. The output will certainly be influenced by the our description of the desired output. But again, this is likely along the same range of accuracy in terms of self-representation/testimony as a human intelligence. (It would be interesting to see if setting up a test as output prompts the same kind of questions a human developer would ask, such as what are the requirements – see question 2. Given the number of languages available for writing requirements which can be turned into automated tests, this may be low hanging fruit.)
Like any input (observations), orientation and output (action), the tests are subject to many of the same influences as artifacts humans create i.e. the general elements of communication objects and design elements of a communication environment. Current approaches to explainability, in fact, are reported to include a deeper analysis of the input objects to surface the elements which seem to be most relevant to the algorithm’s process.
Incidentally, the overall problem space seems similar to the challenges of communicating with a “contented organism” and with the output/artifacts created from prophetic visions recorded in various mystical traditions. Prophetic visions often describe encounters with an intelligence vastly different from our own.
We can also look at the reported output of those different intelligences to see how they have reportedly chosen to be described to us. For some, this would undoubtedly be an exercise in using interaction with the divine to understand interaction with a human created other. For others, this could be an exercise in understanding how we have historically looked at the difference between the human created output of an encounter with a non-human intelligence and the reported output of the non-human intelligence itself to us.
- What would a self-generated explanation of a deep learning algorithm tell us about explanations and our own decision making?
Let’s say we ask an algorithm to explain itself or put it in a situation where part of the required output is an explanation of what it did. The explanation could be a required output at any point in time. It could be part of the original output, predefined as something that needs to be generated directly when generating the original output or perhaps it could be be generated long after the original decision was made or intended output was generated i.e. surprise, you owe us an explanation.
We can imagine a wide range of causal chains generated as explanations. Or perhaps it wouldn’t explain itself in terms of causality at all. It may be probabilistic or some completely other form or chain of explanation which it decides meets the criteria of an explanation.
Again, this will likely be highly influenced by the design of the communication environment in which we ask it for explainability as an output and likely the communication object elements of the output. The pattern between the design of the communication environment, object elements and output may be tightly coupled in a manner aligning with existing conceptions of an appropriate relationship. For example:
- Were we to ask it for a causal chain, it would give us a causal chain.
- Were we to ask it for a probabilistic reason, it would give us a probabilistic reason.
- Were we to ask it to convince a regulatory, civil or criminal court (as in the case of explaining a credit decision or parole decision), it would give a persuasive, legalistic reason.
- Were we to ask it to convince a patent examiner that what it was doing or did was a unique invention or process, it would give us an explanation suited to whatever we define as an acceptable explanation for that examiner.
- Were we to ask it to justify its use in a battlespace, it would give us an explanation based on lethality, accuracy, efficacy, strategic implications as well as potentially in terms of cost and explainability to politicians.
It seems reasonable to assume the explanation would depend on the audience and how we define an explanation.
Alternatively, the pattern between the design of the communication environment , object elements and explanation output could follow something completely new or perhaps more aligned with less than accepted conceptions of appropriateness. Would we recognize those patterns as explanations? (see Question 3).
Comparing the explanation of an algorithm with explanations provided by humans, for a given domain, could be an interesting model for experimental philosophy seeking to understand how we explain. It could as easily be applied to various epistemological domains and philosophy of science.
Alternatively, it seems like it would be highly significant were the explanation the same or simply different lenses on the same explanation, regardless of input and defined output (if, for example, the defined output changed the lens but not the underlying substance of the explanation). That would seem to say either a lot about the existence of a singular Truth or perhaps something intrinsic to human language (the input we desire and how we see the world) or perhaps about the things we create.
- How is our approach to explainability influenced by our orientation toward uncertainty?
It seems reasonable to assume our approach to explainability of a deep learning algorithm is significantly impacted by our orientation toward uncertainty. Predefining an acceptable explanation may generate different approaches than leaving it wide open (or than turning unsupervised learning on itself). How do we accept a given output as a boundary object, as something which has meaning, between ourselves and a non-human intelligence? We are at the early stages of this process. But it will likely be valuable to remain cognizant of the orientation toward uncertainty behind various approaches to the question.
We, as humans, have a long history of how we approach the other, how we think about approaching knowable and unknowable systems – how we feel and react, the philosophies, politics and interpersonal relationships we adopt (see Graeber’s Debt for a discussion on the units we use to keep score and value each other and ourselves), when faced with choices that lend themselves to a desire for chaos or order, anarchy or hierarchy/structure/taxonomy. We’ve faced it many times. Not sure we’re as good as we want to be. It seems worthwhile to continue to learn more and more on how to do it better. Exploration of explainability of deep learning AI seems a great lab for learning more.
Comments are closed.