Might enabling computational aids to “self-correct” when they’re out of sync with people be a path toward their exhibition of recognizably intelligent behavior? In episode 46, Neera Jain from Purdue University discusses in her experiments into monitoring our trust in AI’s abilities so as to drive us more safely, care for our grandparents, and do work that’s just too dangerous for humans. Her article “Computational Modeling of the Dynamics of Human Trust During Human–Machine Interactions” was published on October 23, 2018 in IEEE Transactions on Human-Machine Systems and was co-authored with Wan-Lin Hu, Kumar Akash, and Tahira Reid.
Websites and other resources
- “The robot trust tightrope“
- The Jain Lab
- REID Lab
- “A Classification Model for Sensing Human Trust in Machines Using EEG and GSR“
Patrons of Parsing Science gain exclusive access to bonus clips from all our episodes and can also download mp3s of every individual episode.
Patrons can access bonus content here.
Hosts / Producers
Ryan Watkins & Doug Leigh
How to Cite
Watkins, R., Leigh, D., & Jain, N.. (2019, April 3). Parsing Science – Trusting Our Machines. figshare. https://doi.org/10.6084/m9.figshare.7955777
What’s The Angle? by Shane Ivers
Neera Jain: If humans don’t trust, and in turn are unwilling to use or interact with different types of automation, then there’s no point in us designing them to begin with.
Doug Leigh: This is Parsing Science. The unpublished stories behind the world’s most compelling science as told by the researchers themselves. I’m Doug Leigh…
Ryan Watkins: And I’m Ryan Watkins. Developed by Alan Turing in 1950, the Turing test is an assessment of a machines’ ability to exhibit intelligent behaviors that are indistinguishable from those of a human. Turing predicted that machines would be able to reliably pass such a test by the year 2000. While things haven’t quite come that far, might enabling machines to self-correct when they’re out of sync with people be a path toward this goal? Today, in episode 46 of Parsing Science we talk with Neera Jain from Purdue School of Mechanical Engineering about her research into monitoring people’s trust in machines’ abilities to drive us autonomously, care for our grandparents, and do work that is just too dangerous for humans. Here’s Neera Jain…
Jain: So my name is Neera, and when I took physics in high school as a junior it was eye-opening, and I absolutely loved it. I loved that it involved a lot of math but also involved getting answers to a lot of questions about how things work and I thought that that was extremely satisfying. So I got really excited and then of course applied to universities that had strong engineering programs, and my choice of mechanical engineering was largely driven by having really loved physics specifically within physics loved one part of it called mechanics. And so I thought why don’t I just try my hand at mechanical engineering. So that was really my trajectory getting into the field. Graduate school was something I wanted to do and subsequently spent several more years earning a masters and a PhD in a sub area of mechanical engineering called dynamical systems and control, and it was during that process that I realized that I actually wanted to pursue a career in academia, and that’s why I’m here now.
Leigh: A first step in creating intelligent machines that are capable of building and maintaining trust with humans is developing a way for them to be aware of our mental, physical, and behavioral states, as Neera describes next.
Jain: What we were looking at was how do we really estimate human trust. There’s no way to ever say that you can truly measure it, and so we basically designed human subject experiments in which we collected a lot of this psychophysiological data and used different types of machine learning algorithms, specifically one called classification algorithms, where we said, well, can we, using all this data, classify whether a human is trusting or not trusting. And really what we were able to mathematically compute is their likelihood that they will trust the machine or in this case sort of a decision aid, an intelligent decision aid. And so that paper, and we’re still continuing to do that kind of work as well was exploring whether or not we could use sensors like that to help a machine understand or get some estimate of whether or not a human is at the very least, you know, sort of trusting bin, if you want to think of it that way, or a not trusting bin. And kind of an analogy your way to think about that is that when two humans are interacting with one another we’re interpreting the language the body language of the other person maybe even the tone of voice with which they’re speaking and we’re using all of that to interpret whether or not they are happy, comfortable, stressed, whether or not they’re trusting, and any number of characteristics of that human at that given time. And if we want humans and machines to be able to work together the same way that human teams work then we need to give machines the ability to do something like that. And so that’s really where this whole idea of saying, well, one we want to identify sensors that we could use a combination of psychophysiological data and maybe behavioral data — that a machine can use to interpret whether or not a human is trusting or not — and based on that they can adjust their quote unquote actions which might really be, or how they’re communicating with that human through some user interface or something like that. And we’ve since done subsequent work where we have in fact closed the loop so we have designed an algorithm in which a machine is able to collect data about the behavior of the human make sense of it and then action change what was called the transparency of its user interface. So it’ll actually show either different or more information to the human to help them make a better decision or to understand whether or not they should or shouldn’t trust the advice, essentially of like an autonomous decision aid system.
Watkins: Training machines to learn requires step-by-step sets of mathematical operations called algorithms. These instructions enable machines to predict the appropriate course of action in future events based on data about how the results of similar decisions have turned out in the past. Neera and her team are interested in developing systems that leverage people’s moment-by-moment confidence in the machines’ advice to update these predictions. So Doug and I asked Neera to explain how such an approach can help humans make more efficient and effective choices.
Jain: We have a unique opportunity right now because of our ability to collect so much data. We’re living in this world as big data which has come from one advancements in computing power advancements in sensors and how they’re used to collect data and so we have all this data but as soon as we do, the way to process that data is through algorithms. And there’s a lot of use that can come from that data and certainly one can look at data and analyze it offline and we can do research that helps us in general understand how the world works in different ways or understand how humans trust machines for example. But if we want to start utilizing things like control systems, and control systems are largely based on the idea of feedback. So fundamentally, if I can send something, I can use that information to make some new decision in order to change a system for example. So classic example would be temperature regulation in your oven. You set the oven to be a certain temperature and then there’s an heating element that kicks on and off in order to keep the temperature at some fixed value that you’ve specified. So all of these different things are called control systems and what we’re doing here is saying well I want to use that basic idea about decision-making, basically using data to help me make better decisions, and I want to design machines that are able to make better too because I’ve given them a unique set of data we want to design these machines so that they’re able to take in new types of data whether it be human behavioral data or psychophysiological data and make better decisions about how they’re interacting with humans. And in order to build machines that can do that they need algorithms. So fundamentally, as soon as we start talking about machines and autonomous systems and so forth, now we’re talking about at the end of the day what’s driving them and how they’re acting and behaving is all driven by computers and algorithms. And so to get to that point we need some type of quantitative model, and if we can come up with quantitative models that describe all these things we now have this really cool ability to be able to feed information to a machine that helps it understand whether or not a human is trusting or not, and based on that that machine can change the information or modify how it’s delivering information to that human to help them do their job better.
Leigh: A variety of elements can inform the extent to which people may trust a machine and in turn how well people and machines can perform together on a given task. So Ryan and I were interested in hearing nearest perspective on what the most important of these factors are with regard to her investigations into our interactions with machines.
Jain: There are three different types of factors that typically affect trust: they’re situational, dispositional, and learned. So dispositional factors are things like gender and age. So those are things that certainly don’t change about ourselves very quickly but that do affect how we trust automation. And so in this particular paper, we explored, as a secondary effect, effective demographics and saw differences between men and women how their trust evolved as they interacted with machines and also things like national culture. And another one of those can be sort of a fundamental bias that they might have in their first interaction with a machine. So based on our own individual experiences over time, maybe someone has a lot of experience with different types of machines and maybe those experiences have not good where somebody else has had several experiences over many years that have all been positive. And so that’s also gonna buy us their initial trust with a machine. So that was one of the reasons for including some notion of bias to capture that particular piece of human trust. But again the bias was in essence a dispositional factor that every human has and brings to a new interaction that they have with some new type of automation. And the other types of factors that affect trust are situational and learned. So situational is kind of what it alludes to that’s things that are context dependent like the specific task itself, so very dangerous tasks versus less dangerous ones humans will be more or less willing to trust a machine in different scenarios. And then learned trust is how Trust is changing based on their almost real-time experience with that machine. So when we talk about experience in this particular paper and we model that as a function of the reliability of the machine itself that was really trying to address this learned trust piece.
Watkins: AI’s designed to manipulate or control people’s trust in them when it just be unethical; In applications like autonomous cars and healthcare robots they could be deadly. We asked Neera how it is that intelligent machines can be developed so as to maximize our trust in them when they’re most likely to be correct, but also to augment our mistrust when their recommendations are less certain.
Jain: The term that we prefer to use is calibrating trust, because at the end of the day we certainly want to help humans realize when they should trust machines, but we also want to help recognize when they shouldn’t trust a machine. So for example there are going to be situations in which machines make a bad decision and we actually need the human to then specifically not take the recommendation or advice of this automated system, and make their own decision because they’re actually getting faulty information. And so we call that trust calibration, we want to help the human calibrate their own trust in the machine by making sure they’re provided with all the right information that they need to do that calibration. So from that perspective, if in general we want humans to be willing to trust machines, then certainly we expect that all machines and systems often are going to have misses and false alarms, also known as false negatives and false positives, and ideally you design a system that doesn’t have any misses or false alarms. So it’s hundred percent, you know, correct all the time, it’s never gonna give you any type of faulty information. But the reality is that most of these are, and if we think about the boy who cried wolf and we defined a factor in here, called the Cry Wolf factor, it was exactly that if you’re interacting with some machine, you know, it gives you maybe just the wrong information several times. That alone is something that’s going to lead to you probably having less trust. But what we did with the false alarm and miss-investigation was to say well the machine can be wrong in two different ways.
Watkins: Neera and her team carried out an experiment in which sensors measured how people’s minds and bodies responded when confronted with a virtual self-driving car, which had faulty sensors. We’ll hear how a machines potential for being wrong is leaked to the severity of the consequences of those missteps after this short break.
Watkins: Here again is Neera Jain.
Jain: The scenario here was a human driving in a car, but they can’t actually see out their windshield. So they’re relying on a sensor that will either tell them that yes there’s an obstacle ahead, essentially you should slow down or hit the brakes, or no there is no obstacle, there’s a clear road and you can keep driving. In this case, your human is at the mercy of the information that they’re getting from the sensor. And it’s a big difference if the machine says oh there’s an obstacle and you brake but it turns out that there wasn’t that would be a false alarm so it’s annoying that you hit the brakes and you could have just kept driving, but it is really dangerous if the sensor says there’s a clear road ahead and so you keep driving and in fact there was an obstacle and then all of a sudden your car has crashed. And so that now has a more significant risk associated with it. And so that’s come the fundamental idea where there’s really an imbalance in terms of the severity of a mis versus a false alarm, and you can imagine that several false alarms can be annoying, but even one miss for example can be enough for somebody to say you know what I don’t I don’t trust this I don’t want to use a sensor anymore or I don’t want to rely or use this automation anymore. So from the perspective of engineers who are designing different types of automation that are meant to help humans or make human lives better, it’s important to understand how things like misses and false alarms affect human trust. Because at the end of the day if humans don’t trust in that in turn are unwilling to use or interact with different types of automation, then there’s no point in us designing them to begin with and that’s actually more fundamentally how I came to be interested in this topic.
Leigh: Ryan’s undergraduate degree was in mathematics education, and I currently teach data science. So we couldn’t resist asking about the technical details under the hood. So to speak of the computational model which Neera and her team applied to their autonomous driving experiment.
Jain: The metric that we use, which is called rise time here, is actually a standard metric for characterizing a first or really a second-order dynamic response of a system. So mathematically if you have a second-order differential equation, and you simulate that system response, then you get an exponential that increase and kind of flattens out or decreases with time. And in our case it was switching back and forth, because we kept kind of perturbing the system, we kept switching the reliability of this system. But fundamentally the solution to a second-order differential equation is either an increasing exponential or a decaying exponential function, and so rise time is just defined as a time that it takes for that function to increase from 10% of its final value to 90% of its final value. And what we’re after by modeling dynamics is how things are changing, and so then we’re interested in metrics that give us a quick way to characterize well how quickly are things changing. And so we accept and it’s well documented that trust changes with time, and that it’s affected by things like a humans experience that learned trust as they’re interacting with the machine, and what we were studying here in the context of rise time was understanding is that time that it takes for trust to change different for men and women, or different for people of different cultures. Intuitively, we might expect that, right? If you’re somebody that’s more risk-averse, then I would actually expect your trust to drop pretty fast if you encounters a system that is giving you a lot of faulty information.
Watkins: Intelligent machines have applications in manufacturing, commercial electronics, and many other sectors. Self-driving cars and trucks however, represent the state-of-the-art in AI in consumer applications. So we wanted to know how well Neera trusts them.
Jain: I’m an engineer who knows a lot about how those algorithms are being designed, and I’m not sure I’m ready to get in a car that would drive itself. You know, it’s a really complex issue asking humans to put so much trust in automated systems. Another example that’s much simpler that we kind of overlook is if you think about stepping into an elevator, we do that all the time, but that elevator is also an automated system press a button, and it on its own it’s gonna take you somewhere. I’m sure there’s some interesting stories about how they were received by the public when elevators first got installed, and I’m sure there are a lot of people who were has to get in, and I’m sure anyone who’s ever been stuck in an elevator might still be hesitant. But the majority of society is willing to step into an elevator, and trust that it’s gonna take you where you just asked it to take you. So we’ve actually as humans but have been interacting with automation in a lot of different ways. Some of them are more obvious and others are not, and some we take for granted. And so there’s this huge spectrum of situations in which we can be interacting with them, and how our trust varies across those different situations is also a really interesting thing to think about.
Leigh: Since her research currently focuses on monitoring humans trust in automated systems, Ryan and I were curious to learn if Neera and her team may also study how machines might be taught to trust us.
Jain: As we start thinking more about human-machine teams, in the truest sense where both the human and the Machine are making decisions, that affect one another, but they each have decision-making authority, we do need to do that and there are different ways in which we’re already designing machines to do that. So something that we again aren’t currently working on but an example could be imagine that you have a pilot in an aircraft, and maybe it’s a like a fighter jet, so it’s really just a single pilot, there’s no passenger, just the pilot, and the aircraft, and there are autopilot features, maybe you’ve got sensors on the human, and nominally the human pilots in charge, they’re flying, but maybe they pass out. And if the machine is able to sense those biometrics of the human, and actually realize that either they’ve passed out, or fallen asleep, or they’re basically incapable of continuing to fly that plane, then triggering the machine essentially the autopilot to automatically fly the plane even though the human hasn’t put the the plane into autopilot mode. And so where we really truly have these seamless human-machine teams, where the human is using their own decision-making capabilities to calibrate their trust of the machine and make decisions, but the machine can also do that? That’s kind of the Holy Grail to an extent and so a lot of researchers including myself are thinking about those kinds of things, I think that’s where we’re headed.
Watkins: The study had four authors, all of whom were with Purdue’s School of Mechanical Engineering at the time, though they came from considerably different specializations. So Doug and I were eager to hear how being trained to think about experiments from diverse perspectives influenced their collaboration.
Jain: Dr. Reid is a mechanical engineer by training, but had the unique experience as a graduate student to not only experience training from that perspective, but also at the same time from the perspective of psychology by having an adviser who himself wasn’t fully a psychologist. So she was reading literature and understanding how those different communities think. Even as we then started to collaborate, we both speak that same language in terms of mechanical engineering, but even within engineering, within mechanical engineering, at such a diverse field that the types of research she does as a design engineer, and the types of problems that she looks, at and the types of tools that she uses are very very different than what I use as a controls engineer. So we and our students spent time learning how to communicate with one another, and there were a lot of conversations, and drawing on whiteboards, drawing analogies between the things that we were thinking about, and what Wan Lin and Dr. Reid were thinking about, and what they really brought was significant expertise not only with the use of those psychophysiological sensors that I mentioned, but fundamentally human subject experimentation. How do you design an experiment in which you can even collect data that might capture something like human trust. I had no idea, but she has several years of experience designing and executing human subject studies. And I think that a lot of our success came from when a lot of just excellent rapport and respect that everyone on the team had for one another, and I think the patient with one another, and openness to listen and understand how the other person was viewing the problem, and what that brings to the overall team. I think that’s the biggest challenge working at the intersection of disciplines, and in this case we were four mechanical engineers all sitting in the room, but even then there’s differences in how you’ve been trained to think about problems. And so I think that there’s a lot of innovative solutions and creative engineering to be done if all engineers, even within engineering, and then certainly as you start to collaborate with maybe someone fully trained as a psychologist or outside the field, as you start to do that there’s again a lot of opportunity if you are willing to put in the work to really put yourself in the shoes of the other person.
Leigh: As Neera discusses next, despite the seeming ubiquity, today’s AI applications are generally capable of automating only relatively low-level tasks. We were curious to learn what higher-level tasks computers and robots might be able to do in the future, as well as whether she has any misgivings about the growing automation of jobs.
Jain: This was considered a lower level of automation, there’s different levels of automation and this one is one in which the machine itself doesn’t actually have the ability to make a decision, it just gives information and recommendations. At the end of the day, it’s the human who actually is choosing whether or not to utilize that information, or how to use it utilize that information in their own decision-making. As soon as the machine is allowed to also make decisions, things do get a lot more complicated, and there’s a lot of other things that we have to think about as we’re designing those algorithms. But there are a lot of situations in which we want machines to be able to make different kinds of decisions based on maybe not so much explicitly how much they trust the human, but just based on what the human is doing, or feeling, or needing. And so another example is situations like having a robotic nurse in the home of someone who’s elderly, and assisting them with things like getting them their glasses, and those are things that are being actively researched, and we’re definitely headed in that direction. In that kind of situation, you know, we would want that robotic nurse to one be able to sense if the elderly individual feels either feels nervous, or anxious, or untrusting of the machine for that machine a robot to give that person some space, and not be sort of aggressive with them. But also recognize if the elderly person actually falls and is unable to ask for help, we want a machine a robot rather that’s smart enough to start offering assistance when the human is not able to ask for it themselves. So we want these different machines and robots to be able to do things that humans can do, fundamentally that’s what that’s what we’re trying to build but we want to do it in such a way that ultimately improves quality of life, and helps humans. But I think an important kind of thought that I do want to share, that I think about a lot, is as we design these more sophisticated robots and machines, we know that automation in many ways has displaced a lot of human jobs. There are a lot of people who are out of a job because they have effectively been replaced by some type of automated system, and I think that one thing that I’m personally committed to, and that really drives the research that I do, is that I’m not interested in designing machines that should be replacing humans, but that are augmenting or helping humans to do what they do best. So there are still certain tasks, and decisions, and other cognitive processing that we know humans can do way better than machines. But there’s other situations where there are situations where humans are doing things that are maybe very unsafe that we really shouldn’t ask humans to do. And so in those situations if we can develop assistive robots or other types of automation that help the humans, but still give the human fundamentally their decision-making power and help them to do the jobs that they want to do, and leverage what we know the human brain is capable of. I think that that’s really where the future should go and will go.
Watkins: That was Neera Jain, discussing her article: “Computational Modeling of the Dynamics of Human Trust During Human-Machine Interactions,” which she published in the November 2018 issue of I Triple EEE’s Transactions on Human-Machine Systems, along with Juan-Lynn Hugh, Kumar Akash, and Tahira Reid. You’ll find a link to their paper on: www.parsingscience.org/e46 along with bonus audio and other materials we discussed during the episode.
Leigh: If you enjoyed Parsing Science, consider becoming a Patron for as little as $1 a month. As a sign of our thanks, you’ll get access to hours of unreleased audio from all of our episodes so far, as well as the same for all of our future ones. You’ll be helping us continue to bring you the unpublished stories of scientists from around the world, and be supporting what we hope is one of your favorite science shows. If you’re interested in learning more, head to: www.parsingscience.org/support for all the details.
Watkins: Next time in episode 47 of Parsing Science, we’ll be joined by Amy Orban from the University of Oxford. She’ll talk with us about her research into the association between kids’ digital technology use and their well-being.
Amy Orban: What the research shows, and what we’ve naturally known, but I think what this paper visualizes is that data analysis can actually change what you then end up seeing in the data.
Watkins: We hope that you will join us again.