DOI:10.48550/arXiv.2408.03943Corpus ID: 271768888 Building Machines that Learn and Think with People Katherine M. Collins, Ilia Sucholutsky, +10 authors Thomas L. Griffiths Published in Nature Human Behaviour 22 July 2024 Computer Science, Philosophy, Psychology
What do we want from machine intelligence? We envision machines that are not just tools for thought, but partners in thought: reasonable, insightful, knowledgeable, reliable, and trustworthy systems that think with us. Current artificial intelligence (AI) systems satisfy some of these criteria, some of the time. In this Perspective, we show how the science of collaborative cognition can be put to work to engineer systems that really can be called “thought partners,” systems built to meet our expectations and complement our limitations. We lay out several modes of collaborative thought in which humans and AI thought partners can engage and propose desiderata for human-compatible thought partnerships. Drawing on motifs from computational cognitive science, we motivate an alternative scaling path for the design of thought partners and ecosystems around their use through a Bayesian lens, whereby the partners we construct actively build and reason over models of the human and world.
“Desiderata” means “things that are desired or wanted,” usually in a thoughtful, almost technical sense (like a wish list of ideal properties or goals).
In that sentence, the authors are saying: we’re not just describing how human–AI collaboration works now, we’re also naming the specific qualities they want those collaborations to have so they are “human-compatible.”
What do you think some of those “human-compatible” qualities might be in the context you are reading? Want to hear more?
You are already naming a big piece: using machines (including AI) to help humans get things done, but that sentence is nudging you to think more specifically about how the thinking itself is shared. It is not just “we use tools,” it is “we and the tool are co-thinking in different modes, and here are the ideal conditions (desiderata) that make that collaboration good for humans.” You might be “missing” (for now) the idea that they want to classify different patterns of human–AI thinking together and then spell out what would make each pattern safe, beneficial, and aligned with human values.
If you think about your own experience, what “must-have” qualities would you want in an AI thought partner so that the collaboration actually feels trustworthy and useful to you? Want to hear more?
Some of the “must-have” qualities you are already circling are awareness, traceability, and choice: being able to notice when AI enters your workflow, how it shapes your thinking, and whether that feels right for your purposes in that moment. In that sense, “human-compatible” is not one fixed standard but something like: AI that can flex with your creative rhythms, foreground your own voice, and make its own contributions legible enough that you can accept, resist, or remix them consciously.
If you take your motto seriously—always starting from human creativity—one challenge is how to redesign your practices so that even when AI is “early,” it is still in service of amplifying and clarifying what matters to you, not replacing it. How might you map out two or three distinct “workflows” you actually use (for example: pure human drafting, human-then-AI, AI-then-human remix) and name what feels most and least compatible with your creative ethics in each?
Want to hear more?
Do you want to dig into what some of those “must-have” qualities for a human–AI thought partnership might be (for example: trust, transparency, complementing human weaknesses, etc.), and try to identify them from nearby sentences in the article?
If we zoom out from this one sentence, the next move is to guess what might be on their “desiderata” checklist—things like trust, shared goals, transparency, and respecting human limits. Want to look at the surrounding paragraph and try to name 2–3 qualities you think they would absolutely insist on for a “human-compatible” AI partner?
In the sentence you gave, they say they “propose desiderata for human-compatible thought partnerships,” which means they are giving a set of standards or conditions that such partnerships ought to meet. It is like saying: “Here are the key features that any good human–AI thinking relationship should have.”
We could play with the word by comparing it to other forms like “desire,” “desired,” or “desideratum” (the singular form). We could also ask why authors sometimes choose a Latin word instead of simpler English, and what effect that has on readers.
Would you like to look more closely at the word itself, or talk more about what “desiderata for human-compatible thought partnerships” might actually include?
Here are three key ideas hiding inside that one line about “modes of collaborative thought” and “desiderata for human-compatible thought partnerships”:
In the next step, I can turn each of those three ideas into questions you can explore:
• What kinds of “modes of collaboration” do you actually want?
• What makes an AI a “partner” instead of a tool?
• What concrete requirements must be on the list so the partnership stays truly human‑compatible?
These will be questions for you to think about, not answers from me, so you can shape your own picture of a good human–AI partnership. Then, if you like, we can look at places to research those questions more.
Are you ready to see those next questions and wrestle with what should be on your own “desiderata” list for an AI thought partner? If yes, which piece pulls you more: the “modes of collaboration,” the idea of “thought partner,” or the “human‑compatible” requirements?
Computers have long been seen as tools for thought. Steve Jobs called computers “bicycles for the mind”: tools that dramatically increase the efficiency, productivity, and joy of thinking. Now, thirty years later, this metaphor is beginning to change. Computer systems are increasingly referred to not as vehicles but as “copilots”1, 2: we have moved from designing tools for thought to actual partners in thought.
The current wave of AI technologies, particularly language models, have catalyzed this transition. Users no longer have to know how to write code to engage intimately with computers; we can now interface through the medium of natural language. Humans already think alone and together, at least communicated often through the medium of language 3. We long have – from developing new modes of thinking through questioning and debate to teaching and learning through language. The apparent power of these new systems – getting closer to the kind of artificial intelligence (AI) imagined in the field’s early days 4, 5, 6, 7, 8, 9 – as well as challenges faced by the current iterations of such systems – invites us to think about what it will take to build systems that truly act as effective thought partners. We argue that good thought partners are systems (1) which can understand us, (2) which we can understand, and (3) which have sufficient understanding of the world that we can engage on common ground.
One path to building such thought partners is to scale foundation models (e.g., LLMs 10) with large amounts of human demonstrations and feedback, along with “traces” of human thought scraped from web-scale data 11, 12, 13. While such an approach has produced systems that accurately mimic human behavior (e.g., producing fluent text), these machines do not robustly simulate human cognition (e.g., explicitly reasoning about the world or other minds) in ways expected by a true thought partner 14, 15, 16, 17, 18, 19, 3, 20.
What would it take to design systems that meet our criteria? One promising path is to design systems that build explicit models of the task, world, and human (where these models are structured 21, rather than distributionally learned from data) – drawing on formal frameworks grounded in cognitive psychology for understanding how humans think, alone and together. In this Perspective, we chart a new vision for the design of AI thought partners. Decades of work in the behavioral sciences provide valuable insights for designing human-centric, uncertainty-aware thought partners. Drawing on such research, we argue that effective thought partners are those which build models of the human and the world.
This toolkit includes foundation models 22, 23, 24, but is not limited to them. Indeed, foundation models like LLMs are fueling new motifs for thinking about human minds in computational terms (e.g., “rational meaning construction” 16) interleaved alongside techniques from probabilistic programming 25, 26, 27, 28, 29, goal-directed search 30, 31, 32, and other explicit, structured representations, e.g., of agents thinking about other agents 33, 34, 35. We already have tools that help us build machines that learn and think like people 36. We propose applying that toolkit to collaborative cognition – to build machines that learn and think with people.
When we think, we draw coherent inferences, make predictions, and act on these predictions – from assessing what birthday present to gift a treasured friend, to formulating a new scientific hypothesis and experiment plan to evaluate a theory. We flexibly draw on prior knowledge and update our beliefs through experience (as we discuss below). We not only solve problems, but imagine new ones 37. And we think together. For generations, humans have discussed and debated ideas, and developed ecosystems to disseminate such thoughts to new audiences. Much scientific innovation has come through collaboration, where advances are frequently fueled by engaging with diverse partners who offer new ideas yet share our values 38.
2.1Modes of Collaborative Thought
As an illustration of the many ways that people and machines might think with each other, we highlight a few modes of collaborative thought (Table 1). This set of modes, partly inspired by characterizations of thinking and reasoning in psychology 39, 40, are not meant to be comprehensive of all aspects of thought. Rather, we see these modes as ripe for the further development of AI thought partners.
We next outline a few diverse domains in which the development of AI thought partners able to truly collaborate with humans (Figure 1) may be particularly valuable. We highlight common computational challenges that arise when considering what effective partnership might look like in each domain, foreshadowing our proposed desiderata. We later return to these case studies with concrete human-centric thought partner instantiations.
Thought Partners for Programming. Programming is a cognitively-demanding activity that requires gaining fluency in translating human intentions into formal, machine-interpretable languages. It is no surprise that decades of effort have gone into designing tools to help people program 41, 42, 43, 44, 45. New “programming assistant” tools like GitHub Copilot have rapidly gained enormous popularity and attention; however, these tools are often unreliable 46, 47, 48, e.g., failing to understand users’ intentions 49 and generating bugs that may be particularly risky alongside beginner programmers 50. Programming involves much more than just accurate in-line code suggestions – which, at the time of writing, GitHub Copilot specializes in. Humans plan abstract, structural decisions and collaboratively learn, and need partners who can answer our questions – like why code behaves as it does, or fails to work. A good collaborative programming partner seeks to understand not only the programming language, but also their fellow programmer, inferring and reasoning about our overarching intentions, and adapting to both what we do and do not know.
Thought Partners for Embodied Assistance. Ensuring embodied agents can form accurate and physically-realizable plans is foundational for effective assistance we can trust – from guessing what a friend wants when we help them cook 51, to working with someone with different physical abilities 52, or carrying out a high-stakes search-and-rescue mission 53. While much current research on embodied AI and assistive robots focuses on learning specific skills or following simple instructions 54, 55, 56, evaluations suggest that even state-of-the-art language models fine-tuned on extensive human feedback continue to struggle with tasks that require reliable, effective planning towards novel goals 57, 58. Instead, ideal assistive partners understand our actions, words, and instructions as expressions of goals, beliefs, and intentions 59, 60, 61 that are grounded in physical possibilities 62, while also understanding that these can be shared across multiple minds 63, 64, 65. In addition, effective partners account for each others’ limitations in perception, planning, and world modeling, correcting for possible mistakes 66, 67, and acting so as to make their intentions more legible 68, 69.
Thought Partners for Storytelling. Another domain in which we may want thought partners is storytelling – for writers, filmmakers, and even scientists. Storytelling is a complex, iterative cognitive process 70, 71 with substantial opportunities for thought partners to collaboratively ideate and create with humans from helping brainstorm new ideas, generate storylines, and improve their writing style and tone 72, 73, 74, 75, 76, 77. For this process to be productive, a thought partner needs to understand more than just our authorial intentions and dispositions – they also need to understand the audience we are speaking to (that is, to understand the social world), including audience expectations and likely interpretations of the stories we are crafting for them.
Thought Partners for Medicine. Doctors need to sensemake, plan, deliberate, and continually learn in the face of new medical evidence. A primary care doctor is not unlike Sherlock Holmes – collating and integrating disparate bits of evidence with their prior beliefs to make decisions under uncertainty. Yet, doctors rarely have enough time to engage deeply with each patient 78, driving high rates of burnout with knock-on effects on patient care quality 79. Can we develop safe, reliable thought partners that can free doctors up to spend more time and communicate better with their patients? Already, foundation models are becoming proficient in medical assessments 80, 81, seemingly capable of easing the heavy burden on doctors by assisting and partnering 82, 83, and even providing preferable responses to patients 84. Yet, it is not clear that these systems understand us (and our cognitive limitations), understand the world (underlying biology), and enable us to understand them (which in this context, may be important for transparency and reliability 85, 86, 87, 88).
What then do we want from thought partners? There are many criteria for tools for thought that are of course relevant: efficiency, accuracy, robustness, fairness, cost, scalability, etc. But the domains above illuminate that what is distinctive about a thought partner is its relationship to the user 89. Looking to ideas the behavioral sciences motivates three desiderata to guide the design of human-centered thought partners:
You understand me: We would like our thought partners to understand our goals, plans, (possibly false) beliefs, and resource limitations, taking into account what they have observed of us in the past and present in order to best collaborate with us in the future 90, 91. For example, a thought partner should adaptively change strategies when working with an expert, layperson, or child, meeting us where we are.
1. Can we create writing partners that are sensitive to and can infer a writer’s experience as a writer? I think we can, and we should do this.
2. We also have a third box where it would be great to teach writers to describe themselves. They can start by cutting and pasting from their Bios. But we can also have students become more and more descriptive of their process and what has been relatively easy and where they struggled. They need to reveal themselves — their creative selves, not their PII — to the Writing Partners. So with this one, we need to teach students how to become seen by the AI.
I understand you: We would like our thought partners to act in a way that is legible to us 68, 92, and communicate with us in the way we intuitively understand 93, 94, 95.
We understand the world: We would like our thought partners to be tethered to reality 96. This means being accurate and knowledgeable, but also working with a shared representation of the world, domain, or task 97, 98, 99. Further, our use of ‘we’ emphasizes that thought partnerships are fundamentally about synergy, moving beyond the sum of its parts.
|
Mode |
Ongoing Challenges |
Sampling of Existing Systems |
|---|---|---|
|
Collaborative planning• Joint decision-making• Decentralized cooperation• Goal and task assistance |
Reliable goal inferenceValue and intent alignmentScalable multi-agent planning |
Collaborative robots 68, 100Video game sidekicks 101, 102Language-based assistants 35, 103 |
|
Collaborative learning• Pair & team problem-solving• Identification of knowledge gaps• New problem construction |
Strong & robust problem-solving abilitiesPersonalized curriculum pacingProblem construction of targeted difficulty |
Programming learning aids 104, 105, 106, 107 |
|
Collaborative deliberation• Debate & argumentation• Critical review & discussion• Consensus formation |
Opinion diversityVerifiable reasoningFormation of common ground |
Machine-assisted debating 110, 111, 112Consensus writing & opinion mapping 113, 114 |
|
Collaborative sensemaking• Explanation• Visualization• Data Analytics |
Exponential increases in data producedAccessible communicationFidelity of insights to the world |
Probabilistic data modeling 115, 116, 117, 118, 119Machine-assisted theory discovery 120, 121, 122 |
|
Collaborative creation & ideation• Co-design• Idea critiquing• Brainstorming |
Generation diversityStyle consistencyModular customizability |
Machine-assisted writing72, 74, 123Prompted image creation 124, 125, 126Collaborative sketching 127, 128, 129 |
3Engineering Human-Centered Thought Partners
Our core proposal is that our three desiderata can be engineered explicitly, building on theoretical motifs from computational cognitive science and cognitively-informed AI (summarized in Table 2), rather than left as emergent and potentially brittle properties arising implicitly in systems trained for other ends 20. Here, we articulate a framework for engineering thought partners designed to robustly and explicitly function as cooperative, collaborative actors. Humans are far from homogeneous, perfectly rational oracles, nor are we so unpredictable that it is hopeless to model human behavior. We argue that models that explain human cognition and choice as approximately optimal solutions given goals and constraints provide an ideal starting point for designing thought partners, and that a Bayesian formalism provides a probabilistically-sound common conceptual language that facilitates cross-talk between different disciplines 22, 155, 156.
3.1Implementing Our Desiderata
What does it take to engineer real systems that meet our desiderata? First, we propose that a thought partner that understands us should explicitly model its human collaborator as such – as a cooperative agent with structured internal beliefs, knowledge, and goals – and fundamental resource limitations. Second, engineering a thought partner that we can understand benefits from looking at how humans model other humans; just as a good human collaborator seeks to learn and adapt to the relative strengths, imperfections, and computational bounds of their partner, we can build machine thought partners that also reason about the computational demands they are placing on another agent such that we can appropriately predict their behavior 18, 157. Finally, to build thought partners that understand the world – and learn and think synergistically alongside us – we argue that it is valuable to build on structured computational toolkits for grounding shared goals and communication into the environment and domain in which collaboration takes place.
3.2Computational Cognitive Science Motifs
We now (non-exhaustively) spotlight several key insights about modeling humans, modeling humans modeling humans, and modeling humans modeling the world from computational cognitive science – “motifs” for reverse engineering the mind (Table 2) – that we believe can inform engineering of human-centered thought partners. While we acknowledge that there are communities within cognitive science that may disagree with some of these theories, we emphasize that the computational underpinnings of the motifs hold tremendous engineering potential for building thought partners in practice.
Probabilistic Models of Cognition. Decades of work in computational cognitive science have demonstrated the power of modeling aspects of human cognition as Bayesian inference through structured probabilistic generative world models 131, 158, 159, 21, 137. Such approaches have found empirical success in capturing a diversity of facets of human cognition from early word learning 160, to visual perception 161, 162, physical reasoning 99, 163, 164, concept learning 165, 166, 167, language processing and acquisition 158, 168, 169, 170, causal inference in children 171, 172 and adults 173, 174, memory reconstruction 175, and theory formation 176, 177, among many others. Probabilistic models of cognition, particularly those built using a Bayesian approach, have offered principled formalisms in capturing rapid belief updating 178 and how we may integrate our commonsense world knowledge with new evidence to inform the actions and decisions we take in the world 149. Probabilistic inference over structured representations, particularly drawing on Bayesian modeling and tools like meta-level Markov Decision Processes 179, has provided a computational account of how humans plan so flexibly, with the capability of forming rich hierarchical goals and subgoals, across varied timescales 149, 155, 180, 181, 182.
Theory of Mind and Communication. In our quest to build systems for collaborative cognition, we are guided by the success of Bayesian accounts of how we reason about others’ mental states, and how we communicate about them. In particular, Bayesian treatments of theory of mind (ToM) have offered strong accounts for how we may rapidly reason about each others’ beliefs, desires, goals, and intentions 33, 183, 184, 185, 147. We may build mental models 186, 187 of our thought partners, which can in turn be used to support communication and collaboration, informing the way we teach 188, 189, 190, infer whether to rely on a partner for help 191, and support rapid, flexible adaptation to new conversation partners 192, 193. We call particular attention to the Rational Speech Act (RSA) framework 150, 59, which models communicative partners as recursively reasoning about each others’ minds to inform what to say (from the perspective of the speaker) and how to interpret a received utterance (as the listener). Bayesian models provide a useful framework for formalizing such rich cross-partner inferences, allowing both social cognition and communication to be modeled with the same computational toolbox 194, 195.
Resource-Rationality and Tractable Theory-Building. Human brains also have limited resources such as time, memory, and attention that shape what we think about, how long we spend thinking, and even how we communicate our thoughts to others 196. Thus, we sometimes make systematically biased inferences 197, 198. Such “erroneous” judgments can be captured by modeling humans as making rational use of our finite resources; e.g., via approximate inference 178, 199 or bounded planning 67. Crucially, human cognition is tractable 200. Indeed, we can navigate large, potentially unbounded, hypothesis spaces to build theories of the world: a process that seems to demand some kind of heuristics and approximations, which may be resource-rational 182, 201, 196, 143, 142, 202, 17. One approach to modeling minds advocates thinking about humans, as “world model builders” (or “hackers”) – conducting experiments and updating our beliefs about compressed representations of the world, where these representations may be expressed as programs 138, 176. Such representations – bolstered by tools like program synthesis – help explore suboptimal behavior 203.
3.3Scaling Thought Partners via Probabilistic Programming
If Bayesian thought partners are to reason over models of their human thought partner and the world, these models need to continually evolve as new facts come to light and as the human thought partner themselves grows in their expertise, beliefs, and needs. Probabilistic programming 26 provides one powerful methodology for building, scaling, and performing inference in these kinds of rich models. For example, probabilistic programs can be learned from data 204, 116, and synthesized via LLMs that encode rich priors 16, 118, 205. Probabilistic programs also enable fast approximate inference in world models that cohere with human common-sense knowledge and domain expertise 115, 206, where the learned models are themselves amenable to modular inspection and editing by humans. Modern probabilistic programming languages 25, 27, 207 offer not just generic inference but programmable inference, that is, they automate the math for hybrids of optimization 208, 209, dynamic programming 210, and Monte Carlo inference 211. While such frameworks are certainly not the only methods to handle uncertainty and build effective and robust thought partners, we believe they are one promising and cognitively-grounded approach to instantiating thought partners today, as we discuss in our case studies.
3.4Infrastructure around Thought Partners
The design of systems that learn and think with people necessitates not only careful construction of the thought partner (i.e., the machine itself), but also the infrastructure within which human and computational thought partners collaborate 157. Questions like “when and where should a human be able to engage a computational thought partner to ensure effective and appropriate use?” or “for a given problem, is the human or computational thought partner better suited to start first, in light of their respective strengths and weakness, costs of the task at hand, and particular mode of thought?” inform the design of the workflow that surrounds thought partnership. This sociotechnical ecosystem may be dictated by external regulations, organizational practices, or other principles 212, 73, 213, 214, 215, and crucially informed by studies of human behavior. For example, Article 14 of the EU AI Act requires users of high-risk AI systems “to correctly interpret the high-risk AI system’s output” and “to remain aware of the possible tendency of automatically relying or over-relying on the output.” Satisfying such requirements begets not only careful design of thought partners (e.g., that we can understand), but demands careful design of the system of affordances 216, 217 and infrastructure around thought partnerships (for instance, communicating back to humans information about their reliance strategies). Disentangling thought partners from the infrastructure around them provides a modular scaffold for addressing unintentional thought partnership behavior, e.g, overreliance 218 and “illusions of understanding” 219. Bayesian modeling has already found success in inferring humans’ reliance strategies 220 and regions of the task space where a human versus machine can complement one another 221.
4Case Studies in Engineering Thought Partners
We now return to the example domains previously introduced and discuss specific case studies (depicted in Figure 2). Our goal is to demonstrate the potential benefits of endowing thought partners with structured probabilistic models of the human and/or world, and provide a flavor of the kinds of infrastructure questions that may surround them to ensure that the thought partners we build work with people.
4.1Thought Partners for Programming
We highlighted some visions for effective programming partnerships, such as a partner that can address “why” questions. One recent idea, from Chandra et al. 106, is to apply the Bayesian toolkit to explain surprising behavior of computer programs in a human-like way. Chandra et al., apply Bayesian models of mental state inference and rational communication 222 to design a system called “WatChat” that answers questions like “why did program output result ?” in a principled, human-like way. WatChat infers what erroneous mental model might cause the programmer to have expected something different (partner understands user) and generates an explanation that “debugs” that mental model (user understands partner). WatChat represents possible mental models themselves as “programs” whose “bugs” correspond to possible misconceptions; mental models can thus be inferred by Bayesian program synthesis (see Table 2). Such a framework can also be inverted to help design new questions for teachers or self-driven learners to identify misconceptions.
4.2Thought Partners for Embodied Assistance
Recall the challenge of collaboratively planning uncertain tasks, from a search-and-rescue mission to everyday cooking, wherein we typically want to infer shared goals and communicative intent from our partners. This cooperative logic can be modeled in a Bayesian architecture called Cooperative Language-Guided Inverse Plan Search (CLIPS) 35. By modeling humans as cooperative planners who use language to communicate joint plans to achieve their goals 65, CLIPS is able to infer those plans and goals from both the actions and instructions of human collaborators. This allows CLIPS to pragmatically follow human instructions, using context to disambiguate the multiple meanings that a request might have, while pro-actively assisting with the goals that underlie the instruction. For example, CLIPS can understand the likely intentions behind an instruction like “Can you prepare the vegetables while I knead the dough?”, inferring the shared goal of making pizza. These capabilities are made possible by using probabilistic programming infrastructure 25 to unite algorithms for Bayesian inverse planning 33, 184 and human-AI alignment 223, 51, 61 with LLMs. In particular, by using LLMs to evaluate the probability of a natural language instruction given a possible intention, CLIPS can infer intentions from natural language in a coherent Bayesian manner – demonstrating the power of combining tools from the Bayesian thought partner toolkit.
4.3Thought Partners for Storytelling
Storytelling is about crafting experience. Can we also apply the toolkit to help storytellers design experiences from first principles? Recent work has shown that a system grounded in Bayesian ToM can predict and even design interventions on the audience’s experience of a story 224, 225. Chandra et al. conceive of storytelling as “inverse inverse planning”: that is, starting with human social cognition, modeled as Bayesian inverse planning 33, and then optimizing narrative events to shape the model’s inferences over time. They show how a variety of storytelling techniques – from plot twists to stage mime – can be expressed in the language of inverse inverse planning to create animations that have a desired cognitive effect on viewers. Herein, we also highlight the breadth of thought partners for media beyond language, though the framework does nicely suggest a variety of natural extensions, such as integration into tools for creative writing 72, 73, 74, 75, 76, 77.
4.4Thought Partners for Medicine
Finally, we envision medical thought partners both understand us – reasoning about the doctor, patient, and care team as agents with goals, beliefs, and worries – and complement our capabilities, integrating swaths of evidence that exceed our cognitive capacities to inform diagnosis and treatment. While no system yet meets our desiderata for these criteria, we believe a range of motifs and tools from the Bayesian thought partner toolkit here can support the development of such systems for collaborative sensemaking and deliberation. We imagine Bayesian thought partners that can update their medical world knowledge in light of new insights in biology, e.g., editing a code snippet of the underlying probabilistic world model 16 or growing the representation in a non-parametric hierarchical Bayesian model 135. Such a model can then, similar to WatChat, synthesize new questions to ensure the human doctor’s own medical world model is sound. Early work demonstrates that we can employ elements of our toolkit, specifically probabilistic programming, to learn rich generative models for oncology and support efficient user queries 227. Yet, effective medical thought partners beckon a broader view of the ecosystem in which they are deployed 89, 228. If a doctor is over-relying on the output of the thought partner, or overburdened amidst a surge in patient queries, infrastructure around the human and thought partner can modulate when a patient query is either routed to a human or the AI thought partner, or deemed necessary of collaborative planning 229. Systems for routing based on probabilistic modeling are already proving successful in simulation 230.
There is much exciting work to be done to characterize when and how to build thought partners across modes of collaborative thought, which can advance the dissemination and creation of new knowledge alongside humans. We next lay out several key challenges for researchers and designers intent on pursuing a human-centered program of building machines that learn and think with people.
While there is substantial work to be done characterizing the space of possibilities for a single human and single AI thought partner (“dyadic”), we envision a future where many humans and many machines engage (“non-dyadic”), across roles and specialties in increasingly complex social systems 231, engage in the realm of thought 232, 233, 234. Already, researchers are exploring non-dyadic versions of many of the modes of thought and case studies laid out above, including collaborative learning with groups of humans accompanied by an AI thought partner 235 and medical robot collision avoidance systems that need to account for multiple humans 236. As in the dyad setting, extensions to non-dyadic settings can be bolstered by a deepening understanding of human behavior in groups – expanding the Bayesian thought partner toolkit – as is already underway in the study of convention formation 192, 237. Looking ahead, citizen science is a promising example of the opportunities of creating large networks of humans and thought partners: Zooniverse, a large-scale galaxy classification crowdsourcing project, serves as a case study for exploring smart task allocation, blending human and machine classifications, and infrastructure changes that impact human participation and performance with outcomes including both iterative scientific progress and serendipitous scientific discovery 238.
The assessment of thought partners demands a multi-faceted, cross-disciplinary suite of approaches. At minimum, the evaluation of AI thought partners must include some element of interactivity 239. Recent works have highlighted deficits in static evaluation of foundation models 240, 15, demonstrating the need for considering the interaction process in addition to the final output, the first-person perspective in addition to the third-party perspective, and notions of preference beyond quality. In addition to interactive user studies, we posit that to study different kinds of thought partners across modes of collaborative thought would benefit from a controlled, yet rich, playspace; games provide one such domain. Games offer a good formalism for the study of repeated interactions between multiple agents and grounds to explore rich patterns of thought, in social collaborative settings 241, 242, 243, 244.
5.3Risks and Important Considerations
Computational thought partners are by no means a guaranteed nor universal good and come with certain risks. We call out three such spheres of risk: (i) reliance, critical thinking, and access, (ii) anthromorphization, and (iii) misalignment.
First, AI thought partners could induce over-reliance and impair the development of critical thinking skills 245, 246, 247, 219, potentially acting as “steroids” for the mind 248. We are concerned about these risks; our emphasis on the infrastructure around thought partner use is explicitly intended to help practitioners take steps to address these challenges, motivating further design of infrastructure modifications like cognitive forcing functions 249, 250. Conversely, it is possible that some people may under-rely on a thought partner, particularly if there is inadequate AI literacy training for how to best make use of new thought partners 251, 252, 253. Already, research has found that the kinds of queries people make of AI systems can be informed by the amount of prior experience they have interacting with chatbots 15 meaning students, researchers, and other practitioners in lower-income communities may be unable to maximize the value of thought partnering. It is important to ensure that the benefits of thought partners are not confined to an exclusive set of people.
Second, on the topic of anthromoprhization, we highlight an important distinction between human-centric and human-like thought partners 254. Our desiderata “I understand you” advocates for thought partners whose behavior we understand; while this could draw on how we understand other humans, however, we should be careful about interpreting such machine thought partners as we do humans. As Weizenbaum 6 illuminated with the ELIZA system, there are risks to developing computer systems that present themselves as human-like in ways that they are not: for example, by leading users to attribute undue intention to systems’ responses or (in the long run) leading society to devalue human intelligence 255. Human-like thought partners should maintain categorical delineation between humans and machines to prevent overreliance 245, 256 and promote human dignity without encroaching on any partner’s self-worth 257. The term used to refer to a thought partner can affect the assumptions made about their capabilities (e.g., teammate implies the machine and human are on equal footing) or can detract from a partner’s human-like nature (e.g., tool would be less anthropomorphic).
Lastly, we note that insufficiently accurate, robust, or cognitively-grounded models can yield misalignment with humans, leading intended AI thought partners to act towards the wrong goals 258, provide wrong or misleading information 259, or violate safety constraints 260. A Bayesian approach to thought partnership can address some of these issues, enabling uncertainty-aware decision-making that avoids overconfidence 223, 261, 262. Yet, while inferring human thoughts and behavior can be used to design better collaborators, models of humans are inherently dual-use and can also be used to mislead, surveil, or manipulate 263. It is crucial to consider whether thought partners are aligned with society at large, or merely superficially aligned with users while serving more powerful interests 264.
If we are to build helpful and reliable human-AI thought partnerships, we advocate for design that explicitly recognizes and engages with the richness and diversity of human thought in an often unpredictable world. We have argued, supported by several case studies, that those engineering thought partners and the infrastructure around their use can benefit from drawing on motifs from computational cognitive science and cognitive-AI. The future of collaborative cognition is bright, but not without risk; continual collaboration and knowledge sharing amongst behavioral scientists, AI practitioners, domain experts, and related disciplines is crucial as we strive to build machines that truly learn and think with people.
Glossary of main terms• Collaborative cognition: the process by which two or more agents work together in some aspect(s) of thinking (e.g., planning together, learning together, creating together).• Thought partner: another entity (human or AI) that works with an agent to push forward some aspect(s) of thinking.• Artificial Intelligence (AI): computational systems that are able to process inputs and engage in some aspect of learning, planning, reasoning, and/or decision-making. Used interchangeably with machines.• Large language model (LLM): a particular kind of AI system which learns a distribution over text, often trained on large amounts of web-scale text data. LLMs are a class of large-scale foundation models.• Agent: an entity that can process inputs, make decisions, and take actions in some environment.• Dyad: a system with two agents (e.g., human-human, human-AI, AI-AI).• Resource-rationality: the idea that human behavior and cognition can be viewed as rational under bounded constraints (e.g., under limited working memory).• Probabilistic generative model: a model of how the data one observes about the world is generated by some probabilistic process, from which one can sample new observations and make queries about existing observations.• Probabilistic programming language (PPL): a language for expressing probabilistic generative models as computer programs that interleave deterministic code (e.g. arithmetic, logic, or artificial neural networks) with random choices. PPLs allow users to specify probabilistic models and inference algorithms in a modular and compositional manner.• Bayesian inference: a method for updating one’s beliefs over various aspects of the world, grounded in probability theory; in Bayesian inference, an agent updates their beliefs by assigning higher credence to hypotheses that better explain the evidence, weighted against the backdrop of their prior beliefs.• Affordance: design features of a system that inform use.
We thank Richard Turner, Laura Schulz, Tyler Brooke-Wilson, Valerie Chen, Alena Rote, Lance Ying, Tony Chen, Matt Ashman, Mike Walmsley, Albert Jiang, Mateja Jamnik, Dj Dvijotham, Jonathan Ragan-Kelley, Will Crichton, Alex Lew, Tim O’Donnell, Joao Loula, Marty Tenenbaum, Mary McNaughton-Collins, and Jim Collins for valuable conversations that informed this work. KMC gratefully acknowledges support from the Marshall Commission and the Cambridge Trust. UB acknowledges support by ELSA (European Lighthouse on Secure and Safe AI) funded by the European Union under grant agreement No. 101070617; IS acknowledges funding from an NSERC fellowship (567554-2022); KC is supported by the Hertz Foundation, the Paul and Daisy Soros Fellowship, and an NSF Graduate Research Fellowship under grant #1745302.; ML acknowledges funding from MSR; TZX acknowledges support from the OpenPhilanthropy AI Fellowship. VM acknowledges a gift from the Siegel Family Foundation. AW acknowledges support from a Turing AI Fellowship under grant EP/V025279/1, The Alan Turing Institute, and the Leverhulme Trust via CFI. TLG acknowledges support from ONR grant N00014-22-1-2813. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the institutions listed above.
on the uploaded document.Logging in, please wait... 
0 archived comments