Maintaining Human Oversight in AI

“We the people” must remain in charge of AI systems — not tech companies or political elites. Google’s recent blunder with Gemini AI system makes this crystal clear.

Gemini does not argue that Hitler was worse than Elon Musk’s tweets; rather, it refuses to create policy documents supporting fossil fuel use and instead generates images suggesting America’s founding fathers were of different races and genders than they actually were.

These examples may seem bizarre, but they demonstrate a real and present threat from AI companies: bureaucrats with no accountability determine which ideas and values can be expressed freely versus not, leaving everyone regardless of ideology to accept this dystopia as reality.

Nor should we turn to government for advice on how AI companies should control AI speech. While government regulation will undoubtedly have its place, living in a free society means not letting it dictate which ideas and values people can express freely or not express.

Corporations and governments may not be the appropriate entities to make these decisions; yet decisions still need to be made. People will use AI tools like Watson to access all kinds of information and try to generate content of all kinds; users will expect the tools to reflect their values — though these won’t always coincide.

Beyond businesses and governments, another alternative approach could be putting users in charge of AI.

Strategies to Put Users in Charge of AI Over the last five years, in my academic political science work and alongside tech industry collaborations, I have explored ways to empower users on online platforms through different experiments that put users in control of AI. Here is what I have learned through these experiments about how we can effectively put users in control of AI.

Let users choose guardrails via the marketplace. We should promote a diversity of models. Different users, journalists, religious groups, civil organizations, governments and any other group interested should easily be able to customize open-source base models according to their values and add in guardrails that reflect them; users then should have the freedom of selecting their version when using tools – this would allow companies producing the base models to avoid having to act as the “arbiters of truth” when it comes to AI technologies.

Although this marketplace for fine-tuning and guardrails will ease pressure on companies to an extent, it doesn’t fully address the central guardrail issue. Some content – particularly images or videos – will be so offensive as to disqualify it across any finely tuned model offered by companies. This includes explicitly illegal material like Child Sexual Abuse Material (CSAM), but also many forms that exist on a grayer scale like depictions of real people that might be defamatory, offensive slurs that only bother specific groups in specific contexts as well as sexual or pornographic material support from groups considered either terrorists or freedom fighters and so on.

How can companies establish centralized safeguards on issues related to each fine-tuned model without falling back into Gemini’s politics problem? One solution lies within users – placing them in charge of setting minimum, central guardrails.

Indeed, tech companies are already experimenting with democracy. Meta, for instance, recently established a “community forum” to gather public feedback into how it designs certain guardrails for LlaMA (its open-source generative AI tool). OpenAI then announced an initiative aiming at finding “democratic inputs to AI.” Additionally, Anthropic released its constitution written jointly by an array of Americans.

These steps are great steps forward, but more needs to be done. Representing representative samples of users as in these experiments is expensive, while recruits do not have strong incentives to understand issues and make sound decisions. Furthermore, each assembly only meets once, meaning expertise in governance does not accumulate over time.

Meaningful Power Over Central Guardrails
An improved form of democracy for AI would require that users can submit proposals, debate them and vote on them; their votes hold binding authority over the platform. While proposals may be restricted so as not to violate laws or interfere with its business operations unduly, their scope must still allow people meaningful control over its central guardrails.

Although no tech platform has attempted to implement an actual voting system yet, experiments in Web3 — like those Eliza Oak and I studied in a recent academic working paper — provide us with insight into how such systems might function in practice. Startups within Web3 have long experimented with voting systems with extreme broad powers for years now – although we’re still early in their journey toward full democracy in AI platforms, we’ve learned four key lessons here that apply elsewhere as well.

First, eliminate votes without consequence for users by linking their voting power with something of tangible use for them. AI platforms could link voting power with digital tokens that users could spend within the platform – for instance as credits towards buying additional compute time.

Second, do not force everyone to cast votes on every proposal; rather encourage users to delegate their tokens to qualified experts who will vote on their behalf and offer transparent public explanations as to the proposals submitted, how and why their vote was cast.

Thirdly, establish a rewards system to incentivize meaningful participation in governance. Stipulate that users will earn extra tokens — which they can use either for voting or paying AI usage costs — once they demonstrate meaningful participation over time.

Fourthly, integrate this voting system into a larger constitution that establishes clear user proposals, when and how companies may veto certain kinds of proposals, who has voting power in terms of percentage and proportionality and so forth. Make explicit the company’s pledge not to set central guardrails for their AI tools.

Helping Society Trust What It Sees
Platforms can begin small with this experiment, piloting it on only a few decisions at first and gradually ramping up its powers over time. But for it to succeed, AI companies must eventually commit themselves not being able to set their central guardrails – only then will society trust what we see and the answers it provides aren’t being altered by unaccountable actors who don’t share their values.