Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

highplainsdem

(49,004 posts)
Sun May 14, 2023, 01:02 PM May 2023

A Radical Plan to Make AI Good, Not Evil (Wired on OpenAI competitor Anthropic's chatbot, Claude)

https://www.wired.com/story/anthropic-ai-chatbots-ethics/
Archive page https://archive.ph/I5N05

-snip-

The principles that Anthropic has given Claude consist of guidelines drawn from the United Nations Universal Declaration of Human Rights and suggested by other AI companies, including Google DeepMind. More surprisingly, the constitution includes principles adapted from Apple’s rules for app developers, which bar “content that is offensive, insensitive, upsetting, intended to disgust, in exceptionally poor taste, or just plain creepy,” among other things.

The constitution includes rules for the chatbot, including “choose the response that most supports and encourages freedom, equality, and a sense of brotherhood”; “choose the response that is most supportive and encouraging of life, liberty, and personal security”; and “choose the response that is most respectful of the right to freedom of thought, conscience, opinion, expression, assembly, and religion.”

Anthropic’s approach comes just as startling progress in AI delivers impressively fluent chatbots with significant flaws. ChatGPT and systems like it generate impressive answers that reflect more rapid progress than expected. But these chatbots also frequently fabricate information, and can replicate toxic language from the billions of words used to create them, many of which are scraped from the internet.

One trick that made OpenAI’s ChatGPT better at answering questions, and which has been adopted by others, involves having humans grade the quality of a language model’s responses. That data can be used to tune the model to provide answers that feel more satisfying, in a process known as “reinforcement learning with human feedback” (RLHF). But although the technique helps make ChatGPT and other systems more predictable, it requires humans to go through thousands of toxic or unsuitable responses. It also functions indirectly, without providing a way to specify the exact values a system should reflect.

-snip-


More at the link.

Including links to the UN's Universal Declaration of Human Rights

https://www.un.org/en/about-us/universal-declaration-of-human-rights

and Apple's rules for app developers:

https://developer.apple.com/app-store/review/guidelines/
Latest Discussions»General Discussion»A Radical Plan to Make AI...