Anthropic, an AI startup founded by ex-OpenAI executives, has unveiled its new large language model (LLM), Claude 2.
Available as a web beta in the US and UK and through a paid API, the new model boasts improved performance and capacity compared to its predecessor.
Claude 2 is an evolution of Claude 1.3, capable of searching documents, summarizing content, writing, coding, and answering questions. It’s similar to other LLMs like ChatGPT but accepts attachments, enabling users to upload files and have the AI analyze and use them.
Claude 2 outperforms 1.3 in several areas. For instance, it scores higher on various tests, including a lawyer’s bar exam and the US Medical Licensing Exam’s multiple choice questions. It also outperforms its predecessor on math and coding problems, including the Codex Human Level Python coding test.
Anthropic’s head of go-to-market Sandy Banerjee elaborates on these improvements, “We’ve been working on improving the reasoning and sort of self-awareness of the model, so it’s more aware of, ‘here’s how I like follow instructions,’ ‘I’m able to process multi-step instructions’ and also more aware of its limitations.”
The training data for Claude 2, compiled from websites, licensed data sets from third parties, and user data from early 2023, is more recent than that of Claude 1.3. However, the models are ultimately similar – Banerjee admits that Claude 2 is an optimized version of Claude 1.3.
Like other LLMs, Claude is far from infallible. TechCrunch says the AI has been manipulated to invent names for nonexistent chemicals and offer questionable instructions for producing weapons-grade uranium, among other things. However, Anthropic asserts that Claude 2 is “2 x better” at providing “harmless” responses than its predecessor.
Banerjee stated, “[Our] internal red teaming evaluation scores our models on a very large representative set of harmful adversarial prompts,” “and we do this with a combination of automated tests and manual checks.” This is important to Anthropic as the model’s neutral personality is central to the company’s marketing efforts.
Anthropic uses a specific technique called ‘constitutional AI,’ which infuses models like Claude 2 with specific values defined by a ‘constitution.’ The aim is to make the model’s behavior easier to understand and adjust as needed.
Anthropic’s vision is to create a “next-gen algorithm for AI self-teaching,” and Claude 2 is just one step toward this goal.
Banerjee concluded, “We’re still working through our approach.” “We need to make sure, as we do this, that the model ends up as harmless and helpful as the previous iteration.”
What is Claude?
Claude is an AI assistant developed by Google-backed Anthropic, a startup comprising a few ex-OpenAI researchers. It’s designed to be ‘helpful, honest, and harmless’ and is accessible via a chat interface and API.
Claude can assist with a wide range of tasks, including summarization, creative and collaborative writing, question answering, and coding.
Several companies have implemented Claude, including Notion, Quora, and DuckDuckGo. It’s been used to improve Quora’s AI Chat app, Poe, and integrates into the productivity app Notion.
Other partners include Robin AI, a legal business that uses Claude to understand and redraft complex legal texts, and AssemblyAI, which uses Claude to transcribe and understand audio data at scale.
Claude’s ability to work with files is potentially better suited to some productivity-based uses compared to competitors like ChatGPT.
Users in the US and UK can judge that themselves by trying the web beta.