
We are pleased to announce Claude 2our new model. Claude 2 has improved performancelonger responsesand can be accessed via API as well as a new public-facing beta websiteclaude.ai. We have heard from our users that Claude is easy to converse withclearly explains its thinkingis less likely to produce harmful outputsand has a longer memory. We have made improvements from our previous models on codingmathand reasoning. For exampleour latest model scored 76.5% on the multiple choice section of the Bar examup from 73.0% with Claude 1.3. When compared to college students applying to graduate schoolClaude 2 scores above the 90th percentile on the GRE reading and writing examsand similarly to the median applicant on quantitative reasoning.
Think of Claude as a friendlyenthusiastic colleague or personal assistant who can be instructed in natural language to help you with many tasks. The Claude 2 API for businesses is being offered for the same price as Claude 1.3. Additionallyanyone in the US and UK can start using our beta chat experience today.
As we work to improve both the performance and safety of our modelswe have increased the length of Claude’s input and output. Users can input up to 100K tokens in each promptwhich means that Claude can work over hundreds of pages of technical documentation or even a book. Claude can now also write longer documents - from memos to letters to stories up to a few thousand tokens - all in one go.
In additionour latest model has greatly improved coding skills. Claude 2 scored a 71.2% up from 56.0% on the Codex HumanEvala Python coding test. On GSM8ka large set of grade-school math problemsClaude 2 scored 88.0% up from 85.2%. We have an exciting roadmap of capability improvements planned for Claude 2 and will be slowly and iteratively deploying them in the coming months.
We've been iterating to improve the underlying safety of Claude 2so that it is more harmless and harder to prompt to produce offensive or dangerous output. We have an internal red-teaming evaluation that scores our models on a large representative set of harmful promptsusing an automated test while we also regularly check the results manually. In this evaluationClaude 2 was 2x better at giving harmless responses compared to Claude 1.3. Although no model is immune from jailbreakswe’ve used a variety of safety techniques (which you can read about here and here)as well as extensive red-teamingto improve its outputs.
Claude 2 powers our chat experienceand is generally available in the US and UK. We are working to make Claude more globally available in the coming months. You can now create an account and start talking to Claude in natural languageasking it for help with any tasks that you like. Talking to an AI assistant can take some trial and errorso read up on our tips to get the most out of Claude.
We are also currently working with thousands of businesses who are using the Claude API. One of our partners is Jaspera generative AI platform that enables individuals and teams to scale their content strategies. They found that Claude 2 was able to go head to head with other state of the art models for a wide variety of use casesbut has particular strength for long form low latency uses. "We are really happy to be among the first to offer Claude 2 to our customersbringing enhanced semanticsup-to-date knowledge trainingimproved reasoning for complex promptsand the ability to effortlessly remix existing content with a 3X larger context window," said Greg LarsonVP of Engineering at Jasper. "We are proud to help our customers stay ahead of the curve through partnerships like this one with Anthropic."
Sourcegraph is a code AI platform that helps customers writefixand maintain code. Their coding assistant Cody uses Claude 2’s improved reasoning ability to give even more accurate answers to user queries while also passing along more codebase context with up to 100K context windows. In additionClaude 2 was trained on more recent datameaning it has knowledge of newer frameworks and libraries for Cody to pull from. “When it comes to AI codingdevs need fast and reliable access to context about their unique codebase and a powerful LLM with a large context window and strong general reasoning capabilities,” says Quinn SlackCEO & Co-founder of Sourcegraph. “The slowest and most frustrating parts of the dev workflow are becoming faster and more enjoyable. Thanks to Claude 2Cody’s helping more devs build more software that pushes the world forward.”
We welcome your feedback as we work to responsibly deploy our products more broadly. Our chat experience is an open beta launchand users should be aware that Claude – like all current models – can generate inappropriate responses. AI assistants are most useful in everyday situationslike serving to summarize or organize informationand should not be used where physical or mental health and well-being are involved. Please let us know if you’d like to talk to Claude in a currently unsupported areaor if you are a business who would like to start working with Claude.
Related content
Anthropic partners with CodePath to bring Claude to the US’s largest collegiate computer science program
Read moreChris Liddell appointed to Anthropic’s board of directors
Read moreAnthropic raises $30 billion in Series G funding at $380 billion post-money valuation
We have raised $30 billion in Series G funding led by GIC and Coatuevaluing Anthropic at $380 billion post-money. The investment will fuel the frontier researchproduct developmentand infrastructure expansions that have made Anthropic the market leader in enterprise AI and coding. Our run-rate revenue is $14 billionwith this figure growing over 10x annually in each of those past three years.
Read more