platform, as well as several new capabilities aimed at simplifying how developers build and deploy AI applications. This announcement represents a significant step forward in Cloudflare's efforts to democratize AI and make it more accessible to developers worldwide.
With Workers AI, developers can now run machine learning models on Cloudflare's global network, leveraging the company's distributed infrastructure to deliver low-latency inference capabilities. Currently, there are 14 curated Hugging Face models optimized for Cloudflare's serverless inference platform, supporting tasks such as text generation, embeddings, and sentence similarity. Developers can simply choose a model from Hugging Face, click"Deploy to Cloudflare Workers AI," and instantly distribute it across Cloudflare's global network of over 150 cities with GPUs deployed.Developers can interact with LLMs like Mistral, Llama 2, and others via a simple REST API.
Additionally, Cloudflare has increased the rate limits for most large language models to 300 requests per minute, up from 50 requests per minute during the beta phase. Smaller models now have rate limits ranging from 1,500 to 3,000 requests per minute, further enhancing the platform's scalability and responsiveness.
It sits between applications and AI providers like OpenAI, Hugging Face, and Replicate, enabling developers to connect their applications to these providers with just a single line of code change.
Technology Technology Latest News, Technology Technology Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Source: NBCDFW - 🏆 288. / 63 Read more »