已置顶
Inferless
393 posts
- Join us for our first live townhall showcasing Inferless' new user interface 🎉 If you are a ML engineer who is struggling with issues like high cold-start times at peak loads and effective serverless orchestration for your ML workloads - we understand. We just released the
00:00 - 🚀 We’re officially LIVE on @ProductHunt Check out the video below—it's a quick dive into the Inferless journey. We talk about why we started, the powerful benefits of Serverless GPUs, and give you a hands-on product demo. This video represents nearly two years of relentless
00:00 - 🎉 Exciting news from our Inferless community 🎉 🌟 We're thrilled to support our user @Tenyx_AI 's incredible success. Their Tenyx_Chat-7B-v1 model is now leading the MT-Bench Leaderboard, surpassing giants like Chat GPT! We are a proud compute partner! 🚀 To help you get
- DM us! 💚🚀 Hiring for 2 roles ( Product + Engineering) Hybrid Work Culture Salary range 30 -90 LPA + ESOPs + Perks! If you wanna work at a startup with an engineering-first culture whose ambition is to build a cloud Infra company proudly from India - read onwards 🧵
- ▶︎ Inferless now supports text-to-speech streaming. Implement real-time TTS streaming using Parler-TTS model today: • Quickly stream the generated audio from text input • Supports various voices & languages • Stream the audio chunk by chunk
- Llama 3.1 is now available on Inferless. Deploy Meta's latest open-source LLM with enterprise-grade performance and scalability. • Cold start: 15.44 sec • 74.79 tokens/sec • Running on A100 (80GB) Start building today: 🔗Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context
00:00docs.inferless.comDeploy Llama-3.1-8B-Instruct using Inferless - InferlessLlama-3.1-8B-Instruct is a new state-of-the-art model from Meta's Lama-3.1 series of large language models. The repository is for the Llama-3.1-8B-Instruct model for deploying the model in the... - 🚀 Dive into our latest blog post on LLMs Speed Benchmark and uncover the nuances of different Inference libraries! 📊 We've put Gemma 7B, Llama-2 7B, and Mistral 7B to the test across a range of scenarios to bring you comprehensive insights. Our detailed analysis covered
- 🚀 New Inferless Dashboard UI is live! Introducing a cleaner and more intuitive dashboard. 🎬 dub.sh/inferless-pw
00:00 - We're excited to recap our exciting meetup on Deploying Generative AI Models, hosted with @huggingface & @peakxvpartners on June 10th, 2023 in Bangalore. Hundreds of AI, ML, and data science enthusiasts joined us to showcase their groundbreaking projects.
- 🎉Kicking off the new year by optimizing @upstageai 's SOLAR-10.7B-Instruct-v1.0 with Auto-GPTQ Library. When deployed on an A100 GPU using vLLM, the model shows impressive performance. Although Auto-GPTQ was an option, our experience suggests that vLLM is the superior choice
- 🎉 Congratulations to @SpoofSense on the launch of SpoofSense Face! 🚀 Your innovative Facial Liveness Detection model marks a significant step forward in safeguarding businesses against deepfake identity fraud. We're proud to be your Compute Partner and look forward to seeing
00:00 - Our Cofounder & CTO @nilesh_agarwal2 recently authored a tutorial on @huggingface blog, where he explores how developers can utilize fractional GPUs in Kubernetes to potentially save up to 50% in costs. This approach involves dividing a single GPU into seven smaller units, each
- We're launching on Product Hunt in a week for the first time! If you've been waiting for an effortless way to deploy AI models—this is it. Would really appreciate your support. Set your reminders- be sure to click "Notify Me" to get updates on our launch on Product Hunt.












