Log inSign up
Inferless
393 posts
user avatar
Inferless
@Inferless_
Fastest serverless GPU inference for custom models (acq by @baseten)
San Francisco
inferless.com
Joined March 2023
52
Following
856
Followers
  • 已置顶
    user avatar
    Inferless
    @Inferless_
    3月20日
    Announcing the acquihire of Inferless by Baseten
    From baseten.co
    623
  • user avatar
    Inferless
    @Inferless_
    2024年8月20日
    Join us for our first live townhall showcasing Inferless' new user interface 🎉 If you are a ML engineer who is struggling with issues like high cold-start times at peak loads and effective serverless orchestration for your ML workloads - we understand. We just released the
    00:00
    5.9K
  • user avatar
    Inferless
    @Inferless_
    2025年3月25日
    🚀 We’re officially LIVE on @ProductHunt Check out the video below—it's a quick dive into the Inferless journey. We talk about why we started, the powerful benefits of Serverless GPUs, and give you a hands-on product demo. This video represents nearly two years of relentless
    00:00
    6.7K
  • user avatar
    Inferless
    @Inferless_
    2024年1月8日
    🎉 Exciting news from our Inferless community 🎉 🌟 We're thrilled to support our user @Tenyx_AI 's incredible success. Their Tenyx_Chat-7B-v1 model is now leading the MT-Bench Leaderboard, surpassing giants like Chat GPT! We are a proud compute partner! 🚀 To help you get
    2.7K
  • user avatar
    Inferless
    @Inferless_
    2023年6月23日
    DM us! 💚
    user avatar
    Aishwarya Goel (Ash)
    Baseten
    @aishwarya_08
    2023年6月23日
    🚀 Hiring for 2 roles ( Product + Engineering) Hybrid Work Culture Salary range 30 -90 LPA + ESOPs + Perks! If you wanna work at a startup with an engineering-first culture whose ambition is to build a cloud Infra company proudly from India - read onwards 🧵
    5K
  • user avatar
    Inferless
    @Inferless_
    2024年8月1日
    ▶︎ Inferless now supports text-to-speech streaming. Implement real-time TTS streaming using Parler-TTS model today: • Quickly stream the generated audio from text input • Supports various voices & languages • Stream the audio chunk by chunk
    2.2K
  • user avatar
    Inferless
    @Inferless_
    2024年7月25日
    Llama 3.1 is now available on Inferless. Deploy Meta's latest open-source LLM with enterprise-grade performance and scalability. • Cold start: 15.44 sec • 74.79 tokens/sec • Running on A100 (80GB) Start building today: 🔗 
    user avatar
    AI at Meta
    Meta
    @AIatMeta
    2024年7月23日
    Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context
    00:00
    docs.inferless.com
    Deploy Llama-3.1-8B-Instruct using Inferless - Inferless
    Llama-3.1-8B-Instruct is a new state-of-the-art model from Meta's Lama-3.1 series of large language models. The repository is for the Llama-3.1-8B-Instruct model for deploying the model in the...
    2.1K
  • user avatar
    Inferless
    @Inferless_
    2024年3月22日
    🚀 Dive into our latest blog post on LLMs Speed Benchmark and uncover the nuances of different Inference libraries! 📊 We've put Gemma 7B, Llama-2 7B, and Mistral 7B to the test across a range of scenarios to bring you comprehensive insights. Our detailed analysis covered
    224
  • user avatar
    Inferless
    @Inferless_
    2024年8月20日
    🚀 New Inferless Dashboard UI is live! Introducing a cleaner and more intuitive dashboard. 🎬 dub.sh/inferless-pw
    00:00
    3.3K
  • user avatar
    Inferless
    @Inferless_
    2023年6月14日
    We're excited to recap our exciting meetup on Deploying Generative AI Models, hosted with @huggingface & @peakxvpartners on June 10th, 2023 in Bangalore. Hundreds of AI, ML, and data science enthusiasts joined us to showcase their groundbreaking projects.
    526
  • user avatar
    Inferless
    @Inferless_
    2024年1月4日
    🎉Kicking off the new year by optimizing @upstageai 's SOLAR-10.7B-Instruct-v1.0 with Auto-GPTQ Library. When deployed on an A100 GPU using vLLM, the model shows impressive performance. Although Auto-GPTQ was an option, our experience suggests that vLLM is the superior choice
    565
  • user avatar
    Inferless
    @Inferless_
    2024年1月31日
    🎉 Congratulations to @SpoofSense on the launch of SpoofSense Face! 🚀 Your innovative Facial Liveness Detection model marks a significant step forward in safeguarding businesses against deepfake identity fraud. We're proud to be your Compute Partner and look forward to seeing
    00:00
    1.8K
  • user avatar
    Inferless
    @Inferless_
    2024年1月17日
    Our Cofounder & CTO @nilesh_agarwal2 recently authored a tutorial on @huggingface blog, where he explores how developers can utilize fractional GPUs in Kubernetes to potentially save up to 50% in costs. This approach involves dividing a single GPU into seven smaller units, each
    2K
  • user avatar
    Inferless
    @Inferless_
    2025年3月18日
    We're launching on Product Hunt in a week for the first time! If you've been waiting for an effortless way to deploy AI models—this is it. Would really appreciate your support. Set your reminders- be sure to click "Notify Me" to get updates on our launch on Product Hunt.
    474

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up