Skip to main content

New top story on Hacker News: Show HN: Tune LLaMa3.1 on Google Cloud TPUs

Show HN: Tune LLaMa3.1 on Google Cloud TPUs
3 by felarof | 0 comments on Hacker News.
Hey HN, we wanted to share our repo where we fine-tuned Llama 3.1 on Google TPUs. We’re building AI infra to fine-tune and serve LLMs on non-NVIDIA GPUs (TPUs, Trainium, AMD GPUs). The problem: Right now, 90% of LLM workloads run on NVIDIA GPUs, but there are equally powerful and more cost-effective alternatives out there. For example, training and serving Llama 3.1 on Google TPUs is about 30% cheaper than NVIDIA GPUs. But developer tooling for non-NVIDIA chipsets is lacking. We felt this pain ourselves. We initially tried using PyTorch XLA to train Llama 3.1 on TPUs, but it was rough: xla integration with pytorch is clunky, missing libraries (bitsandbytes didn't work), and cryptic HuggingFace errors. We then took a different route and translated Llama 3.1 from PyTorch to JAX. Now, it’s running smoothly on TPUs! We still have challenges ahead, there is no good LoRA library in JAX, but this feels like the right path forward. Here's a demo ( https://ift.tt/bkr2JVM ) of our managed solution. Would love your thoughts on our repo and vision as we keep chugging along!

Comments

Popular posts from this blog

New top story on Hacker News: Tell HN: I think I found Toyota's battery

Tell HN: I think I found Toyota's battery 173 by scythe | 29 comments on Hacker News. Recently there was a thread about a "breakthrough" in battery technology at Toyota. https://ift.tt/nUtv4yY Toyota has been putting out PR puff pieces about their "solid-state" (solid-electrolyte) batteries for years, but this story was unique in that it had a quote from Keiji Kaita, who holds some high-level role at Toyota. Anyway, I didn't think much of it, because there was no paper referenced in the Guardian article, which seemed to be the original source. But while reading about something else, I came across the paper "A near dimensionally invariable high-capacity positive electrode material", published in Nature Materials last December: https://ift.tt/24ZXPy5 This paper, reporting a cathode that has very little (much less than normal) change in size or shape when charged and discharged, claims reversible storage with a solid electrolyte. It stands to reaso

New top story on Hacker News: Show HN: Neucards – Privacy based digital contact card

Show HN: Neucards – Privacy based digital contact card 7 by bdominy | 1 comments on Hacker News. Neucards is an end-to-end encrypted contact information sharing and updating iOS app that protects your identity while letting you keep in touch with people. I started working on neucards as a side project more than ten years ago, and I decided three years ago to go full-time and try to build a community around it. There are two major problems that neucards addresses. First, most people end up with contact lists that are hopelessly out of date. Over time, people move, change jobs, or add social profiles and unless they tell you, chances are you could lose touch. Second, your contact information ends up in the wrong hands. There has been a huge increase in robocalls, unsolicited emails, data breaches, and online scams that is driven by accessing a person's contact info. Even worse, with AI now being able to imitate a person's voice or other mannerisms, knowledge about the connecti