Skip to main content

Posts

Showing posts with the label Hacker News

New top story on Hacker News: Show HN: Stream of Consciousness – watch an artificial persona making art

Show HN: Stream of Consciousness – watch an artificial persona making art 9 by jamez | 3 comments on Hacker News. Hi HN, moved by curiosity about how to build an autonomous agent, and to explore the boundaries of machine creativity, I built a fictional entity (dubbed Livia) powered by LLMs, Multimodal models and text-to-image models to find some answers. What happened instead is that more questions have cropped up. An important hypothesis of this project is that, by observing the train of thought and witnessing the simulated state of mind and emotional emulation surrounding it, humans could empathize with a machine. What happens when that's the case? Would people enjoy companionship from a synthetic person? Would the Art establishment ever consider a non-human author (capable of making art and interacting with other humans) an Artist? Whatever the answers, I can't shake away the feeling that human uniqueness is being eroded and that we risk facing a crisis of meaning. Perhap

New top story on Hacker News: Show HN: Sonauto – a more controllable AI music creator

Show HN: Sonauto – a more controllable AI music creator 59 by zaptrem | 29 comments on Hacker News. Hey HN, My cofounder and I trained an AI music generation model and after a month of testing we're launching 1.0 today. Ours is interesting because it's a latent diffusion model instead of a language model, which makes it more controllable: https://sonauto.ai/ Others do music generation by training a Vector Quantized Variational Autoencoder like Descript Audio Codec ( https://ift.tt/Iwk16oP ) to turn music into tokens, then training an LLM on those tokens. Instead, we ripped the tokenization part off and replaced it with a normal variational autoencoder bottleneck (along with some other important changes to enable insane compression ratios). This gave us a nice, normally distributed latent space on which to train a diffusion transformer (like Sora). Our diffusion model is also particularly interesting because it is the first audio diffusion model to generate coherent lyrics! W