I can't comment on the merit of the technical aspects, but I feel like of all the AI generated content, especially AI generated music is as interesting as AI generated memoirs - sort of pointless. It lacks the human element that makes it relatable on an emotional level.
You will be able to detect any AI music from human music because of the ability to detect this human element? Or are you saying you'll retroactively hate a song once you figure out it was AI generated?
Can't speak for the person, who brought topic up, but I can't detect AI music from any human music. That's to say music rules aren't that hard, you can learn a basic structure, a few popular scales and a write a completely soulless peace yourself.
Google "Minor Scale", choose any note as a starter, and here you go. You can even add any popular drum pattern.
Soulless music written by humans is dime a dozen. It's out there, people just choose to listen to a better music. Scratch human written, there are algorithms based on music theory that could generate light music endlessly.
But the completely original peaces, written by professionals, aiming to express a feeling, are easily distinguishable from any AI song. AI songs doesn't leave you hanging or waiting, it doesn't know the value of breaking a pattern.
I think the point is more like, MusicAI might give you a progression from a Bob Dylan song, but it can't give you a Bob Dylan song and it will be easy to discern between the two.
> Or are you saying you'll retroactively hate a song once you figure out it was AI generated?
Can't speak for OP, but for me personally, I don't mind AI augmented content, as long as it's done well. Eg. I recently played a game called "Slay the Princess", where they very clearly used Claude to write a lot of the dialogue/narration, but that didn't detract from the experience.
On the other hand, I hate it when I open a youtube video and the script is 100% chatgpt slop.
Music -- a lot of it is already made using software. If I enjoy listening to it, I don't care how it was made.
Yes, and people in 18th century used analogies like "humans are just intricate clockwork mechanisms". (Clockwork mechanisms were the most advanced technology at the time.)
I agree for fully generated work, but I think we’ll eventually reach a sweet spot for assisting tools that retain creativity while removing technical blockers. Things like beat quantization for producers but exponentially better.
The trick is going to be surfacing content in the sea of bullshit, though.
I’m not sure I get your point. Is it that you can’t make art with tools that produce average results?
If so I really disagree, achieving great results with average building blocks is both feasible and I’d say usual - you can perfectly imagine a greatly useful and successful app where any given individual block of code is nothing to write home about.
You can get a great solo of jazz piano over a generated battery filler, you can stitch two recordings by having AI generate the microseconds in the gap, you can produce a melody where AI adds human-like timing micro errors in the synth so doesn’t sound like a robotic rendition, you can rap and generate adlibs as a second voice…
What's giving Daniel Ek a hard-on is that the music industry realised AI isn't a big threat. AI first made Spotify's stock plummet from $300 to under $80, and once the realisation kicked in that the music industry is more about fame (i.e. real people other people want to relate to) than music itself, the stock price climbed up to $600.
About 30 years ago in a creative 2-week fit I wrote a mini-symphony. I’ve always envisioned expanding it into a full-length orchestral piece but don’t have the skills to do so.
I would love to use a music model that could help me do that.
I’m imagining feeding it my score and then iterating with it to create the final piece I’ve had stuck in my head all this time.
https://projectsam.com/libraries/the-free-orchestra play around with something like this, and you'll be surprised with what you can come up with. Modern DAW's have given me a ton of freedom to create all types of music.
Nobody needs this thing indeed. Nice to have, but out of 100 producers I know only 1 employing end-to-end AI in his process. People love to mold and generally touch what they creatively pursue. Even most of the elevator music (so called vapourwave) was and is being done by humans.
Odd how you refer to vaporwave as elevator music. The original artists of the genre very lovingly remixed classic city pop from the 80s and 90s, and overlayed their own music on top.
I think you might be referring to "Muzak", a particular brand of background music that would sell to businesses to play in stores and elevators.
> And people that enjoy creating art will still do it regardless.
Some people will, but for me, I could never justify spending time and money to learn something an AI could do better than me. I'd constantly feel like I'm wasting my time.
That seems silly. I spend time and money to get better at guitar even though there are countless pro (and amateur) musicians who can do it better than me.
Yeah, but you don't have access to those people all the time. If I have a free tool on my laptop that can do everything better than me at my command, it feels silly to spend the time to master the instrument.
People still calligraph even though we have printer. People will actively spend money in a hobby they like. It's totally fair that you personally might not want to do it anymore if AI can, but I think parent was saying other people will continue to do it if they enjoy it. There are tons of people who create copy right free music anyways. Talking about those type of people. :)
I don't even care about profit. My success indicator is "making something I couldn't make before". Since AI can do it, I can't justify learning an instrument, even if I might enjoy the process a bit.
I think this sentiment is actually incredibly sad and more or less the crux of the matter.
AI is a tool intended on increasing efficiency. The way you describe the act of making art/crafting things seems to come through the lens of efficiency, where you're wasting your time in the struggle of improvement.
The time spent struggling is not time wasted. The struggle is the single most important aspect of learning and improving. Its how lessons learned reach the deep subconscious as if you're walking or writing with a pen. No good musician, painter, sculptor, systems engineer, etc became that way without that deep struggle. If someone says they never struggled, be wary of them.
If we round off every corner in the name of efficiency and the bottom line, all we're left with is toil. We don't benefit from that toil, the people who push for greater and greater efficiency do. No matter how efficient we get, the excesses produced by the efficiencies will be guarded. Food thrown away, excess milk dumped in the dirt, crops burned, profits stowed away.
The AI can only exist via previous struggle. The music produced by AI will only ever be a shadow of Human struggle. It will never truly exceed that struggle. A series of vectors in memory on a graphics card can't know the feeling of a beat as it resounds in ones chest, or the resolution of a chord as the bridge turns around. It can't know the sympathetic emotions of a story told in lyrics. It can't know the nostalgia evoked by a shadow in a painting, or the familiarity of an expression in a face.
Only we as humans can create this, and its sad and disgusting to see these tools used to degrade the human experience for an ounce of profit by people who refuse to allow themselves to understand the struggle of creating art.
I appreciate the in-depth response, and I agree for the most part; the issue is, I can choose to struggle on something more practical. I'd get less fulfillment, but in today's economy, it's a far safer option, and I'd still learn and grow.
Do you think you're, perhaps over doing self-the promotion?
> Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity.
It doesn't seem to be the same content being pushed each time. Someone posting their book or the same article once a week vs posting new content each week is not the same to me. To me, it's not obvious that it's against the rule. Especially when the content seems to be engaging enough for it to make to FP.
I work on music models, and this is a very cool paper! There are no papers that go into depth on how token-based AR music models (that aren't absurdly inefficient like Yue) are trained. I'm particularly interested in your semantic tokens. I tried reproducing the CTC loss part but my curve was very spikey and didn't seem to actually figure out any characters. The semantic tokens gave great acoustic info but gibberish lyrics. What did your CTC loss curves look like and did you see anything similar at any point?
As a semi-aside, I feel like semantic tokens in general may end up being a bottleneck on how interesting model outputs can be.
How is it possible that text-to-score/notation is lagging text-to-audio in music generation? Generating audio seems wildly more complicated!
Since you are working in this space, I wonder if you could comment on my pet theories for why this is true: 1. Not enough training data (scores not available for most songs), or 2. Difficulty with tokenization of musical notation vs. audio
Were there any examples of "de novo" music generation using this? The only one I could find on the website was translating the vocals of an existing song, couldn't find any AI compositions.
I think saying "track record" is then factually wrong. Merriam Webster defines track record as: "a record of past performance often taken as an indicator of likely future performance". In that case, you should be able to point to the record.
I assume you will still stand behind the essence of your comment. In that case, it would be better to say "Based on my experience on playing with their models, I have strong reasons to believe that they continuously cheat on benchmarks by training on test datasets." You can then also add that this maps with what you hear from others in the field.
Hmm, even if the model performed poorly on real world task vs benchmark, it doesn't necessarily imply they train on the benchmarks themselves, right? They did the train, test split properly. Didn't train on the test. But the benchmark itself was bad at representing real world tasks? Is so, seems pretty wild to accuse a company of training on test data.. maybe this is "vibe commenting" and I'm just out of loop.