> I spent a decade as an electronic musician, spending literally thousands of hours dragging little boxes around on a screen. So much of creative work is defined by this kind of tedious grind. ... This isn't creative. It's just a slog. Every creative field - animation, video, software - is full of these tedious tasks. Of course, there’s a case to be made that the very act of doing this manual work is what refines your instincts - but I think it’s more of a “Just So” story than anything else. In the end, the quality of art is defined by the quality of your decisions - how much work you put into something is just a proxy for how much you care and how much you have to say.
Great insights here, thanks for sharing. That opening question really clicked for me.
That quote seriously rubs me the wrong way. "Dragging little boxes around" in a DAW is creative, it constitutes the entire process of composing electronic music. You are notating what notes to play, when and for how long they play, what instrument plays them, and any modifications to the default sound of that instrument. Is writing sheet music tedious? Sure, it can be, when the speed of notating by hand can't keep up with the speed your brain is thinking through ideas. But being tedious is not mutually exclusive with being creative despite the attempt to explicitly contrast them as such, and the solution to the process of notating your creativity being tedious is not "randomly generate a bunch of notes and instruments that have little relation with the ones you're thinking of". This excerpt supposes that generative AI lets you automate the tedious part while keeping "the quality of your decisions", but it doesn't keep your decisions, it generates its own "decisions" from a broad, high-level prompt and your role is reduced to merely making decisions about whether or not you like the content generated, which is not creativity.
I'd say that deciding where a transient should go" is creative, manually aligning 15 other tracks over and over again is not (not to mention having to do it in both the DAW and melodyne)...
I agree that "push button get image" AI generation is at best a bit cheap, at worst deeply boring. Art is resistance in a medium - but at what point is that resistance just masochism?
George Perec took this idea to the max when he wrote an entire novel without the letter "E" - in French! And then someone had the audacity to translate it to English (e excluded)! Would I ever want to do that? Hell no, but I'm very glad to live in a world where someone else is crazy enough to.
I've spent my 10,000 hours making "real" art and don't really feel the need to justify myself - but to all of the young people out there who are afraid to play with something new because some grumps on hacker news might get upset:
It doesn't matter what you make or how you make it. What matters is why you make it and what you have to say.
> It doesn't matter what you make or how you make it. What matters is why you make it and what you have to say.
I want to add one point: That you make/ ship something at all.
When the first image generating models came out my head was spinning with ideas for different images I'd want to generate, maybe to print out and hang on the wall. After an initial phase of playing around the excitement faded, even though the models are more than good enough with a bit of fiddling. My walls are still pretty bare.
Turns out even reducing the cost of generating an image to zero does not make me in particular churn out ideas. I suspect this is true for most applications of AI for most people.
Related? Unrelated? Places like redbubble.com will let you print your design on t-shirts, dresses, mugs, pillow covers, bathroom curtains, bedspreads, shower curtains, stickers, posters, phone cases, and 50+ other things. And, cheap! You can order a single print t-shirt for ~$20
When I first saw all the items I was like "yea, I'm going to cover my house in custom stuff'. But other then a few personal t-shirts I haven't done anything.
To be clear, as I said in another reply downthread, I think this particular project is creative, although creative in a fundamentally different way that does not replace existing creative expression. I also don't object to people doing "push button get image" for entertainment (although I do object to it being spammed all over the internet in spaces meant for human art and drowning out people who put effort into what they create, because 10,000 images can be generated in the time it takes to draw a single one). But "push button get image" is not making things yourself. I would give you credit for creating this because you put effort into fine-tuning a model and a bespoke pipeline to make this work at scale, but this project is exceptional and non-representative among generative AI usage, and "push button get image" does not have enough human decision-making involved for the human to really have any claim to have made the thing that gets generated. That is not creativity, and it is not capable of replacing existing expressions of creativity, which you've asserted multiple times in the article and thread. By all means push button and get image for as long as it entertains you, but don't pretend it is something it isn't.
To me, this project is the point. AI makes "push button get image" is now a thing just like photography made "push button get image" a thing. People complained then that photography was not art. But then eventually we mostly found the art of photography. I think the same will happen for AI stuff. When everyone can do it you need to do something else/more
I don't know anything about electronic music or what a DAW is, but his usage of "dragging boxes around" could either be a gross reduction in the process of creating art, or it could genuinely be just mundane tasks.
It's like if someone says my job as a SWE is just pressing keys, or looking at screens. I mean, technically that's true, and a lot of what I do daily can certainly be considered mundane. But from an outsiders perspective, both mundane and creative tasks may look identical.
I play around with image/video gen, using both "single prompt to generate" à la nano banana or sora, and also ComfyUI. Though what I create in ComfyUI often pales in comparison to what Nano or Sora can generate given my hardware constraints, I would consider the stuff I make in ComfyUI more creative than what I make from Sora or Nano, mainly because of how I need to orchestrate my comfy ui workflow, loras, knobs, fine tuning, control net, etc, not to mention prompt refinement.
I think creativity in art just boils down to the process required to get there, which I think has always been true. I can shred papers in my office, but when Banksy shred his painting, it became a work of art, because of the circumstances in which it was creative.
Just to provide a little more context, a DAW (Digital Audio Workstation) is, underneath all of the complexity and advanced features piled on top, a program which provides a visual timeline and a way to place notes on the timeline. The timeline is almost directly analogous to sheet music, except instead of displaying notes in a bespoke artificial language like sheet music does, they are displayed as boxes on the timeline, with the length of the box directly correlating to how long the note plays. Placing and dragging these boxes around makes up the foundation of working in a DAW, and every decision to place a box is a decision that shapes the resulting music.
Where to place boxes to make good music is not obvious, and typically takes a tremendous understanding of music theory, prior art, and experimentation. I think the comparison to an author or programmer "just pressing keys" is apt. Reducing it to the most basic physical representation undercuts all of the knowledge and creativity involved in the work. While it can be tedious sometimes, if you've thought of a structure that sounds good but there is a lot of repetition involved in notating it, there are a lot of software features to reduce the tedious aspects. A DAW is not unlike an IDE, and there are ways to package and automate repetitive musical structures and tasks and make them easy to re-use, just as programmers have tools to create structures that reduce the repetitive parts of writing code so they can spend more of their attention on the creative parts.
I can make and iterate a piece of track in my head.
I have no idea how to translate it to actual audio anyone else could hear in any way, apart from learning to ~code assembler~ drag million boxes in DAW.
Tedium in art is full of micro decisions. The sum of these decisions makes a subtle but big impact in the result. Skipping these means less expression.
Honestly, still figuring this out. Tried social media (Instagram, Threads, Bluesky) and reaching out to blogs/reviewers. Gets some traction but reach is limited. A German print magazine review helped, and the first HN post brought a spike.
What's funny – parent to parent word of mouth has been really helpful. I get lots of emails from parents with ideas or bug reports. They found the app through friends or family. That feels more sustainable than any marketing trick.
But I still need to find better channels with more reach. Most growth so far has been organic via App Store search. Open to suggestions if anyone has ideas! ;)
Thanks a lot. As an unrelated follow-up, I searched for the spotify API and found that spotify changed their criteria for Web API Extended Access last year. I suppose this will not affect your app, since existing API users can continue to use it. Would the new rule affect you if you were just starting to develop this app?
Yes, that change was actually the reason I went with "users create their own Spotify app" approach. With the new criteria, getting extended quota as an indie dev is tough – Spotify wants commercial partnerships or big user numbers.
By having users create their own developer app, each user has their own quota and I stay out of the approval process entirely. Works well for a niche app like this.
When one of the senior executives from Bing Search visited my university for a talk, they personally told the director of the computer science department that they envied the fact that the director could use a 27-inch iMac at work, whereas they could only use one at home.
The head of bing search is not ever going to be in position to issue an emergency patch for windows. They are far more likely to cause such a situation than resolve one.
I love the idea and the execution. The onboarding experience is great as well. Thanks for sharing. I am curious about SOC II. how much effort did you put in to acquire it, and what made you decide to pursue it?
> how much effort did you put in to acquire it, and what made you decide to pursue it?
We originally started looking into it when we were in the B2B space. On our end, we already took security pretty seriously so checking all the boxes was low lift.
The landing page looks great, and a free, privacy-preserving transcription tool is a fantastic idea.
Infrastructure-wise, how much does it cost to host the service at the moment? What services are you using to deploy it?
I am used to typing 'chat.openai.com' in the browser address bar. However, until today, I was always greeted with a login page. Now, access to ChatGPT 3.5 no longer requires me to log in.
> I cannot understand when I mispronounce words. I get corrected often and repeat back the correction but it sounds like what I said the first time.
I have the exact same problem.
I had troubles learning pronunciations when I was a kid, but I never had other learning problems.
As an adult, I struggled enough with pronunciation that I consulted two hearing specialists to determine if there were any issues with my ears. It turns out my hearing is fine.
Is it possible for someone to get a diagnosis for dyslexic based on pronunciation problems only?
I must say I am relieved to finally find someone who shares my experience.
You might be able to be diagnosed, but there’s no medication or cure. You can get extra time on tests if you’re younger.
You can Google dyslexic signs. I think last time I looked auditory processing anomalies and the retina or lens was oddly shaped in 95% of dyslexics. Dyslexic is also an umbrella term for a bunch of things so you’ll have to sort through to find your flavor.
Best fix is a good nights sleep. Also make sure you move closer to speech you want to comprehend and the people you interact with most enunciate and speak at a good volume (too many people speak poorly and could use 2 weeks of beginner theater classes)
Thank you for the kind, detailed reply. I wish I had known my condition may be dyslexic when I was younger. Taking a beginner theater class is now on my todo list.
I did not intentionally look it up. I have an extension installed that tells me domain age whenever I visit a site.