I have been developing Lua-heavy embedded products as a freelancer for about 20 years now, including VoIP devices, home automation controllers, industrial routers, digital video recorders, and more. These systems typically consist of a Linux kernel, some libc implementation, the lua interpreter and a few 3d party libs support libs to help building the app. The Lua apps ranges from 30k to 100k lines of code, depending on the application. Some of these devices can be considered 'small' in 2025 terms: 8MB of flash, 64MB of ram. Lua is doing great here.
All of these products are still alive today, actively supported and making my customers good money.
Some things come very natural to Lua: Lua <=> C interfacing is a breeze, and while some modern languages are still struggling to figure out how to do proper async, Lua has been able to do this for decades. The language itself is minimal and simple but surprisingly powerful - a few smart constructs like coroutines, closures and metatables allow for a lot of different paradigms.
For new projects at this scale, I would still choose Lua + C/C++ as my stack. Over the last few years I have been visiting other ecosystems to see what I'm missing out on (Elixir, Rust, Nim), and while I learned to love all of those, I found none of them as powerful, low-friction and flexible as Lua.
I am currently working on an embedded system with 264Kb of RAM and 4Mb of flash. Do you think Lua could be used in such limited settings? I am also considering the berry scripting language [0].
Assuming your flash allows XIP (execute in place) so all that memory is available for your lua interpreter data, you should at least be able to run some code, but don't expect to run any heavy full applications on that. I don't know Berry but it sounds like a better fit for the scale of your device.
But sure, why not give it a try: Lua is usually easy to port to whatever platform, so just spin it up and see how it works for you!
I haven't worked on a system that limited (not even OpenWRT routers) since a dev board in college.
The experience I had there might be your best bet for something productive. That board came with a 'limited C-like compiler' (took a mostly complete subset of C syntax and transcribed it to ASM).
You'll probably be doing a lot of things like executing in place from ROM, and strictly managing stack and scratch pad use.
The 64MB of RAM and 8MB (I assume that's 64Mbit) of ROM allow for highly liberating things like compressed executable code copied to faster RAM, modify in place code, and enough spare RAM otherwise to use scripting languages and large buffers for work as desired.
It's more than generous. You can run it with much less resource utilisation than this. It only needs a few tens of kilobytes of flash (and you can cut it right back if you drop bits you don't need in the library code). 32 KiB is in the ballpark of what you need. As for RAM, the amount you need depends upon what your application requires, but it can be as little as 4-8 KiB, with needs growing as you add more library code and application logic and data.
If you compare this with what MicroPython uses, its requirements are well over an order of magnitude larger.
Some of our products use MicroPython though we also use a whole host of other technologies. Some of our devices are proof-of-concept (often designed to progress a theory) but we also deliver up to Class B solutions.
Carefully, at least for devices with higher classifications. Using pre/early allocation helps but, more importantly, we monitor memory use over time in realistic scenarios. We've built tooling, like a memory-profiler [1] that allows us to observe memory consumption and quantify performance over time.
However, it turns out that MicroPython has a simple and efficient GC - and once you eliminate code that gratuitously fragments memory it behaves quite predictably. We've tested devices running realistic scenarios for months without failure.
The embedded world is really vast. If it's something safety critical, regulations won't allow it. But the regulations say nothing about all the test rigs you'll be building. IoT is another domain where people do whatever they find convenient.
Every so often I have a need for a small cheap device interoperating with a larger system that I'm developing. Like something that sits on MODBUS and does a simple task when signalled. I've taken the RP2040 and Pico board and spun it into a gizmo that can do whatever I want with Micropython, and it's an order of magnitude cheaper and faster than trying to spin it up in STMCube.