Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been working on https://asterai.io -- a platform for developing, running and managing AI agents.

It lets you create multiple agents, configure them via the web console (such as LLM parameters and system prompts) and manage their plugins and functionality.

The system is fully plugin-based, where each plugin is a WASM program that exposes functions/tools that the agent can call, and can also hook into the query lifecycle. Because plugins are WASM, they can be written in various languages such as Rust, Go, TypeScript etc. Plugins can also act as libraries, which is possible because of WebAssembly Components (a great piece of software!) -- so you can dynamically call functions from other plugins within your agent, and you get type support for your chosen language too (with codegen via WASM Components tooling).

More recently, I've been working on an SSH server for agents. The idea is that you can add public keys to your custom agent and then SSH into it to talk to it easily from terminal.

If this sounds interesting, feel free to join our Discord! The project is still new and feedback is highly appreciated. http://asterai.io/discord



This looks interesting, how do you plan to handle agents which operate apps with a UI - for example playwright, obsidian etc. Or is this out of scope?


Thanks!

That's a good question. Currently, there is one way to do it. The client querying the agent receives JSON-encoded values that are returned from plugin function calls made by the agent. These values are received alongside the agent token response stream (via SSE). So plugins can essentially emit events that the client can forward to the UI application, such as to click a button etc. The limitation with this is that there is no built-in way to send a success/error status back, it's one way only. It works well for actions that are infallible such as simple UI actions.

The client here would also need a way to interact with the target program of course, e.g. from a JavaScript browser you can click buttons and manipulate the DOM, or from a VSCode Plugin you can interact with the editor etc.

It's definitely something that can be improved though! I've been thinking about some type of MCP interoperability that could maybe assist with this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: