Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does this still work if you give it a pre-existing many-legged animal image, instead of first prompting it to add an extra leg and then prompting it to put the sneakers on all the legs?

I'm wondering if it may only expect the additional leg because you literally just told it to add said additional leg. It would just need to remember your previous instruction and its previous action, rather than to correctly identify the number of legs directly from the image.

I'll also note that photos of dogs with shoes on is definitely something it has been trained on, albeit presumably more often dog booties than human sneakers.

Can you make it place the sneakers incorrectly-on-purpose? "Place the sneakers on all the dog's knees?"





My example was unclear. Each of those images on Imgur was generated using independent API calls which means there was no "rolling context/memory".

In other words:

1. Took a personal image of my dog Lily

2. Had NB Pro add a fifth leg using the Gemini API

3. Downloaded image

4. Sent image to BFL Flux2 Pro via the BFL API with the prompt "Place sneakers on all the legs of this animal".

5. Sent image to NB Pro via Gemini API with the prompt "Place sneakers on all the legs of this animal".

So not only was there zero "continual context", it was two entirely different models as well to cover my bases.

EDIT: Added images to the Imgur for the following prompts:

- Place red Dixie solo cups on the ends of every foot on the animal

- Draw a red circle around all the feet on the animal




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: