Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tables are still the big unsolved problem for me.

There are a ton of potential tools out there like Tabula and AWS Textract table mode but none of them have felt like the perfect solution.

I've been trying Gemini Pro 1.5 and Claude 3 Opus and they looked like they worked... but in both cases I spotted them getting confused and copying in numbers form the wrong rows.

I think the best I've tried is the camera import mode in iOS Excel! Just wish there was an API for calling that one programmatically.



Out of curiosity have you tried ocrs by Robert Knight? https://github.com/robertknight/ocrs


No I hadn't heard of that one!


If you're on Windows try https://table2xl.com (disclosure: I'm the founder), it's more accurate than Excel's camera import. No API though.


Would this be helpful? https://github.com/facebookresearch/nougat

Seems like it can handle tables.


As mentioned above, give Unstract a try: https://github.com/Zipstack/unstract


I think the camera import on Excel MacOS works pretty well. You could probably call that version through an API.


Google and Azure have their own PDF Table extraction service but I have noticed Textract is a bit better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: