Hacker Newsnew | past | comments | ask | show | jobs | submit | RealLast's commentslogin

Yep, fully open-source!


You guys have a discord or anything like that by any chance?


You guys are so funny, when papers like these exist: https://arxiv.org/abs/2404.11757

Numerous research, INCLUDING the OpenTSLM paper has PROVEN they are NOT able to do this out of the box. Did you even check out the results at all? They literally compare OpenTSLM against standard text only baselines. Gemma3-270M performs better than GPT-4o using tokenized time series alone. Thus, I guess you guys are being ironic.


I understand how annoying it is when people post shallow dismissals of your work on the internet, but please don't give in to the annoyance when replying. It makes the thread worse, and it's against the HN guidelines: https://news.ycombinator.com/newsguidelines.html.

I don't know if this is your work or not, but I appreciate your wanting to defend it...we just need you to do that in a way that doesn't attack others, no matter how wrong they are or you feel they are. Easier said than done of course, but we're all working on it together.


An experiment is not a proof.

If this is the level of one of the contributors to the OpenTSLM paper (which you very obviously are), no wonder due diligence wasn't done properly.


It’s less about proof and more about demonstrating a new capability that TSLMs enable. To be fair, the paper did test standard LLMs, which consistently underperformed. @iLoveOncall, can you point to examples where out of the box models achieved good results on multiple time-series? Also, what kind of time-series data did you analyze with Claude 3.5? What exactly did you predict, and how did you assess reasoning capabilities?


Check it out, they are completely based on Llama and Gemma, outputting text. Models are open-source.


The full paper is on the website. The arxive release of the exact same paper is pending. Click the button "read the white paper" to get the full paper.


Hosting your paper on your own website instead of arxiv is fine of course. My question is why does the website sound like some of the shadiest investment pitches I've seen? Doesn't make me want to read the paper.


Please don't treat people in a hostile fashion when discussing their work on HN. That's the opposite of the kind of community we want here.

https://news.ycombinator.com/newsguidelines.html


What was hostile about what they said?


"why does the website sound like some of the shadiest investment pitches I've seen" is no way to welcome someone sharing their work in good faith.


I think you missed the point. Would you call an image analysis library to describe an image or reason over a sequence of images? Check out some of the plots in the paper to see what these models can do.


I would if the image analysis library was backed by a VLM. I have not fully read the paper, but couldn't figure 6 have been done by an LLM writing a script that calls libraries for time series feature extraction and writing a hypothesis test or whatever? They will do the heavy lifting and return a likelihood ratio or some statistic that is interpretable to an LLM.


OpenTSLM models are exactly made to capture these subtle signals. That was one of the original motivations. The model integrates the raw time series data via cross attention, with concrete time series representations learned by a raw time series encoder.


Can you explain how? If I'm understanding the paper right, the timeseries encoding is a Conv1D and the cross-attention layer is constrained to output the token space of a pre-trained LLM. My naive expectation is these constraints would make the model less expressive / fine-tunable to pick up on these types of subtle signals.

But obviously ML is an empirical field, so if you found that a constrained architecture worked well in practice, that's an interesting result in its own right.


Sure! There is more after the 1D conv, another transformer architecture that encodes further features of the time series. The LLM can then basically query this encoder for information, also able to capture more subtle patterns. In away it's similiar to how some vision language models work.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: