This is such a good post. I'm pretty humbled by your words about us being "everywhere that's of interest" and that "we're highly respected." It's hard to see that when you're in the weeds, so I just wanted to say I appreciate it.
Regarding proprietary…I get it. I was the CEO of BlazingSQL, and we were fully OSS with an open-core model. The number of Fortune 500 customers that were deploying us at scale but not paying us in money, feedback, or testimonials was honestly heartbreaking.
When Josh (our CEO) and I were in the early days of Voltron Data, we thought maybe we could hold ourselves accountable to the open-source community with a new model, which we now call open-periphery, where, as you said, the interchanges, standards, and protocols are open, allowing companies and developers to build resilient, evolvable data stacks.
Open-periphery also means we don't have to debate what goes back to the community and what goes into the proprietary code because there is such a clear delineation. Open-periphery is our way of thinking about OSS business models, and it's the solution we came up with to ensure we can continue to invest in open-source and next-generation query engines.
Howdy, full disclosure I'm the CEO at BlazingSQL (BSQL).
I'm not incredibly familiar with Ares save the linked article, but we aren't a DBMS or manage data in any way.
BlazingSQL is a SQL engine, it's easier to think of it similar to SparkSQL, Presto, Drill, etc.
We're core contributors to RAPIDS cuDF (CUDA DataFrame), which is a Pyhton and C++ library for Apache Arrow in-GPU memory. The Python library follows a pandas-like API, and the compute kernels are in C/C++.
BSQL binds to the same C++ as the pandas-like cuDF. What this enables users to do is interact with a DataFrame with either SQL or pandas depending on their needs or preferences. This interoperability means that the rest of the RAPIDS stack can be applied to a variety of different use cases (data viz, ML, Graph, Signal Processing, DL, etc), with the same DataFrame.
The DataFrame also has performant libraries for IO, Joins, Aggregations, Math operations, and more.
Again, think of BSQL as a query engine, that runs queries on data wherever and however you have it. Here is a BSQL user running 1-2 minute queries on 1.5TB of CSV files using 2 GPUs. https://twitter.com/tomekdrabas/status/1303824164273270789
GoAi was to get GPU developers on the same page and to work together to build an ecosystem for analytics on GPUs.
RAPIDS is a project that was born out of GoAi to bring that ecosystem to Python.
It is built on Apache Arrow (although on GPU memory), and has many of the original GoAi members like my team, BlazingSQL, and others such as Anaconda, Nvidia, and many MANY others.
So, the part that confuses me with this argument is we live in an Intel world where they have 98% market share in servers. So we're already at the whim of a single company. Why not challenge that dominance?
Not the same. Two companies make x86 processors, and in the very specific case of this article/comment thread, more than one company supports OpenCL. Nvidia/cuda is a one-pony show, no matter how you look at it.
Yeah, we were totally ignorant on PartiQL until your post. Although now looking at it, looks boss! Totally agrees with many of our theses, and there looks to be a lot to glean from this project as well.
This is such a good post. I'm pretty humbled by your words about us being "everywhere that's of interest" and that "we're highly respected." It's hard to see that when you're in the weeds, so I just wanted to say I appreciate it.
Regarding proprietary…I get it. I was the CEO of BlazingSQL, and we were fully OSS with an open-core model. The number of Fortune 500 customers that were deploying us at scale but not paying us in money, feedback, or testimonials was honestly heartbreaking.
When Josh (our CEO) and I were in the early days of Voltron Data, we thought maybe we could hold ourselves accountable to the open-source community with a new model, which we now call open-periphery, where, as you said, the interchanges, standards, and protocols are open, allowing companies and developers to build resilient, evolvable data stacks.
Open-periphery also means we don't have to debate what goes back to the community and what goes into the proprietary code because there is such a clear delineation. Open-periphery is our way of thinking about OSS business models, and it's the solution we came up with to ensure we can continue to invest in open-source and next-generation query engines.