Hacker Newsnew | past | comments | ask | show | jobs | submit | rickn's commentslogin

(MemSQL VP of Product Management here)

We didn't intend the benchmarks to be a sales pitch. We are proud of the performance of MemSQL and what it has been able to achieve in our customers' workloads. We wanted a way to show what we are capable of with concrete numbers. We chose the standard benchmarks because they are well understood, not because they were necessarily representative of any given customer.

In general, benchmarks are useful to understand the strengths and weaknesses of a product at a basic level and how it compares to its peers, but we strongly encourage anyone evaluating their options to do a proper POC comparison on their actual use cases.


[Director of Product Management for MemSQL]

MemSQL has two storage modes Rowstore and Columnstore. The Rowstore is "in-memory" and the columnstore is "on-disk" but those are oversimplifications. The rowstore data is stored in memory but we keep a snapshot of the data on disk. We also keep the transaction log (a record of all changes since the snapshot was taken) also on disk. So queries can be satisfied fully from memory (because that is where the current data lives) but writes go to memory and to the transaction log on disk. If the machine reboots then the snapshot is loaded from disk back into memory and the transaction log is replayed. When that is complete you are back to where you were when the machine rebooted with no loss of committed data. Columnstore data is always stored on disk although we use a row store in front of the column store that is hidden from the user but acts as a buffer of sorts so that writes in the column store can be pretty fast. More details on how the columnstore works can be found here: https://docs.memsql.com/concepts/v6.7/columnstore/#how-the-m...


I am the Director of Product Management for MemSQL.

The 128 GB limit applies to the whole cluster. So if you have two nodes in the cluster they would each have to be 64GB or less. If you have four nodes they would all have to be 32 GB or less. To have a highly available system we recommend 4 nodes (a master aggregator, a child aggregator and two leaf nodes). You can read more about the cluster architecture here: https://docs.memsql.com/concepts/v6.7/distributed-architectu...


How does MemSQL reach consensus between nodes?


My name is Rick Negrin. I run the Product Management team at MemSQL, a scalable relational database. I recently wrote a blog on my thoughts regarding NoSQL vs. Relational Databases and I'd love to hear the community’s thoughts on this.


I notice not a single mention of the CAP theorem. I have a hard time taking any "scalable" RDBMS solution seriously without a discussion on how you scale well-known problems with deletes and updates, maintaining b+tree indexes, distributed joins, distributed data, node loss/partitions, and distributed transactions/updates.

If you are a new distributed system and don't have Jepsen tests or a similar level of discussion on how your database handles partition events, then that tends to be harbingers of snake oil in distributed systems.

Granted I only did a quick search on CAP on your article, but the initial paragraph of your article didn't exactly invite further investigation or time investment.

If your "NoSQL" is just a single node MongoDB, then you should state that rather than your blanket statement. As is your categorization of "NoSQL" is unspecified and unqualified, leading me to believe this is a management-level article with little regard for the real issues in large-scale distributed databases, and why would I think your software also considers such problems?


Hi Rick, I've been following Memsql for a few years now, are there any plans to release "community" edition? Last time I checked about 1.5 years ago json support was very basic and EE pricing (dont remember exact #s) was rather high. Thanks


There already was a community edition[0]. And it was replaced by a "developer" edition that can no longer be used in production. It seems it didn't pan out as a marketing strategy and I don't think it's coming back.

[0] https://news.ycombinator.com/item?id=9577663


What are you looking for in a "community" edition?


EE features without support. I hate to bring up Mongo as example, but something similar...where support, additional software/plugins and cloud hosting are where the $ is made.

I did thorough testing of Memsql two years ago but went with Aurora instead. Would love to see how the product evolved since (Spark and Streaming integration was just being rolled out at the time), but something tells me pricing will be a deal breaker.


The point is that is not a convenient business model to just sell support, I can definitely understand why they are trying to sell features.


Thing is they are competing with PostgresSQL which you can extensively try for free before opting for a support.

"Free" being already hard to beat. The fact that you can't extensively test a solution is a real turn down for me (unless negotiating with commercials which is not nerds cup of tea).


Something that you can use on production based on what your db advertises as it's advantages (high availability + sharding).


I was asking because I am the creator of RediSQL[1] -- SQL steroids for Redis -- which is a less sophisticated product than MemSQL but still has its own use cases.

And maybe for parent was enough, or if not it would be very interesting to know what is missing.

[1]: http://redbeardlab.tech/rediSQL/


The same answer applies to your product.


Honestly, I believe that for small workload you can definitely use RediSQL in production, it will happily contain your cache or it will be a great SQL database.

However, I need a way to cut it between people just using the free product and people actually supporting the project, so provide as paying feature something that the big company will require it seemed to me the only way to go.

Unfortunately, I don't have the capital nor the bandwidth to go with fully open source product and selling just support, which I don't believe is anyway a good business model.

If you were in my shoes, you would do something different?


I haven't followed any links posted in this thread, but some things I see often are: free for non-commercial use, timed commercial use usually in the region of 30 days, or rates based on reads and writes. The last one seems like a winner from what you've described as your situation.


To be honest, I fail to see what I could use your product for so I'm out of the target audience.

Assuming nosql is for something very efficient or very scalable, I need some space to use it before I have to shell $$. There are many products where I have to pay before going on production.


It really depends on what you are building.

If I were building a fast prototype I would not use a postgres box anymore but just a redis one.

If you need to cache data in a way more complex than just key->value you don't have too many alternatives at the moment.

If you want an easy and fast way to have an SQL engine in memory, again is not going to be simple.

If you need a separated database for every of your user there are no many alternatives that I am aware of.

It is definitely not a revolutionary product, but it has it's niche, any of the problems that I mentioned can be solved in a different way, but those different ways are quite complex.


EDIT: I couldn't reply directly to it's message, now I can, I just copied my previous comment verbatim below.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: