The memory expectations for so many programmers going into pandas baffle me. In particular, noone was batting an eye that a 700 meg CSV file would take gigs to hold in memory. Just convincing them to specify the dtypes and to use categorical where appropriate has had over 70% reductions on much of our memory requirements. Not shockingly, they go faster, too.
If there are efforts to help this be even better, I heartily welcome them.
When I'm teaching Pandas, the first thing we do after loading the data is inspect the types. Especially if the data is coming from a CSV. A few tricks can save 90+% of the memory usage for categorical data.
This should be a step in the right direction, but it will probably still require manually specifying types for CSVs.
Yeah, I expect most efforts to just help make the pain not as painful. And specifying the data types is not some impossible task and can also help with other things.
If there are efforts to help this be even better, I heartily welcome them.