Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google Workspace for Education storage policy changes (support.google.com)
37 points by decrypt on Feb 19, 2021 | hide | past | favorite | 54 comments


Is Google running out of storage? In the last 18 months they've seemed to complete a 180 degree turn on their "store all the things" policy.


My guess would be abuse. If you check ebay there's many offers of "lifetime unlimited google drive" accounts that use google education. A similar exploit existed for gsuite. If you bought a 5 (or 10?) seat license then your entire organization would get unlimited storage. That was discontinued as well.


That part was never even enforced. I have a single-seat GSuite (or, now Google Workspace Plus) account and have unlimited storage. It's the only thing keeping me on GSuite, honestly, because I really prefer Outlook to Gmail.


I always thought single seat accounts couldn't get unlimited storage, the cap was just higher.


I wonder if they're getting a lot less value out of the data than they thought they would (I'm presuming data to aid in targeting is how they justify free unlimited storage)?


Google doesn't collect any data from Workspace for Education users for targeted ads.

From https://support.google.com/a/answer/139019: "Google Workspace for Education does not collect student data for advertising purposes.

We scan Gmail to keep users’ mail secure. Scanning includes virus and spam protection and is 100% automated. We scan all incoming email, relevant search results, and features, such as Gmail Priority Inbox."

(Disclaimer: I work at Google, but not on Workspace or Ads - I don't represent these teams.)


>Google doesn't collect any data from Workspace for Education users for targeted ads.

That doesn't mean that they don't collect data for other purposes, which include analytics and tracking on how to change the service and said data is still valuable.


Likely that and when they made the initial offer they made an estimate of average storage per user which was probably correct at the time but has changed as users have begun to increasing use more cloud storage with larger files.


Those business agreements for faults supposedly didn’t allow them to use the data that way anyways. Not sure I could trust it though. Everything I upload (well just about) gets encrypted before uploaded.


* Google Photos Unlimited has shut down.

* Google Drive unlimited for businesses has shut down.

* Google Drive unlimited for education has shut down.

* Google Drive files are now deleted if you've not accessed your account recently (no definition of recent)


> no definition of recent

They've been very clear that it's 2 years, and they will notify you before they remove anything.


Maybe they're finally capitalizing on all their locked-in users.


idk why this is getting downvoted - it's a legitimate criticism of the way Google will enter a market with low prices (or free services,) steal market share, and then boost prices when they have market power and/or significant user anchoring.

Ex: see all the Google Photos competitors who are doing the "I told you so" roadshow when they announced the end of unlimited storage for free. Other apps couldn't compete and/or died out, and so with market share in hand Google could "raise" prices.

Further, Google doesn't hide that they want to grow "services" revenue, and had mentioned it in previous investor calls. I'm guessing they're seeing the success Tim Cook has had with diversifying revenue and growing services like crazy, and now they want to leverage that channel for growth.


Could be GDPR? Storing all the data on everyone now has more potential cost than just the hard drive.


> baseline of 100 TB

> Institutions with greater than 20,000 [...] active users [...] will be provided with additional storage.

So >=5GB per user. Not great, not terrible. I guess most students who use their Google student account for educational purposes only don't need more than that.

Might get tricky for teachers and some STEM-students however.


If any Google employee reads this, how do you check how much data is in a specific drive folder? I haven't found an official method.

I need to free up space, thanks to the new storage policy. It's kinda hard to track down which directories are using the most data when there is no way to check...


(Googler, opinions are my own). I also don't work on drive, these are my rough understanding of things.

There may not be a way to do this yet, but hopefully should be coming someday. One key thing to understand is that Google Drive seems to be implemented as a graph, not a tree. This becomes apparent when you realized that any item (folder or file) could be placed into multiple directories on your drive. This allowed for cycles and lots of other crazy things within the filesystem (as you aren't guaranteed to have a single parent folder).

Now, this behavior finally changed in Sept 2020[1], and they are backfilling all mutli-links to be shortcut (symlink) based. This should finally allow for proper directory navigation and determining size. It also fixes all kinds of other problems that a graph file system causes.

[1] https://cloud.google.com/blog/products/g-suite/simplifying-g...


> This allowed for cycles and lots of other crazy things within the filesystem (as you aren't guaranteed to have a single parent folder).

That doesn't prevent a folder size calculation. It does mean you need to do a graph traversal, and you might need to supplement the feature with a visualization akin to what you get from windirstat/baobab/etc, so users know what's factoring into the calculation.

It also likely means this isn't an instantaneous operation, but again, I look at windirstat/baobab as examples for how to present a UX that's workable.

But that doesn't prevent the feature from being built, and the fact that rclone already has it says to me the gdrive team has simply chosen not to make this functionality available.


I agree that it doesn't prevent it. Traversing a graph is a simple compsci problem, but optimizing it is more complicated. My guess is that it prevent caching of folder sizes, so any size checks require full graph walks, which are not cheap.


Another complication I thought of: permissions. While this may not matter as much for consumer drive product, for Google Drive within an organization, permissions are per object. So 2 people looking at the same folder may see different content. This impacts folder-size, as any given person may only see a subset of the content within a folder.


I think it stands to reason that if computing folder sizes was a trivial thing to implement the Drive folks would've done it by now. It's obviously useful.


It equally stands to reason that by not offering this feature, they make it more difficult for users to manage their quota, which means they can drive folks to higher-priced plans. This would be the "malice" take.

Given Google's history with UX and support of existing products, the other possibility is the product is simply understaffed/undermaintained and poorly managed. This is the "incompetence" take.

And again, to emphasize: rclone can do this today. Is it a bit slow (and possibly limited)? Absolutely. But it works, and is how I've turned to managing my own gdrive quota since Google seems to be either unwilling or incapable of offering the functionality themselves.


I disagree - rclone can solve a substantially simpler and different problem (i.e. computing the size of a specified hierarchy on-demand). Being able to store and produce sizes of arbitrary folders in Drive for however many millions of users in a reasonable time frame is a substantially more challenging problem, and what we're talking about here. Google probably hasn't provided the on-demand version of this because it's trivial to script with their API.

Given that Google does offer you tools for managing your quota, and knowing it's a very difficult problem to solve, I'm disinclined to believe this is malicious or incompetent.


I'm not a Google employee, but as far as I'm aware, there isn't one, and frankly, at this point I have to wonder if the lack of such a feature is a malicious way of preventing users from managing their storage utilization.

I've also had to resort to using rclone for this, which is absurd.

P.S. And don't forget: the data in the Trash folder counts against your quota. So if you use rclone to get folder sizes and start deleting things, you better be sure to empty the trash.

And that gets very difficult if the Trash has a lot of items in it, as the web interface will only delete a limited number of items from the Trash at a time.

As a result, you might also want to use rclone to delete permanently and bypass the Trash entirely.

Once again, while I tend to assume incompetence over malice for most things, in this case, you really gotta wonder...


Trash is now auto-emptied.

You can also view objects by the quota they consume at https://drive.google.com/drive/u/0/quota


> Trash is now auto-emptied.

After 30 days, IIRC. That's not helpful if you're trying to manage quota now, and certainly isn't an excuse for the truly astonishingly bad UX around the trash bin.

> You can also view objects by the quota they consume at https://drive.google.com/drive/u/0/quota

Objects as in individual files. It does not show you which folders are, in aggregate, consuming large amounts of storage. As a result this is utterly useless unless your quota is taken up by a few large objects. I legitimately have no idea what use case they thought they were addressing when they built that view.


I think the UX around the trash bin is pretty good and I've always found the quota pane useful.


Here are my thoughts on the issue. (To be clear, my thoughts have no input from my workplace.)

A) Why wasn't proper usage reporting built in from the start? If you are limiting a user by a usage metric, then you should be giving them a way to manage that usage intelligently. It does sound like Drive is a complicated structure, but, well, that's no excuse for not supporting your users properly.

B) I'm not surprised by Google ending the unlimited plan. But I still consider it bait and switch. Promising "unlimited" in the first place is impossible. It's false advertising. All the companies that do, or did, that, really should be fined. At very least we should hit them with a class action.

C) Google's method of ending the unlimited plan is not very good. They really should have done something like:

"We are ending the unlimited plan. New users will receive 100TB. Current users with more than 100TB of data will be granted a perpetual 120% of their current data usage as of last night at midnight UTC. Current users with less than 100TB, will have a 120TB limit."

That would have been far more fair, and much less of a bait and switch.

Anyway, thanks to everyone who replied. The tips about rclone really helped out.


Rclone's size command does this.

rclone size gdrive_remote:/path/to/folder


rclone also has the really nifty ncdu command: https://rclone.org/commands/rclone_ncdu/


Just tried that. It works. Thanks!


Nice. That worked well. Thanks!


It’s pretty obvious why. It’s the people from r/datahoarders and r/plex that were abusing drive and to a further extent really cheap eBay sold accounts.


That has to be a reason. I use somebody's Plex share that has 0.55PB of videos on Google Drive. Someone on Reddit famously bragged about having 1.6PB.

The person running my Plex uses an unlimited Enterprise plan, so this won't stop them.


I think you're being naive if you think they won't roll this out to Enterprise soon as well.


Already has but it isn't enforced yet. According to what their new "as much storage as you need" you have to ask them to increase it which I assume means they'll look at bandwidth usage and see if theres any one user abusing. Good luck if its just a one person "enterprise" too. It's kinda like what Dropbox "unlimited" is.


Are hard drives harder or more expensive to come by these days due to COVID related production issues? Or is storage not decreasing in price according to their benchmark? Google Workspace only offers 30gb in their lower priced plans ($6/month) which seems a bit less too.


hard disk storage has been static in terms of drive size of the last couple of years. 12TB is roughly the maximum economic size you can get right now.


Counterpoint: I just upgraded a small fleet of drives to 16GB models because that was the optimal price/GB after checking a multitude of supply channels (at least at 7200 RPM).

My last upgrade was to 4TB and I've no doubt my next in a few years will be to 64 (though might be a different technology eg. SSD if the relative economics change).


> We recently announced a new storage model, which provides schools and universities with a baseline of 100 TB of pooled storage shared across all users.

What ? I know me and my friends alone use 5TB each. And our uni has around 8000 students.

100TB for everyone combined is ridiculously low.


Is that 5TB school work, or are you using it for personal files too?


About 40 GB is school work, the rest is personal files. Huge music collections, comics, course dumps, photos, downloaded youtube channels, etc.


And you're genuinely wondering why they are imposing limits? Sure, the new limits are <40GB/student, but most students are probably only using like 1GB of storage anyway.

They write some essays, store some PDF's, share some Google Docs documents, use a music streaming service, and don't download YouTube videos.


It appears that this is exactly the behavior they want to prevent - people hoarding personal (and illegal) files on institutional/corporate accounts.


I don't think you can complain about this while using your school-supplied account for that much private data.

Seems acceptable for schools to start enforcing acceptable-use policies that exclude using these accounts for anything but schoolwork if it saves money and resources to spend on other things.


Before abundant cloud storage somebody with internship money had to buy their own NAS. Kids these days.


As a masters student,I am seriously considering doing that since I don't want to depend on streaming and cloud backups anymore.

Any pointers good sir/mam on how to have their own physical backups? Like NAS and others?


Synology NAS. Seagate Ironwolf hard drives are a good pairing. You can setup your NAS to backup to AWS Deep Glacier or GCP equivalent so you have redundancy still.


Synology is good out of the box. I have a custom-ish U-NAS case.


Note that RAID isn't a replacement for backups. Personally, I just use USB-connected drives, Time Machine backups (on a Mac), and Backblaze for a secondary cloud backup.


Download youtube videos just to re-upload them to the same company's storage array. This is hilarious.


yes, because google has a habit of removing content from youtube, even educational content. (they made a policy to remove infosec content last year which they deem dangerous)

moreover, some youtubers remove their stuff


Youtubes copyright enforcement can hit videos at random, parodies like DBZA have been lucky enough to reappear after a few months. Other channels discussing movies or really anything else that might contain a few seconds of copyrighted music in the background can suffer random bouts of silence. So there are good reasons someone might feel the need to keep a personal backup outside of youtube.


lol, so in other words, you're exactly the reason they're doing this.

Why did you think your university account was for "whatever random crap I want to put in it", and also as much data as I could ever want to use"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: