> to write e-mails, draft course descriptions, structure grant applications, revise publications, prepare lectures, create exams and analyse student responses, and even as an interactive tool as part of my teaching.
So your grant applications were written by AI, your lectures were written by AI, your publications were written by AI, and your students exams were marked by AI? Buddy what were YOU doing?
You're assuming the AI was doing all the work as the primary author then? I mean it could be true. But -steelmanning- what if the AI was acting in a supporting role? Like calculators or computers. (people used to assume people were lazy to use those too, I'm sure)
The lost "work" is two years of ChatGPT logs. Sounds like AI systems had concrete benefits to this researcher in a number of applications, but I'm not sure how I feel that their discussions with AIs are so casually being described as "work". Seems slightly misleading?
I consider it a tool. Tool multiplies performance. Tho from research it appears that multiplication is nonlinear, from "a complete greenhorn makes an app that would otherwise take them weeks to just learn skill", thru "low double digits improvements just from saving time on writing boilerplate/looking up common problems libs" all the way to "the time wasted on trying to make LLM do it is more than just doing it".
If you can't use your tools properly (i.e. in this case, have backups) you will hurt yourself. And trying to blame it on tools that have NO guarantee in the first place
> However, my case reveals a fundamental weakness: these tools were not developed with academic standards of reliability and accountability in mind.
Yeah the article is ridiculous. Im not trying to defend but rather extrapolate. In particular about the “bro you are not working by chatting with ChatGPT” point.
If we consider it a tool, then why is it not work?
And to be clear I’m not even sure what I think. I’m throwing the question out there because I’m curious about what other devs think out here.
Just the first thing that comes to mind: ChatGPT can act as an enhanced Jupyter notebook where you specify tasks in English. This isn't an analogy; they literally run Jupyter kernels in the backend with chat as the frontend.
There's also canvas mode for iterating on actual documents, and the search/retrieval features make it a genuinely useful research tool independent of generation.
And this is me defending OpenAI, which I've stopped using. Other systems are more capable.
That’s not very generous. Keeping files in the Recycle Bin is an incorrect use of the Recycle Bin. Keeping conversations in your ChatGPT history is how it’s supposed to be used.
Never rely on any subscription based service for any data that is important. Never use data formats that lock you in. Especially not online services without (automatic) export options.
Keep a copy (cloud) and a backup (offline) for all you own data.
Any _legitimate_ organisation following the GDPR will allow export of your data; and you shouldn't be stupid enough to trust sensitive or valuable data with some dodgy organisation that doesn't follow the GDPR.
Obviously this doesn't alleviate the need for proper offline backups of your own valuable data!
How many people here rely on Google Docs? I have for many things for years. How many of you regularly back up your Google Docs? I have taken a Google Takeout a few times over the years. But no. Why? Because I have never heard of Google “losing” docs or emails or anything like that except when a user deliberately deletes things. Same with AWS S3. It just doesn’t lose files. Ops mistakes and hackers can make it lose files but the tech is rock solid.
I think it’s very reasonable to assume cloud services don’t need to be backed up, because many of them are based on extremely reliable technologies.
Obviously mistakes like this can happen, and if they’d had a backup OP would be better off.
But I can’t help but think that there’s a lot of shadenfreude here from people who dislike AI at seeing somebody suffer for having a strong reliance on it.
the threat here is not "cloud losing your data". the threat is "cloud denying access to your data". it's like when someone breaks up with you and you still have stuff at their house. good luck getting that back.
I had had my fair share of data loss lessons. For us, it is easy to say, “Why didn’t you back up?” But most people have an innate trust in tools, especially from big companies such as Apple, Google, and Facebook/Meta. I have heard so many people happily claim, “I won’t worry. I have it on Google.”
Even for my daughters’ much simpler school homework, projects, and the usual drawings/sketches, I’ve set up Backups so they don’t cry when their work gets lost. I set up the Macs I handed down to them to be backed up to iCloud, and added a cheap HDD for Time Machine. They think I’m a Magician with Computers when I teach them to use the Time Machine and see the flying timeline of their work. The other thing is the Google Workspace for Schools. I have found that having a local copy always available via a tool (such as InSync) does wonders.
The only sob story now is Games. They sometimes lose points, the game coin thingies, and developer-kids with bugs that reset gameplay earnings. I have no idea how to help them there besides emotional support and how the world works — one step at a time.
How about if ChatGPT/Claude writes a local Markdown copy of each conversation? Won’t that be Nice?
I don't think such an idea is consistent with the existence of trashbin features, or the non-insignificant use of data recovery tools on normally operating devices.
I can definitely see the perspective in clarifying that ChatGPT didn't lose anything, the person did, but that's about it.
A lot of snark in the comments, but I think author is absolutely right: this should har come with a big warning and that warning should have had 3 options:
1) go ahead and delete everything 2) back up and then go ahead 3) abort and keep things as they are
ChatGPT definitely wants to be the copilot of all your work. Guy didn’t just have chats, he had drafts that his virtual assistant helped formulate and proof read. Give how big and used ChatGPT has become, it shouldn’t be a surprise to anyone tech savvy that this is being used for serious work outside of vibecoders.
I still don't understand what could've have happened here. I'm not a chatgpt user so I'm not familiar with the UI.
He starts out saying he "disabled data consent". That wording by itself doesn't mean delete the content at all. The content could theoretically live in local storage etc. He says the data was immediately deleted with no warning.
Then OpenAI replies that there is a confirmation prompt, but doesn't say what the prompt says. It could still be an opaque message.
Finally, he admits he "asked them to delete" the data.
It's always interesting to see how hostile and disparaging people can start to act when given the license. Hate AI yourself, or internalize its social stading as hated even just a little, and this article becomes a grand autoexposé, a source for immense shame and schadenfreude.
The shame is not that he was so imbecile to not have appropriate backups, it is that he is basically defrauding his students, his colleagues, and the academic community by nonchalantly admitting that a big portion of his work was ai-based. Did his students consent to have their homework and exams fed to ai? Are his colleagues happy to know that probably most of the data in their co-authored studies where probably spat out by ai? Do you people understand the situation?
It's not that I don't see or even agree with concerns around the misuse and defrauding angle to this, it's that it's blatantly clear to me that's not why the many snarky comments are so snarky. It's also not as if I was magically immune to such behavioral reflexes either, it's really just regrettable.
Though I will say, it's also pretty clear to me that many taking issue with the misuse angle do not seem to think that any amount or manner of AI use can be responsible or acceptable, rendering all use of it misuse - that is not something agree with.
It seems you are desperately trying to make a strawman without any sensible argument, i don't personally think it is "snarky" to call things as they are, plain and simple, you, as supposed expert and professional academic, post a blog on Nature crying that "ai stole my homework", it's only natural you get the ridicule you deserve, it's the bare minimum, he should be investigated by the institution he works for.
A reasonable amount of AI use is certainly acceptable, where "reasonable" depends on the situation, for any academic related job this amount should be close to zero, and no material produced by any student/grad/researcher/professor should be fed to third party LLM models without explicit consent, otherwise what even is the point? Regurgitating slop is not academic work.
Sorry to hear that's how my comments seem to you; I can assure I put plenty of sense into them, although I cannot find that sense on your behalf.
If you think considering others to be desperate, senseless, and erroneously reasoning without any good reason improves your understanding of them, and that snarky commentary magically ceases to be or is all-okay because it describes something you find a big truth, that's on you. Gonna have to agree to disagree on that one.
The author is an absolute and utter embarrassment for all the good academic professionals out there, and he is also literally admitting to defrauding his students of their precious money, which they thought was going to human-led instruction, he's also put all of his colleagues in an very dodgy position right now. It is preposterous that we are even arguing about it, it is the sign of how much AI-sloppiness is permeating our lives, it is crazy to think that you can be entitled to give years of work to a chatbot without even caring and then write an article like this "uh oh, ai eat my homework".
It is not the student‘s money - academic education is basically free in Germany. But they are still defrauded of their valuable time and effort to follow classes they thought were worth it.
Are we supposed to feel sorry for this person, or chuckle at them?
This is like storing all your data on a floppy disk you never back up and then accidentally dropping it in the toilet.
As much as I want to dunk because it's an AI user, I do think it's really frustrating and bad design when actions don't clearly indicate they will make permanent deletions. I've been bitten by similar things before because the effects weren't obvious to me, in part because when I design things I automatically think to give a warning before doing things that may be unintuitive. Even if I do have backups, it's usually a big annoyance for me to restore, and I'd rather never need to.
I have a hard time chuckling at data loss. Espcially given that exporting and backing up your data from online services has an even smaller tradition than taking backups of one's local data, which is sadly rare in itself on the individual level.
Not that I've ever had the heart to say it to a friend who has shown up with a computer that won't boot and data they must recover. Sometimes it's the same friend a second time.
This is exactly why I don't rely on the web UI for anything critical. It seems like a mistake to treat a chat log as a durable filesystem. I just hit the API and store the request/response pairs in a local Postgres DB. It's a bit of extra boilerplate to manage the context, but at least I own the data and can back it up properly.
The question i have: did he anonymize or ask his students before putting their name and work into ChatGPT? because that's a RGPD violation if he didn't.
Someone in the comments of YT video on that article said that the author most likely breached German privacy laws by submitting their student's homework to US service.
This guy [1] (in Swedish) was digitizing a municipal archive. 25 years later, the IT department (allegedly) accidentally deleted his entire work. With no backup.
Translated:
> For at least 25 years, work was underway to create a digital, searchable list of what was in the central archive in Åstorp municipality. Then everything was deleted by the IT department.
> “It felt empty and meaningless,” says Rasko Jovanovic.
> He saw his nearly 18 years of work in the archive destroyed. HD was the first to report on it.
> “I was close, so close to taking sick leave. I couldn't cope,” he says.
The digital catalog showed what was in the archive, which dates back to the 19th century, and where it could be found.
> "If you ask me something today, I can't find it easily, I have to go down and go through everything.
> “Extremely unfortunate”
> Last fall, the IT department found a system that had no owner or administrator. They shut down the system. After seven months, no one had reported the system missing, so they deleted everything. It was only in September that Åstorp discovered that the archive system was gone.
> “It's obviously very unfortunate,” says Thomas Nilsson, IT manager.
Did you make a mistake when you deleted the system?
> “No. In hindsight, it's clear that we should have had different procedures in place, but the technician who did this followed our internal procedures.”
In typical Swedish fashion, there cannot have been a mistake made, because procedures were followed! Or to put it in words that accurately reflect having 25 years of work removed: "Own it, you heartless bastard."
I'm pretty anti-AI but this isn't really anything to do with AI. The same problem would arise with any online service that you use to hold important data. And it's pretty evil for any such service to have a trap "delete all my stuff with no warning" button.
One big reason I can think of that would make one want a permanent data purge feature, is that the data is not on their premises but on the service provider's. I think GDPR might even require such a feature under a similar rationale.
So maybe a better formulation would be to force the user to transfer out a copy of their data before allowing deletion? That way, the service provider could well and truly wash their hands of this issue.
Forcing an export is an interesting idea. But, like, from the article it sounds like almost anything would be a better flow. It didn't even warn that any data would be deleted at all.
One further refinement I can think of is bundling in a deletion code with the export archive, e.g. a UUID. Then they could request the user to put in that code into a confirmation box, thereby "guaranteeing" the user did indeed download the whole thing and that the service provider is free to nuke it.
Wouldn't really be a guarantee in technical actuality, but one really needs to go out of their way to violate it. I guess this does make me a baddie insofar that this is probably how "theaters" are born, rituals that do not / cannot actually carry the certainty they bolster in their effect, just an empirical one if that.
A typical example of Hyrum's Law: ...all observable behaviors of your system
will be depended on by somebody. It's like how your draft folder feature will be used as a secret messaging app by a general and his mistress, or as Don Norman points out, your flat topped parapet will be used as a table for used cups, or your reliable data store of chats will be used as academic research storage.
But I have to say, quite an incredible choice! ChatGPT released in Nov 2022. This scientist was an early adopter and immediately started putting his stuff in there with the assumption that it would live there forever. Wow, quite the appetite for risk.
But I can't call him too many names. I have a similar story of my own: one thing I once did was ETL a bunch of advertising data into a centralized data lake. We did this through the use of a Facebook App that customers would log in to and authorize ads insights access to. One of the things you need to do is certify that you are definitely not going to do bad things with the data. All we were doing was calculating ROAS and stuff like that: aggregate data. We were clean.
But you do have to certify that you are clean if you even go close to user data, which means answer a questionnaire (periodically). I did answer the questionnaire, but for everyone who has used anything near Meta's business and advertising programs (at the control plane, the ad delivery plane must be stupendous) you know they are anything but reliable. The flaky thing popped up an alert the next day that I had to certify again and it wouldn't go away. Okay, fine, I do need the one field but how about I just turn off the permission and try to work without it. I don't want anyone thinking I'm doing shady stuff when I'm not.
Only problem? If you have an outstanding questionnaire and you want to remove a permission you have to switch from Live to Development. That's fine too, normally, it's a 5 second toggle. Works every time. Except if you have an outstanding questionnaire you cannot switch from Development to Live. We were suddenly stuck, no data, nothing and every client is getting this page about app not approved. And there's nothing to be done but to beg Meta Support who will ignore you. I just resubmitted the app and we waited 24 hours and through the love of God it all came back.
But I was oh-so-cavalier clicking that bloody button! The kind of mistake you make once before you treat any Data Privacy Questionnaire like it's the Demon Core.
It was because of the NYT OpenAI case, however since mid October they are no longer under that legal order. What they keep retaining now and what not, nobody knows but even if they still had the date they surely wouldn't blow their cover
> [...] two years of carefully structured academic work disappeared [...]
> [...] but large parts of my work were lost forever [...]
I wouldn't really say parts of his work were lost. At most the output of an AI agent, nothing more.
If somehow e-mails, course descriptions, lectures, grant applications, exams and other tools, over the period of two years disappeared in an instant, they did not really exist to begin with.
For once, the actual important stuff is the deliverable of these chats, meaning these documents should exist somewhere. If we're being honest everything should be able to be recreated in an instant, given the outputs and if the actual intellectual work was being done by Mr. Bucher.
Does it suck to lose data? Even if just some AI tokens we developed an attachment to? Sure.
Would I have outed myself and my work shamelessly, to the point that clicking a "don't retain my data" option undermines your work like this? Not really.
How can you loose "important work" of multiple years? -- can't be important and how can somebody _expected to become management_ be so incompetent?
"...two years of carefully structured academic work disappeared. No warning appeared. There was no undo option. Just a blank page. Fortunately, I had saved partial copies of some conversations and materials, but large parts of my work were lost forever."
-- stupid: that drive could have died, the building could have burned down, the machine could have been stolen, the data could have been accidentally deleted...
and all there was: "a partial" backup.
I mean, that isn't even a scenario where he didn't know about the data ("carefully structured") and discovered it wasn't covered by the backup schema (that would be a _real_ problem)
Another problem would be of your churn is so high that backing up becomes a real issue (bandwidth, latency, money, ...). None of that applies.
And yet they reserve a spot in "nature" for such whining and incompetence?
That's rough to assume IMHO. I don't assume either way, but I think most have met someone who is briliant in one domain but just hopeless on a lot of other things.
How fair would you think if a potential employer assumed from your (hypothetical) incompetence in picking a suitable hairstyle and outfit for an interview that you were not fit for a non-customer facing role?
While it absolutely makes sense to keep your important data backed up, I know people who were great academics in their field and yet managed to delete all their PhD work (before services like Dropbox and OneDrive became common).
Hot take: Actual "irreversibly delete x stuff with the next action" is simply too powerful and bad design for most people, and has probably caused considerably more harm than good in the world. It's particularly silly with software, where few reasons exist for this to be an actual thing.
What the average human needs is laws and enforcement, and trust in both.
once upon a time i had a boss who asked for a "super admin" account to "trump" the domain administrators..and a "master key" to decrypt any file , in case the user lost their key.
> once upon a time i had a boss who asked for a "super admin" account to "trump" the domain administrators..and a "master key" to decrypt any file , in case the user lost their key.
Key escrow is a well-known concept in cryptography:
From my perspective as an employee of a German academic institution, administrations are still figuring out if and how to regulate the use of AI, while some professors rely pretty heavily on AI tools, so the story is completely believable. However the double naïveté demonstrated here is strange.
Generation of boilerplate prose for grant applications was the beginning around 2023, which is absolutely understandable. The DFG recently allowed the use of AI for reviewers, too, to read an summarise the AI generated applications.
Researchers using qualitative methods seem (in general) to be more sceptical.
I wish we had an open debate about the implications instead of half assed institutional rules that will and must be widely ignored by researchers.
Between the lines, it sounds like German academia doesn't bother to warn - formally or informally - its researchers that failing to have working backups can be a train wreck for their careers.
The issue is not backup, the issue is that he is publicly and nonchalantly admitting that most of his work for the past years was ai-based, which might or might not constitute fraud given his professional position. Imagine being a student paying thousands over thousands expecting an expert human led instruction just to get this, imagine being a fellow researcher and suddenly being in a situation of not being able to trust this guy's current and past work.
The worst thing is all the people looking at this behaviour as normal and totally acceptable, this is where ai-sloppiness is taking us guys. I hope it's just the ai bros talking in the comments, otherwise we are screwed.
I like her meta observation, that using ChatGPT for 2 years rots your brain so badly you somehow think it's a good idea to write an article like this, with your real name and professional/academic reputation attached to it, and get it published somewhere as high profile as Nature.
Someone on my Mastodon field commented that if they'd done that "you wouldn't be able to torture it out of me" and that they'd never admit it to anyone.
Good commentary, good video. She is a little bit too harsh about the data loss though. The author did not realize that disabling data sharing would delete the history of the already occurred interactions, probably not realizing that everything was stored on an external server. And it's quite possible there was no proper warning about that.
I feel that makes her point weaker. Because she is apart from that completely right: The work practice admitted to here is horrible. It likely includes leaking private emails to an US company and in every case meant the job of teaching and publishing wasn't don't properly, not even close.
You mean toggling the data setting? It's on the program to make implications visible. That's a big part of design for usability. It's possible ChatGPT did that and the user was unexpectably dense, it's more likely the implications were not properly explained/shown. That's why you add undo functionality, which the user even tried to find. Here, given the legal component, an undo available for a short time frame seems like a good fit.
But your comment could equally be about the fact of using chatGPT in the first place for the job, that I wouldn't justify at all.
So your grant applications were written by AI, your lectures were written by AI, your publications were written by AI, and your students exams were marked by AI? Buddy what were YOU doing?
reply