Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Forensic analysis of the Windows telemetry for diagnostics (arxiv.org)
104 points by cx0der on March 2, 2020 | hide | past | favorite | 12 comments


Telemetry data is stored in: "%ProgramData%\Microsoft\Diagnosis\Events_*.rbs"

The paper describes the format of these files, and what data can be obtained from them, including a comparison with other sources of similar information.

Recorded data includes: (1) Windows version, registration details, installed and uninstalled programs; (2) hardware devices with serial numbers; (3) process execution data (at Enhanced or Full levels only, data might not include processes that only ran briefly); (4) partition table and boot timestamps (when the system was powered on and off).

In the analyzed examples the data was available for roughly the past three months.


> hardware devices with serial numbers;

Ouch. That sounds a lot like personally identifiable information.


> Since PII have not found so far and Microsoft stated privacy principles with no personal content

I'm going to have to disagree with the authors of the paper, here.

Whilst the information they've found may not appear to be PII at first, it is very far from anonymous.

It has everything required for active fingerprinting of individual devices - namely, the UIDs of the hardware of the computer. Things that don't regularly change, and things that may show habits.

Combining this dataset with another is all it would take to break from pseudo-anonymous to known individuals. However, enough information is there to uniquely fingerprint most users.


Still not legally PII. Correlated assumptions don't qualify. Bad PR but not PII. PII means very specific things,"a person called jack ryan owns this PC" is not PII, "the owner of this PC was born on 01/02/03" is PII.

From a PR perspective it sort of looks bad but please look at the comparison table on the paper where similar data is collected in the registry,logs,prefetch,etc... Unlike *nix,windows does a ton of activity logging and it has been this way for a long time. Most people know of application,system and security logs for example but in the same log directory there are usually 100-200 other log files including IE browsing related logs.


> Still not legally PII. Correlated assumptions don't qualify. Bad PR but not PII

Which is why I didn't call it PII, but also emphasised that is also not anonymous.


Sure. But most other places (unsure about Apple) don't automatically ship those logs off to the maker of the OS for centralised bulk collection, analysis, and monetisation.


Did not say it was nice of them.


RBS file parsers (Python) the authors wrote, along with the sample telemetry data files used in the study: https://github.com/JaehyeokHan/Windows-Telemetry


Excellent paper.

I have questions:

1) is turning off telemetry (opt-out) effective against this? 2) How will this be different between licenses? I would be very interested to see what is collectes when you have something like an E5 license and have Defender ATP and AIP turned on (I don't have that currently). I recall it sends a ton of data (>2000k dns requests/hour for an active user just for new connections to MS) perhaps some of that is left on disk? Would file classification with AIP (e.g.: new document/email is created) be logged? Is it fair to assume the Win10 they tested with is not for enterprise?


Regarding your first question: there's only one edition that let's you turn off telemetry completely: Windows 10 Enterprise. You can set it to "Security" which means nothing but the following information is sent: "Information that’s required to help keep Windows, Windows Server, and System Center secure, including data about the Connected User Experiences and Telemetry component settings, the Malicious Software Removal Tool, and Windows Defender."


I was hoping to find out if it collects keystrokes but authors seem not to mention it.


MS Teams does when you type into a chat box, but it depends on whether your company is opted out somehow or maybe whether they use EU servers. There's a setting that gets pulled down and overwritten every launch.. had to create a mitmproxy edit to save my own config for the Windows client.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: