Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Categories for Your Note Archive Are a Bad Idea (2015) (zettelkasten.de)
33 points by Tomte on July 26, 2022 | hide | past | favorite | 20 comments


This is an argument against taxonomic classification, not categories per se. In Wikipedia, pages can belong to multiple categories, and categories can have multiple parent categories. It's more complex than single hierarchical taxonomies, but it's also a lot more expressive.

That expressiveness is something I find is missing from a lot of tag systems. Most have no structure at all, when even a little structure makes things a lot more convenient. At the very least, I want hierarchical tags. Let things I tag "physics" show up in a search for "science!"

(I think most systems don't have structured tags because it raises a lot of implementation and UI questions. How do you prevent cycles? Do searches go one layer deep or are they fully transitive?)

See also the first comment in TFA:

> I maintained the largest collection of dishes served by restaurants for more than 10 years. [...] 70% of the metadata was captured via tags, manually added or automatically extracted from plain text. Our clients notified us that some of the entries were of poor quality: for instance some dishes were tagged with "vegan" and "beef" at the same time.


#vegan and #beef is an interesting example which I didn't get.

Either a dish is vegan, or it has beef. So a dish tagged #vegan and #beef is either wrong about the #vegan tag, or has... vegan beef, which ontology aside, exists in the sense that there are foods with this name.

Hierarchies can't fix misclassification, a beef-containing dish doesn't belong in the #Vegan: namespace either.

As for organizing tags, something I'd like to see (pending a reasonable UX) is just being able to tag tags. #physics could be tagged #science, and #science could be tagged #physics, cycles aren't a problem for set union.


A dish with an option could legitimately be both — as, eg, burritos at my local Mexican place allow a variety of meat and non-meat fillings. For a single item on the menu.

If they didn’t separate, eg, burrito options into different entries, then you’d get a burrito entry with both “vegan” and “beef” correctly.


The music site RateYourMusic has a hierarchical genre tree. They might organize it like this:

- beef [meta]

- - beef (co-parent: meat)

- - vegan beef (co-parent: vegetarian faux meat)


A dish tagged #vegan and #beef would imply vegan meat version


Or a vegan dish that is also good if you add meat.


That would only exist in an alternate universe.

One universe of vegan dishes and non-vegan dishes.

Another universe where the tag is generated by meat chefs for vegans— this vegan dish would be great if only you added some meat…or something like that.


I like to go to the restaurant chain called Wagamama. They have vegan dishes where you can optionally add different kinds of meat.


> "I like to go to the restaurant chain called Wagamama."

The Wagamama-universe. haha.

I don't know this chain, but your example poses an interesting problem. Is a Caesar salad, add chicken, a new dish? I would argue it's not a new recipe, but a manifestation of how the restaurant caters to patrons' desire. Or, a manifestation of how salads can be minimally altered with grilled chicken to turn a side dish into a main dish for lunch. So maybe there are better, less confusing, tags for restaurants who permit these alterations? `add-meat-optional`. And that's where I find this to be an interesting example.

So now... You're in your sandbox classifying "important things", you might think to add an `add-meat` tag (or whatever). But what if you're just wasting time with your phone standing in a queue just adding bookmarks in Firefox (does the phone app have tags--i don't know)? So now...you might not recall the rules of your ontology. Instead you're drawn into the _alternate universe_ created by the website's design, or it's branding. Branding intended to tweak your perception and you start adding strange tag combinations, without thinking how this will goof up your search algorithm. Tag salad--haha.


Categories are an excellent idea if the system is small enough and everybody using it is familiar with the convention. It's just that you have to accept they are flawed, unaccurate, partial and biased.

That doesn't mean they are not useful.

Think about supermarket scales: each vegetable or fruit is in a single category. The category is flawed, but nobody wants to have to type a precise list of tags to filter what they need. You want to go to vegetable and chose tomatoes, not to have a inner debate about the tomatoes taxonomy. Isn't a fruit? Wait, is fruit a biology thing or a commercial concept? I don't care, give me my damn tomatoes!

It's the same for my notes. Some stuff are in weird directories. But I know how I sort them out, so I can find them quickly. I don't want to have to be precise or accurate, only find what I need.

Category don't replace tags, buts tags are mainly useful for advanced searches and filtering. Anyway, the best systems out there consider a category as a tag, just the main tag.


Is anyone aware of classifiers that work well on {content text} + {tag collection} + {search history} and generate categories? Specifically something that would be usable on personal notes.

Categories seem incredibly useful at exploration-time, but impossible to write at creation-time.

Tags seem incredibly easy to write at creation-time, but aren't useful at exploration-time.

Why not both, with a fuzzy link between them?


I don't have any kind of system for notes, but I've found the "tagmash" feature on LibraryThing to be useful: https://blog.librarything.com/2007/07/tagmash/


My go to is to consider that the first tag is the category, or the directory it's in. Depending on the system, you may reverse it: the category is the main tag.


I'm just waiting for the tongue-in-cheek article from someone that says "Why you should use the Dewey Decimal Classification to organize your notes". And then the refutation based on the limitations of DDC and the inevitable follow on article that says "Why you should sort your notes by Library of Congress Classification instead of Dewey Decimal".


I can already give the refutation for LoC.

Library of Congress Classification Outline: Class D - World History and History of Europe, Asia, Africa, Australia, New Zealand, Etc. https://www.loc.gov/aba/cataloging/classification/lcco/lcco_...


Categories are mutually exclusive.

Tags aren't. I much prefer searching in intersections of tags to searching for needles that may be in different haystacks.


No, they aren't. Storage position is mutually exclusive, if you choose so, but that's not a property of categories vs. tags.

E.g. the Library of Congress assigns books to several subject headings (categories) where applicable.


Depends on what meaning you put on each word. There is no One True Meaning in this case.


I'm putting the finishing touches on my own[1] wiki/personal knowledge base and eventually decided against any categorization whatsoever.

The main reason being that I don't want to be caught up wasting time _curating_ my notes. Which all other note-keeping software seems to encourage or require. My notes are a tool that I depend on simply to function adequately and are neither hobby nor art. I don't want to answer the same questions over and over like:

* Which category does this page fit into?

* If there are two categories, which is more important?

* Which set of tags most accurately describes the content of this page?

* And will those change when I edit it later?

I find all of my content in three ways:

1. Most articles have one obvious title, e.g. "Python" or "Proxmox". I have a shortcut set up in my browser so that when I go to the URL bar, I type "w python" and it will take me to the page in my notes about Python.

2. The wiki has a very good search engine, which can search page titles and page bodies. Similar to the above, I type "ws python" to get search results of all pages that mention "python" somewhere in them.

3. Very occasionally, I will click on a link on one page to go to another page.

And what would be the point of categorizing all my notes? Every single time I go to my wiki, it's to either write down something specific or search for something specific. I have _never_ wanted to see a list of all of my pages about programming languages for example. Or every page tagged "bash".

I think as software engineers building our own tools, we sometimes build features because they sound interesting and we know how to do it, or because the project doesn't "feel" complete without them. Not because we'll ever actually use them.

When I _do_ want to break up a large subject (e.g. Python) into multiple pages, I just create one "Python" page and link to all of the others from that page.

The one concession I've made to categorization/organization is that I've added a feature where two pages can be marked as "related" to one another. This is mainly to avoid having a manually-edited "See Also" section on pages that touch upon topics covered on other pages.

[1]: https://github.com/cu/silicon


Have you considered tagging your notes according to how you intend or potentially expect to use it? Instead of spending time thinking about where the note came from or what it's about, mark it with whatever topic or publication you are working on when you make the note. When you come back to use your KB to write or build something, you can start with the set of notes you previously collected for that purpose.

If you're skeptical, try it a little bit, It's kind of amazing how it can jumpstart work. In fact, you can backfill notes you already have using this concept. Whenever you are working on some output, when you look to your KB for guidance, mark any notes you use with according to why you found it useful.

For example, it's probably that not all the notes about "Python" will be relevant when working on Jupyter Notebooks. But you might also want any relevant notes about data analysis in Jupyter. Tag the notes you use in the output, and next time you revisit the subject, there will already be a set of useful starting points.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: