Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why Folders are holding back hassle-free File Management & Tags are the Future. (doctape.com)
20 points by cedel2k1 on Aug 13, 2012 | hide | past | favorite | 57 comments


I tried tagging my files for a while. I quit. Part of it was that tagging is done via an add-on, of course, and it added an additional step to creating a file.

But I also feel like there are a lot of problems that need to be solved about streamlining the process of tagging files, dealing with typos, and auto-creating maps of the tag structure.

And even if we did move to a wholly tag-based userland, your non-technical relative/coworker would STILL find ways to lose their documents.

Maybe tagging will be a better way. But I think "files" in "folders" have survived for the entire history of file systems for a good reason - they're conceptually simple for our brains to manipulate. There's enough of an aura of physicality to them that we can leverage the part of our brain that's good at storing maps. Honestly I think the way to move forwards is more along the lines of Raskin[1] - interfaces designed to make things MORE spatial - than into tags, which are incredibly ethereal to our brains.

Files and folders are, I think, a local maxima of efficiency. Tags may be more efficient in the long run, but there's a painful trough of uselessness to trudge through before we can get there.

(Other problems I see with tags: - importing existing filesystems, without losing the existing hierarchies - do YOU want to go back and manually tag every file you've ever made and are still holding on to?

- can't control the importance of information. Sure, that project you just finished gets drawn huge in your tag cloud. But it's done, you want to move on, you don't even want to think about it right now because you're in the middle of the next one.

- suggesting tags as you type is not good enough, there needs to be a lot more work put into associating tags, so that if I'm saving a file with, say, the tag for my current comics project, I'm instantly presented with the associated tags of "web final images", "book file images", "fan art", "model sheets", and various subtags of those. )

[1]: http://www.raskinformac.com/features.php - and now I kinda want to try that out again, it's been a while.


Let's see. I'm sitting in front of a workstation with about ten gigabytes of data. It stores all manner of projects, from pure mechanical to pure electronics as well as software-only products. They are organized quite well by product or project as required. Each product lives in a stand-alone directory that is fully self-contained. Need to work on that project on another workstation? Clone it or fork it onto that workstation and you have all files relevant to that project and nothing else.

This method works very well and has been in use for quite some time across multiple workstations, operating system revisions, tools and software revisions as well as a number of engineers. The only effort required in order to maintain this system is to abide by common sense agreed-upon directory structures. For example, if working on an electro-mechanical project, the fasteners might go in the "Fasteners" directory under the "Mechanical" folder and the embedded software might go under "Software", which, in turn, lives within the "Electronics" directory. Not tagging. Nothing that can get lost or royally FUBAR'ed if tags are lost or corrupted, etc. Just a darn simple file cabinet analog that works very well and is perfectly usable, easy to understand and fully searchable.

Don't get me wrong, I like tags. They are great for certain applications. I just don't see them as practical for anything I've ever touched in terms of project or file management.


Yet another company promoting a file management app to fix a problem that does not exist. It's not even particularly unique. I've seen at least dozen such solutions. Even tried a couple of them (Anyone remember Brain - an associative array file management platform?). I always gave them up.

The real problem, I think, is that folders reflect reality. Tags do not. Most people are think of a file as belonging in one specific category. So you tag it as such. Very few people would bother trying to remember multiple tags to apply to their files. The process of organization is too cumbersome, so you stop doing it.

So what do you get if you only apply a single tag to a file? You get a flat list of folders. If the tags have a hierarchy, you get exactly the same thing as using folders. So why bother with tags?

With my Email I don't even use folders. In that case, I don't bother organizing anything. Search works just fine for me. In fact, search works just fine for most of my files these days.


"folders reflect reality" +1


I fought tooth and nail to not support proper albums for The OpenPhoto Project. After about 6 months I gave in and we implemented a native album API[1] (not something that rides on top of tags).

It's true. The argument that tags do everything that albums do and more is false. Perhaps for an engineer but the brain thinks in terms of albums (or folders for files). Nature or nurture, it doesn't matter. This is not a battle you win with engineering.

[1] https://github.com/openphoto/frontend/issues/124


The obvious benefit to using tagging is that tags can be anything. The obvious detriment to using tagging is that tags can be anything. Take, for example, a contact in an address book. Did I tag "contact", "friend", "acquaintance", "people I know", or something else? How do I know what to search for? Obviously a contrived example but it easily extends for any type of document; put simply, a hierarchy is much easier to remember than a "cloud" of unlimited options. Not to mention the time it takes to save a file in a hierarchy versus tagging (how many tags should I use for optimal performance?)

I think the true future is in a hybrid. Improve the metadata on a file (Windows is making giant leaps in this territory), improve file search indexing and capabilities, and retain a hierarchy. This allows ease of both searching and browsing, and is not any different from what we're doing today (no retraining Grandma).


One way to deal with ambiguous tags is to use aliases and related-to indicators.

But I agree with you that a system can't discard the file hierarchy without discarding important qualities many people need.


That's of course a good point. However, one of the things the tagging proponents claim is that tagging reduces the effort of taxonomy. Having to manage aliases and tag links would then negate any such benefit, and might even be much more time consuming.

Now, were we to talk about some standard of tagging that has link relationships baked in, we might be talking. But then we sacrifice customization.


There have been tagging apps that I've tried in the past (i.e. http://www.ironicsoftware.com/leap/) but in my experience, tagging is significantly more work than using folders.


Tags are not a replacement for folders, and the opposite is true too.

A folder defines a directed graph. Combining tags loose the "directed" part. It may be good enough for some, but it's not the same and will fall short in some cases. This difference is the 1st point mentioned in the article. The article says that it's inflexible. In some cases yes, then use tags. In other cases, it will be exactly what's needed and tags will be cumbersome.

The two other limitations (related to searching and sharing) are bogus IMHO. They're not limitation of a folder system, but maybe of some implementation.


Tags are good in conjunction with folders, because they provide another way to search when you forget where a file is, and you don't remember the file name.

But folders are really good at collecting files with a similar purpose together. Does a good tagging system have a notion of a "collection" or something, instead of just categories and tags. If so, it just seems like folders by a different name. Moving away from folders entirely, I would be wary of losing track of some seldom-used files completely.

But maybe this is just as much about how files are stored on disk, and disk efficiency.


The ideal implementation of tags (at least during the transition) is tags as pseudo-folders. That is, you drop a document into "Projects"->"2012"->"Foo" and the flexibility of tags is simply hidden away until needed.

When it is needed, that file can also be 'added to' the 'folders' "Teams"->"Alpha" and "Clients"->"XYZ Corp".

So, sort of like folders by a different name. But more like folders++. With complexity hidden until desired.

How files are physically stored is neither here nor there.


I completely disagree with this based on my experiencing working with several companies who moved to Google Apps. Some of these companies are small businesses (10-25 employees), but some are large organizations with hundreds of employees and retail locations around the country.

The problem with tags as pseudo-folders is that it breaks a core promise of "folders". Folders are a metaphor, and when you use a metaphor like files & folders, you're making a promise to the user. You're saying: "I'm making these things familiar to you so that you don't have to learn a new behavior."

To be clear, I have no problem with tags, and I do believe that the standard "hierarchy of folders" we use today is untenable with the volume of documents we generate, but extending folders is a bad idea.

The reason this breaks so badly is that users expect to preserve the basic move/copy dichotomy of a file/folder when we represent items as such. This expectation is rooted in the metaphor of a file or folder in the real world. I cannot place a physical document in to two folders without copying it. When I copy the document, changes made to the replica are not retroactively applied to the parent.

I have seen many Google Docs users lose hours of work because they used the "collections" feature of Google Apps incorrectly. An example, I've seen plenty of users do the following in an attempt to copy a file to a new collection in Google Docs:

* Locate an item that is already in a Collection

* Check the item in the Document List

* Click the "Organize" icon (it's a folder)

* Check the box next to another collection where they want the copy

* Click "Apply Changes"

The vast majority of the users I work with assume that the "same" document cannot exist in two places at once. They assume this because that is the way that files & folders have always worked. If an action results in the appearance of two documents, there are two documents. The item has been copied. It doesn't matter whether the language "copy" is used. The action is implicit in the learned behavior of files & folders.

Tags should be called tags. Users are smarter than you think. As engineers, we might be frustrated by the assumption made above, but in reality, the user is drawing an inference based on past experiences. They're applying conceptual knowledge that we taught them. That's "smart".

Implement a tagging system in a way that solves users' problems and they'll flock to it, but don't make it opaque by masquerading as something they already know.


It's not about whether users aren't smart enough for tags. It's whether tags work the way users want them to.

Tags are for organization. People like to organize things in terms of visual hierarchies and landmarks. Raw tagging systems aren't popular, because most people don't memorize precise strings. [1] They remember roughly what directories look like and 'know where things are' based on whether they believe they can confidently determine "it" from "not it" while they navigate a structure.

e.g. Was the tag "RaptorX" or "2012 Predator" or "FooBird"? The user doesn't know. But they do know that when they navigate their project folders, they'll be able to pick out whichever project was fuzzily translated to "something like a bird of prey" in their mind. And they'll be able to do this as they navigate into a tree, as fast as if they had remembered exact strings. [2]

So when people are looking for something quickly, they're going to like the visual hierarchy. And when they're first saving a file [3], or when they're looking to open something when they 'know' where it is, they respond very positively to a visual hierarchy.

It's not about 'hiding' the fact that those 'folders' are tags. Or hiding the fact that files can live in multiple places (with multiple versions no less). We're still talking about new software that people would have to be trained on and told about its particular attributes.

So the fact that, yes, you need a dialogue to capture explicit intentions when doing move/copy operations in a pseudo-folder tag system is not a problem.

[1] Also why command line OSs and command line navigation are unpopular. People really like visual organization.

[2] In my experience, this is what people are really saying when they insist they'd rather navigate than search, even when they 'know' where something is. It's because they "know" where something is, in terms of landmarks in the visual tree. They know in a fuzzy way. And fuzzy searching is either frustrating, or requires nailing down enough tangential data (user/date/file type) that searching isn't any faster than navigating.

[3] Almost no-one wants to think about proper taxonomy or organization or an exhaustive document profile when they're drafting a new document. They just want to stash things somewhere where they can find them tomorrow and/or reasonably direct a third party to them. Proper organization is almost always a discrete mental process that happens after a document is 'done'.


I agree with pretty much everything you said there. I guess you could summarize my post as, "just don't call them folders."

I do like that idea though. Users often treat folders like tags. In a manner of speaking folders are tags, in that they label the contents. They just happen to be organized in a hierarchy. The tags get more specific as you dig deeper in to the tree.

I'll have a go at reasoning by anecdote. For some reason, the default behavior of the Finder search field searches your entire computer as you type. This always struck me as odd, because it is 180 degrees from what I wanted. I would frequently:

* Navigate to a relatively shallow point in a folder hierarchy

* Start searching from there, expecting that my search would be constrained to this location

I do this because I know (generally) where something might be, but I can't recall exactly where.

I ended up reconfiguring Finder to behave like this.

If you split the folder path by separator, you often get a set of tags. So, by drilling down in to a folder heirarchy, you're "scoping" your search by tags, getting more detailed as you go.

I think hierarchy may be the missing piece of the tagging puzzle. A hierarchy would provide guidance (and a visual component). It says, "start broad, then get specific". Imagine a workflow where you would navigate to a specific point in the tag tree, then create a document. Doing so would associate the document with the current tag set without any additional work. If needed, the user could add additional tags, drilling down as they go.

My most important point about "folders" is just that we should avoid referring to these new systems as such. If we want to make progress, we need to unhinge users' thoughts from old concepts.


> "My most important point about "folders" is just that we should avoid referring to these new systems as such."

Fair enough and I absolutely agree. Though be prepared for users to refer to them as such.

And as to tag hierarchies - that's where things have been going. At least, those things I follow (DMSs, CMSs-not-named-Wordpress). It also addresses the concern raised upthread of "Company A\Secrets\Company B vs Company B\Secrets\Company A". With hierarchical tags, this isn't remotely a problem.


I think that the problem with tags is the lack of intrinsic hierarchy. While you can flatten the hierarchy by enumerating all the tags, this approach is boring if done manually or requires an accurate ontology if done automatically. Another risk is a result from the paradox of choice. Which tags are appropriate? Did I use the same word as before? Will I be able to remember it? With folders you are able to further refine your lookup, and a document must be in a folder anyway, while it can lack a tag. Of course these are just points that can be overcome in a way or another by a good implementation, but folders, symbolic links and tags together are IMHO a better alternative that tags alone.


Of course, folders imply more of a relationship than tags do.

If I'm keeping files on what multiple companies are up to I might have: Company A -> Secrets -> Company B and Company B -> Secrets -> Company A

To denote the secret files that each company is keeping about the other. The tags would be identical, but the folder structures entirely separate.


To make them work like folders, you would need a query like "Team and Alpha but nothing else" because otherwise each pseudo-folder would also have all subfolders' documents in it.


Umm, there's this thing called hard/soft links...


This seems the most intuitive. You could have some flattened view of a folder where it shows everything with tags, or you could navigate by folders. So if you have two folders and two subfolders by the same name, with the flattened view you can see all of the similar items. However, doing away with hierarchical folder structure is a bad idea.

    > module-1
        > views
        > other-thing
             > views
    > module-2
        > views
    > module-3
        > views


Eventually users might get to the point that the hierarchy is unnecessary. Surely it's more efficient to type out "2012, foo, xyz corp, alpha". And it's certainly easier to type out a few tags than navigate a few trees to do a search.

But people have stubbornly resisted that for so long, I wouldn't hold my breath. [1]

[1] It seems just about every document management system since the 90s was designed under the "search, don't scan!" assumption, stubbornly clinging to theoretical efficiency and then grumpily (and often painfully) adding back in 'artificial' folder structures built from the tags for both file->open and file->save UI.


Not necessarily. The biggest problem I've had with tags is consistency. When I'm tagging photos of my kids, every time I have to remember "did I tag these with kids, rebecca, becky, daughter, or children?" It gets tedious. Sure, you can maintain a list of commonly-used tags, but that list often changes on a per workflow basis. Folders give you the benefits of tags with a hierarchy and smaller search space. When I open my work folder, I see a list of each of my projects. From there, I can go into that project, and see directories that are specific to that project. In a tag-based system, I have to remember all of the requisite tags from the beginning, or I have to go through an iterative process adding tags, seeing if I got the right one, and if so, then looking at the new list of suggested tags. Folders let the computer remember how things are organized and lets the human select from that organization.

Also, this article gets the details wrong. For example, you certainly can nest folders in the real world. Take a personnel file cabinet, with a 2012 drawer, with a hang folder for each employee that contains a manila folder labeled timecards. That's 4 levels of nesting and it works fine.


Tags subsume folders in the UI. They can replicate everything folders do, and then do more.


Can you give some examples of this? Would this be an emulation layer to support classical hierarchies or something else entirely? Academically, I've seen some research in this area from file systems researchers and it certainly has not been a trivial problem for them, particularly due to nuances that emerge like cycles in graphs (maybe? I know there is some data structure oddities, though I'm certain what they are), labeling, and when requirements like backwards-compatibility get thrown into the mix.

It's often said that the devil is in the details. Simply saying, "Tags subsume folders in the UI" likely has many cascading implications and edge cases.


Trivially, you can replace folders with a certain class of tags for which you enforce the folder hierarchy rules. That is, tags of this type are organized in a folder-like hierarchy, and if you want to apply a tag to a file, all the tags above it in the hierarchy are automatically also added to the file. In this way, you could have multiple parallel folder hierarchies if you really wanted to.

But this is obviously messy if you don't do it right. I agree that implementing tags in a way that completely replicates the abilities of folders without burdening the user is non-trivial.


What do these people who continually trash the concept of folders do for a living? Do they even work at all?


Again, BeOS had a working indexing system so the folders it supported didn't effectively matter and no tagging was needed. Of course, you could create virtual folders for queries.

What made the difference to other indexing systems was that the queries were fast and real time because it was all implemented in the file system itself. That's what's needed to make people first trust indexing and subsequently use it as a daily tool, like we use 'ls'.


How would this file system work for the non-technical user? I'm sure there will be plenty more people chiming in on the inefficiency and inferiority of the folder structure, but how about the average user?

Genuinely curious. The more I think about this type of system, the more questions I have. It seems like you would only want to use this with a SSD or other non-disk based storage.


"How would this file system work for the non-technical user?"

It does raise a number of questions that aren't immediately obvious without immediately falling back to a hybrid approach, even if it's internally. But then, you're back to folder structures again, just in a way that is opaque to the user.

For me, I was actually curious how existing namespaces in languages that are so intertwined with nested folder structures would now work (such as Java and Python).


Not to mention web browsers and web servers, which expect URLs to map to a hierarchical file system. E.g., http://example.com/dir1/dir2/foo.htm.


Currently, I am working on a file management tool that uses tags fairly heavily. Obviously I think tags can be useful.

But the claim that folders can or should be eliminated is seriously misguided.

* Tags aren't enough once your file number exceeds a certain amount.

* File path provides a unique human-readable and human-understand identifier for everyone's file. A system where pieces of data don't have unique identifier is going to be serious drag at some point. Even average people need logical consistency occasionally.

* While folders may be a somewhat hard metaphor for new and casual users to understand, so are tags. A many-to-many relationship isn't actually something people quickly understand fully even if they can quickly get fuzzy understanding it.

* When you throw out the spatial metaphor of folders and files, you've thrown out the guarantee that a file is "somewhere". I've lost and seen other lose files in a disturbing fashion on MACs. Without a keyword for file X, that you just uploaded, you can't "look around" for it.


Indeed. Have enough tags (and with files, you'll end up with enough tags) and you'll need ontologies for those tags; you'll end up recreating hierarchical structure out of naming conventions. The same tag "name" within different contexts will have different meanings, and the concept of "context" by itself tends towards hierarchy of analytic decomposition.


>A system where pieces of data don't have unique identifier is going to be serious drag at some point

Reminds me of trying to organize/tag my mp3 library. Those fringe cases of imperfect tagging get so cumbersome to manage. Also, with folders, it's so easy to just dump x number of files into a folder and have it organized. Batch tag editing just isn't as intuitive. Now I have an Android and use that as my mp3 player, and Poweramp has a "Folders" view that I almost exclusively use.


Pfshaw, tags are old hat. They are also just a re-hashing of the humble search keyword of olde.

The real new hat is search and all major OSes are doing this more or less correctly now. I don't recall the last time I browsed to a document.

Also worth mentioning that the concept of "documents" is itself a quickly aging hat. With some very few exceptions most documents are tied pretty closely to just one app and apps and those apps in turn can have more specific ways of handling those documents (i.e. projects, libraries etc).

The internal representation of files and programatic access is another matter - we do need a better way to segregate access and assign metadata to files. Possibly a relational file system would be the way to go, but it's not a problem that very urgently needs solving for anyone so nothing too exciting there for a while....


Folders work standalone, tags don't.

Tags don't work without a search feature - we might tag religiously but others don't and won't.

If you're implementing a search feature then you no longer need tags.

It won't take off. Especially as these days we seem to be moving away from files full stop.


While I do like tags for myself, they are also inherently problematic with a group. I tag a photo of my friend "Chardo" (his nickname). Friend #2 tags another photo of him "Richard" (his real name). Now viewing all items tagged "Richard" won't show all the photos of Richard. (Replace human names with names of places, descriptions ["bbq" vs "tom's party"], etc)

Do you have any plans for how to address this?


There's not really any need to address it specially. Tags only get added to the extent that users care, so mismatches are a problem of abundance, and to the extent tags are meaningful/accurate, they facilitate editing to personal preference.

I'm sure that successful file managers will also have some degree of fuzzy matching. But tags are still useful, even if we never realize the RDF dream of a perfect global ontology.


Do they, though? I'm not very good at remembering which tags I use, and once you have hundreds of documents sharing a single tag, if you mis-tag a new file, you may never find it again.

Folders make using previous organizational decisions easier than creating a new organization. Incentivizing that behavior is a net win.


Do what what?

I'm not all gaga over tags, keeping 'primary organizational tags' around makes lots of sense to me. We can even keep on calling them folders. My point is that you really have to construct a silly, user-hostile system for a tagging capability to actually make things worse than no tagging.


"oh I just forgot to explicitely mark this file as private, now everyone in the world can see it" - how about no ?


If you wanted to "tag" files, could you not simply create hard links to the files inside a folder "tagname"?

Of course, using hard links is currently not easy for the technically incompetent, but you could change that, and it ought to be easier as you're just interfacing with existing capability.


How do you search for files with multiple tags? E.g.: "Taxes" AND "2009"


Good question. As a first pass, I'd think a search assistant that runs two searches in the background and takes the intersection of the results.

I'm a big fan of building on existing capabilities, as you can probably tell.

One other thought- in Explorer or Finder, is it possible to search within a set of search results?


Has Microsoft done away with folders in their upcoming OS? Just curious. What about the file systems of other OS's out there?

It is indeed true that folders are the least efficient aspect of operating systems but it is strange that we're still holding on to them.


I don't know if you could call it the "least efficient" - they seem (to me) to be a simple way to understand what and where something is. Look at the problem users are having with Mountain Lion's "Save As" feature to see what could go awry once you start abstracting the user away from files.


Regarding what and where something is...

Let's look at two examples of most-used aspect of folders. A document file and photograph file.

For me, the least efficient aspect comes with not being able to store a single document at multiple places (folders). If I have a document on Subject A written by Author B and C (two authors), than I may want to have this single document show up at multiple places, for example, the Subject A for me could be XYZ, ABC etc. The best I can do is create shortcuts of this file in a multiple folders (I shouldn't duplicate the file). So here, the tags would come handy.

Now, I don't know about you but I have spent significant amount of time thinking over how to folder my photographs (and I am not a professional so not talking in GIGs).

All in all, the folders are not very efficient on their own.


No. Metro is just an interface.

Curious, how would an OS work with no folders? All the system files just have a "system" tag?

Personally, I've learned folders. Now I have trouble thinking about a filesystem that wouldn't have them. How would you organize files on a physical disk with no file structure?


You can do away with folders in the user experience without getting rid of them under the hood. They are pretty indispensable for how software is structured.


The folders are important of course but they should be very few at the top. A system's folder and User's folder. Rest all should be tags.


If they did away with folders, no existing software that was written for a prior version of Windows could possibly work -- you would need to write a new version of every program which was not dependent on folders in any way.

This would be a lot of work, since all existing software looks for users' files in folders, looks for its own internal configuration files in folders, loads DLLs from paths which are lists of folders, creates folders to put itself into when it's installed, calls APIs to open files with hierarchical names, uses current and temporary directories, etc., etc. This is true for every operating system that I can think of, not just Windows.


Windows 8 still has folders.


I do agree that files/folders are bad in general, but tags will probably not fit the bill too. I think the future file system should be lying within graph theory. Something like semantic web, if you create a file, you need to create at least one semantic link to other files. if you remove one, you need to bridge the separated graphs. But how the semantic link could make sense for human, requires far more work and research I guess.


Yes, managing files (especially documents) with tags is a good idea in many cases, but not all cases.

I had an idea for a tag-based file manager for Windows, I even have registered a domain which I think is cool -

taganizer.com = tag + organizer

But I didn't go for that idea for various reasons, I'm now working on http://liveditor.com.


People have known that folders are doomed for a while now. For instance, it's been part of Gmail's core philosophy since it was launched 8 years. The question is when the major operating systems are going to phase it out from the user experience. (Presumably, folder hierarchies will still exist under the hood.)


I don't see folders going away. (I like them much better than tags, frankly)

Google Docs started out with no folders. They added them with Google Drive http://support.google.com/drive/bin/answer.py?hl=en&answ...

EverNote used to just have tags. Now they have "stacks", which seem to be hierarchical tags. I'm not sure who really uses that kind of stuff, it seems like quite a bit of work

DropBox is obviously predicated on the idea of folders.


GoogleDoc has them because it's integrated with GoogleDrive. GoogleDrive and DropBox have them for the same reason: for compatibility with the existing OS paradigm. This doesn't say much about the inevitability of the demise of folders except that Google and DropBox recognize that it won't happen for many years. But this is obvious given that Windows 8 and Mac OS X are both built on folders.

(I don't know anything about EverNote.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: