Unless many specialized plug-ins are created, metadata has to be gathered from filenames. In order to receive useful results without a plug-in for every type of file it is necessary to optimize the keyword selection algorithm. I expect the most promising idea will be based on a learning algorithm in some way. How this should be implemented is a question that can't be answered so easily. I believe this is a subject that should be examined more closely.
No indexing method can be proven to be successful, since input is always human created data and therefore inherently unreliable. Finding heuristics based on human computer interaction is, I think, the only way of solving this problem. I will give two such heuristics as example. First, exploit the fact that people order their data in a to us unknown yet most probably rational manner. Look at the directory structure and try to find congruent features of elements within certain subsets. The less elements in a directory, the more they usually will fit together in some way or another. Second, many people create, somewhat unconsciously perhaps, a personal encoding scheme for any subset of data. For instance all work related files are numbered with their project code, all music files start with the name of the artist or all files in the favorites directory reference webresources. These rules of thumb I've just given are by no means revolutionary nor are they the only ones to be found. They are simply examples of everyday practice one can learn from personal experience that can be used to broaden the way metadata input can be dealt with.