I'm actually working on this problem. It's remarkable the amount of content we actually generate and the number of services/mediums we generate them on. Simply fetching all the data and hosting it centrally is a large enough problem, let alone indexing all of it.