Finding well-known persons (celebrities) in text documents is a common task that can be used for many applications. There are thousands of celebrities around the world who are well-known in one or several domains, such as sports, science, film, music, etc.
Many people are interested to gather information about celebrities, including their images, awards, age, family, background, activities, people's opinions about them, etc. In addition, companies that want to employ celebrities for their advertisements and other marketing activities are interested to find celebrities that are the best fit for their products and image. They also like to rank celebrities based on different aspects, such as trustworthiness, performance, opinion leadership, uniqueness, etc. Finally, celebrities themselves are eager to know what are the public opinions about them and how they rank among other celebrities in the same field.
This and more information about celebrities is readily available on the Web in the form of structured data, textual documents, blogs, tweets, web services, etc. We plan to build a system that can automatically harvest celebrity data from the Web and present them in a comprehensive portal for the general public, for marketing departments and for the celebrities themselves.