Summer Semester 2009/10
Winter Semester 2009/10
Summer Semester 2010/11
Exploration of Internet Resources PI_ITI1104
Course content:
1. Web structure mining.
2. Web content taking and processing.
3. Web document searching based on keywords.
4. Link-based ranking of search results.
5. Introduction to data clustering.
6. Web documents clustering.
7. Tolerance rough sets in web document clustering.
8. Summary.
Learning outcomes:
Ability to apply existing tools for solving web mining problems.
(in Polish) Rodzaj przedmiotu
Course coordinators
Term 2010L: | Term 2009L: |
Bibliography
a) basic references:
1. Zdravko Markov, Daniel T. Larose:Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage (in Polish), Wydawnictwo Naukowe PWN, 2009
2. David Hand, Heikki Mannila, Padhraic Smyth: Data mining (in Polish). Wydawnictwa Naukowo-Techniczne, 2005
3. Soumen Chakrabarti, Mining the Web: Discovering Knowledge from Hypertext Data, Morgan Kaufmann 2002
b) supplementary references:
1. Bing Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer, 2010
2. Ngo Chi Lang, A tolerance rough set approach to clustering web search results, Warsaw University 2003
3. Saso Dzeroski, Nada Lavrac (eds.): Relational Data Mining, Springer, Berlin, 2001