Laborationer
Laborations: 5 x 4 hours; first mandatory lab Thursday Jan 31, 8:15 - 12:00 in E:4119
Optional, non-mandatory lab on PHP-programming Thursday Jan 24, 8 - 12 in E:4119
Laboratory exercises (PDF)
Schedule
- Thursday Jan 24: 0. Optional lab: PHP programming
- Thursday Jan 31: 1. Information retrieval basics: similarity, tf-idf, recall/precision
- Thursday Feb 7: 2. Link-based ranking, Query languages
- Thursday Feb 14: 3. Text pre-processing for indexing
- Thursday Feb 21: 4. Concepts using LSI; Document classification using SVM
- Thursday Feb 28: 5. Browsing vs searching, Search Engines vs meta-search
- Thursday Mar 7: Spare.
Hints, tips and usefull links
If you want to download and run the software needed (PHP, SVD, SVM) for the laboratory exercises on your own machine just talk to me. It is all free software available on the net.The code examples and testdata provided in the S:\ catalogues on the windows machines in the lab-room is available here:Zipped archive, orTar archive.
The Python skelleton is available here: Zipped archive, orTar archive.
-
PHP Cheat Sheets:
- PCRE cheat sheet.
- PHP Basics Quick Reference Sheet
- The PHP cheat sheet is a one-page reference sheet, listing date format arguments, regular expression syntax and common functions.
- A PHP class for stemming.
- An IR book freely available: Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press. 2008. Covers aproximately the same methods as the course book, but in a slightly different way.
- LSI:
- Vector Model: