Installing and configuring all the software needed for this course on your machine might be tedious. We have prepared a virtual machine (VM) which you can download and use it. You can download the machine from here (size: ~7.46GB). The username/password is csdeptucy.

Week Description Material
2 Inverted Index and the Boolean Model using NLTK and Apache OpenNLP LAB01.pdf,
3 Apache Lucene LAB02.pdf,
Lucene 1 Solution
Lucene 2 Solution
4 Apache Solr LAB03.pdf  
5 ElasticSearch LAB04.pdf
6 Apache Hadoop 1 LAB05.pdf
Hadoop 1 Source Code
Hadoop 1 Solution
7 Apache Hadoop 2   LAB06.pdf
Hadoop 2 Source Code --
Hadoop 2 Solution
8 Apache Hadoop 3   LAB07.pdf
Hadoop 3 Source Code
Hadoop 3 Solution
9 Apache Nutch LAB08.pdf  
10 Apache Tika LAB09.pdf
11 Text Clustering and Classification in Python LAB10
12 Apache Spark LAB11.pdf  
13 Projects Presentations