HTW Berlin Fotopedia, cc-by-nc, Andrea Kirkby, 2008

HTW Berlin
Fachbereich 4
Internationaler Studiengang
Internationale Medieninformatik (Master)
Semantic Modeling
Summer Term 2016

Lab 8: Using Weka

  1. Start the Weka Explorer and load the weather.nominal.arff data set. What attributes do the instances have? How can you find information about the attributes? What are the values that the attribute temperature can have?
  2. Load the iris dataset iris.arff. How many instances does this dataset have? How many attributes? What are the range of possible values of the attribute petallength?
  3. Reload the weather data. What is the function of the first column in the Viewer window?
  4. What is the class value of instance number 8 in the weather data?
  5. Reload the iris data and open it in the editor. How many numeric and how many nominal attribues does this dataset have?
  6. Reload the weather dataset and use a filter to remove an attribute. Use the filter weka.unsupervised.instance.RemoveWithValues to remove all instanes in which the humidity attribute has the value high.
  7. Undo the changes to the dataset that you just performed, and verify that the data has reverted to its original state.
  8. Now play with Weka's data visualization facilities. Since numeric data work best, use the iris data. Click on the Visualize tab. Can you make a scatter plot of petalwidth to petallength? What does the Jitter button do? Can you get anything else interesting to work?

Prepare a report detailing what you did. You should work in groups of 2 or 3. Submit your written report (everyone should have their own copy submitted) to the Moodle area by 22.00 the evening before the session it is due.

This exercise is taken from Witten, Frank & Hall: Data Mining, 3rd Edition.


Some rights reserved. CC-BY-NC Prof. Dr. Debora Weber-Wulff
Questions or comments: <weberwu@htw-berlin.de>