IFI Summer School 2014 - Introduction to Machine Learning with Python
Konstantin Tretyakov
University of Tartu,
BIIT Research Group,
STACC
The process of observing the world, discovering patterns in observations, and describing them in terms of concise models has always been at the core of science (and perhaps even human life in general). Nowadays, however, thanks to the development of computing technologies, the data that we can collect and store is so vast and diverse, that no single human is capable of processing it. Machine learning (also known as "data mining" or "pattern analysis") is a field, which deals with algorithms for discovering patterns and estimating ("learning") useful models from the data. In just a couple of decades, machine learning has grown from a rather niche area of computer science and statistics into a flourishing field, which lies at the heart of countless pieces of software and hardware, used by us in everyday life. The course offers a gentle, hands-on introduction to the core principles and techniques of machine learning. Topics covered are general probabilistic modeling and estimation, trees, linear models some instance-based and, time permitting, some unsupervised techniques. Students will implement and apply most of the discussed algorithms using the interactive environment IPython. Some extent of basic familiarity with programming, probability theory and linear algebra (or at least a vague recollection of those areas) is expected from the participants.
- Time: 9:00–17:00, June 24, 2014
- Location: University of Zurich BIN (Department of Informatics, Binzmühlestrasse 14, 8050 Zürich).
The course will be held in the computer classroom 2.B.04. - General information: available at the IFI Summer School website
Thank you for attending! It was a pleasure to give the course. If you have any questions (or something to say in general), feel free to contact me by e-mail or via Facebook.
Tutorial materials
- Slides:
- Introduction
- The Optimization Perspective
- The Probabilistic Perspective
- Conclusion
- Extra: Unsupervised learning
(This part was prepared for the case extra time would remain beyond the "normal program". This did not happen, hence it was not presented, however the slides may contain some hints to those who would want to check out the worksheets numbered 4-...)
- IPython workbooks: (zip)
In order to use the tutorial materials you will need to have Python installed along with some additional packages. There are several ways to install it, but for most users the most convenient option is to install Anaconda Python, which is a Python distribution preloaded with all the necessary packages. Users of Debian-based Linux systems (e.g. Ubuntu), who already use Python (and thus know what they are doing), can install the necessary packages by running:
$ sudo apt-get install ipython-notebook python-numpy python-scipy python-matplotlib $ sudo apt-get install python-sklearn python-imaging python-nltk
Similarly, Mac users who already have Python installed along with MacPorts, can obtain the needed packages by running:
$ sudo port install py27-zmq py27-tornado py27-nose py27-ipython py27-numpy py27-scipy $ sudo port install py27-matplotlib py27-scikit-learn py27-pil py27-nltk
Unpack the zip file with the workbooks, open the command line in the resulting directory, and launch ipython
as follows:
$ ipython notebook --pylab=inline
Then open your browser and point it to the address http://localhost:8888/ (ideally, it should open automatically as soon as you invoke the above command).