AACIMP Summer School 2015 - Introduction to Machine Learning
Konstantin Tretyakov
University of Tartu,
BIIT Research Group
The process of observing the world, discovering patterns in observations, and describing them in terms of concise models has always been at the core of science (and perhaps even human life in general). Nowadays, however, thanks to the development of computing technologies, the data that we can collect and store is so vast and diverse, that no single human is capable of processing it. Machine learning (also known as "data mining" or "pattern analysis") is a field, which deals with algorithms for discovering patterns and estimating ("learning") useful models from the data. In just a couple of decades, machine learning has grown from a rather niche area of computer science and statistics into a flourishing field, which lies at the heart of countless pieces of software and hardware, used by us in everyday life. The course offers a gentle, hands-on introduction to the core principles and techniques of machine learning. Topics covered are general probabilistic modeling and estimation, trees, linear models some instance-based and, time permitting, some unsupervised techniques. Students will implement and apply the discussed algorithms using the interactive environment IPython. Some extent of basic familiarity with programming, probability theory and linear algebra (or at least a vague recollection of those areas) is expected from the participants.
Tutorial materials
- Slides:
- IPython workbooks: (zip)
In order to use the tutorial materials you will need to have Python installed along with some additional packages. There are several ways to install it, but for most users the most convenient option is to install Anaconda Python, which is a Python distribution preloaded with all the necessary packages. Users of Debian-based Linux systems (e.g. Ubuntu), who already use Python (and thus know what they are doing), can install the necessary packages by running:
$ sudo apt-get install ipython-notebook python-numpy python-scipy python-matplotlib $ sudo apt-get install python-sklearn python-imaging python-nltk
Similarly, Mac users who already have Python installed along with MacPorts, can obtain the needed packages by running:
$ sudo port install py27-zmq py27-tornado py27-nose py27-ipython py27-numpy py27-scipy $ sudo port install py27-matplotlib py27-scikit-learn py27-pil py27-nltk
Unpack the zip file with the workbooks, open the command line in the resulting directory, and launch ipython
as follows:
$ ipython notebook --pylab=inline
Then open your browser and point it to the address http://localhost:8888/ (ideally, it should open automatically as soon as you invoke the above command).