Element List - Resources for Scientific Research: Machine Learning

53 matching results for "machine learning":

Auto-Sklearn: Automated Machine Learning Toolkit

100/5
1
2
3
4
5

Submitted May 16, 2017 to Scientific Software

Auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. Auto-sklearn frees a machine learning user from algorithm selection and hyperparameter tuning. It leverages recent advantages in Bayesian optimization, meta-learning and ensemble construction.

Tags: machine learning

Details Rate Report

CalTech Learning from Data Online Course

100/5
1
2
3
4
5

Submitted May 14, 2017 (Edited May 16, 2017) to Science Courses and Tutorials

This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML has become one of the hottest fields of study today, taken up by undergraduate and graduate students from 15 different majors at Caltech. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects.

Tags: machine learning

Details Rate Report

DBpedia

100/5
1
2
3
4
5

Submitted Apr 26, 2017 to Scientific Data

DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data. We hope that this work will make it easier for the huge amount of information in Wikipedia to be used in some new interesting ways. Furthermore, it might inspire new mechanisms for navigating, linking, and improving the encyclopedia itself.

Tags: nlp, machine learning

Details Rate Report

Stanford CS228 Probabilistic Graphical Models course notes

100/5
1
2
3
4
5

Submitted Apr 24, 2017 to Science Courses and Tutorials

These notes form a concise introductory course on probabilistic graphical models. They are based on Stanford CS228, taught by Stefano Ermon, and have been written by Volodymyr Kuleshov, with the help of many students and course staff. The notes are still under construction! Although we have written up most of the material, you will probably find several typos. If you do, please let us know, or submit a pull request with your fixes to our Github repository. You too may help make these notes better by submitting your improvements to us via Github.

This course starts by introducing probabilistic graphical models from the very basics and concludes by explaining from first principles the variational auto-encoder, an important probabilistic model that is also one of the most influential recent results in deep learning.

Tags: machine learning

Details Rate Report

A Course in Machine Learning by Hal Daumé III

100/5
1
2
3
4
5

Submitted Apr 22, 2017 to Science Books

CIML is a set of introductory materials that covers most major aspects of modern machine learning (supervised learning, unsupervised learning, large margin methods, probabilistic modeling, learning theory, etc.). Its focus is on broad applications with a rigorous backbone. A subset can be used for an undergraduate course; a graduate course could probably cover the entire material and then some.

Tags: machine learning

Details Rate Report

Quora Question Pairs Dataset

100/5
1
2
3
4
5

Submitted Apr 20, 2017 (Edited Apr 20, 2017) to Scientific Data

Today, we are excited to announce the first in what we plan to be a series of public dataset releases. Our dataset releases will be oriented around various problems of relevance to Quora and will give researchers in diverse areas such as machine learning, natural language processing, network science, etc. the opportunity to try their hand at some of the challenges that arise in building a scalable online knowledge-sharing platform. Our first dataset is related to the problem of identifying duplicate questions.

Tags: nlp, machine learning

Details Rate Report

Mining of Massive Datasets

100/5
1
2
3
4
5

Submitted Apr 20, 2017 to Science Books

The book by Jure Leskovec, Anand Rajaraman, and Jeff Ullman is based on Stanford Computer Science course CS246: Mining Massive Datasets (and CS345A: Data Mining). The book, like the course, is designed at the undergraduate computer science level with no formal prerequisites. To support deeper explorations, most of the chapters are supplemented with further reading references. By agreement with the publisher, you can download the book for free.

Tags: data mining, machine learning

Details Rate Report

Introductory Machine Learning Notes

100/5
1
2
3
4
5

Submitted Apr 13, 2017 (Edited Apr 13, 2017) to Science Books

These notes are an attempt to extract essential machine learning concepts for beginners. They are a draft and will be updated. Likely they won’t be typos free for a while. They are dry and lack examples to complement and illustrate the general ideas. Notably, they also lack references, that will (hopefully) be added soon. The mathematical appendix is due to Andre Wibisono’s notes for the math camp of the 9.520 course at MIT.

Tags: machine learning

Details Rate Report

MIT 9.520: Statistical Learning Theory and Applications

100/5
1
2
3
4
5

Submitted Apr 13, 2017 to Science Courses and Tutorials

The course covers foundations and recent advances of Machine Learning from the point of view of Statistical Learning and Regularization Theory. The goal of this course is to provide students with the theoretical knowledge and the basic intuitions needed to use and develop effective machine learning solutions to challenging problems.

Tags: machine learning

Details Rate Report

Univ. Toronto CSC 2541 Fall 2016: Differentiable Inference and Generative Models

100/5
1
2
3
4
5

Submitted Mar 31, 2017 to Science Courses and Tutorials

In the last few years, new inference methods have allowed big advances in probabilistic generative models. These models let us generate novel images and text, find meaningful latent representations of data, take advantage of large unlabeled datasets, and even let us do analogical reasoning automatically. This course will tour recent innovations in inference methods such as recognition networks, black-box stochastic variational inference, and adversarial autoencoders. It will also cover recent advances in generative model design, such as deconvolutional image models, thought vectors, and recurrent variational autoencoders. The class will have a major project component.

Tags: machine learning

Details Rate Report

t-SNE Supercharged

100/5
1
2
3
4
5

Submitted Mar 24, 2017 to Science Blogs

t-SNE is the very popular algorithm to extremely reduce the dimensionality of your data in order to visually present it. It is capable of mapping hundreds of dimensions to just 2 while preserving important data relationships, that is, when closer samples in the original space are closer in the reduced space. t-SNE works quite well for small and moderately sized real-world datasets and does not require much tuning of its hyperparameters. In other words, if you’ve got less than 100,000 points, you will apply that magic black box thing and get a beautiful scatter plot in return.

Tags: visualization, machine learning

Details Rate Report

Applying Machine Learning to the Internet of Things

100/5
1
2
3
4
5

Submitted Mar 13, 2017 to Science Blogs

Given all the hype and buzz around machine learning and IoT, it can be difficult to cut through the noise and understand where the actual value lies. In this week’s #askIoT post, I’ll explain how machine learning can be valuable for IoT, when it’s appropriate to use, and some applications and use cases currently out in the world today.

Tags: machine learning

Details Rate Report

Mizar Project

100/5
1
2
3
4
5

Submitted Mar 09, 2017 to Science Research Groups » Computer Science

The Mizar project started around 1973 as an attempt to reconstruct mathematical vernacular in a computer-oriented environment.

Since 1989, the most important activity in the Mizar project, apart from continual improvement of the Mizar System, has been the development of a database for mathematics. International cooperation (the main partners: Shinshu University in Nagano and University of Alberta in Edmonton) resulted in creating a database which includes more than 9400 definitions of mathematical concepts and more than 49000 theorems (see Megrez MML Browsing for more statistics).

Tags: machine learning, mathematics

Details Rate Report

A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

100/5
1
2
3
4
5

Submitted Mar 08, 2017 to Science Blogs

The joint many-task model tackles multiple NLP tasks with a single architecture. Tasks are layered such that subsequent and previous tasks benefit from training of the closely-related tasks. Though applied to specific NLP objectives, the proposed model introduces a powerful concept for future research.

Tags: ai, machine learning, nlp

Details Rate Report

Google AudioSet: A sound vocabulary and dataset

100/5
1
2
3
4
5

Submitted Mar 08, 2017 to Scientific Data

AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds.

By releasing AudioSet, Google Research hopes to provide a common, realistic-scale evaluation task for audio event detection, as well as a starting point for a comprehensive vocabulary of sound events.

Tags: machine learning

Details Rate Report

Word2Vec Tutorial - The Skip-Gram Model

100/5
1
2
3
4
5

Submitted Mar 08, 2017 to Science Courses and Tutorials

This tutorial covers the skip gram neural network architecture for Word2Vec. My intention with this tutorial was to skip over the usual introductory and abstract insights about Word2Vec, and get into more of the details. Specifically here I’m diving into the skip gram neural network model.

Tags: nlp, machine learning

Details Rate Report

Edward: A library for probabilistic modeling, inference, and criticism

100/5
1
2
3
4
5

Submitted Mar 03, 2017 (Edited Mar 04, 2017) to Scientific Software

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields: Bayesian statistics and machine learning, deep learning, and probabilistic programming.

Edward is built on top of TensorFlow. It enables features such as computational graphs, distributed training, CPU/GPU integration, automatic differentiation, and visualization with TensorBoard.

Tags: python, machine learning, bayes

Details Rate Report

OptNet: Differentiable Optimization as a Layer in Neural Networks

100/5
1
2
3
4
5

Submitted Mar 02, 2017 to Scientific Software

Mathematical optimization is a well-studied language of expressing solutions to many real-life problems that come up in machine learning and many other fields such as mechanics, economics, EE, operations research, control engineering, geophysics, and molecular modeling. As we build our machine learning systems to interact with real data from these fields, we often cannot (but sometimes can) simply ``learn away'' the optimization sub-problems by adding more layers in our network. Well-defined optimization problems may be added if you have a thorough understanding of your feature space, but oftentimes we don't have this understanding and resort to automatic feature learning for our tasks.

Until this repository, no modern deep learning library has provided a way of adding a learnable optimization layer (other than simply unrolling an optimization procedure, which is inefficient and inexact) into our model formulation that we can quickly try to see if it's a nice way of expressing our data.

See our paper OptNet: Differentiable Optimization as a Layer in Neural Networks and code at locuslab/optnet if you are interested in learning more about our initial exploration in this space of automatically learning quadratic program layers for signal denoising and sudoku.

Tags: ai, deep learning, machine learning

Details Rate Report

Scikit-plot

100/5
1
2
3
4
5

Submitted Mar 01, 2017 to Scientific Software

Scikit-plot is an intuitive library to add plotting functionality to scikit-learn objects.

Scikit-plot is the result of an unartistic data scientist's dreadful realization that visualization is one of the most crucial components in the data science process, not just a mere afterthought.

Gaining insights is simply a lot easier when you're looking at a colored heatmap of a confusion matrix complete with class labels rather than a single-line dump of numbers enclosed in brackets. Besides, if you ever need to present your results to someone (virtually any time anybody hires you to do data science), you show them visualizations, not a bunch of numbers in Excel.

That said, there are a number of visualizations that frequently pop up in machine learning. Scikit-plot is a humble attempt to provide aesthetically-challenged programmers (such as myself) the opportunity to generate quick and beautiful graphs and plots with as little boilerplate as possible.

Tags: python, machine learning

Details Rate Report

Victor Lavrenko's Machine Learning Lectures

100/5
1
2
3
4
5

Submitted Feb 25, 2017 to Science Videos and Lectures

Dozens of playlists consisting of multiple ~5-10 minute videos on many different subjects within machine learning, from basic clustering, regression, and probability algorithms to neural networks, information retrieval, search engines, and text analysis by Victor Lavrenko of the University of Edinburgh.

Tags: machine learning

Details Rate Report