Science Courses and Tutorials
Science education websites including university courses online, massive open online courses, and tutorials. No commercial sites.
344 listings
Submitted May 14, 2017 (Edited May 16, 2017) to Science Courses and Tutorials This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML has become one of the hottest fields of study today, taken up by undergraduate and graduate students from 15 different majors at Caltech. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects.
|
Submitted Apr 23, 2017 to Science Courses and Tutorials These notes form a concise introductory course on probabilistic graphical models. They are based on Stanford CS228, taught by Stefano Ermon, and have been written by Volodymyr Kuleshov, with the help of many students and course staff. The notes are still under construction! Although we have written up most of the material, you will probably find several typos. If you do, please let us know, or submit a pull request with your fixes to our Github repository. You too may help make these notes better by submitting your improvements to us via Github.
This course starts by introducing probabilistic graphical models from the very basics and concludes by explaining from first principles the variational auto-encoder, an important probabilistic model that is also one of the most influential recent results in deep learning. |
Submitted Apr 21, 2017 to Science Courses and Tutorials A handy Python notebook containing commonly used Pytorch functions.
|
Submitted Apr 15, 2017 (Edited Apr 16, 2017) to Science Courses and Tutorials This notebook was originally prepared for the workshop Advanced Text Analysis with SpaCy and Scikit-Learn, presented as part of NYCDH Week 2017. Here, we try out features of the SpaCy library for natural language processing. We also do some statistical analysis using the scikit-learn library.
|
Submitted Apr 13, 2017 to Science Courses and Tutorials The course covers foundations and recent advances of Machine Learning from the point of view of Statistical Learning and Regularization Theory. The goal of this course is to provide students with the theoretical knowledge and the basic intuitions needed to use and develop effective machine learning solutions to challenging problems.
|
Submitted Apr 11, 2017 to Science Courses and Tutorials In this project we will be teaching a neural network to translate from French to English.
|
Submitted Apr 10, 2017 to Science Courses and Tutorials Smart, data-driven applications are the future. This class teaches the new engineering principles that have emerged to create and run such applications.
To prepare students to meet these challenges, this course brings together topics from multiple areas of Computer Science: database systems, distributed computing, algorithms, and machine learning. A lot of the course material is drawn from recent research literature. |
Submitted Apr 05, 2017 to Science Courses and Tutorials This guide provides a collection of examples and best practices for creating interactive explanatory research articles using the Distill web framework.
|
Submitted Apr 01, 2017 to Science Courses and Tutorials When I first started investigating the world of deep learning, I found it very hard to get started. There wasn’t much documentation, and what existed was aimed at academic researchers who already knew a lot of the jargon and background. Thankfully that has changed over the last few years, with a lot more guides and tutorials appearing.
I always loved EC2 for Poets though, and I haven’t seen anything for deep learning that’s aimed at as wide an audience. EC2 for Poets is an explanation of cloud computing that removes a lot of the unnecessary mystery by walking anyone with basic computing knowledge step-by-step through building a simple application on the platform. In the same spirit, I want to show how anyone with a Mac laptop and the ability to use the Terminal can create their own image classifier using TensorFlow, without having to do any coding. |
Submitted Apr 01, 2017 to Science Courses and Tutorials In TensorFlow for Poets, I showed how you could train a neural network to recognize objects using your own custom images. The next step is getting that model into users’ hands, so in this tutorial I’ll show you what you need to do to run it in your own iOS application.
|
Submitted Mar 31, 2017 to Science Courses and Tutorials In the last few years, new inference methods have allowed big advances in probabilistic generative models. These models let us generate novel images and text, find meaningful latent representations of data, take advantage of large unlabeled datasets, and even let us do analogical reasoning automatically. This course will tour recent innovations in inference methods such as recognition networks, black-box stochastic variational inference, and adversarial autoencoders. It will also cover recent advances in generative model design, such as deconvolutional image models, thought vectors, and recurrent variational autoencoders. The class will have a major project component.
|
Submitted Mar 31, 2017 to Science Courses and Tutorials Tensorflow is a powerful open-source software library for machine learning developed by researchers at Google Brain. It has many pre-built functions to ease the task of building different neural networks. Tensorflow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.
This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of Tensorflow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use Tensorflow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks with LSTM to solve tasks such as word embeddings, translation, optical character recognition. Students will also learn best practices to structure a model and manage research experiments. |
Submitted Mar 31, 2017 to Science Courses and Tutorials Defining your models in TensorFlow can easily result in one huge wall of code. How to structure your code in a readable and reusable way?
This article explains how to use Python property decorators to strip out redundant code and how to organize your graph with scopes. |
Submitted Mar 28, 2017 (Edited Apr 10, 2017) to Science Courses and Tutorials Although spatial interaction modeling is a fundamental technique to many geographic disciplines, relatively little software exists for spatial interaction modeling and for the analysis of flow data. This applies particularly to the realm of free and open source software. As a result, this primer introduces the recently developed spatial interaction modeling (SpInt) module of the python spatial analysis library (PySAL). The underlying conceptual framework of the module is first highlighted, followed by an overview of the main functionality, which will be illustrated using migration data. Finally, some future additions are discussed.
|
Submitted Mar 28, 2017 to Science Courses and Tutorials An introduction to quantum computing using the pyQuil Python library. This tutorial covers such topics as qubit operations, Pauli operators, multi-qubit operations, the Quantum Abstract Machine, classical/quantum interaction, the probabilistic halting problem, and more, complete with pyQuil code examples.
|
Submitted Mar 27, 2017 (Edited Mar 27, 2017) to Science Courses and Tutorials This note describes how to organize the written thesis which is the central element of your graduate degree. To know how to organize the thesis document, you first have to understand what graduate-level research is all about, so that is covered too. In other words, this note should be helpful when you are just getting started in your graduate program, as well as later when you start to write your thesis.
|
Submitted Mar 27, 2017 to Science Courses and Tutorials Mining word associations from a body of text is often one of the first Natural Language Processing techniques used when mining text data. Word associations are useful for performing NLP tasks such as part of speech tagging, parsing, entity extraction, etc. We will take a brief look at one type of word association called paradigmatic association and show how we can use the Neo4j graph database to help model our text corpus as a graph and implement a simple paradigmatic relation mining algorithm.
|
Submitted Mar 27, 2017 to Science Courses and Tutorials The ‘tipping problem’ is commonly used to illustrate the power of fuzzy logic principles to generate complex behavior from a compact, intuitive set of expert rules.
|
Submitted Mar 25, 2017 to Science Courses and Tutorials Software architecture has become increasingly important in the last 15 years in the software engineering community. At the heart of every well-engineered software system is its software architecture. Software architecture deals with the high level building blocks that represent an underlying software system. These building blocks are the components (units of computation in a system), the connectors (models of the interactions between software components), and the configurations (arrangements of software components and connectors, and the rules that guide their composition). Software architectures that are found particularly useful for families of systems are often codified into architectural styles.
This course will afford the student a complete treatment of software architecture, its foundation, principles, and elements, including those described above. The class is centered around reading assignments, and homework that will test comprehension and understanding of the course material. A class project will require the student to leverage the architectural techniques learned during the course (e.g., architectural recovery, architectural styles, domain specific software architectures) to, coupled with programming/implementation effort, design and implement a real-world software system. In addition to foundations, and practical experience with software architectures, the class will also introduce the student to the state-of-the-art in software architecture research, future trends and state-of-the-practice. Students are expected to attend class regularly, and participate (as directed) in all class discussions, and most importantly, have fun! |
Submitted Mar 25, 2017 (Edited Mar 25, 2017) to Science Courses and Tutorials This course is designed as an advanced course in data analytics, and big data. The course introduces students to the area of content detection and analysis. This involves understanding of digital file formats, their detection and data extraction from them. Emphasis areas include Document Type Detection; Parsing and extraction; Metadata understanding and analysis; Language Identification and detection from files and finally file formats and representation. The class also has a specific focus on Content Detection and Analysis from large data sets. Datasets used in the course are publicly collected by the instructor or his collaborators involved in national Big Data initiatives including DARPA, NASA and other projects. The course is designed to be accessible to students with experience programming in Java and in Python at an intermediate level. The first half of the course focuses on Java, using the Tika framework as the core technology for instruction. The instructor is the co-inventor of Tika and has deep experience in the technology and in search engines technology from Apache. The second half of the course introduces the students to the use of Python programming for Content Detection and Analysis using Tika, ElasticSearch™, Solr, Nutch and Apache Hadoop™. The course will be a combination of lecture, in-class discussion, readings, group-based assignments and a final exam.
|