This is the first day of my participation in the August More text Challenge. For details, see: August More Text Challenge

The course is Ng’s machine learning course. The pyTorch notes will be updated later.


Machine Learning definition

Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.

— Arthur Samuel (1959)

Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by p, improves with experience E.

— Tom Mitchell (1998)

An appropriate learning problem is defined as follows: a computer program learns from experience E, solves a task T, performs a performance measure P, and by P measures performance on T is improved by experience E.

Supervised learning

In supervised learning, given a data set and each sample indicates the correct answer, algorithms are used to predict the “correct answer”.

Supervised Learning: We gave the algorithm a data set, in which the right answers were given. The task of the algorithm was to just produce more of these right answers.

  • Regression: Predict continuous valued output
  • Classification: Discrete valued output

Regression problems

Given the learning time and test score of the data element, a simple function is fitted. After fitting the formula, randomly give a data can predict the corresponding another data, such as known test 75 points can be speculated about 3h review, known review 3h can be speculated about 75 points.

Discrete problem

The differences in height, weight and sex are known, and the differences in height, weight and sex are obvious. Then, given a height and weight, you can infer the sex.

Examples:

You’re running a company and you want to develop learning algorithms to address each of two problems

Problem 1: You have a large inventory of identical items. You want to predict how many of these items will sell over the next 3 months

Problem 2: Youd like software to examine individual customer accounts and for each account decide if it has been hacked /compromised

4, Treat problem 1 as a regression problem, problem 2 as a classification problem.

P1 predicts sales, it must be to fit the historical sales data to get a curve model for prediction, so it is a regression problem.

P2 Check account, data is discrete, safe or unsafe, so it is a classification problem.

Unsupervised learning

In unsupervised learning, data sets are given without data differentiation. The program automatically classifies or groups the input data to find data models and patterns.

In Unsupervised Learning, the data that doesn’t have any labels,or that all has the same labels or really no labels.

  • Clustering algorithm to cluster

As a former student of biology, my first reaction is that clustering is widely used in information generation. Clustering in bioinformatics, given a SEQUENCE of DNA, can be automatically divided into different species. The following is a heat map.

Examples:

Of the following examples, which would you address using an unsupervised learning algorithm?

  • Given email labeled as spam/ not spam learn a spam filter

  • Given a set of news articles found on the web, group them into set of articles about the same story

  • Given a database of customer data, automatically discover market segments and group customers into different market segments

  • Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not.

The answer:

  • Given a set of news articles found on the web, group them into set of articles about the same story
  • Given a database of customer data, automatically discover market segments and group customers into different market segments

1. Distinguish whether it is spam: supervised learning – discrete

4. Distinguish patients with diabetes: supervised learning-discrete