The biggest problem with ai development is no platformization?

Kai-fu Lee made a very pertinent comment about artificial intelligence during his speech at Tsinghua University. It also explains in detail how ordinary enterprises apply ARTIFICIAL intelligence to establish competitive advantages and technical barriers for enterprises. In short, the development of artificial intelligence industry is still very limited — there is no platform. But because of this limitation, the barriers built up at this time are also the highest. Because of the many advantages of AI, we believe that enterprises should start to use AI to assist themselves, and also recruit some related talents.

But we can not just talk, how to overcome difficulties and challenges, so that artificial intelligence to help your work, your career? Let’s explain kai-fu Lee’s speech with an example.

Giiso Information, founded in 2013, is a leading technology provider in the field of “artificial intelligence + information” in China, with top technologies in big data mining, intelligent semantics, knowledge mapping and other fields. At the same time, its research and development products include editing robots, writing robots and other artificial intelligence products! With its strong technical strength, the company has received angel round investment at the beginning of its establishment, and received pre-A round investment of $5 million from GSR Venture Capital in August 2015.

Now, suppose you’re a programmer

Although elder brother is also a media people, but black up their own business is to not be any, assume that you are now a media IT department personnel, every day do most of the media, the most important job is to copy the article from another site, then add some mark posted on their website, and you are a 5 yards farmers, but who CARES, Trying to save the company’s miserable editors. So you decided to write a program of your own that would allow editors to copy articles with one click, even automatically. How would you do that?

Use artificial intelligence to copy and paste, of course, seems to be somewhat overqualified, but it looks very mechanical work, also need a certain strain, such as web pages in addition to the body have a lot of mess in the advertising links, as long as the website designer will not be too stupid, the results should be designed out on sight which part is the body which is irrelevant information. But how does an algorithm recognize the difference between body text and AD/irrelevant links? And how does the algorithm find what is worth copying in the content of the website? (It is called “hot spot”)

The most important thing is, if you think about it carefully, there are a lot of things to pay attention to. You usually learn If else seems not enough, what language should you use to complete your amazing algorithm?

This brings us to the first challenge lee mentioned for deep learning today: there is no platform

One of the challenges of deep learning: platforms

There is no unified platform for AI. In the aspect of deep learning, people now know it and don’t know it. That’s why Google has recently spent a lot of money on hiring the best talent in the industry, offering young people salaries of more than $2 million a year. How can these guys be worth so much when they’re in their 20s with PHDS?

Why is it so expensive? Kaifu Lee mentioned that once these people are invested in VARIOUS fields of AI research, they may soon create tens or even hundreds of millions of dollars of value. But what he probably didn’t get across is that AI development is really hard right now, it’s hard, and it’s hard because there’s no platform.

The concept of platform is a relatively empty, because now the development of artificial intelligence is still in a state of feeling the stones across the river, so no one can predict what is so-called the accurate form of “platform”, this topic spread may be able to speak a single article, but in simple terms, probably is a state of “standard”. For example, when it comes to neural network algorithm, people will think of many concepts, such as CNN, RNN, DNN and so on, and the specific application implementation method is also strange. All ai-based programming is to start from zero and build algorithms bit by bit. But if one day there is something like iOS, Android, that explores the best algorithm (for example, not necessarily the best algorithm). , and it is integrated into a certain program, if the future generations want to carry out neural network related development, only need to call it provides the API can be completed. That can greatly simplify the development of deep learning.

Deep learning challenge # 2: Data collection and computation

Of course, as a programmer with the world in mind, this is not difficult for you, you should quickly find a suitable language system, such as Tensorflow, Scikit, happy to write programs. But the next problem you face may not be so easy to solve: they all determine the performance of your algorithm training in two directions: the amount of training data and the speed of training. Kaifu Lee split this question into two questions, but we believe that they are both about algorithm training, so they can be attributed to the same question.

Deep learning networks are too large and require huge amounts of data.

Because there is so much data, the calculation is extremely slow, so it requires a very large amount of calculation.

How to identify sites within the body position is still a better solve the problem, if you heart a horizontal, decided from only a few big (more than a dozen big) in the mainstream media copy paper, with the if the else can be solve, after all, although the body of law between each site is different, but each site within the article is basic to follow the same rule. If you really want to do a general algorithm, the rules are not hard to find, such as text density in the body will suddenly increase and HTML code density will plummet, such as the body is almost always <p></p>. If the site can update four or five hundred articles a day, it can be very accurate in about ten days of training.

The difficulty lies in “hot spot” ah!

The hot spots on the Internet change every day. How does your algorithm know what’s hot today? And how do you know if the article that the algorithm is scanning is a hotspot-related article, and how is it written? For an algorithm to be trained to judge this information, it would have to scan tens of millions of articles. As the ultimate hobbyist coder, you feel for the first time how weak the 8-core i7 and GTX Titan are on the computer in front of you. See if we can sneak out on the company’s servers sometime during off-peak hours.

For deep learning algorithms to evolve to a relatively high level on their own, Lee estimates that at least 1 billion levels of data are needed, which is quite difficult to collect. In addition, only when these data are all your own, they can give full play to their real value in your hands. In addition, due to the huge amount of data, the amount of calculation required is also quite huge. In order to make a full display in the field of deep learning, it is better to have your own computing equipment, such as your own server cluster. So what we’ve seen early successes in AI are world-class companies like Microsoft, Google, And Facebook, which not only have more money, better people, but most importantly, they have a lot of data.

Deep learning challenge # 3: No feedback

“Strange but reasonable: machines can’t tell you why or why in human language. Even though machine training does great deep learning, face recognition, speech recognition, it can’t be the same as a human, it can’t tell you how it does it. Although some people are also doing research in this area, it is still difficult for deep learning in today’s field where it is necessary to tell others what to do and explain to others why. For example, Alpha Go beat Lee Sedol. If you ask Alpha Go why they made this move, they won’t be able to answer.”

Deep learning is an algorithm that it can only be based on their initial design constantly change myself, but can’t through an effective way to improve myself tell you how it is and how to make every choice after improvement, so a lot of times you can guess it through its final performance of running state, and then a blind cat touch dead mouse improved the original algorithm. And you may not be able to get information from the progress of algorithms to help you progress, such as the example of AlphaGo cited by Teacher Kai-fu Lee.

Giiso information, founded in 2013, is the first domestic high-tech enterprise focusing on the research and development of intelligent information processing technology and the development and operation of core software for writing robots. At the beginning of its establishment, the company received angel round investment, and in August 2015, GSR Venture Capital received $5 million pre-A round of investment.

Of course, as a world-class coder, you must know this, and you must be able to spot the problem based on the final performance of the algorithm, because after all, the problem is relatively obvious in the results.

In this virtual world, we decided to give all the helpless editors a happy ending: the world-conscious coder successfully developed a “fully automatic article reprint machine”, which finally frees up some of the editors’ energy to write some fine-tuned articles.

So you see, while we don’t know if AI will end up destroying us, it looks like they’ll soon be saving a lot of people

The biggest problem with ai development is no platformization?

Related Posts

| eight Transformers to save and load models

Biosphere fusion technosphere, how can AI save the human homeland?

· Use CNN network recognition to crack digital captcha