preface

In fact, I have been writing about open source for a long time. It has been six years since I started working with Hadoop in 2010, from the earliest Hadoop user and Contributor to Committer, Finally, I became a member of PMC (Product Management Committee, Project Management Committee), with both frustration and joy. Every day I deal with open source, I would like to share some personal thoughts here.

This article is my personal opinion and does not represent the views of my employer, Hortonworks, or the Apache Software Foundation.

Copyright Notice: This article is published by LeftNoteasy on LeftNoteasy.cnblogs.com. This article may be quoted in part or in whole. Please reserve the full text of the copyright notice. You can contact Wheeleast (at) gmail.com with questions or add me to @leftNoteasy

What is open source?

I believe many people are confused about the concept of open source. I touched on it in a previous blog post, but I want to expand on it.

Open mind

The first thing to be clear is that open source is a spirit, and the spirit is, I want to share my stuff so that more people can use it. For open source people, what they get out of it varies, some for financial gain, others for interest. But the first thing that’s clear is that once you open source, it doesn’t matter how anyone else uses it.

So there were comments on my previous blog that I don’t want to open source because I don’t want others to sell my code for money, and I can only say that open source is not for you.

Open source licenses

Select an appropriate License for open source software. For details, see How to Select an Open Source License. If none of the above open source licenses are for you, you can always write your own.

Here’s an example of a popular protocol, the MIT protocol, for example, what does it require, because the English version of the protocol is long, and the excerpt has been annotated here

… including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software

You can change my code any way you want, package it and sell it!

… THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED…

The code is as it is, without any warranty!

… IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY…

I’m not responsible for it!

MIT is a relatively loose protocol, but there are also stricter protocols such as the GPL: If your program uses the interface DEFINED by me, then you must open source with the GPL. Xiaomi was involved in some disputes about the GPL before. Therefore, it is important to choose different protocols according to different circumstances, and when you refer to someone else’s open source program, you should also pay attention to whether you violate the author’s open source license.

The open source community

The open source community is the ecosystem on which open source software lives. The community is made up of users and developers.

Open Source Community users

Users tend to come from all over the world to use, complain (and possibly be very sarcastic), and contribute to open source software. The Hadoop community, for example, has many users who post their problems to the User/developer mailing list and the JIRA (bug database). The community of developers will have a chance to see and fix the problem. Users who have a deep understanding of open source software and try to see what bugs are in the code and come up with solutions are the most popular.

Open Source community developer

Developers can come from different companies or from the same company, depending on the purpose. For example, the core developers of Hadoop come from companies that are heavily using or selling Hadoop-related products, such as Yahoo, Microsoft, Hortonworks, Cloudera, etc. There are also some open source software that don’t want people from other companies to participate. For example, Google’s open source products generally don’t have contributions from other companies. Examples include Tensorflow and Android.

A good open source community has a very virtuous circle: users use the software, find bugs and propose improvements; Developers use user feedback to continuously improve the product; New developers are constantly coming into the community to develop, so that even older developers can move on.

A bad open source community, on the other hand, has little participation, ignores user feedback, has no way to release new versions, and is a dead project altogether.

Who are the developers in the open source community?

Are developers in the open source community unpaid?

For open source developers, there is a common understanding that people in the open source community are technology champions, who regard technology as life and money as dung. They work for the company and work for open source projects after work.

This is a big misconception, but the vast majority of developers in the open source community contribute code at work because these open source projects are part of the corporate IT infrastructure. Of course, there are many great bulls who can create the most amazing software based on their own interests. Personally, I admire Linus the most, and he founded Linux and Git entirely based on his own technology, experience and interest.

Is the open source community a pure land of technology?

Another common misconception is that the open source community is a place where programmers live in their own technology world, oblivious to the outside world and focused on programming. In fact, where there are people, there are rivers and lakes. My friend Zhijie wrote an article about the Hadoop two-party system, which is very worth reading.

Like companies in general, developers in the open source community have some not-so-nice individuals:

  • He’s too busy taking credit for other people’s work
  • Cliques and politics, gossiping on various topics of discussion
  • For their own disagreeing with the opinion of the export into dirty personal attack

But I have to say that the good open source community also has more technology gurus and a better atmosphere than the average company, so hopefully this doesn’t scare you: it’s not pure land, but it’s cleaner than the average place.

How do open source projects make money?

There are a lot of ways to make money on open source projects, and it’s not because you open source that you have nothing left to do for someone else (although it sometimes feels like that). The main reason for companies to open source (aside from purely personal interests) is that they can no longer sell closed-source products.

For example, would Android be open source without IOS? Without Unix, Windows, it’s hard to say whether Linux will be open source. So the best way for # 2 to beat # 1 is to open source # 2’s most important technology.

Once open source goes out, there are many ways to make money, such as:

  • Sell tech support, you use my open source software, but I can fix it for you, and I can add the features you need. This is the most common way to make money, but the biggest problem is that the profit margin is relatively low, because of the high labor cost of technical support.
  • Sell training, with the above similar, because only I understand the most open source software, you want to learn, of course, have to find me.
  • Selling advanced featuresSome open source software provides more advanced features on top of open source, which are often closed source. Redhat has done this successfully, but there are several major problems
    • 1) Users may worry about being lock-in. In many cases, enterprise users who choose open source software hope not to be locked in the software of a certain software provider, in case the provider goes bankrupt or charges excessive prices.
    • 2) What is the value of this advanced function? Is there any similar open source free software that can achieve these functions?
  • Selling Cloud services: Now many companies also provide open source software as Cloud services. For example, Tensorflow can run on Google Cloud Platform, and Docker can run on Docker’s own Cloud. This is also the hot open source software profit direction.

In any case, products associated with open source projects tend to be much cheaper than similar closed source products, and with a healthy community there will be fewer people using closed source software in the next few years. As an example, Teradata’s revenue for the latest Quarter was down from the year-earlier period due to a variety of open source projects, see Teradata Reports 2016 First Quarter Results.

Do open source projects represent the highest code quality?

The short answer is no.

It happens all the time in most open source communities: some developer updates the code, finds out “I’m surprised this thing can be submitted to the codebase”, and starts a new discussion topic to start the war.

The quality of code in the open source community is due to several reasons:

1) Although open source projects have a very strict Code review policy, all the Code entered into the repository must be checked and approved by the responsible person in the relevant field before it can be committed, but most of the time, Some developer inexperience and Code reviewer oversight will result in some not-so-good Code being submitted.

2) In addition, the open source community sometimes has a lot of inefficient discussions, and sometimes the final decision is the result of compromise under different requirements.

Therefore, if you look at the current IT field, generally speaking, open source projects are not the best. For example, in distributed systems, Google’s internal GFS should be far ahead of Hadoop HDFS; Google Borg is years ahead of YARN, Mesos, and K8S.

But still have to say, some things can not only watch, if you can find some weak chicken in Hadoop problems does not mean that all the code is a lump of shit, in the code architecture design, user interface design and the very core of the code is dominated by Daniel, get through a lot of ideas.

And because in the open source community, because every line of your code passes through the eyes of others, from a personal reputation, the average person will pay more attention to the code entered into the open source community; At the same time, developers who consistently submit low-quality code will be labeled “unreliable” and will have a hard time working in the community.

So in summary:

The quality of open source code is not the best, but it’s still pretty good, and because the open source community is built by people, good projects will live longer, even if the person or company who worked on the project is no longer there.

Well, as a popular science article, I’ll leave it there, and then I’ll talk about some more in-depth topics about open source.