Zhang Liang, co-founder of Zhihu, asked Teacher Yu Jun a question on Zhihu: “Based on your experience, what are the three to five product improvements that Zhihu needs to make most urgently?” Yu jun of the teacher’s answer to the first opinion is that “explore and push the personalized content, I know there is a lot of content in zhihu is interested in me, but only rarely zhihu push content is I willing to click on, always make me have seed into the baoshan and back to the feeling of empty, the netease cloud music, taobao, today’s headlines are good learning objects.”

In fact, Zhihu launched the laboratory function of “Opening the new version of the home page dynamic experience” in November 2013. One of the features is that “the content in the dynamic will be adjusted according to the relationship between users, users’ interest in the topic and the quality of the content, instead of being ordered strictly by time”. But I don’t know why it doesn’t seem to be officially enabled until now, and the lab function is still closed by default, requiring users to manually open it. If you don’t know this feature, you can turn it on and give it a try. Check it in “Settings – Lab”.

Some time ago, Quora’s VP Engineering machine learning champion Xavier Amatriain gave a presentation at the Question Answering Workshop at WWW2016. Machine Learning for Q&A Sites: The Quora Example [1], I studied it over The weekend and share it with you.

Quora’s Mission: To share and grow the world’s knowledge.

Quora mainly considers three factors: Relevance, Quality, and Demand.

Quora’s core data model and the relationships between them.

Feed Ranking

One of the core issues of Quora recommendations is Personalized Feed Ranking. Quora links “knowledge” with questions, answers and topics as the core, then classifies content quality based on users’ actions such as pushing and stepping, and finally makes knowledge flow within the community through the follow-up relationship between people and questions. Personal feeds are the main carriers of this “flow”. Xavier says Quora is more difficult to create Feed Ranking than Netflix, and that’s normal. Xavier wouldn’t jump if there wasn’t a bigger challenge. The primary goal of Quora Feed Ranking is to ensure that the content pushed into users’ feeds is highly relevant to their interests. Another consideration is follow-up and interaction between users. Xavier calls this social relevance. For example, some questions and answers related to hot events should be timely pushed into users’ feeds.

1, objective: Present most interesting stories for a user at a given time

  • Interesting = topical relevance + social relevance + timeliness

  • Stories = questions + answers

2. Personalized learning-to-rank method is mainly used

3. Xavier confirmed that relevancy ordering significantly increased user engagement compared to time-ordered ordering.

4. Challenges

  • potentially many candidate stories

  • real-time ranking

  • optimize for relevance

Below is the basic data composition for Quora to do Feed Ranking. Quora calls this “Impression logs”.

Around these basic behaviors, Quora defines the following Relevance functions.

In a nutshell, a “behavior weighting function” is used to predict the user’s interest in a story. There are two alternative methods of calculation: one is to put all the actions into a regression model and predict the final value directly, or the other is to predict the probability of each action separately (such as top, read, share, etc.) and then add them together. The first one is simple, but less interpretable. The second one can make better use of each action signal, but it needs to match each action with a classifier, which consumes a lot of calculation.

The three main models Quora uses are as follows.

Xavier also emphasizes the importance of feature engineering. Working on this area can be very helpful in getting a good ranking result, especially if you can update features online in real time, so that you can respond to user behavior in a more timely manner. The main features of Quora include:

  • user (e.g. age, country, recent activity)

  • story (e.g. popularity, trendiness, quality)

  • interactions between the two (e.g. topic or author affinity)

In the grand scheme of things, Quora’s Feed Ranking is nothing special. It’s pretty much standard in the industry. What makes Quora special is that its data model is more complex and has more diverse relationships than other sites. For example, from the perspective of users, we can follow other users, questions and topics.

  1. Follow users receive a wider and more diverse range of information. Surprise content is likely to come from interesting users they Follow, but it may also be the most likely to create irrelevant content noise. The most important work in this area is the evaluation of user professionalism.

  2. Question/Answer is the core content element of Quora and also the force that drives the knowledge flow in The Quora system. The main work of this section is to guide more highly professional users to contribute high-quality answers, and how to stimulate the production of more good questions (even automatic questions). To calculate an Answer ranking, there is also work to be done against SAPM.

  3. Topic is the aggregation of a Topic content, which plays an extremely important role in the information architecture of Quora and is the skeleton of knowledge structure. Quora calls this Topic Network, and how to build a Topic Network is itself a very big challenge. Other issues to be solved include how to find (potential) quality issues under Topic, how to reduce water issues and filter/merge duplicate issues, etc.

Each of these core questions has been further explored by Quora.

Answer Ranking

Goal: Given a question and N answers, come up with the ideal ranking of those N answers.

Quora mainly considers the following three dimensions for Ranking calculation, each of which contains many features.

  1. The quality of the content itself. Quora has clear guidelines [2] on what constitutes a “good answer”, such as one that is fact-based, reusable, provides explanation, well formatted, etc.

  2. Interaction, including top/tap, comment, share, favorites, click, etc.

  3. Some characteristics of the respondent, such as the respondent’s expertise in the question area.

In addition, this part of the work also includes two parts: non-personalized and personalized. The sorting of some kinds of questions is non-personalized, and the best answer is consistent for all users, while other questions are personalized, and the best answer for each person will have their own personalized judgment. In a word, Answer Ranking is very important for Quora, which is very detailed. There is a special article about this on Quora blog. Interested friends can go to see the original [3].

Ask2Answers

A2A is one of the most important features of Quora’s product. Originally, Quora could directly recommend relevant questions to the respondents that the system thinks are appropriate. This is what Quora originally did. In addition, this is also a kind of social action. One of the essence of social interaction is to make it convenient for users to “pretend to be forced”. Before answering questions, it is very casual to “thank the invitation/cathartic medicine”. It’s a seemingly simple feature, and Quora has worked hard to model A2A as a machine learning problem: Given a question and a viewer rank all other users based on how “well-suited” they are. Likelihood of receiving a request + likelihood of the candidate adding a good answer, Consider both the likelihood that the browsing user will send an invitation and the likelihood that the invitee will be invited to respond. There is also an article on the Quora blog detailing their approach [4].

Topic Network

Quora has worked hard to get users to properly tag content, and the benefits of continuing to do so are beginning to show, as they found [5]

  1. Topics are rapidly diversifying as their user base expands.

  2. Many fields have self-organized fairly well hierarchical knowledge structures.

Quora believes it is possible to organize domain knowledge by relying on communities.

User Trust/Expertise Inference

This is another very important thing for Quora. Quora needs to identify experts in a field and then guide them to contribute more quality answers in that field through the product. Quora takes into account how many questions users have answered in a particular area, and how much data they have received from likes, clicks, thanks, shares, favorites, and views. Another important aspect is the spread of expertise. For example, if Xavier has a thumbs-up for an answer in the field of recommendation systems, the author of the answer is likely to have a high degree of expertise in the field of recommendation systems.

Other related topics include recommended topics, recommended users, related questions, repetitive questions, anti-spam, etc. Quora uses machine learning to solve these problems extensively.

Quora’s greatest treasure is the vast amount of valuable content that has accumulated over the years in various fields, and it’s no wonder Quora has mined it, Mapping the Discussion on Quora Over Time Through Question Text [6] is a good case of data mining. Author Tao Wenwen, see personal profile is Peking University yuan pei plan, no problem to another beauty of outstanding results of the students, feel that know can move.

Facebook and several other topics over time

Top U.S. themes for the fourth quarter of 2014

References:

[1] http://www.slideshare.net/xamat/machine-learning-for-qa-sites-the-quora-example

[2] https://www.quora.com/What-does-a-good-answer-on-Quora-look-like-What-does-it-mean-to-be-helpful/answer/Quora-Official-A ccount

[3] https://engineering.quora.com/A-Machine-Learning-Approach-to-Ranking-Answers-on-Quora

[4] https://engineering.quora.com/Ask-To-Answer-as-a-Machine-Learning-Problem

[5] https://data.quora.com/The-Quora-Topic-Network-1

[6] https://data.quora.com/Mapping-the-Discussion-on-Quora-Over-Time-through-Question-Text

ResysChina has previously posted a series of articles about how Facebook and Pinterest do it. Follow the wechat official account ResysChina and reply “feed” to check it out.

Recommend follow wechat official account [ResysChina], China’s most professional personalized recommendation technology and product community. More content will be released on wechat.

★ Guess you like:”
Listen to the Quora product team talk about how to design the UI for the recommendation system