Based on the data released by Airbnb in Beijing on April 17, 2019, this paper analyzes the housing information and housing heat in Beijing, conducts descriptive statistics and regression modeling around region, price and heat, explores customer preferences, and provides suggestions for the operation strategy in Beijing.
01 Background
Sharing economy was first introduced to China when the field of shared accommodation acted as a pioneer. At the end of 2011, domestic short-term home stay booking platforms have been launched. After nearly 8 years of development, the domestic short-term home stay market has formed the market positioning of Airbnb, Mubird, Tujia as the first echelon, and Hazelfruit as a novice player. Founded in the United States, Airbnb focused its development on countries and regions other than China before 2017. As of August 2019, Airbnb’s housing supply has covered more than 191 countries with more than 6 million houses, but there are only 150,000 houses in China.
In recent years, China’s shared accommodation industry has witnessed rapid development. With the development of standardization and legalization of short-term rental, users’ demands have been gradually awakened, and they have gradually changed from entry-level home stays that can be used for laundry and cooking to higher requirements on decoration style and facility quality. Then, the analysis purpose of this article is how Airbnb’s housing characteristics, users’ preferences and income are brought to the platform.
02 Data Description
The data used in this paper are from the data in Beijing released by Airbnb on April 17, 2019. There are 28,452 housing information in total, including the ID and name of the housing, the ID and name of the host, the district and the location: Longitude and latitude, type, price, number of reviews, last review time, average number of reviews per month, number of houses owned by landlords on the platform, available rental time and other information.
Define analysis objectives and assumptions
- Analyze the overall data and describe the housing characteristics of each district;
- Explore customer preferences, mainly through text analysis and user reviews;
- Identify niche areas for growth.
- Hypothesis: The user will evaluate the house after completing the order, and the number of reviews reflects the number of orders for the house.
I. Descriptive statistical analysis
Statistical analysis of distribution by region
1. Regional distribution of housing supply quantity
Statistics of the number of houses in each district of Beijing, and visual display.
Chaoyang district had the most houses, more than three times as many as dongcheng District. All the other districts except Chaoyang district had less than 3500 houses, and Shijingshan District, Mentougou District and Pinggu District had the least number of houses.
2. Regional distribution of prices
Draw price box charts by region.
You can see that there are too many outliers in this direct plot, so draw a box plot below the price limit of 4000 to see the regional distribution of prices.The price of Huairou district, Pinggu District, Yanqing County and Miyun County is significantly higher than that of other areas, while the price of Chaoyang district and Haidian District is not high on the whole, although the number of houses is large. Dongcheng District with the same number of houses is higher than that of Chaoyang district and Haidian District.
There are obvious regional differences in price distribution. The price distribution of Chaoyang and Haidian districts is concentrated and below 1000, while that of Huairou District is relatively scattered, and the proportion of houses over 2000 is high.
3. Heat conditions in different regions
In this analysis, the number of users’ comments is used to reflect the popularity of housing sources. The following is a statistics of the number of comments in different regions and a box chart is drawn.
It also contains many outliers, so the number of comments is limited to 50 for a clearer understanding of heat.
It is found that the heat of the areas with a large number of houses is obviously higher, such as Haidian District, Dongcheng District and Chaoyang District. In particular, although the number of houses in Dongcheng District is not as much as that in Chaoyang District, the heat of houses in Dongcheng District is higher than that in Haidian and Chaoyang district.
4. Proportion of housing types by region
In most urban areas, the proportion of whole houses is higher than that of single rooms, and the proportion of shared rooms is the lowest. However, the ratio of whole house to single room in Haidian district, Miyun County, Huairou District, Yanqing County and Changping district is close to 1:1, which has obvious inconsistent distribution characteristics with other urban areas. From the current analysis, it can be concluded that the housing supply in Beijing has the following obvious characteristics in terms of regional distribution: Chaoyang District, Dongcheng District, Haidian District, etc. : There are a large number of houses with low prices, most of which are whole houses, apartments, independent rooms, and some shared houses; Huairou District, Miyun County, Yanqing County, etc. : Housing supply is low, the price is high, the ratio of whole house to independent room of housing type is close to 1:1.
Statistical analysis by price
1. Heat and price
Count the number of reviews in different price ranges, as shown in the figure below:On the whole, with the increase of house price, the heat of house also decreases. However, the heat of house is relatively high in the price range from 2001 to 2500, which may represent a certain type of house within this price range, so this distribution appears. According to the above analysis results, a comparative analysis is made on the cloud images of housing description sub-words in the two sections of 1001-2000 and 2001-3000:
The name description cloud in the price range of 1001-2000 shows the main characteristics of the housing sources in this price range. The most popular housing types are apartments, small courtyard and quadrangle courtyard. At the same time, the major scenic spots in Beijing such as the Forbidden City, Tian ‘anmen and Nanluoguxiang appear more frequently, and warm and luxurious words are also characteristics of this range. Speculated that this price range of housing, the main Characteristics of Beijing, to attract tourists to Beijing tourism. A comparison of the name description cloud between the house prices between 2001 and 3000 and the name description cloud between 1001 and 2000 shows different combinations. In terms of the type of house, siheyuan, small courtyard and villa are the most frequently used words, followed by family, homestone and home party, etc., while the frequency of near subway and tourist attractions is not so high. It can be speculated that most of the houses at this price want to rent a place suitable for parties. The users in this price range may not be tourists from other places, but local users in Beijing who want a place suitable for parties.
2. Regional distribution of houses priced below and above 2,000 yuan
From the above analysis, it can be known that the division of 2000 yuan as the boundary can basically represent different kinds of short-term rental housing for different purposes. In order to understand the specific distribution of these housing, the following statistics are carried out.
It can be seen that there are obvious differences in the regional distribution of different price ranges. About 50% of houses priced below 2000 yuan are located in Chaoyang and Haidian Districts, while houses priced above 2000 yuan are mainly concentrated in Huairou district, Changping District, Yanqing County and Miyun County. In accordance with the above analysis results, Chaoyang district and Haidian District are located in the central urban area, close to tourist attractions, and most of them are apartment houses. Huairou District, Changping District and Miyun County are far away from the central city, and most of them are villas, which are suitable for parties and group construction, and their rental prices are relatively high.
3. Income analysis of houses priced below and above 2000 YUAN
Assuming that the number of comments is the number of orders, revenue = house price * number of orders
Houses under 2000 | Houses above 2000 | |
---|---|---|
Housing price | 444.74 | 4972.14 |
Proportion of houses | 0.96 | 0.04 |
Proportion of orders | 0.99 | 0.01 |
Percentage of total revenue | 0.89 | 0.11 |
The proportion of houses above 2000 is only 4%, the order volume is only 1%, but it provides 11% of the income. Villa rental may have certain development space.
User preference Analysis
The number of reviews reflects the popularity of a property since the owner started a short-term rental business. In order to analyze users’ preferences, word frequency statistical analysis is conducted on house descriptions (i.e., name) with the top 2000 comments, and word cloud map is drawn as follows:
Apartment-style houses are popular, and they are geographically close to the subway. The descriptions of warm, sunny and home appear more frequently in the popular houses, followed by Sanlitun, Tian ‘anmen, the Forbidden City, Nanluoguxiang, Taigu Li and other popular tourist places. Presumably hot housing, that is, on average comments high buildings, are generally tour groups, the choice of the position is close to the subway and other public transportation and are near hot spots, people pay more attention to go out tourism short rent a place of comfort and warmth, this is also marked differences and choice of the hotel, hotel groups.
Regression model
In order to further understand the relationship between housing heat and price, regional distribution and other factors, this paper takes the number of comments as the index to evaluate housing heat as the dependent variable, and housing distribution, price, type and the number of houses owned by the owner as the independent variable. Linear regression was used to fit the data, and the regression coefficients and significance analysis of the finally obtained variables were shown in the following table. The P value of the final model is less than 0.001, and the adjusted R square is 0.057.
According to the regression coefficient of the model, the following conclusions can be drawn when other factors remain unchanged:
- Regional distribution: Consistent with the conclusion of the box chart, the heat of Daxing District, Miyun County, Pinggu District, Yanqing County, Huairou District, Fangshan District, Changping District, etc. is lower than shunyi District, while the heat of Dongcheng District, Fengtai District, Chaoyang District, Haidian District, Xicheng District is higher than Shunyi District.
- Type of housing: the total rent and independent rooms are 1.397 and 2.36 times of shared rent respectively;
- Price: negatively correlated with the number of comments, the higher the price, the fewer the number of comments;
- Number of properties owned by owners: There is a negative correlation between the number of properties owned and the number of reviews.
3. Analysis conclusion
From the above descriptive statistical analysis, text analysis and regression analysis, we can know that Airbnb pays more attention to quality and delicacy of housing supply in addition to meeting general short-term rental demand.
- Haidian District, Chaoyang District and other central urban areas are popular choice places for tourists. The house price in these areas is generally around 500, and the main demand is to be close to public transportation and tourist attractions. There are a large number of houses in these areas, mostly independent rooms, with large orders. Huairou District, Miyun County and other short-term rental housing prices are significantly higher, most of the whole lease, the order volume is relatively small.
- According to the text analysis of the description of short-term rental houses with high popularity, it is found that there are obvious bias. The main preference is the distance from public transportation and tourist attractions, and the style of sunshine, warmth and family orientation. Through the analysis of the order quantity of different price range, also found that there are obvious differences between different prices. Short-term rental houses in Beijing can be generally divided into two types. The first type is apartment houses, which may be popular with foreign tourists. The price is usually less than 1000, and they are close to public transportation and tourist attractions. The other category is villa houses, the price of more than 2000, mainly for parties, home parties and other group activities, mainly in Huairou, Miyun and other areas.
- Airbnb has been expanding its presence in the domestic market in recent years, and its development target is mainly high-end housing. Through the data analysis, it is found that the high-priced houses over 2000 account for only 4% of the house, 1% of the order volume, and 11% of the income. Moreover, the housing market in the range of 2000 to 2500 shows an obvious increase in popularity, which proves the feasibility of the management strategy of middle and high-end housing. In addition, in recent years, the competition between major short-term rental platforms is fierce, so it may be a feasible strategy for Airbnb to continue targeting high-end housing sources.