Redis- Access and analysis of merchant information
- Process meituan merchant information data set, clean the data and store it in mongodb database
- Read and analyze the data, mining the existing value, and provide strong support for the operation strategy
Install the Redis operation module
PIP install redis
Data stored in
Data initialization
Import pandas as pd df= pd.read_csv(' CSV ', encoding=' UTF-8 ') df=df.dropna().drop(['id'],axis=1) #Copy the code
Database connection
Import redis pool = redis.ConnectionPool(host='127.0.0.1', port=6379,decode_responses=True, Encoding ='UTF-8') r = redis.strictredis (connection_pool=pool) # Connect redisCopy the code
If necessary, this operation will empty the Redis database!!
Write data to the database
The information for each store will be written to Redis as a List
For a zip in (df [' name '], df [' store ID '], df [' score '], df [' address '], df [' comments'], df [' average ']) : r.lpush(a[0],a[1],a[2],a[3],a[4],a[5]) print(r.lrange(a[0],0,-1))Copy the code
The data analysis
Process data and exploit value to support operational strategy
View each Key
keys = r.keys()
print(keys)
Copy the code
All read
df=pd.DataFrame() keys = r.keys() id=[] score=[] dir=[] num=[] price=[] for key in keys: key_list = r.lrange(key,0,-1) id.append(key_list[4]) score.append(key_list[3]) dir.append(key_list[2]) Num. Append (key_list[1]) price. Append (key_list[0]) df['name']=keys # append df['id']=id # append df['score']=score # append(key_list[1]) price Df ['dir']=dir # business address df['num']=num # comments df['price']=price # average consumption dfCopy the code
High marks the businessman
Merchant rating greater than 4.5
df['num']=df['num'].astype('int') df['price']=df['price'].astype('double') df['score']=df['score'].astype('double') Df1 = df [df [' num] > = 0] df2 = df1 [df1 [' price '] > 0] df3 = df2 [df2 [' score '] > 4.5] df4 axle = df3 [df3 [' score '] < = 5] df4 axle. The index = range(len(df4)) df4Copy the code
Popular merchants
Num More than 3000 comments
df['num']=df['num'].astype('int')
df['price']=df['price'].astype('double')
df['score']=df['score'].astype('double')
df1=df[df['num']>=3000]
df2=df1[df1['price']>0]
df3=df2[df2['score']>0]
df4=df3[df3['score']<=5]
df4.index = range(len(df4))
df4
Copy the code
Top business
At the same time to meet the popular business and high score business conditions
df['num']=df['num'].astype('int') df['price']=df['price'].astype('double') df['score']=df['score'].astype('double') Df1 = df [df [' num] > = 3000] df2 = df1 [df1 [' price '] > 0] df3 = df2 [df2 [' score '] > 4.5] df4 axle = df3 [df3 [' score '] < = 5] df4 axle. The index = range(len(df4)) df4Copy the code
Inexpensive restaurant
Per capita consumption level is between 5 and 50
df['num']=df['num'].astype('int')
df['price']=df['price'].astype('double')
df['score']=df['score'].astype('double')
df1=df[df['num']>=0]
df2=df1[df1['price']>5]
df3=df2[df2['score']>0]
df4=df3[df3['score']<=5]
df5=df4[df4['price']<50]
df5.index = range(len(df5))
df5
Copy the code
Fancy restaurant
The per capita consumption level is above 200 yuan
df['num']=df['num'].astype('int')
df['price']=df['price'].astype('double')
df['score']=df['score'].astype('double')
df1=df[df['num']>=0]
df2=df1[df1['price']>200]
df3=df2[df2['score']>0]
df4=df3[df3['score']<=5]
df4.index = range(len(df4))
df4
Copy the code
High-end preferred restaurant
Average consumption is higher than 200 yuan, score is higher than 4.5
df['num']=df['num'].astype('int') df['price']=df['price'].astype('double') df['score']=df['score'].astype('double') Df1 = df [df [' num] > = 0] df2 = df1 [df1 [' price '] > 200] df3 = df2 [df2 [' score '] > 4.5] df4 axle = df3 [df3 [' score '] < = 5] df4 axle. The index = range(len(df4)) df4Copy the code
Full score preferred merchant
Stores with a rating of 5.0 and thousands of reviews
df['num']=df['num'].astype('int')
df['price']=df['price'].astype('double')
df['score']=df['score'].astype('double')
df1=df[df['num']>=1000]
df2=df1[df1['price']>0]
df3=df2[df2['score']==5]
df3.index = range(len(df3))
df3
Copy the code