preface

The text and pictures in this article come from the network, only for learning, communication, do not have any commercial purposes, if you have any questions, please contact us to deal with.

Free Python learning materials and group communication solutions click to join

Basic Environment Configuration

  • Python 3.6
  • pycharm
  • requests
  • parsel

The related module PIP can be installed

! [](https://p1-tt-ipv6.byteimg.com/large/pgc-image/3e80663d28c944e68478987beb13a023)
"' action games, sports games: http://www.4399.com/flash_fl/2_1.htm http://www.4399.com/flash_fl/3_1.htm puzzle games: http://www.4399.com/flash_fl/5_1.htm shooter game: http://www.4399.com/flash_fl/4_1.htm... ' ' 'Copy the code
! [](https://p6-tt-ipv6.byteimg.com/large/pgc-image/cb6a11d95ff74654973a0b45c83e84e5)
! [](https://p9-tt-ipv6.byteimg.com/large/pgc-image/2a6afefbcfa44c1a90583441d31aff78)
Import requests import parsel import CSV f = open(' 4455 miniclip. CSV ', mode='a', encoding=' utF-8-sig ', encoding=' utF-8-sig ', Newline = ") csv_writer = csv.dictwriter (f, fieldNames =[' game address ', 'game name ']) csv_writer.writeheader() for page in range(1, 106): url = 'http://www.4399.com/flash_fl/5_{}.htm'.format(page) headers = { 'User-Agent': 'the Mozilla / 5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'} Response = requests. Get (url=url, headers=headers) response.encoding = response.apparent_encoding selector = parsel.Selector(response.text) lis = selector.css('#classic li') for li in lis: dit ={} data_url = li.css('a::attr(href)').get() new_url = 'http://www.4399.com' + data_url.replace('http://', '/') dit [' game address] = new_url title = li. The CSS (' img: : attr (Alt) '). The get () dit [' game name '] = title print (new_url, title) csv_writer.writerow(dit) f.close()Copy the code
! [](https://p9-tt-ipv6.byteimg.com/large/pgc-image/a7bbf0f8df7f4c398a458c44664f643c)

That’s a lot of data. There’s only 32,548 data stored here