Java multithreading + breakpoint continuation mode download network resources

preface

At the beginning of the preparation of this thing, is to use Java to climb the download address of the opera, and then, cache to the local free slowly brush (half a T small house this must be arranged in the young girls, empty how bad ya)

At the same time, considering some common problems, such as sudden power off the network what, down to the same file directly scrap and have to start from scratch, you must use the breakpoint to continue the function

The initial design idea is to download file fragments to the local PC and store them as files such as 1.temp, 2.temp, 3.temp… After downloading everything, finally merge the files into the xx.mp4 we need

In the process of development, I looked up a lot of data and found that there was a function to read and write the offset coordinate of file data, so the idea was to first generate a file of the same size, and then multiple threads delimited the region on this file for data update

knowledge

java.net.URL

Construct a URL object by passing in an HTTP resource link in the form of a new URL(String)

The sample

URL resource = new URL("https://www.wahaotu.com/uploads/allimg/201904/1555074510295049.jpg");
Copy the code

HttpURLConnection

Used to link to the resource corresponding to the URL

The main points used are:

Open resource links

HttpURLConnection conn = (HttpURLConnection) url.openConnection()

When we open the link, the default object we get is URLConnection, so we need to force it into a more specific subclass of HttpURLConnection

Set the request header

conn.setRequestProperty(key, value);

We simulate sending HTTP requests through URLS. To make our requests more natural, we need to add user-agent

Since we need to request resources to the server in segments, we need to add the request header Range and value in bytes=start-end format, where the data Range can be understood as [start, end]. We need to cut the start and end bits of the resources to be segmented in advance. This step needs to be performed before retrieving the input stream

Gets the resource input stream

conn.getInputStream()

When we are ready, we can get the input stream and then save the file against the standard byte IO stream

Close links

conn.disconnect()

The same is true for resource streams, which need to be closed manually to avoid waste of resources. You can also use JDK7’s try Resource feature to automatically close the file output stream

RandomAccessFile

This is a random read and write file class. The online interpretation is that it is the encapsulation of InputStream and OutputStream, and the actual use is the same. This class provides methods for read and write operations

Important: this class supports random reads and writes!! That is, we can specify to start reading or writing from a coordinate in the file!!

The constructor

new RandomAccessFile(String fileName, String mode)
new RandomAccessFile(File file, String mode)
Copy the code

Once we pass in the File name, the internal help will be converted to File and the second constructor will be called

mode

A minor flaw, no enumerated classes are used. Read the source code, internal use of the string comparison, support 4 values

“R”. Read-only
“Rw”, readable and writable
“RWS”, which supports read and write and requires that every update to file content or metadata be written synchronously to the underlying storage device
“RWD”, which supports reading and writing and requires that every update to the contents of a file be written synchronously to the underlying storage device

Sets the offset pointer to the file

The main thing is that there are two methods that tell the JVM how much to offset a file to read or write

seek(long pos)

Sets the offset of the file pointer from which the next read or write occurs. The offset may be beyond the end of the file. Setting an offset beyond the end of the file does not change the file length. The file length changes only if the offset goes beyond the end of the file.

skipBytes(int n)

Attempt to skip n bytes of input and discard the skipped bytes. The final call inside the method is seek()

Quickly generates blank files of specified size

setLength(fileLength)

Function planning

Intercepts the file name based on the URL, or manually specifies the file name
Generate local target file (skip if any)
Generate a local progress log file
1. If the file exists, the downloaded index data is retrieved
File resource slice, generate index, remove the index that has been downloaded
Enable the thread pool to assign indexes to be downloaded to threads in the thread pool

The finished product code

// v3 multithreading, file fragment replication, file source changed to network source
@SuppressWarnings("resource")
public class FileDown3 {
  // Configure the thread pool
  int threadSize = 16;
  private ExecutorService threads = Executors.newFixedThreadPool(threadSize);

  static final String FILE_ACCESS_MODE = "rwd";

  String source; // Source, HTTP link
  String dir = "D:/a-log/down/"; // Local file download path
  String fileName; // The file name to download
  String tempFileName; // Temp file that records the progress locally

  static final int LEN = 1024 * 1024 * 10; // M, file slice size

  private Set<Integer> used = new HashSet<>(); // It is already in use
  private Set<Integer> todo = new HashSet<>(); // The task to be done

  private Map<Integer, String> ranges = new HashMap<>(); // Slice data to pull data from URL fragments

  // Get the link resource from the URL
  private HttpURLConnection getConn(a) throws Exception {
    URL url = new URL(source);
    HttpURLConnection conn = (HttpURLConnection) url.openConnection();
    return conn;
  }

  private static String getFileNameFromPath(String path) {
    String[] dirs = path.split("/");
    return dirs[dirs.length - 1];
  }

  private String getLocalPath(a) {
    return dir + fileName;
  }

  public FileDown3(String source) {
    this(source, getFileNameFromPath(source));
  }

  public FileDown3(String source, String fileName) {
    this.source = source;
    this.fileName = fileName;
    this.tempFileName = getLocalPath() + ".temp"; // Cache file, progress record
    init();
  }

  // Initialize the operation
  private void init(a) {
    try {
      System.out.println(===> Create local file);
      createLocalFileIfNotExist();
      System.out.println("===> Create progress file");
      processProgressFile();
      System.out.println("===> File slicing");
      createDownIndexBySplit();
      System.out.println("===> Initialization completed");
    } catch(Exception e) { e.printStackTrace(); }}// Generate the local object file if it does not exist
  private void createLocalFileIfNotExist(a) throws Exception {
    File file = new File(getLocalPath());
    if(! file.exists()) { RandomAccessFile accessFile =newRandomAccessFile(file, FILE_ACCESS_MODE); accessFile.setLength(getConn().getContentLengthLong()); }}// Process the local progress log file
  private void processProgressFile(a) throws IOException {
    File temp = new File(tempFileName);
    if(! temp.exists()) {// Create one
      temp.createNewFile();
    } else { // Update the downloaded index data if it exists
      BufferedReader bufferedReader = new BufferedReader(new FileReader(temp));
      String str = bufferedReader.readLine();
      if (str == null)
        return;
      bufferedReader.close();
      for (String s : str.split(",")) { used.add(Integer.valueOf(s)); }}}// cut the file to the index
  private void createDownIndexBySplit(a) throws Exception {
    int fileLen = getConn().getContentLength();
    // [0, 9] [10, 19]
    for (int i = (int) (fileLen / LEN); i >= 0; i--) {
      ranges.put(i, "bytes=" + i * LEN + "-" + Math.min(fileLen + 1, (i + 1) * LEN));
      // System.out.println(ranges.get(i));
    }
    todo.addAll(ranges.keySet());
    // Drop the index that has already been dropped
    todo.removeAll(used);
  }

  void successDown(a) {
    new File(tempFileName).deleteOnExit();
  }

  public void down(a) {
    downByMultithread();
    // successDown();
  }

  // Write files in multithreaded mode
  private void downByMultithread(a) {
    todo.stream().forEach(i -> threads.execute(new DownThread(i)));
    threads.shutdown();
  }

  class DownThread implements Runnable {
    Integer index;
    byte[] bs = new byte[1024 * 128];

    DownThread(Integer index) {
      this.index = index;
    }

    @Override
    public void run(a) {
      try {
        HttpURLConnection conn = getConn();
        // Set the slice file location
        conn.setRequestProperty("Range", ranges.get(index));
        // Make the current request natural to prevent being 403
        conn.setRequestProperty("User-Agent"."Mozilla / 5.0 (Windows NT 10.0; Win64; x64; The rv: 89.0) Gecko / 20100101 Firefox / 89.0");
        RandomAccessFile fos = new RandomAccessFile(getLocalPath(), FILE_ACCESS_MODE);
        // Read/write file synchronization offset
        // fos.skipBytes(LEN * index); // After some comparison, finally call seek()
        fos.seek(LEN * index);
        / / write operations
        InputStream is = conn.getInputStream();
        int read;
        while((read = is.read(bs)) ! = -1) {
          fos.write(bs, 0, read);
        }
        fos.close();
        conn.disconnect();
        synchronized (FileDown3.class) {
          // Update the index
          used.add(index);
          System.out.printf("Current file: [%s], Download fragment: [%d], Progress: [%d %%] \n", fileName, index, (int) (used.size() * 100 / ranges.keySet().size()));
          try {
            // Update to file
            String memo = used.stream().map(n -> n.toString()).collect(Collectors.joining(","));
            Files.write(Paths.get(tempFileName), memo.getBytes());
          } catch(IOException e) { e.printStackTrace(); }}}catch(Exception e) { e.printStackTrace(); }}}public static void main(String[] args) {
    FileDown3 down1 = new FileDown3("https://www.wahaotu.com/uploads/allimg/201904/1555074510295049.jpg"); down1.down(); }}Copy the code

Download the effect

Obviously, the bandwidth is full

conclusion

Sometimes, we download too slowly because the server will deliberately limit our download speed by checking the download speed of a single connection and then allowing us to download the rest of the fragments for a certain amount of time when they exceed their maximum speed. (This is my guess. From this information I find HttpURLConnection download speed limit method

Of course, if your own download speed is slow, you have someone else’s server to blame