[Network] 1. File Transfer Optimization Sharing
[Network] 1. File Transfer Optimization Sharing
Background (Sensitive Data Masked)
- Due to various requirements, we need to upload data to overseas OSS for storage. So we developed a proxy service to maintain data and perform encryption operations. During this process, we discovered that data upload and download were very slow. After a series of investigations, we finally located the root cause of the problem and provided a solution. We’re now sharing the troubleshooting process.
- Of course, one prerequisite is
internal network connectivity through dedicated line network accessto achieve theoretical physical limits. Using complex and lengthy public networks is neither suitable for file security nor for large file long-term transmission.
Service-Level Issues
- Initially, we suspected it was due to data writing to disk being too slow. Because uploads must be written to disk to prevent files from being too large. Downloads use direct streaming transmission, which is very reasonable. The only improvement would be to perform streaming encryption and transmission for uploads, but the current issue is not significant.
Phenomenon
- Using our written script to upload 1M of encrypted data took nearly 2 seconds
| |
| |
Packet Capture
- After communicating with operations, they suspected it was a network issue and performed packet capture to investigate.
Packet Capture Demonstration
Ping Packets
| |
| |
Three-Way Handshake
| |
Four-Way Handshake
| |
tcpdump Flags
- Tcpdump flags are flags that indicate TCP connection status or actions. They are usually represented in square brackets in tcpdump output. There are various flags in tcpdump output, and the output may also contain combinations of multiple TCP flags. Some common flags include:
- S (SYN): This flag is used to establish a connection between two hosts. It is set in the first packet of the three-way handshake.
- . (No flag): This means no flag is set in the packet. It is usually used for data transmission or acknowledgment packets.
- P (PUSH): This flag is used to indicate that the sender wants to send data as soon as possible without waiting for the buffer to fill.
- F (FIN): This flag is used to terminate the connection between two hosts. It is set in the last packet of the four-way handshake.
- R (RST): This flag is used to reset connections that are in an invalid state or encounter errors. It is also used to reject unwanted connection attempts.
- W (ECN CWR): This flag is used to indicate that the sender has reduced its congestion window size according to the network’s Explicit Congestion Notification (ECN).
- E (ECN-Echo): This flag is used to indicate that the receiver has received a packet with the ECN bit, meaning there is congestion in the network.
- For example, a packet with flags [S.] means it is a SYN packet, the first step in establishing a TCP connection. A packet with flags [P.] means it is a PUSH packet containing data that the sender wants to transmit quickly. A packet with flags [F.] means it is a FIN packet, the last step in closing a TCP connection.
Why tcpdump Four-Way Handshake Only Has Three Packets
- The reason tcpdump four-way handshake only has three packets may be due to the following:
- One possibility is that the passive closing party (the one receiving FIN) sends its own FIN while replying with ACK, combining the second and third handshakes into one packet, saving one packet. In this case, the passive closing party has no more data to send, so it can directly enter the LAST_ACK state and wait for the final ACK from the active closing party.
- Another possibility is that the active closing party (the one sending FIN) doesn’t reply with ACK promptly after receiving the passive closing party’s FIN, but sends ACK after some time with the RST flag set, indicating a forced connection reset. In this case, the active closing party may have encountered an exception or timeout, so it no longer waits for the 2MSL time but directly enters the CLOSE state.
- Another possibility is that tcpdump didn’t capture all packets due to network delay or packet loss, causing certain handshake packets not to be captured. In this case, you can try re-capturing packets or increasing the capture time range to see if you can see the complete four-way handshake process.
Actual Data
| |
| |
- Actually uploading 1M of data for analysis, simplified here.
- Since all time jumps occur in packets returned from the server side, the problem is now very clear. Due to the actual physical distance between Shenzhen and the US East Coast, the 200ms round trip has reached its limit. So it’s actually reasonable.
Soul-Searching Question
- At this point, a soul-searching question arises: why was it faster when using the public network before?
- After communicating with colleagues from sister departments and simulating their code, we tested using AWS SDK
| |
Results

