问题
jsoup(版本1.11.2)请求数据时,超时时间设置为1分钟,但是30秒就超时了,爆出SocketTimeoutException:Read timed out。
示例代码
Connection.Response res = Jsoup.connect(url).timeout(60000).ignoreContentType(true)
异常栈
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:734)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:706)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:299)
保险起见,用wireshark抓了包,client端(192.168.8.12)发起请求后(366号包),server端(172.19.80.110)立刻响应了(368号包,只ACK,未携带数据),但是过了30秒后仍然未传输数据,所以client端断开链接,发送FIN报文(369号包)。
解决
首先,想到看java doc
/**
* Set the total request timeout duration. If a timeout occurs, an [email protected] java.net.SocketTimeoutException} will be thrown.
*The default timeout is 30 seconds (30,000 millis). A timeout of zero is treated as an infinite timeout.
*Note that this timeout specifies the combined maximum duration of the connection time and the time to read
* the full response.
* @param millis number of milliseconds (thousandths of a second) before timing out connects or reads.
* @return this Connection, for chaining
* @see #maxBodySize(int)
*/
Connection timeout(int millis);
按照javadoc的意思是,超时时间是connect 时间+read时间的总和,默认是30秒,这明显与实际不符。
根据异常栈,找到源代码
org.jsoup.helper.HttpConnection
private static HttpURLConnection createConnection(Connection.Request req) throws IOException {
final HttpURLConnection conn = (HttpURLConnection) (
req.proxy() == null ?
req.url().openConnection() :
req.url().openConnection(req.proxy())
);
conn.setRequestMethod(req.method().name());
conn.setInstanceFollowRedirects(false); // don't rely on native redirection support
conn.setConnectTimeout(req.timeout());
conn.setReadTimeout(req.timeout() / 2); // gets reduced after connection is made and status is read
//省略不相关代码
注意,conn.setConnectTimeout(req.timeout());
connect timeout设置成了60s,但conn.setReadTimeout(req.timeout() / 2)
是30s(60/2),正好印证了368号包与369号包相隔30秒。至此真想打包,jsoup的timeout并不完全如javadoc所说,正确的说法应该是,connect timeout是传入的timeout,read timeout是传入timeout的一半。
总结
其实这个问题,最终还是回到了基础知识:tcp的两个超时时间(IT虾米网),一个connect timeout,一个read timeout,分别对应java api中
java.net.Socket
connect timeout
connect(SocketAddress endpoint, int timeout)
将此套接字连接到服务器,并指定一个超时值。
连接超时,是三次握手的时间。
read timeout
setSoTimeout(int timeout)
启用/禁用带有指定超时值的 SO_TIMEOUT,以毫秒为单位。
read timeout是数据报文与数据报文之间的间隔时间,并不是读取全部内容的时间。
正确理解以上两个概念,有助于解决问题。
原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/tech/pnotes/20265.html