精华内容
下载资源
问答
  • java WebClient 模拟登陆 得到接口数据

    千次阅读 2020-01-14 10:14:17
    java使用WebClient实现后台登陆爬取数据 WebClient WebClient是从Spring WebFlux 5.0版本开始提供的一个非阻塞的基于响应式编程的进行Http请求的客户端工具。它的响应式编程的基于Reactor的。WebClient中提供了标准...

    java使用WebClient实现后台登陆爬取数据

    WebClient

    WebClient是从Spring WebFlux 5.0版本开始提供的一个非阻塞的基于响应式编程的进行Http请求的客户端工具。它的响应式编程的基于Reactor的。WebClient中提供了标准Http请求方式对应的get、post、put、delete等方法,可以用来发起相应的请求。WebClient可以通过WebClient.create()创建一个WebClient的实例,之后可以通过get()、post()等选择调用方式,uri()指定需要请求的路径,retrieve()用来发起请求并获得响应,bodyToMono(String.class)用来指定请求结果需要处理为String,并包装为Reactor的Mono对象。

    后台爬取数据主要是就是cookie怎么保持的问题,有些网站登陆加密很复杂,所以可以用webclient模拟登陆,绕过加密,只要WebClient不清除cookie就会一直在。

    依赖
            <dependency>
    			<groupId>net.sourceforge.htmlunit</groupId>
    			<artifactId>htmlunit</artifactId>
    			<version>2.36.0</version>
    		</dependency>
    
    登陆
    public void login() throws FailingHttpStatusCodeException, IOException {
    		HtmlPage page = null;
    		// 获取指定网页实体
    		URL url = new URL("****");//网页url
    		page = (HtmlPage) wc.getPage(url);//page就会得到一个页面实体
    		HtmlInput usernameInput = page.getHtmlElementById("LoginName");//通过getHtmlElementById获取输入框
    		HtmlInput pswInput = page.getHtmlElementById("Password");
    		// 往输入框 “填值”
    		usernameInput.setValueAttribute("admin");
    		pswInput.setValueAttribute("cjglyjsbyy");
    		// 获取登陆按钮
    		HtmlInput btn = page.getHtmlElementById("btnLogin");
    		try {
    			btn.click();//按下登陆按钮
    			Log.debug(wc.getCookies(url).toString());
    		} catch (IOException e) {
    			e.printStackTrace();
    			hLog.error(e);
    		}
    	}
    
    拿数据
    //url就是爬取数据的网页接口
    public String ajax(String Url) throws FailingHttpStatusCodeException, IOException {
    		URL url = new URL(Url);
    		String res = "";
    		res = wc.getPage(url).getWebResponse().getContentAsString();
    		if (analysisRes(res)) {//这里是一个根据返回值判断是否存在登陆状态的方法
    			login();
    			res = wc.getPage(url).getWebResponse().getContentAsString();
    		}
    		Log.debug(res);
    		// 判断返回值是否正确
    		return res;//这里返回的就是接口数据
    	}
    

    只要保持webClient不关闭,就可以用 wc.getPage(url).getWebResponse().getContentAsString(); 拿到所有接口返回数据

    附上全部代码
    
    public class  Client {
    
    	WebClient wc;
    	public  Client(WebClient webClient) {
    		wc = webClient;
    		wc.getOptions().setJavaScriptEnabled(true);//支持js
    		wc.getOptions().setCssEnabled(true);//支持CSS,一般CSS不需要
    		wc.setAjaxController(new NicelyResynchronizingAjaxController());//配置使用ajax,没有这一项调用网页接口会失败
    	}
    
    	public void login() throws FailingHttpStatusCodeException, IOException {
    		HtmlPage page = null;
    		// 获取指定网页实体
    		URL url = new URL("http://****");
    		page = (HtmlPage) wc.getPage(url);
    		HtmlInput usernameInput = page.getHtmlElementById("LoginName");
    		HtmlInput pswInput = page.getHtmlElementById("Password");
    		// 往输入框 “填值”
    		usernameInput.setValueAttribute("admin");
    		pswInput.setValueAttribute("cjgly**y");
    		// 获取搜索按钮
    		HtmlInput btn = page.getHtmlElementById("btnLogin");
    		try {
    			btn.click();
    			Log.debug(wc.getCookies(url).toString());
    		} catch (IOException e) {
    			e.printStackTrace();
    			hLog.error(e);
    		}
    	}
    
    	public String ajax(String Url) throws FailingHttpStatusCodeException, IOException {
    		URL url;
    		String res = "";
    		url = new URL(Url);
    		res = wc.getPage(url).getWebResponse().getContentAsString();
    		if (analysisRes(res)) {//analysisRes() 判断是否需要登陆
    			login();
    			res = wc.getPage(url).getWebResponse().getContentAsString();
    		}
    		Log.debug(res);
    		// 判断返回值是否正确
    		return res;
    	}
    
    	 
    }
    
    展开全文
  • Java WebClient 总结

    千次阅读 2016-11-08 15:42:00
    private WebClient getAWebClient() { WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24); webClient.getOptions().setTimeout(20000); // webClient.getCookieManager...
    private WebClient getAWebClient() {
            WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24);
            webClient.getOptions().setTimeout(20000);
            // webClient.getCookieManager().setCookiesEnabled(true);
            webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
            webClient.getOptions().setThrowExceptionOnScriptError(false);
            webClient.getOptions().setCssEnabled(false);
            webClient.getOptions().setJavaScriptEnabled(false);
            webClient.addRequestHeader("Accept", "textml,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
            webClient.addRequestHeader("Accept-Encoding", "gzip, deflate");
            webClient.addRequestHeader("Accept-Language", "en-US,en;q=0.5");
            webClient.addRequestHeader("Cache-Control", "max-age=0");
            webClient.addRequestHeader("Connection", "keep-alive");
            webClient.addRequestHeader("Host", "www.amazon.com");
            webClient.addRequestHeader("User-Agent", "Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0");
            return webClient;
        }
    /**
         * 采集网页
         */
        public StringBuilder crawlPage(String url) {
            StringBuilder builder = new StringBuilder();
            logger.info(Thread.currentThread().getName() + " crawl " + url);
            // mygetpage代码放在这里
            webClient.getCookieManager().clearCookies();
            logger.info(Thread.currentThread().getName() + " webClient.getCookieManager().clearCookies();");
            File file = new File(cookiePathAppendRandom());
            logger.info(Thread.currentThread().getName() + " File file = new File(cookiePathAppendRandom());");
            if (file.exists()) {
                FileInputStream fin = null;
                try {
                    fin = new FileInputStream(file);
                } catch (FileNotFoundException e1) {
                    e1.printStackTrace();
                }
                CookieStore cookieStore = null;
                ObjectInputStream in;
                try {
                    in = new ObjectInputStream(fin);
                    cookieStore = (CookieStore) in.readObject();
                    in.close();
                } catch (IOException e) {
                    logger.error(e);
                } catch (ClassNotFoundException e) {
                    logger.error(e);
                }
                List<org.apache.http.cookie.Cookie> l = cookieStore.getCookies();
                for (org.apache.http.cookie.Cookie temp : l) {
                    Cookie cookie = new Cookie(temp.getDomain(), temp.getName(), temp.getValue(), temp.getPath(),
                            temp.getExpiryDate(), false);
                    webClient.getCookieManager().addCookie(cookie);
                }
            }
            logger.info(Thread.currentThread().getName() + " MyGetPage start,url:" + url);
            HtmlPage page = MyGetPage(new StringBuffer(url));
            logger.info(Thread.currentThread().getName() + " MyGetPage end,url:" + url);
            if (page == null) {
                // 采集过程中出现异常的model,可以统一放在一个list中,发送给server重新加入到采集分配队列
                logger.info("Page null!");
                AmazonCrawlModel model=new AmazonCrawlModel(crawlId, crawlURLId, url, depth,ischange);
                exceptionFun(model);
                return (new StringBuilder("getNullPage"));
            }
            logger.info(Thread.currentThread().getName() + " builder.append(page.asXml());");
            builder.append(page.asXml());
            logger.info(Thread.currentThread().getName() + " return builder;");
            logger.info(Thread.currentThread().getName() +" CrawlPage $Length="+builder.toString().length());
            if(builder.toString().length()<=300){
                AmazonCrawlModel model=new AmazonCrawlModel(crawlId, crawlURLId, url, depth,ischange);
                exceptionFun(model);
                return (new StringBuilder("getNullPage"));
            }
            return builder;
        }

     

    /***
         * 自定义的getpage,遇到验证码页面识别直至成功
         * 
         */
        private HtmlPage MyGetPage(StringBuffer URL) {
            HtmlPage page = null;
            boolean flag = true;
            int TryTimeCnt = 1;
            int UnknowHostTryTimeCnt = 1;
            while (flag) {
                flag = false;
                try {
                    logger.info(Thread.currentThread().getName() + " webClient.getPage : " + URL + ",CrawlURL_id:"
                            + crawlURLId);
                    page = webClient.getPage(URL.toString());
                    Document doc = Jsoup.parse(page.asXml());
                    int robotchecknum = 1;
                    while (doc.select("title").text().equals("Robot Check")) {
                        logger.info(Thread.currentThread().getName() + " " + dayformat1.format(System.currentTimeMillis())
                                + " [Robot Check,URL:" + URL + "]");
                        String captcha_str = AmazonGetCaptcha.GetCaptcha(new StringBuilder(doc.toString()));
                        logger.info(Thread.currentThread().getName() + " " + dayformat1.format(System.currentTimeMillis())
                                + " end AmazonGetCaptcha.GetCaptcha");
                        logger.info(dayformat1.format(new Date()) + " " + Thread.currentThread().getName() + " : "
                                + captcha_str);
    
                        HtmlForm form = null;
    
                        logger.info(Thread.currentThread().getName() + " page.getForms().get(0) Start");
                        form = page.getForms().get(0);
                        logger.info(Thread.currentThread().getName() + " page.getForms().get(0) End");
    
                        HtmlButton button = null;
    
                        logger.info(Thread.currentThread().getName() + " form.getElementsByTagName(button).get(0) Start");
                        button = (HtmlButton) form.getElementsByTagName("button").get(0);
                        logger.info(Thread.currentThread().getName() + " form.getElementsByTagName(button).get(0) End");
    
                        logger.info(Thread.currentThread().getName() + " setValueAttribute Start");
                        form.getInputByName("field-keywords").setValueAttribute(captcha_str);
                        logger.info(Thread.currentThread().getName() + " setValueAttribute End");
    
                        logger.info(Thread.currentThread().getName() + " button.click Start");
                        boolean click_flag = false;
                        while (!click_flag) {
                            try {
                                click_flag = true;
                                page = button.click();
                            } catch (Exception e1) {
                                logger.error(Thread.currentThread().getName() + " button.click出错了: " + e1);
                                //e1.printStackTrace();
                                click_flag = false;
                            }
                        }
                        logger.info(Thread.currentThread().getName() + " button.click end");
                        while (page.asXml() == null) {
                            logger.info(Thread.currentThread().getName() + " page xml null");
                            logger.info(Thread.currentThread().getName() +" "+ page.asXml());
                            page.refresh();
                            logger.info(Thread.currentThread().getName() + " refresh End!");
                        }
                        logger.info(Thread.currentThread().getName() + " button.click End");
    
                        logger.info(Thread.currentThread().getName() + " Start ParsePage!");
                        doc = Jsoup.parse(page.asXml());
                        if (!doc.select("title").text().equals("Robot Check")) {
                            logger.info(Thread.currentThread().getName() + " " + doc.select("title").text());
                            logger.info(Thread.currentThread().getName() + " "
                                    + dayformat1.format(System.currentTimeMillis()) + " [Robot Check,captcha success:"
                                    + captcha_str + ",try num:" + robotchecknum + "]");
                        }
                        robotchecknum++;
                    }
    
                } catch (FailingHttpStatusCodeException e) {
                    logger.error(Thread.currentThread().getName() +" "+ e);
                    flag = true;
                } catch (MalformedURLException e) {
                    logger.error(Thread.currentThread().getName() +" "+ e);
                    flag = true;
                }catch(UnknownHostException e) {
                    logger.error(Thread.currentThread().getName() +" "+ e);
                    flag = true;
                    logger.info("found UnknownHostException,start sleep 20 min");
                    try {
                        Thread.sleep(1000*60*Integer.parseInt(Configuration.getProperties("unknowhost_sleeptime")));
                    } catch (InterruptedException e1) {
                        logger.error(Thread.currentThread().getName() +" "+ e1);
                    }
                    logger.info("found UnknownHostException,end sleep 20 min");
                    UnknowHostTryTimeCnt++;// 访问异常数加一
                    logger.info(Thread.currentThread().getName() + " " + dayformat1.format(System.currentTimeMillis())
                            + " [UnknowHostTryTimeCnt:" + UnknowHostTryTimeCnt + "]");
                    if (UnknowHostTryTimeCnt > Integer.parseInt(Configuration.getProperties("unknowhost_maxtrytime"))) {
                        return null;
                    }
                }catch (Exception eq) {
                    logger.error(Thread.currentThread().getName() + " "+eq);
                    TryTimeCnt++;// 访问异常数加一
                    logger.info(Thread.currentThread().getName() + " " + dayformat1.format(System.currentTimeMillis())
                            + " [TryTimeCnt:" + TryTimeCnt + "]");
                    if (TryTimeCnt > 5) {
                        return null;
                    }
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                        logger.error(Thread.currentThread().getName() + e);
                    }
                    flag = true;
                }
                try {
                    Thread.sleep(random.nextInt(500) + 1500);
                } catch (InterruptedException e) {
                    logger.error(Thread.currentThread().getName() + e);
                    flag = true;
                }
            }
            return page;
        }

     

    展开全文
  • 自定义一个 http 请求类,常用于 java 后台代码根据URL地址 获取数据 ------------------------------ BEGIN ---------------------------------- 1、向 WebAPI 的url 提交数据 - POST 方式 a、具体方法: /** * @...
    自定义一个 http 请求类,常用于 java 后台代码根据URL地址 获取数据

    ------------------------------ BEGIN ----------------------------------

    1、向 WebAPI 的url 提交数据 - POST 方式

    a、具体方法:

    /**
     * @Title : httpPostJson
     * @Function: HTTPPost发送JSON
     * @param url
     * @param jsonStr
     * @throws IOException
     * @throws ClientProtocolException
     */
    public static String httpPostJson(String url, String jsonStr) throws ClientProtocolException, IOException {
    	String resultStr = null;
    	
    	// 创建一个DefaultHttpClient的实例
    	HttpClient httpClient = new DefaultHttpClient();
    	// 创建一个HttpPost对象,传入目标的网络地址
    	HttpPost httpPost = new HttpPost(url);
    	//httpPost.setHeader("Content-Type", "application/json; charset=UTF-8");
    	//这里要这样设置...C#的接口需要
    	httpPost.setHeader("contentType", "application/json;charset=UTF-8");
    
    	// 提交的数据 - 注明utf8编码
    	StringEntity se = new StringEntity(jsonStr, "UTF-8");
    	se.setContentType("text/json");
    	//se.setContentEncoding(new BasicHeader(HTTP.CONTENT_TYPE, "application/json"));
    	//这里要这样设置...C#的接口需要
    	se.setContentEncoding(new BasicHeader("contentType", "application/json"));
    	// 传入参数
    	httpPost.setEntity(se);
    	
    	// 调用HttpClient的execute()方法,并将HttpPost对象传入;
    	// 执行execute()方法之后会返回一个HttpResponse对象,服务器所返回的所有信息就保护在HttpResponse里面
    	HttpResponse response = httpClient.execute(httpPost);
    
    	// 输出调用结果 - 
    	if(null != response && null != response.getStatusLine()){
    		//先取出服务器返回的状态码,如果等于200就说明请求和响应都成功了
    		if(response.getStatusLine().getStatusCode() == 200){
    			// 调用getEntity()方法获取到一个HttpEntity实例
    			// EntityUtils.toString()这个静态方法将HttpEntity转换成字符串,防止服务器返回的数据带有中文,所以在转换的时候将字符集指定成utf-8
    			resultStr = EntityUtils.toString(response.getEntity(), "UTF-8");
    		}else{
    			System.out.println("HttpResponse StatusCode:" + response.getStatusLine().getStatusCode());
    		}
    	}
    
    	return resultStr;
    }
    复制代码

    b、测试方法:

    String result = httpPostJson(url, paramsStr);


    2、这是一个比较复杂的类、常用语GET数据,POST也行 WebClient

    package com.aaa.bbb.ccc.ddd;
    
    import java.io.IOException;
    import java.security.KeyManagementException;
    import java.security.NoSuchAlgorithmException;
    import java.security.cert.CertificateException;
    import java.security.cert.X509Certificate;
    import java.util.ArrayList;
    import java.util.HashMap;
    import java.util.List;
    import java.util.Map;
    import java.util.Map.Entry;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    import javax.net.ssl.SSLContext;
    import javax.net.ssl.TrustManager;
    import javax.net.ssl.X509TrustManager;
    
    import org.apache.http.Header;
    import org.apache.http.HttpException;
    import org.apache.http.HttpHost;
    import org.apache.http.HttpRequest;
    import org.apache.http.HttpResponse;
    import org.apache.http.NameValuePair;
    import org.apache.http.auth.AuthScope;
    import org.apache.http.auth.UsernamePasswordCredentials;
    import org.apache.http.client.entity.UrlEncodedFormEntity;
    import org.apache.http.client.methods.HttpGet;
    import org.apache.http.client.methods.HttpPost;
    import org.apache.http.client.methods.HttpUriRequest;
    import org.apache.http.client.params.ClientPNames;
    import org.apache.http.client.params.CookiePolicy;
    import org.apache.http.conn.routing.HttpRoute;
    import org.apache.http.conn.routing.HttpRoutePlanner;
    import org.apache.http.conn.scheme.Scheme;
    import org.apache.http.conn.scheme.SchemeRegistry;
    import org.apache.http.conn.ssl.SSLSocketFactory;
    import org.apache.http.impl.client.DefaultHttpClient;
    import org.apache.http.message.BasicNameValuePair;
    import org.apache.http.params.CoreProtocolPNames;
    import org.apache.http.params.HttpConnectionParams;
    import org.apache.http.params.HttpParams;
    import org.apache.http.protocol.HTTP;
    import org.apache.http.protocol.HttpContext;
    import org.apache.http.util.EntityUtils;
    
    public class WebClient {
    	private DefaultHttpClient httpClient = new DefaultHttpClient();
    	private String url;
    	private HTTPMethod method;
    	private byte[] content;
    	private Map<String, String> headers = new HashMap<String, String>();
    	private int responseCode;
    	private List<NameValuePair> postParameter = new ArrayList<NameValuePair>();
    
    	private static final Pattern pageEncodingReg = Pattern.compile("content-type.*charset=([^\">\\\\]+)",Pattern.CASE_INSENSITIVE);
    	private static final Pattern headerEncodingReg = Pattern.compile("charset=(.+)", Pattern.CASE_INSENSITIVE);
    
    	public static void main(String[] args) throws Exception {
    		WebClient web = new WebClient("http://localhost:8081/test/api?s=1",HTTPMethod.GET);
    		// web.enableProxy("10.10.10.10", 8080, false, null, null,"127.0.0.1");
    		System.out.println(web.getTextContent());
    		System.out.println("------------------------------------------");
    		
    		/**
    		 * web.setUrl("https://mail.google.com/mail/"); 
    		 * System.out.println(web.getTextContent());
    		 * System.out.println("------------------------------------------");
    		 * web.setUrl("http://www.snee.com/xml/crud/posttest.cgi");
    		 * web.setMethod(HTTPMethod.POST);
    		 * web.addPostParameter("fname", "ababab");
    		 * web.addPostParameter("lname", "cdcdcd");
    		 * System.out.println(web.getTextContent());
    		 * System.out.println("------------------------------------------");
    		 */
    	}
    
    	public WebClient(String url, HTTPMethod method) {
    		this(url, method, false, null, 0, false, null, null, null);
    	}
    
    	public WebClient(String url, HTTPMethod method, String proxyHost, int proxyPort) {
    	    this(url, method, true, proxyHost, proxyPort, false, null, null, null);
    	}
    
    	public WebClient(String url, HTTPMethod method, boolean useProxy, String proxyHost, int proxyPort, boolean needAuth, String username, String password, String nonProxyReg) {
    	    setUrl(url);
    	    setMethod(method);
    	    if (useProxy) {
    	    	enableProxy(proxyHost, proxyPort, needAuth, username, password,nonProxyReg);
                }
    	}
    
    	public void setMethod(HTTPMethod method) {
    		this.method = method;
    	}
    
    	public void setUrl(String url) {
    		if (isStringEmpty(url)) {
    			throw new RuntimeException("Url不能为空.");
    		}
    		this.url = url;
    		headers.clear();
    		responseCode = 0;
    		postParameter.clear();
    		content = null;
    		if (url.startsWith("https://")) {
    			enableSSL();
    		} else {
    			disableSSL();
    		}
    	}
    
    	public Map<String, String> getRequestHeaders() {
    		return headers;
    	}
    
    	public void setHeaders(Map<String, String> headers) {
    		this.headers = headers;
    	}
    
    	public void addPostParameter(String name, String value) {
    		this.postParameter.add(new BasicNameValuePair(name, value));
    	}
    
    	public void setTimeout(int connectTimeout, int readTimeout) {
    		HttpParams params = httpClient.getParams();
    		HttpConnectionParams.setConnectionTimeout(params, connectTimeout);
    		HttpConnectionParams.setSoTimeout(params, readTimeout);
    	}
    
    	private void enableSSL() {
    		try {
    			SSLContext sslcontext = SSLContext.getInstance("TLS");
    			sslcontext.init(null, new TrustManager[] {truseAllManager}, null);
    			SSLSocketFactory sf = new SSLSocketFactory(sslcontext, SSLSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER);
    			// sf.setHostnameVerifier();
    			Scheme https = new Scheme("https", 443, sf);
    			httpClient.getConnectionManager().getSchemeRegistry().register(https);
    		} catch (KeyManagementException e) {
    			e.printStackTrace();
    		} catch (NoSuchAlgorithmException e) {
    			e.printStackTrace();
    		}
    	}
    
    	private void disableSSL() {
    		SchemeRegistry reg = httpClient.getConnectionManager().getSchemeRegistry();
    		if (reg.get("https") != null) {
    			reg.unregister("https");
    		}
    	}
    
    	public void disableProxy() {
    		httpClient.getCredentialsProvider().clear();
    		httpClient.setRoutePlanner(null);
    	}
    
    	public void enableProxy(final String proxyHost, final int proxyPort, boolean needAuth, String username, String password, final String nonProxyHostRegularExpression) {
    		if (needAuth) { httpClient.getCredentialsProvider()
    				.setCredentials(new AuthScope(proxyHost, proxyPort), new UsernamePasswordCredentials(username, password));
    		}
    		// httpClient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY,new
    		// HttpHost(proxyHost, proxyPort));
    		httpClient.setRoutePlanner(new HttpRoutePlanner() {
    			@Override
    			public HttpRoute determineRoute(HttpHost target, HttpRequest request, HttpContext contenxt) throws HttpException {
    				HttpRoute proxyRoute = new HttpRoute(target, null, new HttpHost(proxyHost, proxyPort), "https".equalsIgnoreCase(target.getSchemeName()));
    				if (nonProxyHostRegularExpression == null) {
    					return proxyRoute;
    				}
    				Pattern pattern = Pattern.compile(nonProxyHostRegularExpression, Pattern.CASE_INSENSITIVE);
    				Matcher m = pattern.matcher(target.getHostName());
    				if (m.find()) {
    					return new HttpRoute(target, null, target, "https".equalsIgnoreCase(target.getSchemeName()));
    				} else {
    					return proxyRoute;
    				}
    			}
    		});
    	}
    
    	private void fetch() throws IOException {
    		if (url == null || method == null) {
    			throw new RuntimeException("参数错误....");
    		}
    		httpClient.getParams().setParameter(ClientPNames.COOKIE_POLICY, CookiePolicy.BROWSER_COMPATIBILITY);
    		HttpResponse response = null;
    		HttpUriRequest req = null;
    		if (method.equals(HTTPMethod.GET)) {
    			req = new HttpGet(url);
    		} else {
    			req = new HttpPost(url);
    			((HttpPost) req).setEntity(new UrlEncodedFormEntity(this.postParameter, HTTP.UTF_8));
    		}
    		for (Entry<String, String> e : headers.entrySet()) {
    			req.addHeader(e.getKey(), e.getValue());
    		}
    		req.getParams().setBooleanParameter(CoreProtocolPNames.USE_EXPECT_CONTINUE, false);
    		response = httpClient.execute(req);
    		Header[] header = response.getAllHeaders();
    		headers.clear();
    		for (Header h : header) {
    			headers.put(h.getName(), h.getValue());
    		}
    		content = EntityUtils.toByteArray(response.getEntity());
    		responseCode = response.getStatusLine().getStatusCode();
    	}
    
    	private boolean isStringEmpty(String s) {
    		return s == null || s.length() == 0;
    	}
    
    	public int getResponseCode() throws IOException {
    		if (responseCode == 0) {
    			fetch();
    		}
    		return responseCode;
    	}
    
    	public Map<String, String> getResponseHeaders() throws IOException {
    		if (responseCode == 0) {
    			fetch();
    		}
    		return headers;
    	}
    
    	public byte[] getByteArrayContent() throws IOException {
    		if (content == null) {
    			fetch();
    		}
    		return content;
    	}
    
    	public String getTextContent() throws IOException {
    		if (content == null) {
    			fetch();
    		}
    		if (content == null) {
    			throw new RuntimeException("抓取类容错误.");
    		}
    		String headerContentType = null;
    		if ((headerContentType = headers.get("Content-Type")) != null) {
    			// use http header encoding
    			Matcher m1 = headerEncodingReg.matcher(headerContentType);
    			if (m1.find()) {
    				return new String(content, m1.group(1));
    			}
    		}
    		String html = new String(content);
    		Matcher m2 = pageEncodingReg.matcher(html);
    		if (m2.find()) {
    			html = new String(content, m2.group(1));
    		}
    		return html;
    	}
    
    	public DefaultHttpClient getHttpClient() {
    		return httpClient;
    	}
    
    	public enum HTTPMethod {
    		GET, POST
    	}
    
    	private static TrustManager truseAllManager = new X509TrustManager() {
    		public X509Certificate[] getAcceptedIssuers() {
    			return null;
    		}
    
    		public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException {}
    
    		public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException {}
    	};
    }
    复制代码
    展开全文
  • WebClient jar包

    2018-09-04 10:26:17
    WebClient jar包
  • Java Maven依赖:WebClient

    千次阅读 2019-09-30 14:54:09
    <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-webflux</artifactId> ...
        <dependencies>
            <dependency>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-starter-webflux</artifactId>
                <version>2.1.8.RELEASE</version>
            </dependency>
            <dependency>
                <groupId>org.projectlombok</groupId>
                <artifactId>lombok</artifactId>
                <optional>true</optional>
                <version>1.18.10</version>
            </dependency>
            <dependency>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-starter-test</artifactId>
                <scope>test</scope>
                <version>2.1.8.RELEASE</version>
            </dependency>
            <dependency>
                <groupId>io.projectreactor</groupId>
                <artifactId>reactor-test</artifactId>
                <scope>test</scope>
                <version>3.2.12.RELEASE</version>
            </dependency>
        </dependencies>

     

    展开全文
  • 本次更新技术类博客的目的一来是为了教大家如何使用基于Java的爬虫工具对网络资源进行定向爬取,这里我先埋下一个伏笔,我此前一直在致力于网络爬虫,自研了一个Java爬虫框架,名字叫做stupy框架,结合Spark,scalar...
  • 主要介绍了java web中 HttpClient模拟浏览器登录后发起请求的相关资料,需要的朋友可以参考下
  • 【Winform】学习笔记(二)—— WebClient异步回调Java后台接口 背景: 项目中需要用到一个winform的桌面程序,主要是用winform来做前端界面,数据都是来源于Java后台接口,所以在这里做一个Winform调用Java后台接口的...
  • 使用htmlunit/WebClient实现java爬虫程序,可以实现抓下css文件和js文件从而可以尽可能保留网页原有的样式和动态效果。 最重要的是可以实现抓取js动态加载的数据。 具体代码如下所示: 其中,webClient....
  • 外部包:除了 java.io 和 java.net 等默认 Java 包之外,不需要任何外部包。 操作系统: Windows 7 命令行界面:用于运行/测试程序的 Windows 命令提示符 ###目录结构 服务器:包含服务器实现的源文件以及默认的...
  • Java接口异步调用

    2020-08-26 00:11:43
    主要介绍了Java接口异步调用,下面我们来一起学习一下吧
  • :是Spring-webFlux包下的,非阻塞响应,最低java8支持函数式编程,性能好 RestTemplate :是Spring-webmvc包下的,满足RestFul原则,代码简单,默认依赖jdk的HTTP连接工具。 HttpClient :是apache httpClient...
  • import java.io.BufferedReader; import java.io.DataOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.net.HttpURLConnection; impor
  • 使用WebClient 发送各种请求

    千次阅读 2019-08-06 16:58:12
    WebClient webClient = WebClient.create(); String url = "http://localhost:8091/flux/object"; /** * get请求 * 返回单个字符串 */ Mono<String> mono = webClient.get().uri("url").retrieve()....
  • 最近在做一个项目,通过扫描营业执照的二维码得到一条URL链接。一条链接跳转后会进入企业信息公示页面,需要通过这条链接获取需要的信息(公司名,...最后采用的方案是:WebClient 模拟一个浏览器客户端,设置JS动态...
  • WebClientDemo

    2014-11-13 14:46:53
    WebClient的例子,在Eclipse中直接可以运行
  • 给予javaWeb-httpclient请求https实例demo

    热门讨论 2014-09-17 11:15:00
    本文主要介绍了请求HTPPS的实例.帮助请参考;http://blog.csdn.net/zhangxiaowei_/article/details/39339775
  • WebClient 的Post实现

    2012-10-29 11:40:56
    用银光技术 摸拟网页的Post方法来实现数据上传操作。
  • NULL 博文链接:https://piranha.iteye.com/blog/2119924
  • 今天工作之余有时间整理下前段时间自己写的webService 和 webClient。步骤如下: MyEclipse 下开发Java webService 和 Java webClient 的一个完整回调列子
  • at java.util.Optional.orElseGet(Optional.java:267) at org.springframework.web.reactive.function.BodyExtractors.readWithMessageReaders(BodyExtractors.java:250) at org.springframework.web.reactive....
  • 在spring5后官方带来了WebClient,说是以后替代RestTemplate了,不知道真假哈,既然有了新东西,就研究一下使用方法呗,也同时做个备忘笔记。 老规矩,先上结果,后说问题: import org.springframework....
  • 在本文中,我们将研究几种方法,以了解如何通过使用Spring reactive WebClient进行并行服务调用来完成此任务。 2. 回顾一下反应式编程 快速回顾一下,WebClient是在Spring 5中引入的,并包含在Spring Web响应模块中...
  • //webclient应用 MyImageServerEntities db = new MyImageServerEntities(); public ActionResult Index() { return View(); } public ActionResult FileUpl...
  • 获取网络资源,使用动态代理ip解决单个ip访问次数限制问题
  • 前言正如《Java异步编程实战》一书中所述,Spring5中引入了与Web Servlet技术栈并行存在的,框架层面支持异步处理的Webflux技术栈。利用Spring WebFlux可以...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 5,824
精华内容 2,329
关键字:

javawebclient

java 订阅