最近在学习网络爬虫抓取数据, 运用HtmlUnit, 可以获取到动态加载后的数据。但是有些网站需要先登录,后获取登录后的数据就出现问题。
public static void TianyaTestByHtmlUnit() {
try {
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);
// The ScriptException is raised because you have a syntactical
// error in your javascript.
// Most browsers manage to interpret the JS even with some kind of
// errors
// but HtmlUnit is a bit inflexible in that sense.
webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setCssEnabled(false);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setRedirectEnabled(true);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.setJavaScriptTimeout(20000);
webClient.waitForBackgroundJavaScript(10000);
webClient.getOptions().setRedirectEnabled(true);
webClient.getCookieManager().setCookiesEnabled(true);
// get the url
HtmlPage page = webClient.getPage("http://passport.tianya.cn/login.jsp");
System.out.println("Orgin page data =" + page.asXml());
HtmlTextInput username = (HtmlTextInput) page.getElementById("userName");
username.type("lms_test_****");
HtmlPasswordInput password = (HtmlPasswordInput) page.getElementById("password");
password.click();
password.type("liu*****");
//HtmlAnchor submit = page.getAnchorByName("loginBtn");
HtmlButton submit = (HtmlButton) page.getElementById("loginBtn");
webClient.waitForBackgroundJavaScript(4000);
HtmlPage nextPage =(HtmlPage) submit.click();
// Wait js load the data
webClient.waitForBackgroundJavaScript(10000);
Thread.sleep(20000);
System.out.println("After click login button =" + nextPage.asXml());
Set<Cookie> cookies = webClient.getCookieManager().getCookies();;
Map<String, String> responseCookies = new HashMap<String, String>();
for (Cookie c : cookies) {
responseCookies.put(c.getName(), c.getValue());
System.out.println("cookie name --" + c.getName()+" value:"+c.getValue());
}
webClient.close();
} catch (Exception e) {
e.printStackTrace();
}
}
nextPage.asXml() 获取的数据总是与登录前的数据差不多,求大神帮忙解决!
1、首先一个简单的需求:微信小程序登录后,获取用户的信息,跳转到主页面,点击个人中心时,页面显示用户的个人信息。
2、文件夹目录
login文件夹主要存放有关登录的操作和页面:
wxml页面代码:
<!--miniprogram/pages/login.wxml--> <view class="login-container"> <view class="title">微信小程序App</view> <view class="login-box"> <label>用户名</label> <input placeholder='账号/手机号/微信/QQ' name="username" value='1433223'></input> <label>密码</label> <input password='true' name="pwd" value='123456'></input> <button bindtap="login" class="login-btn" form-type='submit'>立即登录</button> <view class="three-line">------第三方登录------</view> <button class="login-btn" style="background-color:green">微信登录</button> </view> </view>
wxss页面样式:
/* pages/login/login.wxss */ page{ height: 100%; /* 使用page的height可以使页面占全屏 */ background-color: #fafafa; } .login-container{ padding: 0 10%; height: 100%; } .title{ font-size: large; text-align: center; padding-top: 10%; font-weight: bold; } .login-box{ margin-top: 10%; padding: 10% 5%; background-color: white; border-radius: 10px; box-shadow: 0 4px 4px #888888; } .login-box>input{ margin: 5% 0 8% 0; border-bottom: 1rpx solid lightgray; } .login-btn{ width: 100%!important; background-color: #2f6afd; color: white; font-weight: normal; } .three-line{ margin: 8% 0; text-align: center; font-size: 12px; color: gray; }
login.js页面代码:
Page({ login: function(){ //跳转到底部tabBar页面 wx. switchTab({ url: '/pages/index/index' }); //调用微信api接口,获取登录人信息 wx.getUserInfo({ success: function(res) { console.log(res); // 获取成功后,存储到storage wx.setStorage({ data: res.userInfo, key: 'userInfo', }) } }) } }
person页面的wxml代码:
<!--pages/person/index.wxml--> <!--pages/personinfo/personinfo.wxml--> <view class="container"> <view class="info-box"> <view style="text-align: center;margin-bottom:10%;"> <image class="avatar-img" src="{{avatarUrl}}" /> </view> <text>昵称: {{nickName}}</text> <text>性别: {{gender}}</text> <text>国家: {{country}}</text> <text>省份: {{province}}</text> </view> </view>
person中的js代码:
// pages/person/index.js Page({ /** * 页面的初始数据 */ data: { nickName : "", avatarUrl : "", gender : "", province : "", city : "", country : "" }, /** * 生命周期函数--监听页面加载 */ onLoad: function (options) { //获取storage中存放的用户信息 var userInfo = wx.getStorageSync('userInfo'); var that = this; if(userInfo.gender ==0){ userInfo.gender = '未定义' }else if(userInfo.gender ==1){ userInfo.gender = '男' }else { userInfo.gender = '女' } //给data中数据赋值 that.setData({ nickName : userInfo.nickName, avatarUrl:userInfo.avatarUrl, gender: userInfo.gender, province: userInfo.province, city: userInfo.city, country: userInfo.country }) } })
上图:登录界面:
用ajax登录,登录成功后不刷新页面,在页面怎样获取登录用户的信息,如:${sessionScope.member.loginName }大家有什么好办法,麻烦分享
H5页面微信登录获取用户基本信息
1。首先在微信公众平台申请一个测试账号方便用于本地测试,微信公众平台
在测试号管理找到自己的appID和appsecret。
2。修改体验接口权限表里的网页账号,修改OAuth2.0网页授权里的回调域名(这里可以用natapp免费申请一个通道NATAPP)这个小软件可以把本地ip穿透为公网ip(本地ip为你项目的ip地址)。
3。扫描测试号二维码关注测试公众号。
4.在你的项目里新建一个html页面来获取微信回调里的code(通过code可获得open_id,access_token)var Appid = "你的appID";// var code = getUrlParam("code");//获取回调页面里的code console.log("code:"+code); if (code == null || code === "") { var fromurl = location.href; var s = encodeURIComponent(fromurl); var url = 'https://open.weixin.qq.com/connect/oauth2/authorize?appid=wxc5ec3f8eb94e9ae2&redirect_uri=' + s + '&response_type=code&scope=snsapi_userinfo&state=STATE%23wechat_redirect&connect_redirect=1#wechat_redirect'; location.href = url; }
如果不想在前端使用刷新页面的方法可以访问后端接口的方式获取code
//需要两个参数backUrl(重定向url),stateParam(可以通过上面前端方法里的链接地址查看参数信息) public static String getAuthCodeUrl(String backUrl, String stateParam){ StringBuffer getCodeUrl = new StringBuffer(); getCodeUrl.append("https://open.weixin.qq.com/connect/oauth2/authorize?"); getCodeUrl.append("appid="+appid); String backUri = backUrl; //"http://www.evshare.com.cn/wap/phone/index/do/toIndex.jsp"; backUri = backUri.replaceAll(":","%3A"); backUri = backUri.replaceAll("/","%2F"); getCodeUrl.append("&redirect_uri="+backUri); //重定向url getCodeUrl.append("&response_type=code&scope=snsapi_base"); getCodeUrl.append("&state="+stateParam); //扩展参数 getCodeUrl.append("#wechat_redirect"); return getCodeUrl.toString(); }
5。通过以上步骤获取到code后访问后端接口获取open_id和access_token(可以百度一个微信javaUtils)
private static final String appid="你自己刚刚申请的测试appid"; private static final String secret="你自己刚刚申请的测试appsecret"; private static final String AUTH_ACCESS_TOKEN = "https://api.weixin.qq.com/sns/oauth2/access_token?grant_type=authorization_code&code="; public static Map<String, String> getUserBasicData(String code){ Map<String, String> map = new HashMap<String, String>(); try { String url = AUTH_ACCESS_TOKEN + code +"&appid="+appid+"&secret="+secret; HttpClient client = new HttpClient(); GetMethod get = new GetMethod(url); get.getParams().setContentCharset("UTF-8"); client.executeMethod(get); String str = get.getResponseBodyAsString(); JSONObject obj = JSONObject.fromObject(str); String openid = obj.get("openid")== null ? "" : (String)obj.get("openid"); String access_token = obj.get("access_token")== null ? "" : (String)obj.get("access_token"); map.put("open_id", openid); map.put("access_token", access_token); }catch (Exception e){ e.printStackTrace(); } return map; }
这里标注一下 前端与后端的appid必须一致否则微信会返回一个错误代码提示
6。如果上述步骤获取到openid(微信用户唯一标识)和access_token后再添加一个获取微信用户信息的方法(头像,昵称,地址等信息)private static final String USER_INFO_URL = "https://api.weixin.qq.com/sns/userinfo?access_token=";//获取微信用户信息 public static JSONObject getWXUser(String openId, String appAccessToken){ JSONObject obj = null; try { String url = USER_INFO_URL + appAccessToken + "&openid=" + openId; HttpClient client = new HttpClient(); GetMethod get = new GetMethod(url); get.getParams().setContentCharset("UTF-8"); client.executeMethod(get); String str = get.getResponseBodyAsString(); obj = JSONObject.fromObject(str); System.out.println("obj:"+obj); }catch (Exception e){ e.printStackTrace(); } return obj; }
这样你就获取到了在微信内置的浏览器下微信登录用户的信息了。