折腾:
【已解决】PySpider模拟小花生app请求parentChildReadingBookQuery2返回空数据
期间,已确定,获取parentChildReadingBookQuery2的不同分页数据时
offset不同:
0
10
20
对应的timestamp和signature值,都是不同的
所以接下来就是要去研究具体生成的机制了。
去看之前:
【已解决】从不同版本的小花生apk中反编译出包含业务逻辑代码的dex和jar包源码
破解到的java源码,看能否找到逻辑。

com/huili/readingclub/activity/classroom/SelfReadingActivity.java
private void getSupportingResources()
{
JsonObject localJsonObject1 = new JsonObject();
JsonObject localJsonObject2 = new JsonObject();
localJsonObject2.addProperty("userId", MainActivity.userId);
localJsonObject2.addProperty("fieldName", this.mFieldName);
localJsonObject2.addProperty("fieldValue", this.mFieldValue);
localJsonObject2.addProperty("grade", this.mGrades);
localJsonObject2.addProperty("level", this.mLevels);
localJsonObject1.addProperty("J", localJsonObject2.toString());
localJsonObject1.addProperty("C", Integer.valueOf(0));
XutilsHttpClient.sendHttpJson(this, HttpRequest.HttpMethod.POST, "http://www.xiaohuasheng.cn:83/Reading.svc/getSupportingResourcesInSelfReadingBookQueryCondition", localJsonObject1.toString(), new RequestCallBack()
{
...-》没有看到timestamp和signature
-》说明是类库封装的统一的生成的
-》去看:
XutilsHttpClient.sendHttpJson
->
com/huili/readingclub/network/XutilsHttpClient.java
public static void sendHttpJson(Context paramContext, HttpRequest.HttpMethod paramHttpMethod, String paramString1, String paramString2, RequestCallBack<String> paramRequestCallBack) ...
找找
最后通过搜:
“signature”
找到了:
com/huili/readingclub/model/ModelSecurity.java
public static void addSignature(RequestParams paramRequestParams, String paramString)
{
Object localObject = paramString;
if (paramString.startsWith("http://www.xiaohuasheng.cn:83")) {
localObject = paramString.substring("http://www.xiaohuasheng.cn:83".length());
}
if (((String)localObject).startsWith("/UserService.svc/getToken/")) {
return;
}
int i;
if (StringUtil.isNullOrEmpty(MainActivity.userId)) {
i = 0;
} else {
i = Integer.parseInt(MainActivity.userId);
}
int j = i;
if (i < 1) {
j = 0;
}
paramString = "";
if (j != 0) {
paramString = getToken(MainActivity.userId);
}
long l = DateUtils.get1970ToNowSeconds();
if (j == 0)
{
paramString = new StringBuilder();
paramString.append(l);
paramString.append((String)localObject);
paramString.append(“AyGt7ohMR!xx#N");
paramString = StringUtil.md5(paramString.toString());
}
else
{
StringBuilder localStringBuilder = new StringBuilder();
localStringBuilder.append(MainActivity.userId);
localStringBuilder.append(l);
localStringBuilder.append((String)localObject);
localStringBuilder.append(paramString);
localStringBuilder.append(“AyGt7ohMR!xx#N");
paramString = StringUtil.md5(localStringBuilder.toString());
}
if (j != 0) {
paramRequestParams.addHeader("userId", MainActivity.userId);
}
localObject = new StringBuilder();
((StringBuilder)localObject).append(l);
((StringBuilder)localObject).append("");
paramRequestParams.addHeader("timestamp", ((StringBuilder)localObject).toString());
paramRequestParams.addHeader("signature", paramString);
}
后记:后来发现Jadx导出的源码可读性更好:
public static void addSignature(RequestParams requestParams, String str) {
if (str.startsWith(MyConfig.SERVER_PORT)) {
str = str.substring(MyConfig.SERVER_PORT.length());
}
if (!str.startsWith("/UserService.svc/getToken/")) {
StringBuilder stringBuilder;
int parseInt = StringUtil.isNullOrEmpty(MainActivity.userId) ? 0 : Integer.parseInt(MainActivity.userId);
if (parseInt < 1) {
parseInt = 0;
}
String str2 = "";
if (parseInt != 0) {
str2 = getToken(MainActivity.userId);
}
long j = DateUtils.get1970ToNowSeconds();
if (parseInt == 0) {
stringBuilder = new StringBuilder();
stringBuilder.append(j);
stringBuilder.append(str);
stringBuilder.append(MyConfig.SECRET_KEY);
str = StringUtil.md5(stringBuilder.toString());
} else {
StringBuilder stringBuilder2 = new StringBuilder();
stringBuilder2.append(MainActivity.userId);
stringBuilder2.append(j);
stringBuilder2.append(str);
stringBuilder2.append(str2);
stringBuilder2.append(MyConfig.SECRET_KEY);
str = StringUtil.md5(stringBuilder2.toString());
}
if (parseInt != 0) {
requestParams.addHeader(USER.USERID, MainActivity.userId);
}
stringBuilder = new StringBuilder();
stringBuilder.append(j);
stringBuilder.append("");
requestParams.addHeader("timestamp", stringBuilder.toString());
requestParams.addHeader("signature", str);
}
}从而更加利于理解程序的逻辑。
另外又看到了addSignature被调用的地方:
com/huili/readingclub/network/XutilsHttpClient.java
public static void sendHttpJson(Context paramContext, HttpRequest.HttpMethod paramHttpMethod, String paramString1, String paramString2, RequestCallBack<String> paramRequestCallBack)
{
if (!isNetWorkAvaiable(paramContext))
{
paramRequestCallBack.onFailure(new HttpException("检查网络设置"), "检查网络设置");
return;
}
RequestParams localRequestParams = new RequestParams();
localRequestParams.addHeader("Content-Type", "application/json");
localRequestParams.addHeader("Authorization", "NSTp9~)NwSfrXp@\\");
try
{
if (paramHttpMethod == HttpRequest.HttpMethod.GET) {
ModelSecurity.addSignature(localRequestParams, paramString1);
} else if (paramHttpMethod == HttpRequest.HttpMethod.POST) {
ModelSecurity.addSignature(localRequestParams, (String)JsonUtil.jsonToMap(paramString2).get("J"));
}
}
...可见:
此处当POST时,传入了:
- localRequestParams
- J字段
然后想办法看看如何转换成python代码:
【已解决】Python实现小花生中addSignature的md5加密生成签名的逻辑
得到了正确的计算signature的逻辑后,再去
python生成timestamp,然后调用signature的md5值的计算逻辑,加上相关函数:
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
# Created on 2019-03-27 15:35:20
# Project: XiaohuashengApp
from pyspider.libs.base_handler import *
import os
import json
import codecs
import base64
import gzip
import copy
import time
# import datetime
from datetime import datetime, timedelta
from hashlib import md5
######################################################################
# Const
######################################################################
SelfReadingUrl = "http://www.xiaohuasheng.cn:83/Reading.svc/selfReadingBookQuery2"
ParentChildReadingUrl = "http://www.xiaohuasheng.cn:83/Reading.svc/parentChildReadingBookQuery2"
RESPONSE_OK = "1001"
######################################################################
# Config & Settings
######################################################################
OutputFolder = "/Users/crifan/dev/dev_root/company/xxx/projects/crawler_projects/crawler_xiaohuasheng_app/output"
DefaultPageSize = 10
UserAgentNoxAndroid = "Mozilla/5.0 (Linux; U; Android 4.4.2; zh-cn; A0001 Build/KOT49H) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1"
gUserId = "1134723"
gAuthorization = """NSTp9~)NwSfrXp@\\"""
# Timestamp = "1553671960"
# Signature = "bc8d887e7f16ef92269f55bfd74e131b"
# Timestamp = "1553671956"
# Signature = "a1f51f6852f42d934f0ac154dc374500"
# Timestamp = "1553675572"
# Signature = "d516df23120d3c9f0397475f03c61199"
# gTimestamp = "1553845333"
# gSignature = "2c30a3ac5898fa43eeececfa21560f5b"
gUserToken = "40d2267f-359e-4526-951a-66519e5868c3"
gSecretKey = “AyGt7ohMR!xxx#N"
gHeaders = {
"Host": "www.xiaohuasheng.cn:83",
"User-Agent": UserAgentNoxAndroid,
"Content-Type": "application/json",
"userId": gUserId,
"Authorization": gAuthorization,
# "timestamp": gTimestamp,
# "signature": gSignature,
"cookie": "ASP.NET_SessionId=dxf3obxgn5t4w350xp3icgy0",
# "Cookie2": "$Version=1",
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"cache-control": "no-cache",
"Connection": "keep-alive",
# "content-length": "202",
}
######################################################################
# Common Util Functions
######################################################################
def getCurTimestamp(withMilliseconds=False):
"""
get current time's timestamp
(default)not milliseconds -> 10 digits: 1351670162
with milliseconds -> 13 digits: 1531464292921
"""
curDatetime = datetime.now()
return datetimeToTimestamp(curDatetime, withMilliseconds)
def datetimeToTimestamp(datetimeVal, withMilliseconds=False) :
"""
convert datetime value to timestamp
eg:
"2006-06-01 00:00:00.123" -> 1149091200
if with milliseconds -> 1149091200123
:param datetimeVal:
:return:
"""
timetupleValue = datetimeVal.timetuple()
timestampFloat = time.mktime(timetupleValue) # 1531468736.0 -> 10 digits
timestamp10DigitInt = int(timestampFloat) # 1531468736
timestampInt = timestamp10DigitInt
if withMilliseconds:
microsecondInt = datetimeVal.microsecond # 817762
microsecondFloat = float(microsecondInt)/float(1000000) # 0.817762
timestampFloat = timestampFloat + microsecondFloat # 1531468736.817762
timestampFloat = timestampFloat * 1000 # 1531468736817.7621 -> 13 digits
timestamp13DigitInt = int(timestampFloat) # 1531468736817
timestampInt = timestamp13DigitInt
return timestampInt
...
######################################################################
# Main
######################################################################
class Handler(BaseHandler):
crawl_config = {
# 'headers': {
# "Host": "www.xiaohuasheng.cn:83",
# "User-Agent": UserAgentNoxAndroid,
# "Content-Type": "application/json",
# "userId": gUserId,
# "Authorization": gAuthorization,
# # "timestamp": gTimestamp,
# # "signature": gSignature,
# "cookie": "ASP.NET_SessionId=dxf3obxgn5t4w350xp3icgy0",
# # "Cookie2": "$Version=1",
# "Accept": "*/*",
# "Accept-Encoding": "gzip, deflate",
# "cache-control": "no-cache",
# "Connection": "keep-alive",
# # "content-length": "202",
# },
}
def on_start(self):
offset = 0
limit = DefaultPageSize
self.getParentChildReading(offset, limit)
def getParentChildReading(self, offset, limit):
print("offset=%d, limit=%d" % (offset, limit))
# jTemplate = """{"userId":"%s","fieldName":"","fieldValue":"全部类别","theStageOfTheChild":"","parentalEnglishLevel":"","supportingResources":"有音频","offset":%d,"limit":%d}"""
jTemplate = "{\"userId\":\"%s\",\"fieldName\":\"\",\"fieldValue\":\"全部类别\",\"theStageOfTheChild\":\"\",\"parentalEnglishLevel\":\"\",\"supportingResources\":\"有音频\",\"offset\":%d,\"limit\":%d}"
jcJsonDict = {
"J": jTemplate % (gUserId, offset, limit),
"C": 0
}
# print("jcJsonDict=%s" % jcJsonDict)
parentChildReadingParamDict = {
"offset": offset,
"limit": limit,
"jTemplate": jTemplate,
"jcJsonDict": jcJsonDict
}
# for debug
# jcJsonDict = {"J":"{\"userId\":\"1134723\",\"fieldName\":\"\",\"fieldValue\":\"全部类别\",\"theStageOfTheChild\":\"\",\"parentalEnglishLevel\":\"\",\"supportingResources\":\"\",\"offset\":0,\"limit\":10}","C":0}
# jcJsonDict = {"J":"{\"userId\":\"1134723\",\"fieldName\":\"\",\"fieldValue\":\"全部类别\",\"theStageOfTheChild\":\"\",\"parentalEnglishLevel\":\"\",\"supportingResources\":\"有音频\",\"offset\":0,\"limit\":10}","C":0}
# jcJsonDict = {"J":"{\"userId\":\"1134723\",\"fieldName\":\"\",\"fieldValue\":\"全部类别\",\"grades\":\"\",\"levels\":\"\",\"supportingResources\":\"有音频\",\"offset\":0,\"limit\":10}","C":0}
# jcJsonDict = {"J":"{\"userId\":\"1134723\",\"fieldName\":\"\",\"fieldValue\":\"全部类别\",\"grades\":\"\",\"levels\":\"\",\"supportingResources\":\"\",\"offset\":0,\"limit\":10}","C":0}
# jcJsonDict = {"J":'{"userId":"1134723","fieldName":"","fieldValue":"全部类别","grades":"","levels":"","supportingResources":"","offset":0,"limit":10}',"C":0}
# jcJsonDict = {"J":"{\"userId\":\"1134723\",\"fieldName\":\"\",\"fieldValue\":\"全部类别\",\"theStageOfTheChild\":\"\",\"parentalEnglishLevel\":\"\",\"supportingResources\":\"有音频\",\"offset\":0,\"limit\":10}","C":0}
jValueStr = jcJsonDict["J"]
# print("jValueStr=%s" % jValueStr)
jcJsonDictStr = json.dumps(jcJsonDict)
# print("jcJsonDictStr=%s" % jcJsonDictStr)
# # for debug
# jcJsonDictStr = '{"J":"{\"userId\":\"1134723\",\"fieldName\":\"\",\"fieldValue\":\"全部类别\",\"theStageOfTheChild\":\"\",\"parentalEnglishLevel\":\"\",\"supportingResources\":\"\",\"offset\":0,\"limit\":10}","C":0}'
# print("jcJsonDictStr=%s" % jcJsonDictStr)
curHeaders = copy.deepcopy(gHeaders)
# print("curHeaders=%s" % curHeaders)
# print("type(curHeaders)=%s" % type(curHeaders))
curTimestampInt = getCurTimestamp()
# # for debug
# curTimestampInt = int(gTimestamp)
# print("curTimestampInt=%s" % curTimestampInt)
curTimestampStr = str(curTimestampInt)
# print("curTimestampStr=%s" % curTimestampStr)
curHeaders["timestamp"] = curTimestampStr
# jValueParamStr = jcJsonDictStr
# jcJsonDictFormattedStr = "%s" % jcJsonDict
# jcJsonDictFormattedStr = jcJsonDictFormattedStr.replace(" ", "")
# print("jcJsonDictFormattedStr=%s" % jcJsonDictFormattedStr)
# jValueParamStr = jcJsonDictFormattedStr
jValueParamStr = jValueStr
calculatedSignature = self.generateSignature(curTimestampInt, jValueParamStr)
# print("calculatedSignature=%s" % calculatedSignature)
curHeaders["signature"] = calculatedSignature
# # for debug
# curHeaders["signature"] = gSignature
# print("debug signature=%s" % curHeaders["signature"])
self.crawl(ParentChildReadingUrl,
method="POST",
# data=jcJsonDict,
data= jcJsonDictStr,
callback=self.getParentChildReadingCallback,
headers=curHeaders,
save=parentChildReadingParamDict
)
def generateSignature(self, timestampInt, jValueParamStr):
# print("generateSignature: timestampInt=%d, jValueParamStr=%s" % (timestampInt, jValueParamStr))
# userId = "1134723"
userId = gUserId
timestamp = "%s" % timestampInt
# localObject = "/Reading.svc/parentChildReadingBookQuery2"
# localObject = jValueParamStr
# userToken = "40d2267f-359e-4526-951a-66519e5868c3"
userToken = gUserToken
# fixedSault = “AyGt7ohMR!xx#N"
# secretKey = “AyGt7ohMR!xx#N"
secretKey = gSecretKey
# strToCalc = userId + timestamp + localObject + jValueParamStr + fixedSault
# strToCalc = timestamp + localObject + fixedSault
strToCalc = userId + timestamp + jValueParamStr + userToken + secretKey
# print("strToCalc=%s" % strToCalc)
encodedStr = strToCalc.encode()
# encodedStr = strToCalc.encode("UTF-8")
# print("encodedStr=%s" % encodedStr)
md5Result = md5(encodedStr)
# print("md5Result=%s" % md5Result) # md5Result=<md5 HASH object @ 0x1044f1df0>
# md5Result = md5()
# md5Result.update(strToCalc)
# md5Digest = md5Result.digest()
# print("md5Digest=%s" % md5Digest) #
# print("len(md5Digest)=%s" % len(md5Digest))
md5Hexdigest = md5Result.hexdigest()
# print("md5Hexdigest=%s" % md5Hexdigest)
# print("len(md5Hexdigest)=%s" % len(md5Hexdigest))
# md5Hexdigest=c687d5dfa015246e6bdc6b3c27c2afea
print("md5=%s from %s" % (md5Hexdigest, strToCalc))
return md5Hexdigest
# return md5Digest
def extractResponseData(self, respJson):
"""
{
"C": 2,
"J": "H4sIAA.......AA=",
"M": "1001",
"ST": null
}
"""
# respJson = json.loads(respJson)
respM = respJson["M"]
if respM != RESPONSE_OK:
return None
encodedStr = respJson["J"]
decodedStr = base64.b64decode(encodedStr)
# print("decodedStr=%s" % decodedStr)
decompressedStr = gzip.decompress(decodedStr)
# print("decompressedStr=%s" % decompressedStr)
decompressedStrUnicode = decompressedStr.decode("UTF-8")
# print("decompressedStrUnicode=%s" % decompressedStrUnicode)
decompressedJson = json.loads(decompressedStrUnicode)
respDataDict = decompressedJson
return respDataDict
def getParentChildReadingCallback(self, response):
respUrl = response.url
print("respUrl=%s" % respUrl)
prevParaDict = response.save
print("prevParaDict=%s" % prevParaDict)
respJson = response.json
print("respJson=%s" % respJson)
respData = self.extractResponseData(respJson)
print("respData=%s" % respData)
if respData:
bookSeriesList = respData
for eachBookSerie in bookSeriesList:
print("eachBookSerie=%s" % eachBookSerie)
# self.getStorybookDetail(eachBookSerie)
prevOffset = prevParaDict["offset"]
limit = prevParaDict["limit"]
print("prevOffset=%d, limit=%d" % (prevOffset, limit))
offset = prevOffset + limit
self.getParentChildReading(offset, limit)
else:
print("!!! %s return no more data: %s" % (response.url, respJson))然后再去调用api,应该就可以返回希望的包含了J字段的json字符串了:

直到获取到最后一页的数据为止:

转载请注明:在路上 » 【已解决】小花生app中调用接口parentChildReadingBookQuery2时timestamp和signature生成的逻辑