折腾:
【未解决】Python处理发布印象笔记帖子到WordPress后的部分细节优化
期间,继续去优化。发现有个问题:
之前sync印象笔记后,即使把所有的resource更新为空后,重新获取印象笔记的note的resources,却还有一个。
后来,自己领悟处理,猜测是:
帖子的缩略图所属的图片
比如帖子:

更新掉,内部没有<en-media>后,但是获取还是有resource
然后原先代码:
先upload后,比如:
注意:地址是-4,说明多次调试,上传多次了,导致name都重名4次了
对应图片是:

很明显就是该note的缩略图的原始大图
但是代码中,每次都upload,然后却发现note中不存在此en-media
所以后来决定去优化为:
upload之前,先去 判断是否存在此en-media
如果不存在,就不继续upload了。
避免无效的upload
最后代码改为:
libs/crifan/crifanEvernoteToWordpress.py
def uploadNoteImageToWordpress(self, curNoteDetail, curResource, curResList=None):
"""Upload note single imges to wordpress, and sync to note (replace en-media to img)
Args:
curNote (Note): evernote Note
curResource (Resource): evernote Note Resource
curResList (list): evernote Note Resource list
Returns:
upload image url(str)
Raises:
"""
if not curResList:
curResList = curNoteDetail.resources
uploadedImgUrl = ""
isImg = self.evernote.isImageResource(curResource)
if not isImg:
logging.warning("Not upload resource for NOT image for %s", crifanEvernote.genResourceInfoStr(curResource))
return uploadedImgUrl
foundResEnMediaSoup = crifanEvernote.findResourceSoup(curResource, curNoteDetail=curNoteDetail)
if not foundResEnMediaSoup:
logging.warning("Not need upload resource %s to wordpress for not found related <en-media> node", crifanEvernote.genResourceInfoStr(curResource))
return uploadedImgUrl
isUploadOk, respInfo = self.uploadImageToWordpress(curResource)
if isUploadOk:
# {'id': 70491, 'url': 'https://www.crifan.com/files/pic/uploads/2020/11/c8b16cafe6484131943d80267d390485.jpg', 'slug': 'c8b16cafe6484131943d80267d390485', 'link': 'https://www.crifan.com/c8b16cafe6484131943d80267d390485/', 'title': 'c8b16cafe6484131943d80267d390485'}
uploadedImgUrl = respInfo["url"]
logging.info("uploaded url %s", uploadedImgUrl)
# "https://www.crifan.com/files/pic/uploads/2020/03/f6956c30ef0b475fa2b99c2f49622e35.png"
# relace en-media to img
respNote = self.syncNoteImage(curNoteDetail, curResource, uploadedImgUrl, curResList)
# logging.info("Complete sync image %s to note %s", uploadedImgUrl, respNote.title)
else:
logging.warning("Failed to upload image resource %s to wordpress", curResource)
return uploadedImgUrl和:
libs/crifan/crifanEvernote.py
@staticmethod
def findResourceSoup(curResource, soup=None, curNoteDetail=None):
"""Find related <en-media> BeautifulSoup soup from Evernote Resource
Args:
curResource (Resource): Evernote Resource
soup (Soup): BeautifulSoup soup of note content
curNoteDetail (Note): Evernote note, with detail content
Returns:
soup node
Raises:
"""
if not soup:
soup = crifanEvernote.noteContentToSoup(curNoteDetail)
curMime = curResource.mime # 'image/png'
logging.debug("curMime=%s", curMime)
# # method 1: calc again
# curResBytes = curResource.data.body
# curHashStr1 = utils.calcMd5(curResBytes) # 'dc355da030cafe976d816e99a32b6f51'
# method 2: convert from body hash bytes
curHashStr = utils.bytesToStr(curResource.data.bodyHash)
logging.debug("curHashStr=%s", curHashStr)
# b'\xae\xe1G\xdb\xcdh\x16\xca+@IF"\xff\xfa\xa3' -> 'aee147dbcd6816ca2b40494622fffaa3'
# imgeTypeP = re.compile("image/\w+")
curResSoup = soup.find("en-media", attrs={"type": curMime, "hash": curHashStr})
logging.debug("curResSoup=%s", curResSoup)
# <en-media hash="aee147dbcd6816ca2b40494622fffaa3" type="image/png" width="370"></en-media>
return curResSoup另外每次debug打印resource时,默认都把data打印出data:
所以顺带加上生成resource的info的str
libs/crifan/crifanEvernote.py
@staticmethod def genResourceInfoStr(curResource): """Generate resource info str, use for debug print Args: curResource (Resource): Evernote Resouce Returns: resource info(str) Raises: """ resInfoStr = "Resource(name=%s,mime=%s,guid=%s)" % (curResource.attributes.fileName, curResource.mime, curResource.guid) return resInfoStr
然后去调试看看效果

是可以的:
找不到en-media,不上传图片