折腾:
【已解决】用Python去连接本地mongoDB去用GridFS保存文件
期间,已经可以用GridFS的API中的put去保存文件了。
现在需要去研究,如何在put时保存额外的信息。
去试试添加额外参数:
<code>audioFileId = fsCollection.put(
audioFp,
filename="Lots of Hearts_withContentType.mp3",
content_type="application/mpeg",
metadata={
"keywords": {
"series": "All Aboard Reading",
"name": "Lots of Hearts",
"keywords": [
"hearts"
],
"leadingActor": "",
"topic": "",
"contentKeywords": []
},
"fitAgeStartYear": 3,
"fitAgeEndYear": 6,
"isFiction": False
})
</code>结果:
至少是正常运行的:

audioFileId=5abc9525a4bc715e187c6d6d
去看看结果:
<code>{
"_id" : ObjectId("5abc9525a4bc715e187c6d6d"),
"contentType" : "application/mpeg",
"chunkSize" : 261120,
"metadata" : {
"keywords" : {
"name" : "Lots of Hearts",
"series" : "All Aboard Reading",
"topic" : "",
"contentKeywords" : [ ],
"leadingActor" : "",
"keywords" : [
"hearts"
]
},
"isFiction" : false,
"fitAgeStartYear" : 3,
"fitAgeEndYear" : 6
},
"filename" : "Lots of Hearts_withContentType.mp3",
"length" : 4795707,
"uploadDate" : ISODate("2018-03-29T07:26:29.264Z"),
"md5" : "955d19f230a5824e0fd5f41bee3dda21"
}
</code>果然是可以的,可以把字典或者其他任何信息,直接放入metadata中去的。
然后看看如何读取metadata出来。
“get(file_id, session=None)
Get a file from GridFS by “_id”.
Returns an instance of GridOut, which provides a file-like interface for reading.
Parameters:
* file_id: “_id” of the file to get
* session (optional): a ClientSession
Changed in version 3.6: Added session parameter.”
此处get是返回的GridOut对象
http://api.mongodb.com/python/current/api/gridfs/grid_file.html#gridfs.grid_file.GridOut
class gridfs.grid_file.GridOut(root_collection, file_id=None, file_document=None, session=None)¶
而里面说了:
metadata¶
Metadata attached to this file.
This attribute is read-only.
所以应该可以直接读取出来的。
看到:
“Changed in version 3.0: Creating a GridOut does not immediately retrieve the file metadata from the server. Metadata is fetched when first needed.”
感觉是:
metadata是属于lazy load懒加载
如果没有用到,则不会去读取,第一次用到,才会去读取
但是我们写代码,理论上不需要关心。
去试试:
写代码期间,PyCharm也可以检测到GridOut了:

对应的,fs的collection的put也能看到了:

用代码:
<code>audioFileId = fsCollection.put(
audioFp,
filename="Lots of Hearts_withContentType.mp3",
content_type="application/mpeg",
metadata={
"keywords": {
"series": "All Aboard Reading",
"name": "Lots of Hearts",
"keywords": [
"hearts"
],
"leadingActor": "",
"topic": "",
"contentKeywords": []
},
"fitAgeStartYear": 3,
"fitAgeEndYear": 6,
"isFiction": False
})
logging.info("audioFileId=%s", audioFileId)
readOutAudioFile = fsCollection.get(audioFileId)
logging.info("readOutAudioFile=%s", readOutAudioFile)
audioFileMedata = readOutAudioFile.metadata
logging.info("audioFileMedata=%s", audioFileMedata)
</code>调试期间,是可以看到metadata的:


可以获取metadata的:

<code>2018/03/29 03:33:52 LINE 83 INFO audioFileId=5abc96dfa4bc715f473f0297
2018/03/29 03:36:03 LINE 86 INFO readOutAudioFile=<gridfs.grid_file.GridOut object at 0x110ec7c90>
2018/03/29 03:36:38 LINE 88 INFO audioFileMedata={u'keywords': {u'name': u'Lots of Hearts', u'series': u'All Aboard Reading', u'topic': u'', u'keywords': [u'hearts'], u'leadingActor': u'', u'contentKeywords': []}, u'isFiction': False, u'fitAgeStartYear': 3, u'fitAgeEndYear': 6}
</code>【总结】
此处通过pymongo的gridfs的api中的put保存文件时,想要传递其他额外信息时,直接存在metadata中即可,比如
<code>with open(curAudioFullFilename) as audioFp :
audioFileId = fsCollection.put(
audioFp,
filename="Lots of Hearts_withContentType.mp3",
content_type=fileMimeType,
metadata={
"keywords": {
"series": "All Aboard Reading",
"name": "Lots of Hearts",
"keywords": [
"hearts"
],
"leadingActor": "",
"topic": "",
"contentKeywords": []
},
"fitAgeStartYear": 3,
"fitAgeEndYear": 6,
"isFiction": False
})
</code>保存后的效果类似于:
<code>{
"_id" : ObjectId("5abc997ea4bc71611bd37613"),
"contentType" : "application/mpeg",
"chunkSize" : 261120,
"metadata" : {
"keywords" : {
"name" : "Lots of Hearts",
"series" : "All Aboard Reading",
"topic" : "",
"contentKeywords" : [ ],
"leadingActor" : "",
"keywords" : [
"hearts"
]
},
"isFiction" : false,
"fitAgeStartYear" : 3,
"fitAgeEndYear" : 6
},
"filename" : "Lots of Hearts_withContentType.mp3",
"length" : 4795707,
"uploadDate" : ISODate("2018-03-29T07:45:02.535Z"),
"md5" : "955d19f230a5824e0fd5f41bee3dda21"
}
</code>