关于python词云的频次统计机制 用Python实现小说里的高频词统计并显示

python\u751f\u6210\u8bcd\u4e91\uff0c\u8981\u6c42\u9891\u7387\u8d8a\u5c0f\u751f\u6210\u7684\u5b57\u8bcd\u8d8a\u5927\uff0c\u8001\u5e08\u7684\u8981\u6c42\uff0c\u8bf7\u5404\u4f4d\u5927\u4f6c\u89e3\u7b54\uff0c\u611f\u8c22\uff0c\u6025\u6025\u6025\uff01\uff01\uff01

\u9996\u5148\u4f60\u8981\u627e\u4e2a\u57fa\u51c6\u70b9\uff0c\u5426\u5219\u4f60\u6ca1\u6cd5\u6bd4\u8f83\u8981\u653e\u591a\u5927\u3002\u5047\u8bbe\u4e00\u4e2a\u5355\u8bcd\u51fa\u73b0\u7684\u9891\u7387\u4e3a0.5\uff0c\u4f60\u7ed9\u5b83\u5b9a\u4e2a\u5b57\u4f53\u5927\u5c0f10\uff0c\u7136\u540e\u7528\u5176\u4ed6\u7684\u5355\u8bcd\u9891\u7387\u548c\u5b83\u6bd4\uff0c\u4ea7\u751f\u7684\u500d\u6570\u4f5c\u4e3a\u5b57\u4f53\u53d8\u5927\u53d8\u5c0f\u7684\u6743\u91cd\u3002\u8fd9\u4e2a\u6743\u91cd\u4e00\u5b9a\u8981\u8bbe\u597d\uff0c\u5426\u5219\u4f1a\u51fa\u73b0\u5927\u7684\u7279\u522b\u5927\uff0c\u5c0f\u7684\u7279\u522b\u5c0f\u3002\u8fd8\u6709\u4e00\u79cd\u65b9\u6cd5\uff0c\u4f60\u5148\u5c06\u5355\u8bcd\u6309\u8bcd\u9891\u5927\u5c0f\u5206\u621010\u7ec4\uff0c\u6bcf\u7ec4\u4e4b\u95f4\u5b57\u4f53\u6709\u5927\u7684\u53d8\u5316\uff0c\u7ec4\u5185\u5b57\u4f53\u4e5f\u6309\u9891\u7387\u5927\u5c0f\u7ed9\u4e88\u76f8\u5e94\u5b57\u4f53\u5927\u5c0f\u3002

\u7528jieba\u505a\u5206\u8bcd\uff0c\u7528wordcloud\u5305\u505a\u8bcd\u4e91\u5c31\u53ef\u4ee5\u4e86
# \u8bfb\u53d6\u6587\u4ef6\u5185\u5bb9file = 'd:/\u827e\u8428\u514b\u00b7\u963f\u897f\u83ab\u592b/\u5947\u5999\u7684\u822a\u7a0b.TXT'f = open(file, 'r', encoding='gbk')text = f.read()f.close()# \u4f7f\u7528jieba\u5206\u8bcd,\u56e0\u4e3awordcloud\u662f\u4ee5\u7a7a\u683c\u8bc6\u522b\u5355\u8bcd\u8fb9\u754c\u7684import jiebatext = ' '.join(jieba.cut(text))# \u63a9\u7801\u56fe\u7247,\u5355\u8272\u56fe\u5c31\u597dfrom scipy.misc import imreadcolor_mask = imread('D:/Pictures/7218.png')# \u5efa\u7acb\u8bcd\u4e91\u5bf9\u8c61,\u56e0\u4e3a\u662f\u4e2d\u6587,\u6307\u5b9a\u4e00\u4e2a\u4e2d\u6587\u5b57\u4f53,\u4e0d\u7136\u53ef\u80fd\u4f1a\u4e71\u7801# WordCloud\u7684\u53c2\u6570\u53ef\u4ee5\u63a7\u5236\u5f88\u591a\u5185\u5bb9,\u8bf7\u81ea\u884c\u9605\u8bfb\u5305\u7684\u6587\u6863import wordcloudw = wordcloud.WordCloud(font_path='C:/Windows/Fonts/msyh.ttc', max_words=100, mask=color_mask)# \u8f7d\u5165\u4ee5\u7a7a\u683c\u5206\u8bcd\u7684\u5b57\u7b26\u4e32w.generate(text)# \u751f\u6210\u56fe\u7247w.to_file('d:/img1.png')

使用wordcloud库和jieba库可以使用图片上的效果,
这个就是将一个文本先进行分词,然后再统计每个词的词频,选出词频较高的一些词语,然后按照词频的大小设定不同的字体大小,随机生成颜色,随后形成图片。

  • python濡備綍鍋璇嶄簯 涓姝ヤ竴姝ユ暀浣犲浣曞仛
    绛旓細3銆佹垜浠彲浠ュ彂鐜伴噷闈㈡湁涓涓悕涓簆ip.exe鏂囦欢锛岃繖涓枃浠跺氨鏄python瀹樻柟缁欐垜浠幓瀹夎python绗笁鏂瑰簱鐨勪竴涓▼搴忥紝閭d箞鎴戜滑鍙互鍦╒sCode鐨勭粓绔腑灏卞彲浠ュ幓閫氳繃瀹冿紝杩欎篃鏄垜浠负浠涔堣鑾峰彇python瀹夎浣嶇疆鐨勬牴鏈師鍥犮4銆乸ython鍋璇嶄簯鍛紝闇瑕佸鍏ョ殑鍖呮湁wordcloud鍜孭IL锛屽叾涓璓IL锛Python Image Library锛夋槸python骞冲彴鍥惧儚...
  • python濡備綍浣跨敤wordcloudwhl鏂囦欢?
    绛旓細鐒跺悗鎴戜滑鍚戝畠杈撳叆涓娈垫枃鏈紝閭d箞瀹冪殑绋嬪簭閫昏緫涓氨浼氭牴鎹枃鏈粯鍒跺嚭涓涓璇嶄簯鍥惧儚锛屽垎闅旓細wordcloud浠ョ┖鏍间负鍒嗛殧绗﹀彿锛屾潵灏嗘枃鏈垎闅旀垚鍗曡瘝銆傚崟璇嶅嚭鐜版鏁板鏋滃嚭鐜版鏁板锛屽瓧浣撳ぇ锛屽嚭鐜版鏁板皯鐨勶紝瀛椾綋灏忥紝鍚屾椂灏嗗緢鐭殑鍗曡瘝杩囨护鎺夛紝鏍规嵁缁熻閰嶇疆瀛楀彿锛岄鑹茬幆澧冨昂瀵革紝width锛屾寚瀹氳瘝浜戝璞$敓鎴愬浘鐗囩殑瀹藉害锛岄粯璁400銆
  • python涓瀵瑰凡缁忔帓濂藉簭鐨勮瘝璇庝箞鍋璇嶄簯
    绛旓細frequencies(keywords)image_color = ImageColorGenerator(graph)plt.imshow(wc)plt.imshow(wc.recolor(color_func=image_color))plt.axis("off")plt.show()wc.to_file('dream.png')浠ヤ笂杩欑瘒python鐢熸垚璇嶄簯鐨瀹炵幇鏂规硶(鎺ㄨ崘)灏辨槸灏忕紪鍒嗕韩缁欏ぇ瀹剁殑鍏ㄩ儴鍐呭浜嗭紝甯屾湜鑳界粰澶у涓涓弬鑰冿紝涔熷笇鏈涘ぇ瀹跺澶氭敮鎸...
  • python3.7鐢熸垚鐨璇嶄簯,鏄剧ず鎴愬姛,鍗存病鏈夊浘鐗?
    绛旓細鏍规嵁浣犵殑浠g爜锛屼綘鐢熸垚鐨璇嶄簯鍥剧墖鏂囦欢鍚嶅瓧鍙仛aaaaa.png锛屾墦寮浣犲瓨鍌python鏂囦欢鐨勬枃浠跺す锛屽湪閭i噷闈㈡壘鍒癮aaaa.png杩欎釜鍥剧墖鏂囦欢锛屾墦寮灏辨槸鐢熸垚鐨勮瘝浜戜簡
  • 濡備綍浣跨敤python鏉ュ疄鐜颁釜鎬у寲璇嶄簯鐨绀轰緥浠g爜鍒嗕韩
    绛旓細coding=utf-8# using python27from os import pathfrom PIL import Imageimport numpy as npimport matplotlib.pyplot as pltfrom wordcloud import WordCloud, STOPWORDS, ImageColorGenerator# d = path.dirname(__file__)# Read the whole text.text = open(r'C:\Study\Python\wordcloud_\alice....
  • 濡備綍鐢Python鍋璇嶄簯
    绛旓細鎺ㄨ崘浣跨敤jieba妯″潡鏉ュ疄鐜板垎璇嶏紝WordCloud鏉ョ粯鍒璇嶄簯銆-*- coding: utf-8 -*-from PIL import Imageimport numpy as npimport matplotlib.pyplot as pltimport jiebafrom wordcloud import WordCloud, STOPWORDS# Read the whole text.text = open('鍐呭.txt', 'r').read()text = " ".join(jieba....
  • python鐨jieba搴撳拰璇嶄簯搴撴庝箞杩愮敤?杩愮敤鏂规硶鍒嗕韩
    绛旓細Python鏄竴闂ㄨ緝涓虹畝鍗曠殑缂栫▼璇█銆4.鎺ヤ笅鏉ユ垜浠湅涓嬩竴涓棶棰橈紝杩涜鏌ョ湅涓嬩竴涓▼搴忔槸灏嗕笉闇瑕佺殑璇嶈繘琛屽墧闄ゃ5.鎴戜滑杩涜缂栬緫浠g爜鍑芥暟 6.鐒跺悗鎴戜滑浜嗚В鍒颁箣鍚庡垪琛ㄥ瀷鏁版嵁鎵嶅彲浠ユ帓搴忥紝鍙湁瀛楃涓叉墠鍙互杩涜璇嶄簯鏁堟灉鏄剧ず銆7.缁х画鍚戜笅鏌ョ湅鎺掑嚭鐨勭▼搴忔枃浠躲8.鏈鍚庣殑鏁堟灉鍛堢幇 浠ヤ笂灏辨槸鍏充簬鈥python鐨jieba搴撳拰璇嶄簯搴...
  • python鍒朵綔璇嶄簯鐨鏃跺欐庝箞浣挎枃瀛楀瘑闆嗕竴鐐
    绛旓細閫夌敤鐨勮瘝瓒婂(max_words)锛屾牱鏈秺澶(鏂囨湰鐨勯暱搴︼級锛屽氨瓒婃帴杩戞鎬佸垎甯冦傚湪姝f佸垎甯冪殑鎯呭喌涓嬶紝浣庨璇嶅氨鐩稿澶(闀垮熬瀹氱巼)銆傜劧鍚庡氨鏄皟鏁磎ax_font_size涓巑in_font_size浠ュ強relative_scaling浜嗭紝杩欎笁涓弬鏁扮殑鍙栧煎琛ㄨ涓婄殑瀵嗛泦绋嬪害鏈夊奖鍝嶃傚鏋渕ax_words灏忥紝闀垮熬涓嶆槑鏄撅紝璇嶄簯涓皬瀛楀氨灏戯紝寰堝鍦版柟娌℃湁璇...
  • 闄や簡python璇嶉缁熻杩樻湁鏇寸畝鍗曞疄鐢ㄧ殑鏂规硶涔?
    绛旓細杩樺彲浠ョ敤璇嶉缁熻宸ュ叿鏉ヨ繘琛屽疄鐜拌瘝棰戠粺璁★紝鎿嶄綔瓒呯骇绠鍗曪紝鍒嗚瘝绮惧噯搴﹁繕姣python鏇村噯锛屽競闈笂宸茬粡鏈夋垚鐔熺殑璇嶉缁熻杞欢 杩欓噷浣跨敤鍦ㄧ嚎宸ュ叿鈥滃井璇嶄簯鈥濊繘琛岃瘝棰戠粺璁 鎿嶄綔娴佺▼锛氾細鈥滃井璇嶄簯鈥濃斺旇繘鍏ュ垱寤鸿瘝浜戦〉鈥斺旂偣鍑诲鍏ュ崟璇嶁斺旈夋嫨銆愮瓫璇嶅垎璇嶅悗瀵煎叆銆
  • 濡備綍鐢python鍋氳垎鎯呮椂闂村簭鍒楀彲瑙嗗寲
    绛旓細璇︾粏鐨勬祦绋嬫楠よ鍙傝冦 濡備綍鐢Python鍋璇嶄簯 銆嬩竴鏂囥傚姪鎵嬪ソ涓嶅鏄撳仛濂界殑Excel鏂囦欢restaurant-comments.xlsx,璇蜂粠杩欓噷涓嬭浇銆傜敤Excel鎵撳紑,濡傛灉涓鍒囨甯,璇峰皢璇ユ枃浠剁Щ鍔ㄥ埌鍜变滑鐨勫伐浣滅洰褰昫emo涓嬨傚洜涓烘湰渚嬩腑鎴戜滑闇瑕佸涓枃璇勮浣滃垎鏋,鍥犳浣跨敤鐨勮蒋浠跺寘涓篠nowNLP銆傛儏鎰熷垎鏋愮殑鍩烘湰搴旂敤鏂规硶,璇峰弬鑰冦婂浣曠敤Python鍋氭儏鎰熷垎鏋?銆...
  • 扩展阅读:中文词云的制作python ... python 字典如何输出键值 ... python制作词云图 ... python编程 ... python词云简单代码 ... python爬取网页数据 ... 词云图生成器 ... python根据词频绘制词云 ... python词云图片生成不出来 ...

    本站交流只代表网友个人观点,与本站立场无关
    欢迎反馈与建议,请联系电邮
    2024© 车视网