{"id":5381,"date":"2022-04-02T20:40:24","date_gmt":"2022-04-02T12:40:24","guid":{"rendered":"https:\/\/egonlin.com\/?p=5381"},"modified":"2022-04-02T20:40:24","modified_gmt":"2022-04-02T12:40:24","slug":"06-06-%e5%ae%9e%e4%be%8b10-%e6%96%87%e6%9c%ac%e8%af%8d%e9%a2%91%e7%bb%9f%e8%ae%a1","status":"publish","type":"post","link":"https:\/\/egonlin.com\/?p=5381","title":{"rendered":"06-06 \u5b9e\u4f8b10-\u6587\u672c\u8bcd\u9891\u7edf\u8ba1"},"content":{"rendered":"<h1>\u4e00\u3001&quot;\u6587\u672c\u8bcd\u9891\u7edf\u8ba1&quot;\u95ee\u9898\u5206\u6790<\/h1>\n<h2>1.1 \u95ee\u9898\u5206\u6790<\/h2>\n<p>\u6587\u672c\u8bcd\u9891\u7edf\u8ba1<\/p>\n<ul>\n<li>\u9700\u6c42\uff1a\u4e00\u7bc7\u6587\u7ae0\uff0c\u51fa\u73b0\u4e86\u54ea\u4e9b\u8bcd\uff1f\u54ea\u4e9b\u8bcd\u51fa\u73b0\u5f97\u6700\u591a\uff1f<\/li>\n<li>\u8be5\u600e\u4e48\u505a\u5462\uff1f<\/li>\n<\/ul>\n<p>\u82f1\u6587\u6587\u672c &#8211;&gt; \u4e2d\u6587\u6587\u672c<\/p>\n<ul>\n<li>\n<p>\u82f1\u6587\u6587\u672c\uff1aHamlet \u5206\u6790\u8bcd\u9891<\/p>\n<\/li>\n<li>\n<p>\u4e2d\u6587\u6587\u672c\uff1a\u300a\u4e09\u56fd\u6f14\u4e49\u300b \u5206\u6790\u4eba\u7269<\/p>\n<\/li>\n<\/ul>\n<h1>\u4e8c\u3001&quot;Hamlet\u82f1\u6587\u8bcd\u9891\u7edf\u8ba1&quot;\u5b9e\u4f8b\u8bb2\u89e3<\/h1>\n<ul>\n<li>\u6587\u672c\u53bb\u566a\u53ca\u5f52\u4e00\u5316<\/li>\n<li>\u4f7f\u7528\u5b57\u5178\u8868\u8fbe\u8bcd\u9891<\/li>\n<\/ul>\n<pre><code class=\"language-python\"># CalHamletV1.py\n\ndef getText():\n    txt = open(&quot;hamlet.txt&quot;, &quot;r&quot;).read()\n    txt = txt.lower()\n    for ch in &#039;!&quot;#$%&amp;()*+,-.\/:;&lt;=&gt;?@[\\\\]^_\u2018{|}~&#039;:\n        txt = txt.replace(ch, &quot; &quot;)\n    return txt\n\nhamletTxt = getText()\nwords = hamletTxt.split()\ncounts = {}\nfor word in words:\n    counts[word] = counts.get(word, 0) + 1\nitems = list(counts.items())\nitems.sort(key=lambda x: x[1], reverse=True)\nfor i in range(10):\n    word, count = items[i]\n    print(&quot;{0:&lt;10}{1:&gt;5}&quot;.format(word, count))\nthe         948\nand         855\nto          650\nof          581\nyou         494\na           468\nmy          447\ni           443\nin          373\nhamlet      361<\/code><\/pre>\n<ul>\n<li>\u8fd0\u884c\u7ed3\u679c\u7531\u5927\u5230\u5c0f\u6392\u5e8f<\/li>\n<li>\u89c2\u5bdf\u5355\u8bcd\u51fa\u73b0\u6b21\u6570<\/li>\n<\/ul>\n<h1>\u4e09\u3001&quot;\u300a\u4e09\u56fd\u6f14\u4e49\u300b\u4eba\u7269\u51fa\u573a\u7edf\u8ba1&quot;\u5b9e\u4f8b\u8bb2\u89e3(\u4e0a)<\/h1>\n<p><div class='fancybox-wrapper lazyload-container-unload' data-fancybox='post-images' href='https:\/\/egonlin.com\/wp-content\/uploads\/2022\/04\/\u6587\u672c\u8bcd\u9891\u7edf\u8ba11.gif'><img class=\"lazyload lazyload-style-2\" src=\"data:image\/svg+xml;base64,PCEtLUFyZ29uTG9hZGluZy0tPgo8c3ZnIHdpZHRoPSIxIiBoZWlnaHQ9IjEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgc3Ryb2tlPSIjZmZmZmZmMDAiPjxnPjwvZz4KPC9zdmc+\"  data-original=\"https:\/\/egonlin.com\/wp-content\/uploads\/2022\/04\/\u6587\u672c\u8bcd\u9891\u7edf\u8ba11.gif\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB\/AAffA0nNPuCLAAAAAElFTkSuQmCC\" alt=\"\" \/><\/div><\/p>\n<ul>\n<li>\u4e2d\u6587\u6587\u672c\u5206\u8bcd<\/li>\n<li>\u4f7f\u7528\u5b57\u5178\u8868\u8fbe\u8bcd\u9891<\/li>\n<\/ul>\n<pre><code class=\"language-python\"># CalThreeKingdomsV1.py\n\nimport jieba\ntxt = open(&quot;threekingdoms.txt&quot;, &quot;r&quot;, encoding=&quot;utf-8&quot;).read()\nwords = jieba.lcut(txt)\ncounts = {}\nfor word in words:\n    if len(word) == 1:\n        continue\n    else:\n        counts[word] = counts.get(word, 0) + 1\nitems = list(counts.items())\nitems.sort(key=lambda x: x[1], reverse=True)\nfor i in range(15):\n    word, count = items[i]\n    print(&quot;{0:&lt;10}{1:&gt;5}&quot;.format(word, count))\nBuilding prefix dict from the default dictionary ...\nLoading model from cache \/var\/folders\/mh\/krrg51957cqgl0rhgnwyylvc0000gn\/T\/jieba.cache\nLoading model cost 1.030 seconds.\nPrefix dict has been built succesfully.\n\n\u66f9\u64cd          953\n\u5b54\u660e          836\n\u5c06\u519b          772\n\u5374\u8bf4          656\n\u7384\u5fb7          585\n\u5173\u516c          510\n\u4e1e\u76f8          491\n\u4e8c\u4eba          469\n\u4e0d\u53ef          440\n\u8346\u5dde          425\n\u7384\u5fb7\u66f0         390\n\u5b54\u660e\u66f0         390\n\u4e0d\u80fd          384\n\u5982\u6b64          378\n\u5f20\u98de          358<\/code><\/pre>\n<h1>\u56db\u3001&quot;\u300a\u4e09\u56fd\u6f14\u4e49\u300b\u4eba\u7269\u51fa\u573a\u7edf\u8ba1&quot;\u5b9e\u4f8b\u8bb2\u89e3(\u4e0b)<\/h1>\n<h2>4.1 \u300a\u4e09\u56fd\u6f14\u4e49\u300b\u4eba\u7269\u51fa\u573a\u7edf\u8ba1<\/h2>\n<p>\u5c06\u8bcd\u9891\u4e0e\u4eba\u7269\u76f8\u5173\u8054\uff0c\u9762\u5411\u95ee\u9898<\/p>\n<p>\u8bcd\u9891\u7edf\u8ba1 &#8211;&gt; \u4eba\u7269\u7edf\u8ba1<\/p>\n<pre><code class=\"language-python\">#CalThreeKingdomsV2.py\nimport jieba\ntxt = open(&quot;threekingdoms.txt&quot;, &quot;r&quot;, encoding=&quot;utf-8&quot;).read()\nexcludes = {&quot;\u5c06\u519b&quot;, &quot;\u5374\u8bf4&quot;, &quot;\u8346\u5dde&quot;, &quot;\u4e8c\u4eba&quot;, &quot;\u4e0d\u53ef&quot;, &quot;\u4e0d\u80fd&quot;, &quot;\u5982\u6b64&quot;}\nwords = jieba.lcut(txt)\ncounts = {}\nfor word in words:\n    if len(word) == 1:\n        continue\n    elif word == &quot;\u8bf8\u845b\u4eae&quot; or word == &quot;\u5b54\u660e\u66f0&quot;:\n        rword = &quot;\u5b54\u660e&quot;\n    elif word == &quot;\u5173\u516c&quot; or word == &quot;\u4e91\u957f&quot;:\n        rword = &quot;\u5173\u7fbd&quot;\n    elif word == &quot;\u7384\u5fb7&quot; or word == &quot;\u7384\u5fb7\u66f0&quot;:\n        rword = &quot;\u5218\u5907&quot;\n    elif word == &quot;\u5b5f\u5fb7&quot; or word == &quot;\u4e1e\u76f8&quot;:\n        rword = &quot;\u66f9\u64cd&quot;\n    else:\n        rword = word\n    counts[rword] = counts.get(rword, 0) + 1\nfor word in excludes:\n    del counts[word]\nitems = list(counts.items())\nitems.sort(key=lambda x: x[1], reverse=True)\nfor i in range(10):\n    word, count = items[i]\n    print(&quot;{0:&lt;10}{1:&gt;5}&quot;.format(word, count))\n\u66f9\u64cd         1451\n\u5b54\u660e         1383\n\u5218\u5907         1252\n\u5173\u7fbd          784\n\u5f20\u98de          358\n\u5546\u8bae          344\n\u5982\u4f55          338\n\u4e3b\u516c          331\n\u519b\u58eb          317\n\u5415\u5e03          300<\/code><\/pre>\n<ul>\n<li>\u4e2d\u6587\u6587\u672c\u5206\u8bcd<\/li>\n<li>\u4f7f\u7528\u5b57\u5178\u8868\u8fbe\u8bcd\u9891<\/li>\n<li>\u6269\u5c55\u7a0b\u5e8f\u89e3\u51b3\u95ee\u9898<\/li>\n<li>\u6839\u636e\u7ed3\u679c\u8fdb\u4e00\u6b65\u4f18\u5316<\/li>\n<\/ul>\n<p>\u9686\u91cd\u53d1\u5e03\u300a\u4e09\u56fd\u6f14\u4e49\u300b\u4eba\u7269\u51fa\u573a\u987a\u5e8f\u524d20\uff1a\u66f9\u64cd\u3001\u5b54\u660e\u3001\u5218\u5907\u3001\u5173\u7fbd\u3001\u5f20\u98de\u3001\u5415\u5e03\u3001\u8d75\u4e91\u3001\u5b59\u6743\u3001\u53f8\u9a6c\u61ff\u3001\u5468\u745c\u3001\u8881\u7ecd\u3001\u9a6c\u8d85\u3001\u9b4f\u5ef6\u3001\u9ec4\u5fe0\u3001\u59dc\u7ef4\u3001\u9a6c\u5cb1\u3001\u5e9e\u5fb7\u3001\u5b5f\u83b7\u3001\u5218\u8868\u3001\u590f\u4faf\u60c7<\/p>\n<h1>\u4e94\u3001&quot;\u6587\u672c\u8bcd\u9891\u7edf\u8ba1&quot;\u4e3e\u4e00\u53cd\u4e09<\/h1>\n<h2>5.1 \u5e94\u7528\u95ee\u9898\u7684\u6269\u5c55<\/h2>\n<ul>\n<li>\u300a\u7ea2\u697c\u68a6\u300b\u3001\u300a\u897f\u6e38\u8bb0\u300b\u3001\u300a\u6c34\u6d52\u4f20\u300b\u2026<\/li>\n<li>\u653f\u5e9c\u5de5\u4f5c\u62a5\u544a\u3001\u79d1\u7814\u8bba\u6587\u3001\u65b0\u95fb\u62a5\u9053 \u2026<\/li>\n<li>\u8fdb\u4e00\u6b65\u5462\uff1f \u672a\u6765\u8fd8\u6709\u8bcd\u4e91\u2026<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>\u4e00\u3001&quot;\u6587\u672c\u8bcd\u9891\u7edf\u8ba1&quot;\u95ee\u9898\u5206\u6790 1.1 \u95ee\u9898\u5206\u6790 \u6587\u672c\u8bcd\u9891\u7edf\u8ba1 \u9700\u6c42\uff1a\u4e00\u7bc7\u6587\u7ae0\uff0c\u51fa\u73b0\u4e86\u54ea\u4e9b\u8bcd [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":5382,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[371,377],"tags":[],"_links":{"self":[{"href":"https:\/\/egonlin.com\/index.php?rest_route=\/wp\/v2\/posts\/5381"}],"collection":[{"href":"https:\/\/egonlin.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/egonlin.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/egonlin.com\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/egonlin.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5381"}],"version-history":[{"count":0,"href":"https:\/\/egonlin.com\/index.php?rest_route=\/wp\/v2\/posts\/5381\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/egonlin.com\/index.php?rest_route=\/wp\/v2\/media\/5382"}],"wp:attachment":[{"href":"https:\/\/egonlin.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5381"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/egonlin.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5381"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/egonlin.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5381"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}