{"id":482,"date":"2014-09-13T20:38:00","date_gmt":"2014-09-13T12:38:00","guid":{"rendered":"http:\/\/note.systw.net\/note\/?p=482"},"modified":"2023-11-02T20:39:50","modified_gmt":"2023-11-02T12:39:50","slug":"mahout-recommend","status":"publish","type":"post","link":"https:\/\/systw.net\/note\/archives\/482","title":{"rendered":"Mahout Recommend"},"content":{"rendered":"\n<p>mahout Itembased Collaborative Filtering<\/p>\n\n\n\n<p><strong>#mahout recommenditembased<\/strong><br>Usage:<br>-i &lt; input&gt;<br>-o &lt; output&gt;<br>-n &lt; number of Recommendations&gt; \u63a8\u6578\u91cf<br>-b \u4e0d\u9700\u8a55\u5206\u6b04\u4f4d,\u53ea\u8981user,item\u9019\u5169\u500b\u6b04\u4f4d\u5373\u53ef<br>-s &lt; similarityClassname &gt;\u5e38\u7528\u7684\u6709\u4ee5\u4e0b<br>\u3000SIMILARITY_COOCCURRENCE,<br>\u3000SIMILARITY_LOGLIKELIHOOD,<br>\u3000SIMILARITY_TANIMOTO_COEFFICIENT,<br>\u3000SIMILARITY_CITY_BLOCK,<br>\u3000SIMILARITY_COSINE,<br>\u3000SIMILARITY_PEARSON_CORRELATION,<br>\u3000SIMILARITY_EUCLIDEAN_DISTANCE<\/p>\n\n\n\n<p><br>&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;..<\/p>\n\n\n\n<p>DEMO<\/p>\n\n\n\n<p>user x book \u8a55\u50f9\u8868(5\u70ba\u6700\u9ad8\u8a55\u50f9\uff0c1\u70ba\u6700\u4f4e\u8a55\u50f9)&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>&nbsp;<\/td><td>book 1&nbsp;<\/td><td>book 2&nbsp;<\/td><td>book 3&nbsp;<\/td><\/tr><tr><td>user 1&nbsp;<\/td><td>&nbsp;5<\/td><td>4&nbsp;<\/td><td>5&nbsp;<\/td><\/tr><tr><td>user 2&nbsp;<\/td><td>&nbsp;4&nbsp;<\/td><td>5&nbsp;<\/td><td>4&nbsp;<\/td><\/tr><tr><td>user 3&nbsp;<\/td><td>&nbsp;5<\/td><td>4&nbsp;<\/td><td>&nbsp;<\/td><\/tr><tr><td>user 4&nbsp;<\/td><td>&nbsp;1<\/td><td>2&nbsp;<\/td><td>&nbsp;<\/td><\/tr><tr><td>user 5&nbsp;<\/td><td>&nbsp;2&nbsp;<\/td><td>1&nbsp;<\/td><td>1&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><br><strong>#vi recom.data<\/strong><br>1,1,5<br>1,2,4<br>1,3,5<br>2,1,4<br>2,2,5<br>2,3,4<br>3,1,5<br>3,2,4<br>4,1,1<br>4,2,2<br>5,1,2<br>5,2,1<br>5,3,1<\/p>\n\n\n\n<p><strong># hadoop fs -mkdir testdata<br># hadoop fs -put recom.data testdata<br># hadoop fs -ls -R testdata<\/strong><br>-rw-r&#8211;r&#8211; 3 root hdfs 288374 2014-02-05 21:53 testdata\/recom.data<\/p>\n\n\n\n<p><strong>#mahout recommenditembased -i testdata -o output -s SIMILARITY_EUCLIDEAN_DISTANCE<\/strong><br>&#8230;omit&#8230;<br>{&#8211;booleanData=[false], &#8211;endPhase=[2147483647], &#8211;input=[tasteinput], &#8211;maxPrefsPerUser=[10], &#8211;maxPrefsPerUserInItemSimilarity=[1000], &#8211;maxSimilaritiesPerItem=[100], &#8211;minPrefsPerUser=[1], &#8211;numRecommendations=[10], &#8211;output=[tasteoutput], &#8211;similarityClassname=[SIMILARITY_EUCLIDEAN_DISTANCE], &#8211;startPhase=[0], &#8211;tempDir=[temp]}<br>14\/03\/02 09:52:37 INFO common.AbstractJob: Command line arguments: {&#8211;booleanData=[false], &#8211;endPhase=[2147483647], &#8211;input=[tasteinput], &#8211;maxPrefsPerUser=[1000], &#8211;minPrefsPerUser=[1], &#8211;output=[temp\/preparePreferenceMatrix], &#8211;ratingShift=[0.0], &#8211;startPhase=[0], &#8211;tempDir=[temp]}<br>&#8230;omit&#8230;<br>File Input Format Counters<br>Bytes Read=287<br>File Output Format Counters<br>Bytes Written=32<br>14\/09\/04 05:46:56 INFO driver.MahoutDriver: Program took 434965 ms (Minutes: 7.249416666666667)<\/p>\n\n\n\n<p><strong><br>#hadoop fs -cat output\/part-r-00000<br><\/strong>3 [3:4.4787264]<br>4 [3:1.5212735]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>mahout Itembased Collaborative &#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"","fifu_image_alt":"","_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[13],"tags":[],"class_list":["post-482","post","type-post","status-publish","format-standard","hentry","category-dataanalysis"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/posts\/482","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/comments?post=482"}],"version-history":[{"count":0,"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/posts\/482\/revisions"}],"wp:attachment":[{"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/media?parent=482"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/categories?post=482"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/systw.net\/note\/wp-json\/wp\/v2\/tags?post=482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}