tag:blogger.com,1999:blog-6230750.post2419229198337438817..comments2023-11-30T11:57:43.224-08:00Comments on Niniane's Blog: tfidf package updatedNhttp://www.blogger.com/profile/06731517033909059791noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-6230750.post-18255799077770865132010-01-21T22:19:06.904-08:002010-01-21T22:19:06.904-08:00Sigh. Looks like code hosted might not be origina...Sigh. Looks like code hosted might not be originally written by Sanjay Ghemawat and Jeff Dean.<br />See a lot of the following comments in the code:<br />// Author: kenton@google.com (Kenton Varda)<br />// Based on original Protocol Buffers design by<br />// Sanjay Ghemawat, Jeff Dean, and others.<br /><br />(Also, their names don't seem to be mentioned in the project members list:<br />http://code.google.com/p/protobuf/people/list)<br /><br />Thanks, anyway!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6230750.post-31950023592839172332010-01-21T13:47:04.259-08:002010-01-21T13:47:04.259-08:00Haha. If you want to read awesome Googler code, y...Haha. If you want to read awesome Googler code, you should look at code by Jeff Dean and Sanjay Ghemawat. Some of their libraries are so beautiful -- impossibly short, yet performs every function you'd want from the library. It is art!<br /><br />The only public code of theirs I can think of is the protocol buffer code:<br />http://code.google.com/apis/protocolbuffersNhttps://www.blogger.com/profile/06731517033909059791noreply@blogger.comtag:blogger.com,1999:blog-6230750.post-24238100257580196972010-01-21T03:56:38.619-08:002010-01-21T03:56:38.619-08:00Thanks. Have you hosted more of your projects else...Thanks. Have you hosted more of your projects elsewhere? I just wanted to read code written by an awesome Xoogler. :)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6230750.post-61699534042730551672010-01-21T01:30:19.285-08:002010-01-21T01:30:19.285-08:00All that means is:
- you create a vector of words ...All that means is:<br />- you create a vector of words for each document<br />- you figure out how similar two vectors are<br />- that tells you how similar the documents are<br /><br />The description just sounds more challenging when condensed.Nhttps://www.blogger.com/profile/06731517033909059791noreply@blogger.comtag:blogger.com,1999:blog-6230750.post-88850462307062382032010-01-21T01:26:56.986-08:002010-01-21T01:26:56.986-08:00http://en.wikipedia.org/wiki/Tf-idf
"The tf-i...http://en.wikipedia.org/wiki/Tf-idf<br />"The tf-idf weighting scheme is often used in the vector space model together with cosine similarity to determine the similarity between two documents."<br /><br />Wow. I am not worthy.Justin Knoreply@blogger.com