Any tips/resources on writing a faster search engine?
Tools: .Net, SQL Server.
Advice on writing a faster search?Any tips/resources on writing a faster search engine?
Tools: .Net, SQL Server. I mean this, http://incubator.apache.org/lucene.net/
'faster' is relative.
Do you want faster web server response? Faster database response? I really don't like saying 'throw hardware at it'. But SQL really really does act better when it has it's data and logs on separate physical disks from one another and the OS. google dotlucene, a dotnettified version of lucene...very, very good
Precaching seems effective: http://research.yahoo.com/node/579/2922
Maybe use squid in reverse proxy mode. Try this search problem.
Start with a keyword -> id index. Now add a twist: a lat+lon -> id index. Search algorithm is find ids within lat+lon that match also have a keyword match. Oh, there are 1 billion ids and 20GB of keyword data. It's late.
> Search algorithm is find ids within lat+lon that match also have a keyword match. Search parameters are a location + keywords. Algorithm is find all ids in index lat+lon -> ids that are within N mile radius of specified location. Intersect ids with matching keywords. They're plural too.
keyword -> id1,id2,id3,...,idN id -> lat+lon1,lat+lon2,lat+lon3,...,lat+lonN lat+lon -> id1,id2,id3,...,idN lucene.net looks really cool. we wouldn't need sql server with it, making it superfast i imagine.
re: latitudes and longitudes: interesting, but i'm not sure i understand. are we looking at a table as a map? i'm still leaning towards utilizing sql server's full text indexing somehow. while it may be as blazingly fast as a pure search engine, it might be faster and easier to develop with. and advice in this regard would be much appreciated! (i'm stilling going to play around with lucene, so thank you for that as well) |
|
|
|
|