Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Web-app version of Markov-Chain domain generator (be gentle on the server) (suggestly.com)
31 points by codeismightier on Oct 26, 2008 | hide | past | favorite | 33 comments


Very cool, I was actually working on a start-up that suggested available domain names, and the Markov-Chain was just one of the ways I did this.

I'd also suggest looking into Natural Language Processing, (Great Python toolkit - http://nltk.sourceforge.net/index.php/Main_Page). Between NLTK, Markov and affixes you have enough reading to keep you busy for quite a while.

Also, consider looking into using Yahoo BOSS search results for synonyms through word association. Things like table, desk, wood, etc... are extremely useful for generating domain names--this is pretty simple doing some basic natural language processing on search results and yields awesome results.

Finally, there's a fantastic business model behind this. Other than the obvious domain registration affiliates--keeping a database of all available domain names and then selling it to people who do this for a living (ie. domain squatters). This type of information is very valuable to them.

Unfortunately I had to give up on this idea because life gets in the way (and will for at least the next 2 years) but it looks like you've got a great start.


Very interesting. I would be interested in hearing more about this. Can you leave your email? Mine is yongqli@10gic.net


Sure, you can contact me at bjasper@gmail.com

I'd be more than happy to go over in more detail some of the stuff I just mentioned. Getting into natural language processing gets difficult extremely fast, so I'm not sure how well I can explain that stuff.



I would make the code more efficient. It is firing off an AJAX request after practically every keystroke. Each request took ~20 seconds. The page is also taking forever to load, probably because of all the requests being sent to the server. Why not just have people press enter to initiate retrieval of a result? That would improve user experience a ton.

You should also queue AJAX request and only allow one or two at a time from the browser. That way, if you are really attached to this request after every keystroke feature the server can handle the load slightly better.

Why do you need to make an AJAX call for a 1 character string? I typed in 't' and an AJAX request was fired off. That was 20 seconds ago and I am still waiting.

It also appears that when the browser receives a result it runs several more AJAX calls recursively on all of the returned items. All of that should be done on the server. The overhead of all those HTTP requests is just too much,

I would do the following to make this better:

1) Have a minimum query length for search

2) Don't do a request for every keystroke

3) Make an AJAX request queue

4) Do all of the recursion on the server, don't open a new HTTP requests for each recursive call

I think this will be a lot better for server load and usability. Right now it is horribly inefficient.

I have no use for it but would be pretty cool if I were looking for a domain name.


Ok, I'll work on that right now.


Ok, now the user has to press enter for a request to be initiated.

The reason that the script issues a recursive call is that I want the user to see the suggestions right away and check the status later since the DNS requests can take a while.


I can't actually see the site yet, so I'm totally guessing, but a better way to do ajax autocomplete type things is to use SetTimeout or SetInterval to poll, rather than onKeyDown.


Are you caching the results on your server? You should cache the results for a better response time. That way you can return more information to the browser without checking availability for the same queries over and over.

For example, save the availability of picklenickel.com for 24 hours (or maybe more I don't know). Then you can do some recursive calls using cached data. Making all those concurrent HTTP requests seems impractical and has to be affecting your server's responsiveness.


I am running a local resolver (bind9), so I would assume that it is being cached.


Hmm. It's strange the initial requests usually take 20+ s but the subsequent recursive requests take ~200 ms each. Maybe a bottleneck in the code?


query_too_long


Ability to generate a two word domain with first word fixed and a second generated with markovian process would be really nice. All short domains are taken anyway :(


I searched for my name- "kwame". The request took 50 seconds.

It returned a bunch of results as well and then did recursive calls for each of those, including "kwamile". The time it took for each of these calls was under 300 ms.

I typed "kwamile" in the search field and pressed enter. Took a long time but should have been immediate because this search was already conducted.

You should just implement the caching yourself. That will improve things A LOT.


What an insanely great idea! Saves hours of brainstorming for startups who normally would spend hours brainstorming to come up with all variants of a noun or verb for a domain name and doing dns lookups.

I tried 'guitars' and it instantly reported that 'gitarious' was available. I would however consider starting to search automatically after the user has typed to or more characters.


> I tried 'guitars' and it instantly reported that 'gitarious' was available.

Not anymore, it's not :)


"Instantly check domain name availability and get suggestions"

Not very instant is it?

Other than that great idea and good job :)


It should be much faster now.


It's an interesting idea but it would be much more useful if the system didn't just vary individual letters without regard to word boundaries.

For example, if I enter "integration" the suggestions include "integoration", "integormation", etc. That's not very useful. I think you need to have some kind of dictionary to generate better suggestions.

I'm not saying that only dictionary words should be used, but variations of letters have very different value depending on where inside a dictionary word they occur.


For some reason the jquery file just took 40 s to download but the html took ~146 ms. Since you include the jquery in your head tag (which is necessary), the whole page did not show for 40 s. Not sure why the js took that long. Maybe the server load increased between the time the page load the HTML and the time it loaded the js- doesn't seem likely though.


My browser is not caching the jquery file. You can force the browser to cache it by using a header and some server side script. This should speed things up too, although I think the main bottleneck now is all those server to server requests.


Ok, now it loads jQuery from Google.


Cool. The page comes up much more quickly now. I don't know if this is due to the changes or what but it is much more responsive.


is there some technical reason why you display domains that are already taken? Seems like they take up a lot of space which could be better used with extra suggestions


You can still purchase these domains from the owner.


Can't access as of 6:45 PM PST. I want to go to Slicehost headquarters, give your slice a hug and tell him everything is going to be OK. :(


Actually it's on a Linode right now. Linode gives more RAM and hard disk space, so I'm migrating over. Slicehost might give more support but I don't need it.


this is really useful, thanks


If i type a space, i'm given all of your prefixes and suffixes :-D


Please add document.search-input.focus();


Thanks. Done.


It is working MUCH faster, great work.


I just registered 1300 domains beginning with "ass."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: