How to make a site like howhowto
Howhowto is a vertical search engine for the "how to" knowledge domain. That's a fancy way of saying that it only tries to search for information about how to do things. This gives you a different way to search than if you use a general purpose search engine such as Google or Yahoo, where results can be lost in the billions of pages indexed by these engines.
It's possible to make other vertical search engines for other knowledge domains. If you have a field of expertise or a good grasp of the web resources that are available for a particular topic then you might consider making a site like this yourself.
The general process consists of three steps:
(The easy part!) Go to
Google Co-op and sign up to make a custom search engine.
(The hard part!) You have to decide which sites to include.
If you only want to search one site, or if the knowledge domain isn't too big, you can do this step quickly (Google even has a
tool that lets you make a search engine on the fly in about five seconds).
But if you want to make a site that indexes an entire domain of knowledge then you'll have to develop a strategy for locating perhaps thousands of relevant sites and you'll have to find a way to try to narrow your search to just the parts of these sites that have the relevant information. This step can take many hours. How you do this depends on the particular domain you've chosen, but here are some common problems:
- You can only search pages which have been indexed by Google. For various reasons, Google may not have indexed parts of sites that interest you (one reason: the site owner doesn't want Google to index these parts). You can get a sense for which pages have been indexed by Google by using the site: operator (for example, you would do a Google search for site:example.com). You can refine your search by adding the inurl: operator (for example: inurl:bob gives you pages that have bob in the address of the site). Sometimes you might find that Google has actually indexed the pages you want under a different address than the one you expected.
- To narrow the search to pages that actually contain relevant information, you can specify an URL pattern intead of an address. You use an asterisk - * - as a wild card symbol to stand for parts of the address that can take on any string value. Unfortunately, not every site has structured its address scheme to make this easy for you, so you may have to resort to looking for certain keywords in the address or some other trick. You can also exclude sites or parts of sites to narrow the search further.
- Your search results are at the mercy of Google's Page Rank algorithm, which means that popular sites tend to show up earlier in the results pages, crowding out smaller sites. To some extent you can overcome this by labeling your sites with tags so your users can refine their searches (though we don't do this at howhowto).
- If you want your search engine to remain current, you'll need a way to discover new relevant sites as they appear. Monitoring a wide variety of RSS feeds can help you do this.
- Over time you may want to adjust your list of sites to better fit the actual searches that your visitors are doing. Google Co-op keeps track of the most popular keywords people are using with your search engine, but it can be a long time before a pattern of popular searches emerges. Google Analytics can give you a better idea of how your search engine is performing.
(The part of medium difficulty.) Make it look pretty! You don't have to parachute in teams of web designers from around the world, as we so obviously did to make
howhowto, but at least you want to make your search engine usable. Spend some time using it yourself before you give it to the world.
Have a look at the
Google Co-op
directory to see what others have done!
- Steve Kangas
About Steve It's fun to make things! Howhowto came about because I wanted a better way to get information about how to do stuff. And it's fun to make websites! Years ago I made a site called bookmarklets.com which enjoyed some success. This is one of my first sites since then. Enjoy!
|
|