First thing required is to set up accounts / projects and the like with the relevant providers.
I won't describe this process as they were all pretty well documented.
Bing Developer Center
Yahoo Developer Network
Google Developers Console
A couple tips for the above sites.
- Bing: Setup both the web and synonym searches.
- Yahoo: In the BOSS console, under manage account, put in a daily limit $ amount (or turn of limit), as they only allow 1 free query a day... so only the first request works.
- Google: It doesn't seem that you can set it up to search the whole web, but after creating your custom search engine, you can select "Search the entire web but emphasize included sites" so don't worry about that.
All these providers allow for many options while searching ( e.g. images, location, news, video etc.) , however in this initial example I have limited it to just a pure and simple web search.
All the code will be available in my blog Github repository.
Going through the main points.
There is a BasicWebSearch interface, that takes the search term and returns SearchResults.
SearchResults contains results in a map based on a result type enum.
The implementations of BasicWebSearch namely: BingSearch, GoogleSearch and YahooSearch call the relevant search engine with the search term and then convert the results into a SearchResult. In the case of Yahoo and Bing, I map the JSON result to the SearchResult. Google however does that in their search client included in the dependencies.
Now for the main code bits:
SearchSettings
As this is just an example, I use included the search settings in the following class, be sure to replace with the relevant values.
UrlConnectionHandler
As both Bing and Yahoo use an HttpUrlConnection, I figured I would centralise the handling of that, the only difference between the 2 is that Bing used basic authentication and Yahoo I went with the OAuth implementation.
BingSearch
BingResultParser
YahooSearch
YahooResultParser
GoogleSearch
GoogleSearchResult
Google has a whole bunch of extra information being returned so I extended the base SearchResult so add all the information just in case I ever need it.
Maven Dependencies
Google has a whole bunch of extra information being returned so I extended the base SearchResult so add all the information just in case I ever need it.
Maven Dependencies
Very interesting. Thanks for the post.
ReplyDeleteSo, only Google returns the extra meta (pagemap) data? That is unfortunate, as their service is ridiculously expensive, while the others are more reasonable.
It is the best post on making a customized own search engine and I think that it will be worth to try all of them. Thanks for sharing the good and worthy post.
ReplyDeleteIt's cool that you report such things.
ReplyDeleteLiên hệ Aivivu, đặt vé máy bay tham khảo
ReplyDeletegiá vé máy bay đi Mỹ khứ hồi
vé máy bay từ mỹ đi việt nam
vé máy bay từ đức về việt nam giá rẻ
mua vé máy bay từ nga về việt nam
vé máy bay từ anh về việt nam vietnam airlines
chuyến bay từ pháp về việt nam hôm nay
Chi phí cho chuyên gia nước ngoài