How to Install KNOW-Crawler
1. There is no need to install the KNOW-Crawler. The user can access it by logging into the crawler settings page with UW credentials at http://depts.washington.edu/knowcse2/firstpage.html (under construction). Only selected members of UW’s KNOW Project will be given access to the crawler’s setting page, which will be password protected.
How to Run KNOW-Crawler
Only verified members of the UW’s KNOW project will be able to run the crawler. Once they are logged in and can access the front page, they simply need to navigate to the schedule customization or news source customization page. Each page will allow the user to make changes to the web crawler’s settings. These settings will then be submitted to the server that runs the web crawler automatically.
How to Use KNOW-Crawler
Using the know-crawler is easy. The web crawler and annotation database are already installed on a UW server. The web crawling script is set up to run automatically on the server. All the user needs to do is specify a desired time schedule for when the crawler should run and a list of desired news sources to crawl. A list of international news sources is already stored in the database. Verified members of the UW’s KNOW Project will have access to a customization page on the KNOW Project site which will allow them to modify the crawling schedule and news source list. Only those verified members will be allowed to change the settings of the crawler also reffered to as schedule it.
To schedule the crawler:
Navigate to the Time Schedule page by following the link “Change Crawler’s time”.
Specify time of day (hour and minute) for the crawler to start searching and annotating news articles, schedule it to run either daily or weekly. The option daily means that the crawler will run 7 days per week at the specified time and the choosing the option weekly will activate a list of checkboxes that when checked will specify the days on which the crawler will run.
To view and edit news source list:
Navigate to the News Source List page by following the link “Change the crawled websites”. You will see the list of all news websites that the crawler will search for articles and annotate them.
You can edit this list by adding or removing news sources.
Download & Stay Up-to-date : http://code.google.com/p/know-crawler/downloads/list
Read more in here : http://code.google.com/p/know-crawler/