项目编号 : 100499

项目预算 : $1,500

开发周期 : 7 天

技能 : CSS XML

类别 : 网站应用开发 - 网站开发

发布日期 : 2010-04-15

描述

The scope of this task involves one particular section scraping of 5 Public websites which allow such via robots.txt. Some details:

-The section of each website can be determined via a keyword in the URL.

-The # of documents varies per site but on average it is 10K.

-Some structured data will need to be extracted from each page, such as URL, page title and other information within HTML or CSS tags. Approximately 9 attributes will be extracted.

-Data should be delivered as CSV or XML file in previously agreed upon format.

-If task gets completed in a high quality manner then weekly or bi-weekly refreshes can be negotiated for an additional cost.

Please contact me if you have any questions or concerns. I look forward to working with you.

项目竞标

	接包方	国家/地区
	4 Xing910
	3 Itgenes (中标)
	3 Djworth
	3 Cmaxo
	2 Gdinnovative
	2 Talentmainly
	2 Infomediatech

竞标

请您先登录，然后提交此项目的竞标方案。

还不是智城用户? 智城期待您的加入，请注册成为我们的一员吧！

Scraping/Mining data from 5 Public Websites

描述

项目竞标

竞标