Projects / YaCy

YaCy

YaCy is a personal Web crawler and Web search engine. It's also a P2P-based Web index exchange network without a central server and no censoring possibility. Web crawls can be done locally, or you can trigger a collaborative Web crawl with all other YaCy peers. YaCy is fun to use and shows interesting text, image, audio and video search results with direct links to Ogg, MP3, and video files. It has a cooperative bookmark system and many Web publishing functions.

Tags Communications File Sharing Information Management Internet Proxy Servers Web HTTP Servers Indexing/Search DNS Dynamic Content Networking
Licenses GPL
Operating Systems OS Independent
Implementation Java

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  14 Jan 2009 18:58
  • Rrelease-after

Changes: The full international character set and all UTF-8 characters are now supported for indexing and search. Support has been added for site:, inurl:, and filetype: operator search. A public API has been added to the search results, the indexing, and link structure in XML and JSON syntax.

  • Rrelease-mid
  •  05 Oct 2008 08:07
  • Rrelease-after

Changes: This is a quick release, with a lot of security fixes and bugfixes.

  • Rrelease-mid
  •  01 Oct 2008 16:09
  • Rrelease-after

Changes: Automatic re-crawling and a combination of Crawls and Bookmarks has been added. It's now possible to customize a personal search portal with YaCy. The functional range for Windows users has been enhanced.

  • Rrelease-mid
  •  10 Jun 2008 14:53
  • Rrelease-after

Changes: The IP and seed handling were improved. Crawl starting was slightly changed. The basic configuration is very easy now, as a result of changes to the authorization mechanism. The way to define and switch networks was improved. YaCy is now SRU compliant. There is ongoing work to the YaCy-UI rich client. In addition, some minor security vulnerabilities have been fixed and a lot of bugfixes have been made.

Changes: Some minor security vulnerabilities have been fixed. Some bugfixes have been made.

Rss Recent comments

Rcomment-before 08 Mar 2006 07:08 Rcomment-trans Orbiter Rcomment-after

Re: YAcY is a badly behaved robot
Both is not true:

1) YaCy respects the robots.txt since mid of 2005, it never ignored robots.txt on purpose. At this time it was simply the first time implemented.

2) There is no referrer spam. YaCy shows that the page was indexed by a YaCy peer. Since the corresponding web page is referenced then not only by this peer, but by all peers, there must be a central address where a referred page must see that it was referenced by a non-centralized web crawler. This is a unique problem that other centralized crawlers do not have. In this case YaCy is just honest an references to the YaCy project page. This feature was removed with YaCy 0.43 because of too many people had been confused with this referrer.

Rcomment-before 06 Mar 2006 15:42 Rcomment-trans Low012 Rcomment-after

Re: YAcY is a badly behaved robot

> 1. YAcY doesnt ask for robots.txt, let

> alone follow it.

> 2. YAcY posts the yacy web address as

> the HTTP Refer[r]er header similar to

> spam bots.

This issues have been resolved for some time now.

Rcomment-before 27 Feb 2006 17:43 Rcomment-trans pgregg Rcomment-after

YAcY is a badly behaved robot
1. YAcY doesnt ask for robots.txt, let alone follow it.

2. YAcY posts the yacy web address as the HTTP Refer[r]er header similar to spam bots. Well behaved bots may put their url into the Agent header.

I only came across this project whilst researching against HTTP Referrer spammers, nice idea - shame about the implementation.

No-screenshot

Project Spotlight

mysql_auth

A basic authenticator for Squid proxy.

4336443080ce74b86bbe2ffe12760d2e_thumb

Project Spotlight

Necromancer's Dos Navigator

A Norton Commander clone.