$_$_BEGIN_HTML $_$_END_HTML $_$_TITLE Search engine robots $_$_DESCRIPTION This page lists the search engine robots known to JafSoft Limited $_$_KEYWORDS scooter, gulliver, slurp, googlebot, netmind, alexa, ia_archiver, architectspider, $_$_KEYWORDS ultraseek, lycos_spider, diibot, nttdirectory_robot, Linkwalker, $_$_KEYWORDS linkalarm, linklint, linkscan, linkchecker, linkverify, linkbot, $_$_KEYWORDS xenu's link sleuth, go!zilla, getright, getsmart, download wonder, $_$_KEYWORDS netzip download, ecatch, MSIECrawler, MSProxy, CNET_Snoop, search engine robots $_$_TABLE_HEADER_ROWS 1 $_$_TABLE_MIN_COLUMN_SEPARATION 2 $_$_CHANGE_POLICY column merging factor : 0 $_$_CHANGE_POLICY default table width : 75% $_$_RESET_HTML_FRAGMENT HTML_HEADER $_$_BEGIN_HTML
Are you using your clipboard
to it's fullest potential?

Search engine robots that visit your web site

$_$_END_HTML *Contents of this page* $_$_CONTENTS_LIST Search engines and other sites send robots to read and index your pages. This page reverses that process and indexes the robots. This information has been gleaned by looking at the server logs for www.jafsoft.com. Whenever a page is read from a web site, the log file records a number of details including the time, the IP address and usually the referrer page and the user agent. You can see this in our [[HYPERLINK URL,log_sample.html,"analysis of a server log sample"]]. Unlike many pages that list web robots, this page actually tries to go visit the robots themselves. Where possible links are provided to the robots home pages, and descriptions are given of what they're up to. This page is updated regularly as more information is found (the last update was on *[[TIMESTAMP]]*). Well behaved robots will identify themselves, often supplying web or email addresses you can contact. In any case, the pattern of pages being read and the IP addresses being used soon sorts the men from the robots. Good robots will read robots.txt to see what your site policy is, but there are other ways of spotting robots. In addition to the search engine robots, other "user agents" will visit your site, e.g. to validate links to your site from other people's pages. Often these will just access the HEAD of the file, rather than doing a GET on the whole file. You can also visit our page [search_engines]. _*This page is regularly converted from this [[SOURCE_FILE "text file"]] by the author's own text to HTML converter [AscToHTM_abs]. The last update was on [[TIMESTAMP]]. This software is available as shareware (cost $40)*_ Search engine robots and others =============================== The following table lists the search engines that spider the web, the IP addresses that they use, and the robot names they send out to visit your site. Version numbers are usually included in the robot names, but are omitted here except where it implies a visit from a different IP address or (as in inktomi) a different search engine. Often multiple IP addresses are used, in which case we just give a flavour of the names or numbers. Inktomi is a company that offers search engine technology and is used by a number of sites (e.g. www.snap.com and www.hotbot.com) Wherever appears this indicates a number of different digits may be used. $_$_BEGIN_TABLE $_$_TABLE_MAY_BE_SPARSE $_$_TABLE_ALIGN CENTER Home page/search engine | Robot identifier | IP address(es) ======================================================================================= www.aesop.com | AESOP_com_SpiderMan | 209.189.115.49 | | www.alexa.com | ia_archiver | green.alexa.com | | sarah.alexa.com | | www.altavista.com | Scooter | test-scooter.pa.alta-vista.net | | brillo.pa.alta-vista.net | | av-dev4.pa.alta-vista.net | | scooter.aveurope.co.uk | | bigip1-snat.sv.av.com | Mercator | mercator.pa-x.dec.com | | scooter.pa.alta-vista.net | | election2000crawl-complaints-to-admin.webresearch.pa-x.dec.com | Scooter2_Mercator_3-1.0 | scooter.sv.av.com | roach.smo.av.com-1.0 | avfwclient.sv.av.com | Tv_Merc_resh_26_1_D-1.0 | tv.sv.av.com | | www.altavista.co.uk | AltaVista-Intranet | host-119.altavista.se | jan.gelin@av.com | | | www.alltheweb.com | FAST-WebCrawler | 209.67.247.154 | crawler@fast.no | | www.fast.no/faq/faqfastwebsearch/faqfastwebcrawler.html | | | Wget | ext-gw.trd.fast.no | | www.acoon.de | Acoon Robot | 194.231.42.178 | | www.atomz.com | Atomz | router-sc.atomz.com | | www.crawler.de | Crawler | crawlit.crawler.de | admin@crawler.de | | | www.daum.net | RaBot | 210.183.28.46 | Agent-admin/ phortse@hanmail.net | | contact/jylee@kies.co.kr | 211.50.57.6 | | | RaBot | 202.30.94.34 | Agent-admin/ webmaster@kisco.go.kr | | | www.excite.com | ArchitextSpider | Musical instrumentss are used | | in the name such as viola.excite.com | | cello.excite.com | | piano.excite.com | | kazoo.excite.com | | ride.excite.com | | sabian.excite.com | | sax.excite.com | | bugle.excite.com | | snare.excite.com | | ziljian.excite.com | | bongos.excite.com | | maturana.excite.com | | mandolin.excite.com | | piccolo.excite.com | | kettle.excite.com | | ichiban.excite.com | | (and the rest of the band) | | more recently first names are being | | used like philip.excite.com | | peter.excite.con | | perdita.excite.com | | macduff.excite.com | | agouti.excite.com | | | | | | (excite) | ArchitectSpider | crimpshrine.atext.com | | ichiban.atext.com | | www.euroseek.net | Arachnoidea | 212.209.54.134 | arachnoidea@euroseek.net | | | www.ezresults.com | EZResult | 216.28.23.59 | | www.findsame.com | DIIbot | 207.230.106.188 (see also www.powerinter.net | robot@digital-integrity.com | below) | | | | www.fireball.de | KIT-Fireball | ???? | | www.geckobot.com | geckobot | ???.rdc1.az.coxatwork.com | | www.gendoor.com | GenCrawler | ???? (Genealogical Search Engine) | | | | www.google.com | Googlebot | c.googlebot.com | googlebot@googlebot.com | | http://googlebot.com/ | | | www.goo.ne.jp | moget/2.0 | 202.229.31.13 | moget@goo.ne.jp | | | (inktomi) | Slurp.so/1.0 | q2004.inktomisearch.com | slurp@inktomi.com | j5006.inktomisearch.com | | (inktomi) | Slurp/2.0j | 202.212.5.34 | slurp@inktomi.com | goo313.goo.ne.jp | www.inktomisearch.com | | | (inktomi) | Slurp/2.0-KiteHourly | y400.inktomi.com | slurp@inktomi.com; | | www.inktomi.com/slurp.html | | | (inktomi) | Slurp/2.0-OwlWeekly | 209.185.143.198 | spider@aeneid.com | | www.inktomi.com/slurp.html | | | (inktomi) | Slurp/3.0-AU | j6000.inktomi.com | slurp@inktomi.com | | www.inktomisearch.com | | | www.hubat.com | Hubater | 209.114.176.250 | | www.infoseek.com | UltraSeek | cde2c923.infoseek.com | | cde2c91f.infoseek.com | InfoSeek Sidewinder | cca26215.infoseek.com | | www.informatch.com/mediabot/ | MP3Bot | 212.204.169.52 | | www.ip3000.com | C-PBWF-ip3000.com-crawler | www.ip3000.com | ip3000.com-crawler | | | www.lexis-nexis.com | LNSpiderguy | firewall5.lexis-nexis.com | | www.looksmart.com | MantraAgent | fjupiter.looksmart.com | | www.lycos.com | Lycos_Spider_(T-Rex) | bos-spider.bos.lycos.com | | 216.35.194.188 | | www.mirago.co.uk | HenryTheMiragoRobot | 194.202.39.46 | | www.northernlight.com | Gulliver | marvin.northernlight.com | | taz.northernlight.com | | www.portaljuice.com | PJspider | timber.nextopia.com | | www.powerinter.net | DIIbot | node-d8e93393.powerinter.net but it won't let us in :-( | | | | http://navi.ocn.ne.jp/ | nttdirectory_robot | lilis00.navi.ocn.ne.jp | super-robot@super.navi.ocn.ne.jp | | griffon | lilis04.navi.ocn.ne.jp | griffon@super.navi.ocn.ne.jp | | | www.maxbot.com | Spider/maxbot.com | search.wport.com | admin@maxbot.com | | | ??? | various (fakes agent on each access) | pool0058.cvx2-bradley.dialup.earthlink.net | | ??? | gazz/1.0 | deleuze.infobee.ne.jp | gazz@nttrd.com | derrida.infobee.ne.jp | | ??? | ??? | search-8.xift.com | | www.nationaldirectory.com | NationalDirectory-SuperSpider | spider.nationaldirectory.com | | 209.116.58.143 | | www.pinpoint.com | CrawlerBoy Pinpoint.com | nitrogen.pinpoint.com | | www.petersnews.com | user.ip3000.com | news.petersnews.com | | http://www.vestris.com/alkaline | AlkalineBOT | host130.uv-ray.com | | www.singingfish.com | asterias | grouper.singingfish.com | | www.speedfind.de | speedfind ramBot xtreme | BWEB.highway.telekom.at | | www.surfnomore.com | Surfnomore Spider v1.1 | 165.90.194.245 | | www.supersnooper.com | Robot@SuperSnooper.Com | 207.8.212.162 | | www.travel-finder.com | ESISmartSpider | 202.46.33.15 | | www.uksearcher.co.uk | UK Searcher Spider | - | | www.walhello.com | appie | ...speed.planet.nl | | www.websmostlinked.com | Nazilla | - | | www.webwombat.com.au | www.WebWombat.com.au | 202.139.99.131 | | www.webtop.com | MuscatFerret | ferret.webtop.com | | www.whizbanglabs.com | WhizBang! Lab | 216.250.143.108 | | | | www.wisenut.com | ZyBorg | - (in beta) | (info@WISEnut.com) | | | www.wire.co.uk | WIRE WebRefiner: | brighton.wire.co.uk | webrefiner@wire.co.uk | | | www.worldsearchcenter.com | WSCbot | ??? | | | libwww-perl | www.linpro.no/lwp/ | | http://verno.ueda.info.waseda.ac.jp/ | | Iron33 | 207.18.183.251 $_$_END_TABLE Link Checkers, Link monitors and bookmark managers ================================================== Link checkers and bookmark managers are run by people wanting to keep their pages and bookmarks up to date. Being visited by a link checker is good news as it means that someone has linked to you, and cares that you're still alive. Link monitors regularly check your pages for changes, usually because someone has selected your page as "one to watch". (pause for warm glow :-) If you have access to the server log, check the referrer page to try and get the URL from which you are linked. Sometimes these URLs are inside password protected parts of sites, so you won't be able to view the page. If you build up a list of sites that link to you, these are the guys you should tell when you move (moral - never move) It's also quite common for the Link checker to give no indication of which URL it's coming from. Some link checkers always come from the same IP address, more usually they come from the client's site. It depends on whether the site owner has purchased a copy of the link checking software, or signed up to some centralized link checking service. If you get the client's IP address you can always try visiting that if they blank the referrer URL field, and surfing their site. Some of these tools appear to imply they're extracting email addresses (e.g. emailSiphon). As such they're probably unwelcome visitors since these addresses are probably being collected for spammers. You can read more about this at www.csc.ncsu.edu/~brabec/antispam.html A page listing various link checkers (and other tools) can be found at www.softwareqatest.com/qatweb1.html#LINK $_$_BEGIN_TABLE $_$_TABLE_ALIGN CENTER Robot identifier IP address(es) Link Checker home page ======================================================================================= LinkWalker lw.seventwentyfour.com www.seventwentyfour.com 209.167.50.23 LinkAlarm linkalarm.com www.linkalarm.com NetMind-Minder marvin.netmind.com (retired) www.netmind.com gary.netmind.com meg.netmind.com inyanga.netmind.com leo.netmind.com gemini.netmind.com Check&Get http://checkget.udm.net/ (also shown as referrer page) CheckWeb www.asi.fr/~duby/chkweb.htm CNET_Snoop www.download.com (only if you have software listed at that site) EmailSiphon We don't list information like this on this site. EmailWolf www.pixeltech.com.au/~msw/ewolf/index.html The Informant cosmo.dartmouth.edu http://informant.dartmouth.edu/ The Intraformant jdwhatsnew.cgi www.jdrowell.com/Linux/Projects/jdwhatsnew LinkLint-checkonly -- www.goldwarp.com/bowlin/linklint/ javElink salix.ingetech.com www.dailydiffs.com Lambda LinkCheck 195.139.70.25 www.stud.ifi.uio.no/~lmariusg/download/python/LinkCheck.html LinkScan Server www.elsop.com LinkSweeper www.lss.com.au/lss/windows/ls/linksweeper.htm LinkVerify Spider frances.yourwebhost.com www.enduser.co.uk/linkverify/ Linkbot www.tetranetsoftware.com/products/linkbot.htm Morning Paper www.boutell.com/morning/ NetLookout -- www.frugalsoft.com/lookout/ NetMechanic gamma.netmechanic2.com www.netmechanic.com www.elsop.com Rational SiteCheck www.rational.com/products/teamtest/prodinfo/sitecheck.jtmpl Robozilla h-206---.netscape.com http://directory.mozilla.org/ (checks links in the dmoz directory) SyncIT www.bookmarksync.com WatzNew Agent www.watznew.com WebTrends Link Analyzer www.webtrends.com Xenu's Link Sleuth www.snafu.de/~tilman/xenulink.html $_$_END_TABLE Validators ========== Validators check your web pages for HTML correctness and standards compliance. Since other people are unlikely to send a validator to *your* site, you don't usually see much of this. Consequently the "list" below is restricted to the on-line validators I've used myself. However if you choose to validate your own site, then the validation attempts will appear in your logs. The following list is thus limited to the on-line validator I use (and recommend) and a URL submission service that I use. $_$_BEGIN_TABLE Robot Identifier IP address Validator home page ==================================================================== W3C_Validator abyss.w3.org http://validator.w3.org/ Tooter selfpromotion.com www.selfpromotion.com. This is used as part of a link submission agent (trebor@animeigo.com) $_$_END_TABLE FTP clients and download managers ================================= If you offer files for download, then you'll start to be visited by various FTP clients. Clients like Go!Zilla and GetRight are smart in that they can resume downloads that have been interrupted. This relies on your web server supporting the necessary protocol, but that's fairly standard these days. If your download files are over 1Mb in size (or if your server is slow), you'll often see the same IP address make multiple partial downloads of your file (look at the file size). In the case of Clients line Go!Zilla and GetRight if these add up to the right number of bytes, then chances are the download succeeded. $_$_BEGIN_TABLE $_$_TABLE_LAYOUT 2,"31","255" $_$_TABLE_ALIGN CENTER Client Identifier FTP Client home page ================================================ BatchFTP www.dynamicnet.net/products/batchftp.htm ChinaClaw http://go2.163.com/~22787/chinaclaw.htm (Chinese) (Chinese download utility) DA www.lidan.com www.downloadaccelerator.com Download Demon www.netzip.com Download Wonder www.forty.com Go!Zilla www.gozilla.com GetRight www.getright.com MyGetRight GetSmart http://members.xoom.com/m507/ JetCar (or FlashGet) www.amazesoft.com LeechFTP http://stud.fh-heilbronn.de/~jdebis/leechftp/ Mass Downloader www.geocities.com/SiliconValley/Vista/2865/md.htm NetZip Downloader www.netzip.com SmartDownload NetAnts www.netants.com Net Vampire www.netvampire.com Octopus http://moskalyuk.com/octopus/ RealDownload http://service.real.com/help/faq/rdown4/rdownfaqa01.html $_$_END_TABLE Browsers ======== Most browsers identify themselves with a string that begins "Mozilla...". I've chosen not to document those (as yet). Here are a few of the rarer browser identifiers that I've seen. $_$_BEGIN_TABLE $_$_TABLE_ALIGN CENTER Browser identifier Information ------------------------------------------- xChaos_Arachne http://browser.arachne.cz/ (DOS-compatible browser. Linux version under development) IBrowse http://www.hisoft.co.uk/ (search for IBrowse) Amiga-based browser ICab http://www.icab.de/index.html (Macintosh-only) Konqueror http://www.konqueror.org/konq-browser.html (Linux KDE browser) Lynx http://lynx.browser.org/ (Cross-platform text based browser) OmniWeb http://www.omnigroup.com/products/omniweb/ (Macintosh-only) Opera http://www.opera.com/ (Cross-platform, small, efficient and standards lead browser) pwWebSpeak http://www.prodworks.com/issound/catalog/catalog_pwwebspeak.html Audio Browser QWeb http://sunsite.auc.dk/qweb/ (Linux browser) (see also http://browswerwatch.internet.com/news/story/qweb8.html) VMS_Mosaic http://vaxa.wvnet.edu/vmswww/vms_mosaic.html (OpenVMS only version of Mosaic, a pre-Netscape browser) WannaBe http://mindstory.com/wb2/ (Macintosh text-only browser) $_$_END_TABLE Offline browsers and other agents ================================= $_$_BEGIN_TABLE $_$_TABLE_ALIGN CENTER Agent Identifier Agent home page ================================================= AnswerChase www.answerchase.com/advan.html a personal search robot. beholder or www.vigiltech.com/esensedisclaim.html e-sense www.vigiltech.com/esensedisclaim.html contype Possibly Adobe Acrobat or Reader or Adobe Acrobat Reader used with MSIE (I have been unable to confirm this) DaviesBot www.wholeweb.net/web/ DigOut4U www.arisem.com/Enu/ DISCoFinder www.ars.ru/eng/products/discof.asp eCatch www.ecatch.com EirGrabber http://www2p.biglobe.ne.jp/~eir/index.htm (Japanese software from the "Eir Project") Excalibur Internet Spider www.excalib.com/products/ispi/index.shtml ExtractorPro -- FairAd Client www.hager.co.at/fordelka/fairad.htm (German) A German pay-to-surf client FavOrg http://www.zdnet.com/pcmag/stories/solutions/0,8224,2649295,00.html A utility written by PC Magazine to fetch icons files (favicon.ico) for your IE favorites Favorites Sweeper www.manitoolssoftware.cjb.net. Another "favorites" tidy-up utility GigaBaz http://brainbot.com/web/en/ GigaBazVStheWeb crawler@brainbot.com Giskard http://212.145.12.170/ (Spanish) www.oralco.com (Trivia note: Giskard is probably named after the Isaac Asimov robot) infoGIST www.infogist.com iSiloWeb www.isilo.com/screensh.htm (for palm pilot) larbin http://pauillac.inria.fr/~ailleret/prog/larbin/index-eng.html LexiBot www.lexibot.com Links http://gossamer-threads.com/scripts/links/ (Link management cgi script) logikabot www.logika.net Kenjin Spider www.kenjin.com/kenjin/info.html Mata Hari www.thewebtools.com (Internet search agent) MoveAnnouncer www.moveannouncer.com (notifies webmasters when your pages have moved) MSIECrawler (Microsoft IE4.0) MSProxy NEC Research Agent http://heavenly.nj.nec.com/ Research "Inquirus" (meta?) search engine NexTools WebAgent www.igsnet.com/igs/wagent.html Offline Explorer www.metaproducts.com/OE.html Oxxbot1 www.oxxfordinfo.com (Data mining bot on IP 216.0.86.75) NetAttache Offline browser www.tympani.com/store/NAProDownload.html ParaSite www.ianett.com/parasite/ Phoaks www.phoaks.com/index.html. An index or web resources listed in UseNet. See also www.public.iastate.edu/~CYBERSTACKS/Aristotle.htm Pita (Chub.Stanford.EDU) -- PolyBot http://cis.poly.edu/polybot/ crawls from weasel.poly.edu and grampus.poly.edu PureSight www.puresight.com/Products/PureSightHomeDescription.htm Searchworks Spider www.nedesign.com/Phipps/products.html SilentSurf http://www4.silentsurf.com/ SiteMapper www.trellian.com/mapper/index.html SiteSnagger www.zdnet.com/pcmag/pctech/content/17/04/ut1704.001.html SpaceBison http://members.tripod.com/Proxomitron/features.html A web filter that is "ShonenWare", i.e. you should purchase a Shonen Knife CD if you use it. Shonen Knife are a great Japanese band, much loved by the late Kurt Cobain. Sometimes this sets the referrer page to the band's home page at http://www.mmjp.or.jp/knife/ (or maybe the users just happen to go there themselves). SpotOn www.spoton.com (IE add-on that organizes your browsing) SQ Webscanner http://macinsearch.com/users/webscanner/ (on holiday last time I looked) SuperBot www.sparkleware.com/superbot/index.html Teleport Pro www.tenmax.com/teleport/pro/home.htm teoma_agent1 www.teoma.com teoma_admin@hawkholdings.com Another coming soon search tool. Crawls from IP address 63.236.92.148. Hawk holdings is the holding company. The venture is between qwest.net and Baxter Investments UCmore www.ucmore.com A broswer plug-in (initially IE only) that searches for related pages and categories. In my experience this seems to entail accessing a favicon.ico file on a daily basis (presumably to refresh the "favorites" list) UdmSearch http://search.mnogo.ru/ Search engine technology, as used at sites such as www.maplesearch.com. Now called mnoGoSearch. vspider www.verity.com/products/intspider/ A commercial spidering product. Webbandit http://softwaresolutions.net/webbandit/index.htm Webclipping.com www.Webclipping.com webcollage Form collage from randomly select web images www.jwz.org/webcollage/ pet project of one of the authors of Netscape. Seems to come from differing IP nodes. WebCompass ??? (quarterdeck search engine software) WebCopier www.maximumsoft.com WebFetch www.webfetch.com WebGather http://pccms.pku.edu.cn:8000/ Chinese search project Webpush www.webhauler.com/webpush.htm WebReaper www.otway.com/webreaper/ Webrobot www.multimania.com/dilletb/WebRobot/ WebVCR www.netresultscorp.com/fs_webvcr_info.html WebStripper www.solentsoftware.com/webstripper/ WebTwin www.WebTwin.com Convert websites into help files. webwasher www.webwasher.com/en/products/wwash/functions.htm (browser filter) WebZIP www.spidersoft.com Zeus 1500 Webster Pro www.homepagesw.com/webster_overview.htm Zeus 2500 Webster Pro Zeus 4300 Webster Pro $_$_END_TABLE Other miscellaneous agents ========================== These agents are ones that we've seen, but been unable to get information for, or which are slightly unusual in origin. If you have any additional information on any of these, feel free to send it to search@jafsoft.com [[IGNORE_THIS table is broken. highlighting * is lost]] $_$_BEGIN_TABLE $_$_TABLE_ALIGN CENTER User Agent Information ------------------------------------------------- Albert Indexer www.albert.com/papers.htm Multi-lingual search technology Aranha Seems to be from a yet-to-be launched site www.girafa.com. Spiders using IP 212.150.51.90 which also seems to be Aranha.girafa.com AVSearch Seems to be the AltaVista personal search agent. The crawling site is sometimes referred to in the agent name Checkbot Seems to come from www.oxxfordinfo.com who offer B2B services Digimarc WebReader Digimarc search images on the web looking for digital watermatrs More details at www.digimarc.com/about/index.shtml EchO!/2.0 Spiders from 194.254.160.3, which would seem to be part of www.voila.com, a French-based search engine. FinaleRobot The www.expressus.com site describes an Interactive Natural robot-master@expressus.com Language encyclopedia that will become a search engine at www.final-e.com. Good name, but at present it just maps back onto the ExpressUs site (not such a good name). Crawls from IP address 64.114.34.115 GentleSpider Some sort of spider that usually visits using an IP address from within www.research.att.com or crawler.tivra.com Gulper Web Bot www.ecsl.cs.sunysb.edu/~maxim/cgi-bin/Link/GulperBot (Open research project to produce opinion-based search engine) InterGO www.teachersoft.com http://browserwatch.internet.com/news/story/intergo1.html This was a child-safe browser, nut it seems no associated page remains InternetArchive Presumably www.internetarchive.com, but that's in "stealth mode" Internet Ninja www.ifour.co.jp (Japanese Macintosh browser?) InternetSeer A web monitoring service. More details at www.internetseer.com/support/faq.jsp Katriona Something to do with the European Regional Internet Registry (RIPE) Browses using IP address 213.219.19.148 larbin And from the people that brought you xyro (see below), sebastien.ailleret@inria.fr comes another, newer bot. This one seems to crawl from ghi@lcs.mit.edu the IP address cremant.inria.fr. *Update* more recently it's also been seen coming from barracutta.lcs.mit.edu cosmos And then there was "cosmos", crawling from pomelos.inria.fr Seems these people are a webbot factory. Cosmos doesn't offer an email address. LEIA *Unable to find* (Too many "Star Wars" references get in the way) libwww-perl The PERL programming language comes with a number of routines for constructing web-aware scripts. This and related strings are the default user agent identifiers, although it's perfectly easy to change this to be whatever you want. MultiText Research project to index the last weeks' news items http://multitext.uwaterloo.ca/NetSearch.html NetCruiser www.netcruiser-software.com/products.html It's not clear to me *which* of these products this might be, but I'm assuming it's one of them. ORA_checksite http://www.oreilly.com/openbook/webclient/ch06.html Identifier used in a sample perl program in the online book "Web Client Programming with Perl". The program is used to check links. Obviously people have tried it, and it works :-) PintaSpider *Unable to find* But the spider came from www.cnet.fr PitSpyder Thread0 *Unable to find* psbot www.picsearch.org/bot.html A bot indexinx pictures. Crawls from ps.direct2internet.com RepoMonkey Bait & Tackle A bit of detective work here. Recent entries in the the log file link this to the site www.hungryhippo.com, although the robot always appears to come from an IP address at backflip.com (a bookmarking service). Visiting www.hungryhippo.com reveals a "coming soon" site. Looking at the HTML source leads to another page at http://www.mezzaluna.net/hungryhippo.com/ (appears identical). The META tags for this page all appear to be references to day trading, futures, training and the like, although we did spot the word "fibonacci" (our favourite :-). So... possibly a future search engine related to stock trading?, or maybe the Monkey and Hippo are just feeding me a red herring? There's more. The picture on the Kenjin site at www.kenjin.com/kenjin/info.html is currently the same as that at HungryHippo. Kenjin is an Autonomy company. Robot2.0(PingSoft) There are several "PingSoft"s around, but I suspect that this belongs to one of the products listed at http://pingsoft.com.cn/english/e_index.html (e.g. SmartHunter) since I was visited froma Chinese IP address. ru-robot Unable to find details on this, but I'm guessing it's 0.1_hseo(at)cs.rutgers.edu a research spider from www.rutgers.edu. Crawls using the IP teal.rutgers.edu TaWWWantula *Unable to find* TeraCrawl *Unable to find* unlostBot www.unlost.com is "under construction". The robot came unlostBot@unlost.com from IP address 212.37.219.147 which is in France. utopy Coming soon at www.utopy.com (requires flash). This crawler@utopy.com venture-capital funded site is "running in stealth mode" before launching the "new new thing" (is that a typo?). One of the Flash pages defines Utopia (geddit?), and some of the browsing is done by IP addresses at ...myutopy.com. UtilMind HTTPGet Probably the perl-based (uses the httpget library) web page grabber "Web Thief", described at www.utilmind.com/scripts/webthief.html UrlScope *Unable to find* VCI WebViewer Web browser object, that may be incorporated into software www.homepagesw.com/webster_dl.htm WAVETools A set of Delphi components offered to build Internet applications from www.transerve.com Web Hound *Unable to find* Or rather, I found several different "web hounds", so can't tell which this was, Web Magnet www.webmagnet.com this appears to be a tool used by this web consultancy. WebSymmetrix Originates in Korea, and is possibly related to their National Computerization Agency. Uses IP address 210.183.28.39 WhosTalking http://softwaresolutions.net/whostalking/ Software that tracks Trademark usage xyro Seems to be a spider associated with a French xcrawler@inria.fr research institute. Usually crawls using the IP address vamos.inria.fr $_$_END_TABLE Sites that regularly visit ========================== Some IP addresses, or sites may regularly visit you, although the user agent may be obscure, or even change. Here are a few that I've been able to work out $_$_BEGIN_TABLE Site address(es) Description -------------------------------------------------------- proxy.netsetter.org This is a site thet offers a speed-up to your surfing, in return for being able to monitoring people's surfing habits. The speed-ups are acheived through a variety of techniques, and the monitoring info is sold on, although your privacy is protected. Visit www.netsetter.org for more details. pwoshoes.transport.com *Not known* ...lightrealm.com This site daily reads any xml files submitted to a shareware site in PAD format. PAD is a means for describing shareware devised by the Association of Shareware Professionals (www.asp-shareware.org). This site is performing daily checks, looking to automatically update its lists with any changes. $_$_END_TABLE Awards for this page ==================== $_$_BEGIN_HTML
Spider award for achieving a top 10 position in search engines
$_$_END_HTML All awards gratefully received :-)