Crawling some external sites failed

Ovidiu Becheş-Puia
  • Crawling some external sites failed Ovidiu Becheş-Puia

    I'm using Search Server Express + WSS 3.0. I wanna crawl external public web sites. One site is :

    http://www.av.se/

    When I try a full crawl it is throwing:

    http://www.av.se
    Access is denied. Check that the Default Content Access Account has access to this content, or add a crawl rule to crawl this content.

    Local sites and other public sites are getting crawled OK.

    What is wrong with that sit?

    Can you add it on Content sources and try a full crawl for testing?

  • Check here... I've ran into this a few times and this KB has really saved me.

    http://support.microsoft.com/kb/896861

    Cheers, Matt B.

Tags
search crawling internet-sites
Related questions and answers
  • I got the following warning after I crawled a site. Info from the Crawl Log: The content for this address was excluded by the crawler because this item was marked with a no-index meta-tag. To index this item, remove the meta-tag and recrawl. Sound easy but I don't no where to do this. In the Crawl Configuration section I have selected "Include all items in this path". Also the Content Access Account has full permissions on content to be crawled.

  • from the server on which SharePoint is running, using the internal domain name (e.g. http://localhost:5050/site). If I view the site home page while not logged in, I can't see the webpart -- this is the correct behaviour. However, if I view the site via the public address (e.g. http://www.realaddress.co.uk:5050/site) while not logged in, I can see the webpart. (I shouldn't be able...I have a web part which I am trying to target to a specific audience (the 'Staff' audience.) I have the web part in the home page of a site that has anonymous access enabled. I have set the Target

  • I am running on a new dev setup for SharePoint 2010 and trying to setup some External Content types. I think that I have setup BCS correctly (since I see it running in the central administration). When I go into SharePoint designer 2010 and try to setup a new External Content Type, I get the following error: There is no Business Connectivity Service associated with the current web context. Am I missing something with the configuration? Why am I not able to setup a new External Content Type to point to my existing SQL database?

  • I am working on replacing our protocol handler with a Connection Framework component, so that SharePoint will crawl our data source. I think I understand the big picture, but I'm missing a lot of the details. I've read the documentation starting at http://msdn.microsoft.com/en-us/library/ee556429%28office.14%29.aspx. I currently have a working BDC assembly; I know it works because I can make an external content type list from it. I am currently stuck on step three: "Use Microsoft SharePoint Designer to discover the DLL and create a model". I don't see where to do this at all. Under

  • I am using SharePoint Server 2007 on Windows Server 2008. I am using Search Center to crawl web data source (i.e. crawl web page from other web sites). My question is related to crawled page counters displayed for the web data source log page of Search Center. My question is, there are 3 crawl counters displayed, successful counter, fail counter and warning counter. For each counter value... stored in Search Center? I am not sure whether there are any duplicated Urls in the 1000 counted pages? BTW: I have this confusion because I set daily incremental page crawl, for example, if http

  • I am using SharePoint Server 2007 on Windows Server 2008. I am using Search Center to crawl web data source (i.e. crawl web page from other web sites). My question is related to incremental crawled page settings. My question is if I have set to incremental crawl daily, what means crawl incremental? If Url itself does not change, but the Url's content is updated, if I set crawl to be incremental crawling, will it be re-crawled and store the lastest content of the Url? thanks in advance, George

  • We have created a survey and have been testing it by adding responses. Now we want to open it up to the 'public'. Note that my user has Full Access to the survey. How can I remove all of the test responses? I have deleted all the responses which are listed and I can see no more. However, in the overview, it says there are 27 items in the list but when I select 'show all responses' it says that there are no responses. ** What does that mean?** Thanks

  • In SharePoint 2010, for a regular "custom" list calling SP.ListOperation.Selection.getSelectedItems() using the ECMAScript client object model returns an array of item IDs. These IDs correspond to the SPListItem.ID property of the server object model. Using the same approach for an external (BCS) list, getSelectedItems() is returning IDs of the form "_bg40001300", "_bg40002300", "__bg40003300", etc... My application is passing these IDs as HTTP paramters to the server where the server OM is using them to try and access the selected SPListItems. Is there some way to access the selected

  • I’m trying to write a Http Handler to generate rss xml (based on the CKS code). I want to be able to get the list/library that the end point of the url is referring to. For example if my Url is: http://example.org/pressreleases/pages/rss.xml I want to be able to get the pages library of the 'press releases' web. On a side note if I'm going about this in the wrong way please let me know. Update 1 It may be easier for me to show my code (incase of retardation) public void ProcessRequest(HttpContext context) { try { SPList list

Data information