Question:
Hi I'm new to Chilkat as we as C#.
I'm trying to crawl my own website http://www.sh3lls.net to get all the links on the site including outbound links to other domains as well as other pages on my website from the same website, i'm working on a project so to avoid any kind of TOS using my own website, but when i crawl i'm able to get all links of other domains, but i don't get any links from my own domain www.sh3lls.net, below is the source code:
public string url;
Chilkat.StringArray extractURL(string url) { bool success; int i; string url2; url = txtURL.Text;
Chilkat.StringArray urlList = new Chilkat.StringArray();
Chilkat.Spider crawl = new Chilkat.Spider();
crawl.Initialize(url);
crawl.AddUnspidered(url);
success = crawl.CrawlNext();
urlList.Unique = true;
urlList.Clear();
for (i = 0; i <= crawl.NumOutboundLinks- 1; i++)
{
url2 = crawl.GetOutboundLink(i);
//MessageBox.Show(crawl.CanonicalizeUrl(url));
//MessageBox.Show(crawl.CanonicalizeUrl(url2));
MessageBox.Show(url);
MessageBox.Show(url2);
if (url2.Contains(url))
{
MessageBox.Show("works");
urlList.Append(url2);
txtLog.Text += "Found New Page to Save " + url2 + "\r\n";
if (crawl.LastFromCache != true)
{
crawl.SleepMs(1000);
}
}
}
return urlList;
}
Kindly help how to get links of the same domain
Thanks
Hi Sorry but i think i found the problem, the reason is the way in which the links are added in my site, not an issue with chilkat
Thanks