如何c#利用cersharp进行网页资源整站下载
有时候在网上发现很多好看的网页,想要下载下来珍藏一番,可是另存为的东西将图片、js、css等都放在一起,比较混乱,如何将整个网站的资源(html、js、css、图片、视频、字体)全部按照类别分类下载呢,手动的话比较麻烦,而且还耗时,今天教大家利用cefsharp进行整站资源下载。
如果不知道如何在c#中引入cefsharp的话,可以看看这边文章 http://blog.bfw.wiki/user6/15572101669805510054.html
引入cefsharp后,我们来试试看
一、自定义cefsharp的IRequestHandler
初始化浏览器
browser = new ChromiumWebBrowser("http://www.baidu.com/") { Dock = DockStyle.Fill, }; browser.RequestHandler = new MyRequestHandler();自定义irequesthandler
public class MyRequestHandler : IRequestHandler { private String hosturl = ""; public bool CanGetCOOKIEs(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request) { // throw new NotImplementedException(); return false; } public bool CanSetCOOKIE(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, COOKIE COOKIE) { //throw new NotImplementedException(); return false; } public bool GetAuthCredentials(IWebBrowser browserControl, IBrowser browser, IFrame frame, bool isProxy, string host, int port, string realm, string scheme, IAuthCallback callback) { return false; } public IResponseFilter GetResourceResponseFilter(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, IResponse response) { return null; //throw new NotImplementedException(); } public bool OnBeforeBrowse(IWebBrowser chromiumWebBrowser, IBrowser browser, IFrame frame, IRequest request, bool userGesture, bool isRedirect) { hosturl = request.Url; hosturl = request.Url.Substring(0, request.Url.LastIndexOf('/'))+"/"; String path = request.Url.Replace(hosturl, ""); // Directory.CreateDirectory(s);//如果文件夹不存在就创建它 //读取页面源码 if (path != "") { HttpDownFile dt = new HttpDownFile(); dt.Download(request.Url, Directory.GetCurrentDirectory() + "\\" + path); } //MessageBox.Show(hosturl); return false; // throw new NotImplementedException(); } public CefReturnValue OnBeforeResourceLoad(IWebBrowser browserControl, IBrowser browser, IFrame frame, IRequest request, IRequestCallback callback) { //下载静态资源 // You can also check the URL here //MessageBox.Show(request.Url); // String path = request.Url.Replace(hosturl,"" ); if (path!="") { HttpDownFile dt = new HttpDownFile(); dt.Download(request.Url, Directory.GetCurrentDirectory() + "\\"+ path.Replace("/","\\")); ...
点击查看剩余70%
网友评论0