如何用java读取一个网站的源文件

我想把www.baidu.com的源文件复制到我F盘里面baidu.txt文件里面去,怎么写java代码,希望哪个高手告诉我下,谢谢!~~

举报该文章

相关建议 2008-12-13

源文件你读取不了，你顶多读取出百度的页面代码，而这些页面代码是后台经过server编译过的，已经全都是HTML，因为只有HTML CSS和一些动态脚本浏览器认识，其他的浏览器不认识

温馨提示：内容为网友见解，仅供参考

当前网址：https://11.t2y.org/zz/47m2vfpv.html

其他看法

第1个回答 2008-12-13

呵呵你说的好像跟我们用的雷达信息抓取系统差不多，可以抓取各大网站的页面到本地

第2个回答 2008-12-13

百度的源文件你也想得到呀，不可能的。你应该是想页面显示的内容或HTML文件的内容吧。

第3个回答 2008-12-13

net包的url类有相关方法，会java的自己研究下，用输入流获取，不会的我提供源程序import java.net.*;
import java.io.*;
public class GetURLPage {
public static void main(String[] args){
try{
URL url=new URL("http://www.baidu.com/index.htm");
BufferedReader br= new BufferedReader(new InputStreamReader(url.openStream()));
while (true) {
String s=br.readLine();
if (!s.equals(null)){
System.out.println(s);
}else System.exit(0);
}
}catch(Exception e){}
}
}

第4个回答 2008-12-14

// 你看下吧。。给你弄好了

import java.io.*;
import java.net.URL;

import javax.swing.JTextArea;

public class Test {
static BufferedReader reader;

static JTextArea tPage;

static URL url;
static String wenben;
public static void main(String[] args) {
tPage = new JTextArea();
String url = "http://www.baidu.com"; //网站路径
wenben="F:\\baidu.txt"; //存储路径
readPage(url);
}

public static void readPage(String uu) {
String line;
try {
url = new URL(uu);
reader = new BufferedReader(new InputStreamReader(url.openStream()));
while ((line = reader.readLine()) != null) {
tPage.append(line + "\n");
}
cunchu();
System.out.println("存储成功");
} catch (Exception ie) {
tPage.setText("发生输入输出异常 ");
} finally {
try {
if (reader != null)
reader.close();
} catch (Exception e) {
}
}
}

public static void cunchu() {

File f = new File(wenben);
try {
f.createNewFile();
FileWriter out = new FileWriter(f);
out.write(tPage.getText());
out.flush(); // 清空管道
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}本回答被提问者采纳

1 2 下一页

相似回答

大家正在搜