php抓取網頁頁面方法匯總,php自動採集網頁內容

本文目錄一覽：

1、php獲取網頁源碼內容有哪些辦法
2、php抓取網頁源碼方法
3、PHP獲取網頁內容的幾種方法
4、php獲取指定網頁內容

php獲取網頁源碼內容有哪些辦法

可以參考以下幾種方法：

方法一： file_get_contents獲取

span style=”white-space:pre” /span$url=””;

span style=”white-space:pre” /span$fh= file_get_contents

(”);span style=”white-space:pre” /spanecho $fh;

方法二：使用fopen獲取網頁源代碼

span style=”white-space:pre” /span$url=””;

span style=”white-space:pre” /span$handle = fopen ($url, “rb”);

span style=”white-space:pre” /span$contents = “”;

span style=”white-space:pre” /spanwhile (!feof($handle)) {

span style=”white-space:pre” /span$contents .= fread($handle, 8192);

span style=”white-space:pre” /span}

span style=”white-space:pre” /spanfclose($handle);

span style=”white-space:pre” /spanecho $contents; //輸出獲取到得內容。

方法三：使用CURL獲取網頁源代碼

$url=””;

$UserAgent = ‘Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; .NET CLR 3.5.21022; .NET CLR 1.0.3705; .NET CLR 1.1.4322)’;

$curl = curl_init(); //創建一個新的CURL資源

curl_setopt($curl, CURLOPT_URL, $url); //設置URL和相應的選項

curl_setopt($curl, CURLOPT_HEADER, 0); //0表示不輸出Header，1表示輸出

curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); //設定是否顯示頭信息,1顯示，0不顯示。//如果成功只將結果返回，不自動輸出任何內容。如果失敗返回FALSE

curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);

curl_setopt($curl, CURLOPT_ENCODING, ”); //設置編碼格式，為空表示支持所有格式的編碼

//header中「Accept-Encoding: 」部分的內容，支持的編碼格式為：”identity”，”deflate”，”gzip”。

curl_setopt($curl, CURLOPT_USERAGENT, $UserAgent);

curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);

//設置這個選項為一個非零值(象「Location: 「)的頭，伺服器會把它當做HTTP頭的一部分發送(注意這是遞歸的，PHP將發送形如「Location: 「的頭)。

$data = curl_exec($curl);

echo $data;

//echo curl_errno($curl); //返回0時表示程序執行成功

curl_close($curl); //關閉cURL資源，並釋放系統資源

拓展資料

PHP（外文名:PHP: Hypertext Preprocessor，中文名：「超文本預處理器」）是一種通用開源腳本語言。語法吸收了C語言、Java和Perl的特點，利於學習，使用廣泛，主要適用於Web開發領域。PHP 獨特的語法混合了C、Java、Perl以及PHP自創的語法。它可以比CGI或者Perl更快速地執行動態網頁。

用PHP做出的動態頁面與其他的編程語言相比，PHP是將程序嵌入到HTML（標準通用標記語言下的一個應用）文檔中去執行，執行效率比完全生成HTML標記的CGI要高許多；PHP還可以執行編譯後代碼，編譯可以達到加密和優化代碼運行，使代碼運行更快。

參考資料：PHP（超文本預處理器)-百度百科

php抓取網頁源碼方法

可以使用file_get_content函數來獲取源代碼，你只需要把網站傳入這個函數，獲取後是一個字元串，你需要格式化代碼就可以了

PHP獲取網頁內容的幾種方法

簡單的收集下PHP下獲取網頁內容的幾種方法:

用file_get_contents,以get方式獲取內容。

用fopen打開url,以get方式獲取內容。

使用curl庫，使用curl庫之前，可能需要查看一下php.ini是否已經打開了curl擴展。

用file_get_contents函數，以post方式獲取url。

用fopen打開url，以post方式獲取內容。

用fsockopen函數打開url，獲取完整的數據，包括header和body。

php獲取指定網頁內容

一、用file_get_contents函數,以post方式獲取url

?php

$url= ”;

$data= array(‘foo’= ‘bar’);

$data= http_build_query($data);

$opts= array(

‘http’= array(

‘method’= ‘POST’,

‘header’=”Content-type: application/x-www-form-urlencoded\r\n” .

“Content-Length: ” . strlen($data) . “\r\n”,

‘content’= $data

)

);

$ctx= stream_context_create($opts);

$html= @file_get_contents($url,”,$ctx);

二、用file_get_contents以get方式獲取內容

?php

$url=”;

$html= file_get_contents($url);

echo$html;

三、用fopen打開url, 以get方式獲取內容

?php

$fp= fopen($url,’r’);

$header= stream_get_meta_data($fp);//獲取報頭信息

while(!feof($fp)) {

$result.= fgets($fp, 1024);

}

echo”url header: {$header} br”:

echo”url body: $result”;

fclose($fp);

四、用fopen打開url, 以post方式獲取內容

?php

$data= array(‘foo2’= ‘bar2′,’foo3’=’bar3’);

$data= http_build_query($data);

$opts= array(

‘http’= array(

‘method’= ‘POST’,

‘header’=”Content-type: application/x-www-form-

urlencoded\r\nCookie:cook1=c3;cook2=c4\r\n” .

“Content-Length: ” . strlen($data) . “\r\n”,

‘content’= $data

)

);

$context= stream_context_create($opts);

$html= fopen(‘;id2=i4′,’rb’,false, $context);

$w=fread($html,1024);

echo$w;

五、使用curl庫，使用curl庫之前，可能需要查看一下php.ini是否已經打開了curl擴展

?php

$ch= curl_init();

$timeout= 5;

curl_setopt ($ch, CURLOPT_URL, ”);

curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);

$file_contents= curl_exec($ch);

curl_close($ch);

echo$file_contents;

原創文章，作者：PPTK，如若轉載，請註明出處：https://www.506064.com/zh-tw/n/132482.html

php抓取網頁頁面方法匯總,php自動採集網頁內容

本文目錄一覽：

php獲取網頁源碼內容有哪些辦法

php抓取網頁源碼方法

PHP獲取網頁內容的幾種方法

php獲取指定網頁內容

相關推薦

發表回復