Another PHP simple DOM selection

I have touched on PHP Simple HTML DOM parser before, its useful to grab data from websites where no API is available. It’s actually one of the many methods.. such as using native PHP DOM document or cURL which I will show in this post combined with PHP simple DOM.

The idea was to get the water level for reservoir/s and the date it was updated. Firstly a function utilizing cURL to get the webpage contents:

function get_html($url)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
    $data = curl_exec($ch);
    curl_close($ch);
    return str_get_html($data);
}

This returns the URL in HTML format

Now just using the find method with selectors to pinpoint that data:

$html = get_html('https://theurl.com/page');
$level = $html->find('tr.capacityPercentage td', 0)->innertext;
$updated = $html->find('span.value', 0)->innertext;

This is the pure html for the section of the page i was interested in:

<div id="StorageLevels">
            <h3>
                Water Storage Levels
            </h3>
            <!-- Storage Level -->
            <div id="StorageLevels_lakelake" class="storageLevels"><div id="StorageLevels_lakeeppalockInner" class="storageLevelsInner">
        <div class="storageLevelInfo levelDecreased capacity4 indicateLevels">
    <div class="date lastUpdated top">
        <span class="label">Last Updated</span>
        <span class="value">25/02/2019</span>
    </div>
    <div class="storageLevelsHistory months5">
<div class="storageLevelsGraph">
    <table cellspacing="0" cellpadding="0" border="0">
        <tbody><tr class="levels">
            <td id="OctData" valign="bottom">
                <div id="OctPercent" class="storageLevelPercentText">53.2<span class="percentSymbol">%</span></div>
                <div class="storageLevelPercentBar"><img src="/images/gmw/spacer.gif" alt="" width="30" height="53"></div>
            </td>
            <td id="NovData" valign="bottom">
                <div id="NovPercent" class="storageLevelPercentText">51.0<span class="percentSymbol">%</span></div>
                <div class="storageLevelPercentBar"><img src="/images/gmw/spacer.gif" alt="" width="30" height="51"></div>
            </td>
            <td id="DecData" valign="bottom">
                <div id="DecPercent" class="storageLevelPercentText">49.4<span class="percentSymbol">%</span></div>
                <div class="storageLevelPercentBar"><img src="/images/gmw/spacer.gif" alt="" width="30" height="49"></div>
            </td>
            <td id="JanData" valign="bottom">
                <div id="JanPercent" class="storageLevelPercentText">45.8<span class="percentSymbol">%</span></div>
                <div class="storageLevelPercentBar"><img src="/images/gmw/spacer.gif" alt="" width="30" height="46"></div>
            </td>
            <td id="FebData" valign="bottom">
                <div id="FebPercent" class="storageLevelPercentText">42.9<span class="percentSymbol">%</span></div>
                <div class="storageLevelPercentBar"><img src="/images/gmw/spacer.gif" alt="" width="30" height="43"></div>
            </td>
        </tr>
        <tr class="months">
            <th id="OctTitle">Oct</th>
            <th id="NovTitle">Nov</th>
            <th id="DecTitle">Dec</th>
            <th id="JanTitle">Jan</th>
            <th id="FebTitle">Feb</th>
        </tr>
    </tbody></table>
</div>
    </div>
    <div class="storageVolumes">
        <table cellspacing="0" cellpadding="0" border="0">
                <tbody><tr class="currentVolume">
                    <th class="label">Current Volume</th>
                    <td class="value">130785</td>
                </tr>
                <tr class="capacityPercentage">
                    <th class="label">% of Capacity</th>
                    <td class="value">42.93</td>
                </tr>
                <tr class="capacityML">
                    <th class="label">Capacity (ML)</th>
                    <td class="value">304651</td>
                </tr>
        </tbody></table>
    </div>
    <div class="moreInfo">

 

To get the water level percent (42.93) i filtered <td> with class of “capacityPercentage” and the inner text of the first (0) <td> child.

For the updated date (25/02/2019); The inner text of the first (0) <span> with a class of “value” .

You can read more on PHP Simple DOM selectors here.