DOM and HTML Parsing

DOMDocument

PHP core comes with a very handy HTML parser called DOMDocument.

// Create a new DOMDocument object.
$dom = new DOMDocument; 
 
// Load the HTML content into the object.
$dom->loadHTML($html);

Once the HTML is loaded into the object, access nodes and child elements:

Get an element by it’s ID

$mydivObj = $dom->getElementById('mydiv');

Get all elements of a type

$anchors = $dom->getElementsByTagName('a'); 
foreach($anchors as $anchor)
{
    echo $dom->saveHTML( $anchor ); // Prints the text-only content of the anchor.
}

More Resources
Parse html DOM with DOMDocument

Remove Tag Attributes


function removeAttribute( $html, $attributeName )
{
    return preg_replace( '/(<[^>]+) ' . $attributeName . '=".*?"/i', '$1', $html );
}

echo removeAttribute( '<div style="color:#CCC;"></div>', 'style' ); // Prints '<div></div>'