DOM and HTML Parsing


PHP core comes with a very handy HTML parser called DOMDocument.

// Create a new DOMDocument object.
$dom = new DOMDocument; 
// Load the HTML content into the object.

Once the HTML is loaded into the object, access nodes and child elements:

Get an element by it’s ID

$mydivObj = $dom->getElementById('mydiv');

Get all elements of a type

$anchors = $dom->getElementsByTagName('a'); 
foreach($anchors as $anchor)
    echo $dom->saveHTML( $anchor ); // Prints the text-only content of the anchor.

More Resources
Parse html DOM with DOMDocument

Remove Tag Attributes

function removeAttribute( $html, $attributeName )
    return preg_replace( '/(<[^>]+) ' . $attributeName . '=".*?"/i', '$1', $html );

echo removeAttribute( '<div style="color:#CCC;"></div>', 'style' ); // Prints '<div></div>'