Removing duplicate product images in Magento
A problem we have experienced when migrating product data from other carts into Magento is that some products may have the same image or images more than once, be that due to issues with the data coming out of the previous cart or with the way the import was done.
To resolve this issue I wrote a small script to check the images on each product and remove any duplicates. The theory behind it is that the script loops through each product, gets all the images for that product and gets an MD5 hash which is stored in an array for that product. If the value is already in the array then a previous image in the loop must have been the same and so instead this image is removed from the product.
A safety check is also in place so that the base image is not removed as my initial run without this check in place would remove the version of the product that was set as the base image and leave the other instance, thus leaving the product with no base image set. You may want to add checks for thumbnail and small image too.
include('app/Mage.php'); //Mage::App('default'); Mage::app()->setCurrentStore(Mage_Core_Model_App::ADMIN_STORE_ID); error_reporting(E_ALL | E_STRICT); Mage::setIsDeveloperMode(true); ini_set('display_errors', 1); ob_implicit_flush (1); $mediaApi = Mage::getModel("catalog/product_attribute_media_api"); $_products = Mage::getModel('catalog/product')->getCollection(); $i =0; $total = count($_products); $count = 0; foreach($_products as $_prod) { $_product = Mage::getModel('catalog/product')->load($_prod->getId()); $_md5_values = array(); //protected base image $base_image = $_product->getImage(); if($base_image != 'no_selection') { $filepath = Mage::getBaseDir('media') .'/catalog/product' . $base_image ; if(file_exists($filepath)) $_md5_values[] = md5(file_get_contents($filepath)); } $i ++; echo "rn processing product $i of $total "; // Loop through product images $_images = $_product->getMediaGalleryImages(); if($_images){ foreach($_images as $_image){ //protected base image if($_image->getFile() == $base_image) continue; $filepath = Mage::getBaseDir('media') .'/catalog/product' . $_image->getFile() ; if(file_exists($filepath)) $md5 = md5(file_get_contents($filepath)); else continue; if(in_array($md5, $_md5_values)) { $mediaApi->remove($_product->getId(), $_image->getFile()); echo "rn removed duplicate image from ".$_product->getSku(); $count++; } else { $_md5_values[] = $md5; } } } } echo "rnrn finished removed $count duplicated images";
Just create a script in the root folder of your Magento installation with this code and run it, preferably from the command line if you have access but should work via a browser.
More Help with Magento
If you need more help with Magento, see my extended post on how to configure a new Magento install for SEO.