Sunday, December 20, 2009

The filtering process could not load the item. This is possibly caused by an unrecognized item format or item corruption.

I was constantly getting this error while running Crawl on my “All local portals”. Earlier I suspected it could be due to .doc IFilter not properly registered. But after verifying that all the registry entries are intact as per the MSDN, I thought of troubleshooting it in more detail.

Even when I removed all document libraries and word files from my site, I was still getting the same error. I still don’t know why this error is occurring. But I was able to get my .doc file indexing using a workaround.

 

Create a new Content Database with only your site which contains the .doc,.docx or .pdf document libraries. As shown in the below snapshot

image

The trick here is to minimize the Scope instead of having a wide scope of the entire Web Application like http://manmoss:2222/ in this case. Only crawl only the new subset Content database. You’ll get success in Crawl Logs