We have a system that generates pdf's based upon data from our Oracle database. The pdf's are then stored in a BLOB column in a table. We recently realized that at least one of those pdf's was corrupt. Is there a way within Oracle to determine if others are corrupt?
I initially thought about trying to write a script that would pull each blob, run it through some kind of pdf validator and write a log entry if it were bad. Hmmm, use plsql and call out to a validator on the O/S? Write a java routine to do the same thing? What validator would I use?
Then I had an idea--Oracle Text (formerly Intermedia and Context). I could create a context index on the blob column and see if there are any errors.
CREATE INDEX myIndex ON my_table(my_blob) INDEXTYPE IS ctxsys.context;
If the Oracle Text engine is unable to index the blob it writes a log entry to the view ctx_user_index_errors including the rowid. Depending on the database version you might have issues with very recent versions of the pdf, but that was not an issue in our case as every pdf was created by the same pdf generator.
I think Oracle Text is underused. This is just one more example of how to make it work for you.