Saturday, April 15, 2017

The Man in the Middle

I recently ran into an Oracle database bug (version 12c). It's definitely a bug, and there may even be a patch for it, maybe. We've all been in this position, though. There's definitely a bug. We've identified something similar in the Oracle support site bug database. There's a patch for it. How long will it take to get the patch installed? Can I even install the patch on an Oracle Database Appliance (or Exadata or whatever other special circumstance)? What happens if the patch doesn't fix the specific issue?

Here is my scenario. I need to interact with the National Institutes of Health taxonomy database. It has REST based services--check them out:

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=taxonomy&term=Metazoa&usehistory=n

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=33208

So, super easy, I can use apex_web_service.make_rest_request to make the request, parse the response and do some science!

Let's see what happens:

select 
  apex_web_service.make_rest_request(
        p_url => 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=33208',
        p_http_method => 'GET'
        ) taxonomy_info
        from dual
        ;

ORA-29273: HTTP request failed
ORA-28860: Fatal SSL error
ORA-06512: at "SYS.UTL_HTTP", line 368
ORA-06512: at "SYS.UTL_HTTP", line 1118
ORA-06512: at "APEX_050100.WWV_FLOW_WEB_SERVICES", line 636
ORA-06512: at "APEX_050100.WWV_FLOW_WEB_SERVICES", line 760
ORA-06512: at "APEX_050100.WWV_FLOW_WEBSERVICES_API", line 236
ORA-06512: at line 1
29273. 00000 -  "HTTP request failed"
*Cause:    The UTL_HTTP package failed to execute the HTTP request.
*Action:   Use get_detailed_sqlerrm to check the detailed error message.
           Fix the error and retry the HTTP request.

Now, I know that I've used this same kind of call from this same database before, so I do a little more digging and try this:

select 
  apex_web_service.make_rest_request(
        p_url => 'https://oracle.com',
        p_http_method => 'GET'
        )
        from dual
        ;

ORA-29273: HTTP request failed
ORA-29024: Certificate validation failure
ORA-06512: at "SYS.UTL_HTTP", line 1258
ORA-06512: at "APEX_050100.WWV_FLOW_WEB_SERVICES", line 717
ORA-06512: at "APEX_050100.WWV_FLOW_WEB_SERVICES", line 760
ORA-06512: at "APEX_050100.WWV_FLOW_WEBSERVICES_API", line 236
ORA-06512: at line 1
29273. 00000 -  "HTTP request failed"
*Cause:    The UTL_HTTP package failed to execute the HTTP request.
*Action:   Use get_detailed_sqlerrm to check the detailed error message.
           Fix the error and retry the HTTP request.

It's a different error, but still and error! Now I'm getting concerned. I assume it has to do with using https because both errors reference SSL/TLS. I try a variety of http and https sites. Most work fine, but occasionally I get one of the two errors above. I think they are actually two different errors. The second can probably be fixed by adding some certs to the db wallet, but the first, the one I really care about, appears to need a database patch: Patch 24666032. I'm not sure that would fix it, and for a variety of reasons, I can't get the patch installed and tested in a timely fashion.

But...the scientists have work to do and I don't want to stand in the way of science. So, I developed a hack. Please don't judge me. Of course, the title of this post has already given it away. It's seems so obvious when you already know the answer, but it took some pondering to come up with it: the man in the middle. While I am the man in the middle, between the scientists and the data, I need another man in the middle to solve the problem. That MITM is an Apache Reverse Proxy.

As I mentioned, I know this database can use apex_web_service.make_rest_request, definitely with my own internal development web server. I set up a reverse proxy on our internal apache server to route the request to the NIH REST server.


  1. Added an entry to our internal DNS server
    [internal IP address]   ncbi.concept2completion.com
  2. Created a virtual host entry on our internal Apache web server and restarted
    <VirtualHost *:443>
       ServerName ncbi.concept2completion.com
     
       ProxyRequests on
       SSLProxyEngine on
       ProxyPass / https://eutils.ncbi.nlm.nih.gov/
       ProxyPassReverse / https://eutils.ncbi.nlm.nih.gov/
    </VirtualHost>
  3. Pointed the apex_web_service.make_rest_request to our internal server.

select 
  apex_web_service.make_rest_request(
        p_url => 'https://ncbi.concept2completion.com/entrez/eutils/efetch.fcgi?db=taxonomy&id=33208',
        p_http_method => 'GET'
        ) taxonomy_info
        from dual
        ;


<?xml version="1.0" ?>
<!DOCTYPE TaxaSet PUBLIC "-//NLM//DTD Taxon, 14th January 2002//EN" "https://www.ncbi.nlm.nih.gov/entrez/query/DTD/taxon.dtd">
<TaxaSet><Taxon>
    <TaxId>33208</TaxId>
    <ScientificName>Metazoa</ScientificName>
    <OtherNames>
        <GenbankCommonName>metazoans</GenbankCommonName>
        <BlastName>animals; animals</BlastName>
        <Synonym>Animalia</Synonym>
        <CommonName>multicellular animals</CommonName>
    </OtherNames>
    <ParentTaxId>33154</ParentTaxId>
    <Rank>kingdom</Rank>
    <Division>Invertebrates</Division>
    <GeneticCode>
        <GCId>1</GCId>
        <GCName>Standard</GCName>
    </GeneticCode>
    <MitoGeneticCode>
        <MGCId>1</MGCId>
        <MGCName>Standard</MGCName>
    </MitoGeneticCode>
    <Lineage>cellular organisms; Eukaryota; Opisthokonta</Lineage>
    <LineageEx>
        <Taxon>
            <TaxId>131567</TaxId>
            <ScientificName>cellular organisms</ScientificName>
            <Rank>no rank</Rank>
        </Taxon>
        <Taxon>
            <TaxId>2759</TaxId>
            <ScientificName>Eukaryota</ScientificName>
            <Rank>superkingdom</Rank>
        </Taxon>
        <Taxon>
            <TaxId>33154</TaxId>
            <ScientificName>Opisthokonta</ScientificName>
            <Rank>no rank</Rank>
        </Taxon>
    </LineageEx>
    <CreateDate>1995/02/27 09:24:00</CreateDate>
    <UpdateDate>2017/02/16 16:52:33</UpdateDate>
    <PubDate>1992/05/26 01:00:00</PubDate>
</Taxon>

</TaxaSet>


A hack? Yes, but done in the name of science.

Stay tuned for my next post on how to solve the parsing issue with the XML shown above. Also, we now have a pl/sql API for interacting with the NIH efetch service--it's extensive and provides access to the taxonomy lineage, synonyms, etc. Contact me if you are interested in the API.

No comments: