Generating Prometheus Metrics directly from your Oracle Database

Posted by Dirk Nachbar on Monday, March 02, 2020
Prometheus (https://prometheus.io/) is getting more and more common as a Monitoring Solution, many products are offering out-of-box Prometheus formatted metrics (e.g WildFly, Spring Boot and so on). Even Oracle is providing for the Oracle WebLogic Server a weblogic-monitoring-exporter, which is generating Prometheus formatted metrics (see https://github.com/oracle/weblogic-monitoring-exporter and checkout my previous blog post "WebLogic Monitoring with Prometheus and Grafana" https://dirknachbar.blogspot.com/2020/02/weblogic-monitoring-with-prometheus-and.html).

Many people are using Oracle Enterprise Manager for Oracle Database Monitoring, which makes sense if you have a certain environment size, but for some small environments, e.g. you have a couple of database, Oracle Enterprise Manager might be an overkill. On top, if you are running your Oracle Database in Docker, you might want to have an agent-less and lightweight monitoring alternative.

So why not to use Prometheus? The problem is, Oracle is not providing Prometheus formatted metrics for the Oracle Database :-( but with a little bit of PL/SQL and Oracle REST Data Services (ORDS) you can get Prometheus formatted metrics out of your Oracle Database and you can define which metrics you want to get.

The first step is to create a procedure which will collect your Prometheus formatted metrics, for this I am simply using the good old PL/SQL Webtool Kit. In my case I will create the procedure under the Oracle Database User dbsnmp, as this user got already the required permissions on the various v$ and dba_% views.

The basic concept is really simply, at first define all the cursors you will need for collecting your required metrics, e.g. "c_process is SELECT COUNT(*) as count FROM v$process;" and later you will loop over the cursor and generate a output with htp.print. That's all ...

Below is an example, which you can use as starting point to collect your metrics and you can easily add your own required metrics or remove them.

create or replace NONEDITIONABLE procedure prometheus_metrics IS
-- =======================================================================
-- Procedure prometheus_metrics
-- Purpose: Generating Prometheus formatted metrics from the Oracle DB
--
-- Author: Dirk Nachbar, https://dirknachbar.blogspot.com
--
-- =======================================================================
--
   cursor c_processes is
          SELECT COUNT(*) as count FROM v$process;
   --
   cursor c_sessions is
          SELECT status, type, COUNT(*) as value FROM v$session GROUP BY status, type;
   --
   cursor c_activities is
          SELECT name, value FROM v$sysstat WHERE name IN ('parse count (total)', 'execute count', 'user commits', 'user rollbacks');
   --
   cursor c_version is
          select 'Oracle' as oracle, substr(version_full, 1,2)||'.'|| substr(version_full, 4,1) as value from v$instance;
   --
   cursor c_tablespaces is
            SELECT
    Z.name       as tablespace,
    dt.contents  as type,
    Z.bytes      as bytes,
    Z.max_bytes  as max_bytes,
    Z.free_bytes as free
  FROM
  (
    SELECT
      X.name                   as name,
      SUM(nvl(X.free_bytes,0)) as free_bytes,
      SUM(X.bytes)             as bytes,
      SUM(X.max_bytes)         as max_bytes
    FROM
      (
        SELECT
          ddf.tablespace_name as name,
          ddf.status as status,
          ddf.bytes as bytes,
          sum(coalesce(dfs.bytes, 0)) as free_bytes,
          CASE
            WHEN ddf.maxbytes = 0 THEN ddf.bytes
            ELSE ddf.maxbytes
          END as max_bytes
        FROM
          sys.dba_data_files ddf,
          sys.dba_tablespaces dt,
          sys.dba_free_space dfs
        WHERE ddf.tablespace_name = dt.tablespace_name
        AND ddf.file_id = dfs.file_id(+)
        GROUP BY
          ddf.tablespace_name,
          ddf.file_name,
          ddf.status,
          ddf.bytes,
          ddf.maxbytes
      ) X
    GROUP BY X.name
    UNION ALL
    SELECT
      Y.name                   as name,
      MAX(nvl(Y.free_bytes,0)) as free_bytes,
      SUM(Y.bytes)             as bytes,
      SUM(Y.max_bytes)         as max_bytes
    FROM
      (
        SELECT
          dtf.tablespace_name as name,
          dtf.status as status,
          dtf.bytes as bytes,
          (
            SELECT
              ((f.total_blocks - s.tot_used_blocks)*vp.value)
            FROM
              (SELECT tablespace_name, sum(used_blocks) tot_used_blocks FROM gv$sort_segment WHERE  tablespace_name!='DUMMY' GROUP BY tablespace_name) s,
              (SELECT tablespace_name, sum(blocks) total_blocks FROM dba_temp_files where tablespace_name !='DUMMY' GROUP BY tablespace_name) f,
              (SELECT value FROM v$parameter WHERE name = 'db_block_size') vp
            WHERE f.tablespace_name=s.tablespace_name AND f.tablespace_name = dtf.tablespace_name
          ) as free_bytes,
          CASE
            WHEN dtf.maxbytes = 0 THEN dtf.bytes
            ELSE dtf.maxbytes
          END as max_bytes
        FROM
          sys.dba_temp_files dtf
      ) Y
    GROUP BY Y.name
  ) Z, sys.dba_tablespaces dt
  WHERE
    Z.name = dt.tablespace_name;
    --   
    cursor c_locked_accounts is
           select username, case when account_status like '%LOCKED%' then '1' else '0' END value from dba_users where username in ('SYSTEM', 'SYS', 'DEMO');
    --
    cursor c_top_sessions is
           SELECT a.username as username, a.sid, a.machine as machine, a.program as program, Trunc(b.value/1024) as memorykb
           FROM   v$session a,
                  v$sesstat b,
                  v$statname c
           WHERE  a.sid = b.sid
           AND    a.username is not null
           AND    b.statistic# = c.statistic#
           AND    c.name = 'session pga memory'
           AND    a.program IS NOT NULL
           AND    rownum <= 10
           ORDER BY b.value DESC;
     --
     cursor c_fileio is
            SELECT ddf.tablespace_name as tablespace_name, d.name as filename, f.phyblkrd as blocksread, f.phyblkwrt as blockswriten, f.phyblkrd + f.phyblkwrt as totalio FROM v$filestat f, v$datafile d, dba_data_files ddf WHERE  d.file# = f.file# and ddf.file_id = d.file# ORDER BY f.phyblkrd + f.phyblkwrt DESC;
     --
     cursor c_sga_info is
            SELECT decode(name, 'Fixed SGA Size','FixedSGASize','Redo Buffers','RedoBuffers','Buffer Cache Size','BufferCacheSize','In-Memory Area Size','InMemoryAreaSize','Shared Pool Size','SharedPoolSize','Large Pool Size','LargePoolSize','Java Pool Size','JavaPoolSize','Streams Pool Size','StreamsPoolSize','Shared IO Pool Size','SharedIOPoolSize','Data Transfer Cache Size','DataTransferCacheSize','Granule Size','GranuleSize','Maximum SGA Size','MaximumSGASize','Startup overhead in Shared Pool','StartupOverheadInSharedPool','Free SGA Memory Available','FreeSGAMemoryAvailable') name,
                   bytes as value 
            FROM v$sgainfo;
     --
     cursor c_db_status is
            SELECT instance_name as name, decode(status, 'STARTED', 0, 'MOUNTED', 1, 'OPEN', 2, 'OPEMN MIGRATE', 3) as value
            FROM v$instance;
     --
     cursor c_archiver_status is
            SELECT instance_name as name, decode(archiver, 'STOPPED', 0, 'STARTED', 1, 'FAILED', 2) as value
            FROM v$instance;

--
begin
owa.num_cgi_vars := 0;
htp.init();
--
-- Oracle Version
--
    htp.print('# HELP oracle_version Oracle DB Version Number');
    htp.print('# TYPE oracle_version counter');
    for rec_version in c_version loop
        htp.print('oracle_version ' || rec_version.value);
    end loop;

--
-- Oracle Processes
--
    htp.print('# HELP oracle_processes Current Number of processes');
    htp.print('# TYPE oracle_processes counter');
    for rec_processes in c_processes loop
        htp.print('oracle_processes ' || rec_processes.count);
    end loop;
--
-- Oracle Sessions
--
    htp.print('# HELP oracle_sessions Current Number of sessions');
    htp.print('# TYPE oracle_sessions counter');
    for rec_sessions in c_sessions loop
        htp.print('oracle_sessions{status="'||rec_sessions.status||'",type="'||rec_sessions.type||'"} ' || rec_sessions.value);
    end loop;
--
-- Oracle Activities
--
    htp.print('# HELP oracle_activity Metrics from v$sysstat');
    htp.print('# TYPE oracle_activity counter');
    for rec_activities in c_activities loop
        htp.print('oracle_activities{name="'||rec_activities.name||'"} ' || rec_activities.value);
    end loop;
--
-- Oracle Tablespaces
--
    -- bytes
    htp.print('# HELP oracle_tablespace_bytes Generic counter metric of tablespace bytes in Oracle DB');
    htp.print('# TYPE oracle_tablespace_bytes gauge');
    for rec_tablespaces in c_tablespaces loop
        htp.print('oracle_tablespace_bytes{tablespace="'||rec_tablespaces.tablespace||'",type="'||rec_tablespaces.type||'"} ' || rec_tablespaces.bytes);
    end loop;
    -- free bytes
    htp.print('# HELP oracle_tablespace_free Generic counter metric of tablespace free bytes in Oracle DB');
    htp.print('# TYPE oracle_tablespace_free gauge');
    for rec_tablespaces in c_tablespaces loop
        htp.print('oracle_tablespace_bytes_free{tablespace="'||rec_tablespaces.tablespace||'",type="'||rec_tablespaces.type||'"} ' || rec_tablespaces.free);
    end loop;
    -- max bytes
    htp.print('# HELP oracle_tablespace_max_bytes Generic counter metric of tablespace max bytes in Oracle DB');
    htp.print('# TYPE oracle_tablespace_max_bytes gauge');
    for rec_tablespaces in c_tablespaces loop
        htp.print('oracle_tablespace_max_bytes{tablespace="'||rec_tablespaces.tablespace||'",type="'||rec_tablespaces.type||'"} ' || rec_tablespaces.max_bytes);
    end loop;
--
-- Locked Accounts
--
    htp.print('# HELP oracle_locked_accounts Metrics from dba_users');
    htp.print('# TYPE oracle_locked_accounts counter');
    for rec_locked_accounts in c_locked_accounts loop
        htp.print('oracle_locked_accounts{username="'||rec_locked_accounts.username||'"} ' || rec_locked_accounts.value);
    end loop;
--
-- File IO
--
    htp.print('# HELP oracle_fileio Metrics from v$filestat, v$datafile');
    htp.print('# TYPE oracle_fileio counter');
    for rec_fileio in c_fileio loop
        htp.print('oracle_fileio{tablespace_name="'||rec_fileio.tablespace_name||'",filename="'||rec_fileio.filename||'",type="blocksread"} ' || rec_fileio.blocksread);
        htp.print('oracle_fileio{tablespace_name="'||rec_fileio.tablespace_name||'",filename="'||rec_fileio.filename||'",type="blockswriten"} ' || rec_fileio.blockswriten);
        htp.print('oracle_fileio{tablespace_name="'||rec_fileio.tablespace_name||'",filename="'||rec_fileio.filename||'",type="totalio"} ' || rec_fileio.totalio);
    end loop;
--
-- Top Sessions
--
    htp.print('# HELP oracle_top_sessions Metrics from v$session, v$sesstat, v$statname');
    htp.print('# TYPE oracle_top_sessions counter');
    for rec_top_sessions in c_top_sessions loop
        htp.print('oracle_top_sessions{username="'||rec_top_sessions.username||'",sid="'||rec_top_sessions.sid||'",program="'||rec_top_sessions.program||'"} ' || rec_top_sessions.memorykb);
    end loop;
--
-- SGA Info
--
    htp.print('# HELP oracle_sga_info Metrics from v$sgainfo');
    htp.print('# TYPE oracle_sga_info counter');
    for rec_sga_info in c_sga_info loop
        htp.print('oracle_sga_info{name="'||rec_sga_info.name||'"} ' || rec_sga_info.value);
    end loop;
--
-- DB Status
--
   htp.print('# HELP oracle_db_status from v$instance');
   htp.print('# TYPE oracle_db_status counter');
   for rec_db_status in c_db_status loop
       htp.print('oracle_db_status{name="'||rec_db_status.name||'"} ' || rec_db_status.value);
   end loop;
--
-- Archiver Status
--
   htp.print('# HELP oracle_archiver_status from v$instance');
   htp.print('# TYPE oracle_archiver_status counter');
   for rec_archiver_status in c_archiver_status loop
       htp.print('oracle_archiver_status{name="'||rec_archiver_status.name||'"} ' || rec_archiver_status.value);
   end loop;
end;
/
show err

To test, if the metrics are generated correctly enable the OWA Output in your SQL Developer and execute the above created procedures.



As next we have to setup the Oracle REST Data Services (ORDS). I am using as a showcase the ORDS 19.4 release and will do a setup in Standalone Mode, so with the integrated Jetty Server, you can also deploy your ords.war into a Tomcat or an Oracle WebLogic Server.

Download the ORDS 19.4 from Technical Resources from Oracle (former OTN) https://www.oracle.com/database/technologies/appdev/rest-data-services-downloads.html, transfer the zip file to your server and unzip it somewhere on your server, I am using /u00/app/oracle/product/ords-19.4 (so I am a bit OFA aligned).
Then you have to run the install process:

 #
[oracle@prometheus ords-19.4]$ cd /u00/app/oracle/product/ords-19.4
[oracle@prometheus ords-19.4]$ java -jar ords.war install advanced
This Oracle REST Data Services instance has not yet been configured.
Please complete the following prompts


Enter the location to store configuration data: .
Specify the database connection type to use.
# Choose Option 1 for a Basic Connection String
Enter number for [1] Basic  [2] TNS  [3] Custom URL [1]:1
# Provide the Servername of your Database Server Host
Enter the name of the database server [localhost]:prometheus
# Provide the Listener Port or confirm the default value with 1521
Enter the database listen port [1521]:
# If you are using Database Service Names, choose Option 1
Enter 1 to specify the database service name, or 2 to specify the database SID [1]:
# Provide the Database Service Name
Enter the database service name:DB193
# Choose Option 2
Enter 1 if you want to verify/install Oracle REST Data Services schema or 2 to skip this step [1]:2
# Confirm the default Option 1
Enter 1 if you want to use PL/SQL Gateway or 2 to skip this step.
If using Oracle Application Express or migrating from mod_plsql then you must enter 1 [1]:
# Provide the Database User, in my case dbsnmp
Enter the PL/SQL Gateway database user name [APEX_PUBLIC_USER]:dbsnmp
# Provide 2 times the password of the above provided Database User
Enter the database password for dbsnmp:
Confirm password:
# Choose Option 3
Enter a number to select a feature to enable [1] SQL Developer Web [2] REST Enabled SQL [3] None [1]:3
2020-03-02T08:33:12.149Z INFO   reloaded pools: []
# For showcases I am using the Standalone Mode, so choose Option 1
Enter 1 if you wish to start in standalone mode or 2 to exit [1]:
# Choose if you are using http or https, I choose 1 = http
Enter 1 if using HTTP or 2 if using HTTPS [1]:1
# Provide the http Port, in my case 7778
Enter the HTTP port [8080]:7778
2020-03-02 09:33:34.927:INFO::main: Logging initialized @84052ms to org.eclipse.jetty.util.log.StdErrLog
2020-03-02 09:33:34.967:INFO:oeju.TypeUtil:main: JVM Runtime does not support Modules
2020-03-02T08:33:35.017Z INFO   HTTP and HTTP/2 cleartext listening on host: localhost port: 7778
2020-03-02T08:33:35.039Z INFO   Disabling document root because the specified folder does not exist: /u00/app/oracle/product/ords-19.4/ords/standalone/doc_root
2020-03-02 09:33:35.342:INFO:oejs.Server:main: jetty-9.4.24.v20191120; built: 2019-11-22T11:09:44.612Z; git: 8b8c80157294e38f81ef8ea2358a0c49bf5db918; jvm 1.8.0_232-b09
2020-03-02 09:33:35.380:INFO:oejs.session:main: DefaultSessionIdManager workerName=node0
2020-03-02 09:33:35.380:INFO:oejs.session:main: No SessionScavenger set, using defaults
2020-03-02 09:33:35.381:INFO:oejs.session:main: node0 Scavenging every 660000ms
2020-03-02T08:33:36.218Z INFO   Configuration properties for: |apex||
db.connectionType=basic
db.hostname=prometheus
db.port=1521
db.servicename=DB193
db.password=******
db.username=dbsnmp
resource.templates.enabled=true

2020-03-02T08:33:36.222Z WARNING *** jdbc.MaxLimit in configuration |apex|| is using a value of 10, this setting may not be sized adequately for a production environment ***
2020-03-02T08:33:36.222Z WARNING *** jdbc.InitialLimit in configuration |apex|| is using a value of 3, this setting may not be sized adequately for a production environment ***
2020-03-02T08:33:37.045Z INFO   Oracle REST Data Services initialized
Oracle REST Data Services version : 19.4.0.r3521226
Oracle REST Data Services server info: jetty/9.4.24.v20191120

2020-03-02 09:33:37.524:INFO:oejsh.ContextHandler:main: Started o.e.j.s.ServletContextHandler@4470f8a6{/ords,null,AVAILABLE}
2020-03-02 09:33:37.535:INFO:oejs.AbstractConnector:main: Started ServerConnector@74eb909f{HTTP/1.1,[http/1.1, h2c]}{0.0.0.0:7778}
2020-03-02 09:33:37.535:INFO:oejs.Server:main: Started @86661ms

# Now hit Control-C to stop the Jetty Server

^C2020-03-02 09:36:07.794:INFO:oejs.AbstractConnector:Thread-1: Stopped ServerConnector@74eb909f{HTTP/1.1,[http/1.1, h2c]}{0.0.0.0:7778}
2020-03-02 09:36:07.795:INFO:oejs.session:Thread-1: node0 Stopped scavenging
2020-03-02 09:36:07.811:INFO:oejsh.ContextHandler:Thread-1: Stopped o.e.j.s.ServletContextHandler@4470f8a6{/ords,null,UNAVAILABLE}

After the Jetty Server is stopped, we can setup the Database Definition for our Prometheus Metrics.

 #
[oracle@prometheus ords-19.4]$ java -jar ords.war setup --database prometheus_metrics
Specify the database connection type to use.
# Choose Option 1 for a Basic Connection String
Enter number for [1] Basic  [2] TNS  [3] Custom URL [1]:1
# Provide the Servername of your Database Server Host
Enter the name of the database server [localhost]:prometheus
# Provide the Listener Port or confirm the default value with 1521
Enter the database listen port [1521]:
# If you are using Database Service Names, choose Option 1
Enter 1 to specify the database service name, or 2 to specify the database SID [1]:
# Provide the Database Service Name
Enter the database service name:DB193
# Choose Option 2
Enter 1 if you want to verify/install Oracle REST Data Services schema or 2 to skip this step [1]:2
# Confirm the default Option 1
Enter 1 if you want to use PL/SQL Gateway or 2 to skip this step.
If using Oracle Application Express or migrating from mod_plsql then you must enter 1 [1]:
# Provide the Database User, in my case dbsnmp
Enter the PL/SQL Gateway database user name [APEX_PUBLIC_USER]:dbsnmp
# Provide 2 times the password of the above provided Database User
Enter the database password for dbsnmp:
Confirm password:
# Choose Option 3
Enter a number to select a feature to enable [1] SQL Developer Web [2] REST Enabled SQL [3] None [1]:3
2020-03-02T08:37:46.632Z INFO   reloaded pools: [|apex||, |prometheus_metrics||]

As next we need to define the mapped URL for the above created Database Definition

 #
[oracle@prometheus ords-19.4]$ java -jar ords.war map-url --type base-path /prometheus_metrics prometheus_metrics
2020-03-02T08:38:32.462Z INFO   Creating new mapping from: [base-path,/prometheus_metrics] to map to: [prometheus_metrics, null, null]

Finally we can startup the ords.war in standalone mode

 #
[oracle@prometheus ords-19.4]$ java -jar ords.war standalone
2020-03-02 09:38:43.552:INFO::main: Logging initialized @1525ms to org.eclipse.jetty.util.log.StdErrLog
2020-03-02 09:38:43.595:INFO:oeju.TypeUtil:main: JVM Runtime does not support Modules
2020-03-02T08:38:43.652Z INFO   HTTP and HTTP/2 cleartext listening on host: localhost port: 7778
2020-03-02T08:38:43.682Z INFO   Disabling document root because the specified folder does not exist: /u00/app/oracle/product/ords-19.4/ords/standalone/doc_root
2020-03-02 09:38:44.283:INFO:oejs.Server:main: jetty-9.4.24.v20191120; built: 2019-11-22T11:09:44.612Z; git: 8b8c80157294e38f81ef8ea2358a0c49bf5db918; jvm 1.8.0_232-b09
2020-03-02 09:38:44.330:INFO:oejs.session:main: DefaultSessionIdManager workerName=node0
2020-03-02 09:38:44.330:INFO:oejs.session:main: No SessionScavenger set, using defaults
2020-03-02 09:38:44.331:INFO:oejs.session:main: node0 Scavenging every 600000ms
2020-03-02T08:38:45.500Z INFO   Configuration properties for: |apex||
db.connectionType=basic
db.hostname=prometheus
db.port=1521
db.servicename=DB193
db.password=******
db.username=dbsnmp
resource.templates.enabled=true

2020-03-02T08:38:45.507Z WARNING *** jdbc.MaxLimit in configuration |apex|| is using a value of 10, this setting may not be sized adequately for a production environment ***
2020-03-02T08:38:45.507Z WARNING *** jdbc.InitialLimit in configuration |apex|| is using a value of 3, this setting may not be sized adequately for a production environment ***
2020-03-02T08:38:46.221Z INFO   Configuration properties for: |prometheus_metrics||
db.connectionType=basic
db.hostname=prometheus
db.port=1521
db.servicename=DB193
db.password=******
db.username=dbsnmp
resource.templates.enabled=true

2020-03-02T08:38:46.221Z WARNING *** jdbc.MaxLimit in configuration |prometheus_metrics|| is using a value of 10, this setting may not be sized adequately for a production environment ***
2020-03-02T08:38:46.221Z WARNING *** jdbc.InitialLimit in configuration |prometheus_metrics|| is using a value of 3, this setting may not be sized adequately for a production environment ***
2020-03-02T08:38:46.594Z INFO   Oracle REST Data Services initialized
Oracle REST Data Services version : 19.4.0.r3521226
Oracle REST Data Services server info: jetty/9.4.24.v20191120

2020-03-02 09:38:47.093:INFO:oejsh.ContextHandler:main: Started o.e.j.s.ServletContextHandler@47db50c5{/ords,null,AVAILABLE}
2020-03-02 09:38:47.103:INFO:oejs.AbstractConnector:main: Started ServerConnector@2438dcd{HTTP/1.1,[http/1.1, h2c]}{0.0.0.0:7778}
2020-03-02 09:38:47.103:INFO:oejs.Server:main: Started @5078ms

Just to test, open a browser and point to http://<servername>:<Jetty Port>/ords/prometheus_metrics/PROMETHEUS_METRICS and you show see as follows:


The final step is to add in your Prometheus Configuration File prometheus.yml under the scrape_configs section a job_name for the Prometheus formatted metrics out of your Oracle Database

 
. . .
scrape_configs:
. . .
# Align the IP address and Ports for the targets with your settings
  - job_name: 'oracle_db'
    scrape_interval: 10s
    scrape_timeout: 5s
    metrics_path: /ords/prometheus_metrics/PROMETHEUS_METRICS
    static_configs:
      - targets: ['172.17.0.1:7778']
. . .

Restart your Prometheus Server and check the status of the new target in the Prometheus Console


Under the Graph section in the Prometheus Console you can find your defined metrics from the Oracle Database


Finally you can create in your Grafana a Dashboard to reflect your Oracle Metrics:


As you can see, with a little bit of work, you can collect from your Oracle Database with really easy methods Prometheus formatted metrics, push them into your Prometheus system and build a nice dashboard with Grafana. You can even setup thresholds for your Oracle Database metrics and generate alerts either using the Prometheus Alertmanager or trigger the alerts via Grafana.