SQL Server table to tab delimited files for ELK using BCP utility

One of our customers has recently implemented ELK for log analysis and they wanted to pull data from an activity table called t_activity from all production databases. You might ask what application activity is doing inside a relational database :) We agree and they are working on moving it away. These tables are huge and grow at frenetic pace and we were looking for ways to spool.

JDBC input plugin allows connecting to relational databases and execute SQL queries at regular intervals to pull data, but I didn't like the idea of a log analysis system talking to production databases, even if they used a NOLOCK. Besides, cron jobs have to run on logstash server for polling the table and if we decide to disable polling for a particular server, the configuration changes require logstash restart.

We finally decided to run a SQL Agent job on the production servers to do a differential load based on activity id at regular intervals. We needed to create tab delimited output files from these huge tables without locking the table and causing performance issues. bcp is the perfect tool for this job. It can create tab delimited files or with custom delimiters and load data pretty fast. Below are steps for configuring this job on an imaginary database/table.

Running a full table load each time is not a good idea, so we needed a temp table to maintain state for differential loads.

#Create a table for storing LoadStatus
CREATE TABLE [dbo].[dba_LoadStatus](  
    [id] [int] NOT NULL
) ON [PRIMARY]
GO  

Create a stored procedure that looks up the load status during each job run and performs a differential load. At the end of the load, populate MAX(id) from the table to state table for next run.

USE [Aurora]  
GO

SET ANSI_NULLS ON  
GO

SET QUOTED_IDENTIFIER ON  
GO

CREATE PROCEDURE [dbo].[dba_activityload_elk]  
AS  
DECLARE @newloadid int  
DECLARE @oldloadid int  
SELECT @newloadid = MAX(id) from t_Activity  
SELECT @oldloadid= MAX(id) from dba_LoadStatus  
SELECT dbo.f_UtcToPacificDate(timestamp) AS TimeStamp,  
       isdeleted, 
       id, 
       type, 
       sessionid, 
       '[' + username + ']'       AS username, 
       '[' + campaignname + ']'   AS CampaignName, 
       '[' + message + ']'        AS Message, 
       '[' + datum + ']'          AS Datum 
FROM   t_activity WHERE id > @oldloadid  
INSERT INTO dba_LoadStatus VALUES (@newloadid)  
GO  

The next step is a bit tricky and some of you might not like it. Enable xp_cmdshell - well, I know its evil and a cute kitten dies each time it runs, but it gets the job done here.

EXEC sp_configure 'show advanced options', 1;  
GO  
RECONFIGURE;  
GO  
EXEC sp_configure 'xp_cmdshell', 1;  
GO  
RECONFIGURE;  
GO  

Now we need a job that runs every 5 min, spools data only if new rows were added since the last run, created new tab delimited files on each run. Here we go:

DECLARE @OutputFile NVARCHAR(100),  
        @FilePath   NVARCHAR(100), 
        @bcpCommand NVARCHAR(1000), 
        @newloadid  INT, 
        @oldloadid  INT 

SELECT @newloadid = Max(id)  
FROM   [castanet].[dbo].[t_activity]

SELECT @oldloadid = Max(id)  
FROM   [castanet].[dbo].[dba_loadstatus]

IF @newloadid = @oldloadid  
  BEGIN 
      RAISERROR( 
'Error! No new activity records were added since the last run. Skipping this load'  
,1,1) 

    GOTO EndMe 
END 

SET @bcpCommand =  
'bcp "exec [Aurora].dbo.dba_activityload_elk " queryout '  
SET @FilePath = 'D:\Logs\'  
SET @OutputFile = 'dev-engineactivity' + '-'  
                  + CONVERT(VARCHAR(500), Getdate(), 112) + '-' 
                  + Replace(CONVERT (VARCHAR(8), Getdate(), 108), ':', '') 
                  + '.log' 
SET @bcpCommand = @bcpCommand + @FilePath + @OutputFile  
                  + ' -c -T -S' + @@servername 

EXEC master..Xp_cmdshell  
  @bcpCommand 

EndMe:  

This script can be scheduled as a job.

Now that we have files, how do we hook them up with ELK? You can use FileBeat as a shipper for this log data.

Hope this helps someone!

comments powered by Disqus