

We can send data to AWS Kinesis Firehose and then configure it to forward the data to Splunk’s HTTP Event Collector. Let’s look at the alternatives to understand why we eventually chose to develop our own aggregator. Initially we aimed for 5 seconds latency. Speed: For a timely detection of suspicious transactions, data must arrive with low latency.

Data loss must be avoided even if OS or disk failure happened. Robustness: Even if there is a temporary network or system failure, all the data must eventually be indexed by Splunk.Therefore to select the best one we first have to know our needs. There are many ways to send data to Splunk, each with their own pros and cons. This includes standardizing field names and values from differing microservices, and also enriching the data with extra information: Fields (before enrichment)ĭata enrichment for an item purchase eventĪfter preprocessing, we serialize the data to JSON and it’s ready to be sent to Splunk. Our bridge server subscribes to Pub/Sub and preprocesses the data for transaction monitoring purposes. There are various microservices within Merpay and Mercari so there are several ways in which we obtain the data, but most of them use Google Cloud Platform (GCP) Pub/Sub: This post of Merpay Advent Calendar 2019 is brought to you by from Merpay AML/CFT team.Īs explained in our previous articles ( 1 and 2), Merpay’s AML (Anti Money Laundering) system uses Splunk as its centralized database and rule engine.
