first introduce our use case:
The Real-time data analytics platform.
One external system product time-series data every 1s.The time-series data consist of [id,time,value] fields .And it support REST API to search these time-series data.
We have a lot of (more than 100) standlone CPP programs to analyze the time-series data, and these programs would produce KPI into Database. The programs is realtime computation, every CPP programs read datas per second , doing processing per soncond, and send the KPI result to database.
The system architechture we used is so simple:
But the system has problems:
Every second, the external system will receive a huge of http request, this cause a decline in the performance of the external system.
The same situation to DB as 1
We have not suitable tools to manage CPP programs, we don't know when and why they crash.We want to receive the alert when they have any problem.
And lack of the suitable tools,we can only deploy and start the CPP one by one
Many programs would request same time-series data, for example, program A request [ID1, ID2, ID3, ID4] , program B may request [ ID2, ID4, ID6, ID7], program C may request [ID3, ID5,ID5,ID7], so a large number of duplicate data will appear in different requests.
After some investigation, we think WSO2 product is the best choise to solve our problem, and we change the architecture:
We use DAS to search TS data, dispatch data, and collect the KPI result. GREG use to manage the CPP programs lifecycle.
In GREG
Define new artifact type, the type hold the CPP program( .exe or script). We want use the publisher console web to publish new CPP program, manage the program lifecycle(start/stop/pause/reset), but still in development, and can not fully confirm that can be archived
We want upload the CPP program file to Enterprise Store , and user can subscribe it from GREG publisher
Monitor every CPP program.
In DAS
Create custom receiver, it get the id list from GREG per 30s, and Get time-series data from external system
Create the stream, it persistence event data
Create the execution plan, it used siddhi to reorganize time-series data by each CPP
Create HTTP receiver to receive KPI result that from CPP
Create the publisher to send KPI to external DB store
So are there any problem with our architecture? And is it the best way to use DAS and GREG?
Thanks for your any suggestions.
来源:https://stackoverflow.com/questions/35934015/the-architecture-we-use-with-das-and-greg